Enriching Image Collections with Iconclass: Can It Be Automated?
A Datalab Case Study
Ein junges Mädchen im Arbeitsanzuge
Few would take issue with the idea that interoperability benefits from metadata standardisation. Few will also deny that complying with a standardised vocabulary makes it easier to search across multiple databases. But what if you would like to adopt a standard such as Iconclass, but you are already using a locally developed vocabulary? Is it then possible to automatically add an extra layer of Iconclass concepts to existing metadata? Or, if it cannot be automated, can artificial intelligence help us to speed up the process and achieve a significant reduction in costs?
Those are the questions we have asked ourselves. As there are more online catalogues that use their own, local vocabulary than use Iconclass, positive answers could have important consequences. It could mean that the acceptance of Iconclass as a standard will further increase. However, these questions cannot be answered “in theory”; answers can only be found in a case study, which is what we are presenting here.
Badisches Landesmuseum (BLM) Karlsruhe
The catalogue we use for our case study is that of Badisches Landesmuseum Karlsruhe. The reasons to choose this database are mainly practical:
The museum is already involved in the Datalab project with institutions that are using Iconclass for subject access to their collection,
it is easy to access the museum’s keyword dataset and use the keywords for searches in the website’s frontend,
the complete list of keywords at the centre of this exercise is moderate in size – in all, some 7,000 unique keywords,
there is an interest in adding an Iconclass layer to the content information the museum wants to provide.
Why add Iconclass?
An obvious question that needs to be answered first is whether an additional layer of Iconclass concepts would really improve access to an online collection of images, e.g. from a museum or a library? Would it merely help staff members? Would it also be useful for researchers? Or could it even benefit the general public? Among the many answers to that question, three are particularly relevant here:
1. Iconclass creates groups, which helps to discover new information.
2. Iconclass efficiently connects data collections that are distributed across the internet.
3. Iconclass is a community-driven open standard.
Iconclass creates groups
When we try to make sense of the subject matter of an image, our eyes and our mind alternate between zooming in and zooming out. One moment we take in a whole picture and juxtapose its iconography with images we have seen before; the next moment we focus on a particular detail and compare that with other details that are stored in our memory. Here is an example to illustrate the general idea.
The story depicted is that of Cain killing his brother Abel (Genesis 4:8). The various weapons used by Cain – club, jaw-bone, spade and axe – and the brothers’ postures and gestures are examples of details that represent various iconographic traditions.
Investigating those traditions pivots around the comparison of the images in front of us with those we retrieve from our own memory and from databases that provide iconographic information. The retrieval of images from our own memory is a process of neuropsychology unique to each one of us as individual observers. This is fascinating in its own right but not relevant to our present topic, which is the retrieval of information from the artificial memory of machines.
Retrieval from computer memory relies on textual metadata. The retrieval process may nowadays often include some form of pattern recognition, but until computers can read our minds, we rely on textual metadata to tell them what exactly we are looking for.
What Iconclass brings to the table is organisation. By systematically grouping metadata according to themes, motifs and storylines, it facilitates efficient retrieval and stimulates knowledge discovery. That is not unlike how AI’s large language models organise the flood of data scraped from the internet. Iconclass is exceptional, however, because it was originally designed for the arts and humanities, even though it also covers far more general themes. With Iconclass, we can group specific themes like the stories from the Old Testament, but also generic ones like the different methods people may use to hurt or kill each other. Because its concepts are all linked to alphanumeric codes – like labels on the drawers of a filing cabinet –, all images to which we assign the descriptor
71A82 the killing of Abel: Cain slays him with a stone, a club or a jaw-bone, alternatively with a spade or another tool as weapon
can easily be grouped. The encoding will automatically store those images as part of the more general concept
71A8 the story of Cain and Abel (Genesis 4:3-17)
In this broader group, the killing of Abel will then automatically fall in line between
71A81 the sacrifice of Cain and Abel: Abel offers a lamb, Cain usually a sheaf of corn
71A83 the curse and flight of Cain.
The basic principle is illustrated in this clipping from the Iconclass browser.
Grouping is not limited to a narrative or to details within a main theme. A formal motive like a gesture or a posture can be the focal point of a group in its own right. Lifting a cap, hat or crown as a gesture of salute or respect, for example, can be the criterion for gathering images in a group.
Resting the head against a hand is a posture we find in thousands of images. Discovering its meaning in a variety of contexts will be easier if we bring instances of the gesture together in a group, in this case with the label
31A25311 'caput manui innixum', head held in the hand(s)
A motive like this can be described in many different ways (and languages), and pattern recognition algorithms can easily miss it in heterogeneous art collections. Retrieving it with the help of the short code 31A25311 will therefore be an efficient alternative.
Because all Iconclass concepts are encoded with hierarchically organized notations, it is easy to cast the net wider when performing searches. A truncated notation will retrieve a larger selection of images with related features from a database. Retrieving all records that contain the value 31A253* will also automatically retrieve all records that contain the value 31A25311. The harvest will be correspondingly larger.
When the idea has sunk in that using a common set of pre-defined and hierarchically ordered concepts is an efficient way to group and retrieve images, the claim that Iconclass can connect databases will make sense. By definition, institutions using Iconclass as a standard will share codes for the retrieval of similar images – or similar image details – from their databases, even if their collections are very different in scope and focus and they use different types of software.
Two examples should suffice to demonstrate this: From medieval manuscripts to modern war reports, and from libraries to newspaper archives, we can find images of wounded soldiers being carried away from the battlefield ...
... and across the ages images were created about people helping others escape the horrors of war – in fiction and in real life ...
The simple rationale behind the use of Iconclass is that we can more efficiently retrieve images from the collective “memory of mankind” if their subject matter and their forms are archived in a standardised way.
These examples also illustrate that standardisation will emphasise the similarity of subject matter rather than the differences of detail. It goes without saying that the iconography of knights accompanying their comrades from the battlefield in a medieval manuscript differs radically from that of the prince of Orange being carried away at Waterloo, or that of Ukrainian soldiers being brought to a field hospital. But to ensure that they can all be retrieved with the same query, it helps if they are connected by a standardised label. Standardisation thus always brings with it some reduction of information about specific details.
So, images tagged with the same Iconclass concept are connected, at least virtually. However, that is not the only way Iconclass can connect data. Iconclass notations can, for instance, also be used as identifiers linked to Wikidata concepts, as shown here for the Wikidata concept “piggy back”. Adding the appropriate Iconclass notation to the set of identifiers linked to the Wikidata entity, the sample images of the Iconclass browser can be called forth by a simple click on the Wiki page.
Iconclass is an open standard
Of course, accepting a particular standard also means complying with a particular way of looking at the world. A classification system like Iconclass is an artificial construct that comes with its own peculiarities and biases; in other words: with its own world view.
At the same time, the world of heritage collections and of the arts and humanities is not static. Therefore, a vocabulary like Iconclass – and its corresponding world view – cannot be static either. Over time, words and concepts can pick up new connotations and be interpreted in new ways. Definitions can even become unacceptable because they are seen as discriminatory or misogynist. And as Iconclass is applied more widely, it is also more widely tested for omissions and flaws.
All of this is unavoidable. All classifications and vocabulary systems share these issues. However, the simple fact that a vocabulary system is imperfect and has to change over time is not a problem. What is important is that Iconclass is transparent, that its biases can be addressed, that the system can cope with change and be adapted to an evolving world. The raw text data of Iconclass and the source code that builds the online browser are both deposited in the open access repository Github, where both are open for inspection and correction by the community of its users.
Iconclass and AI systems
The open character of the Iconclass ecosystem is particulary relevant as artificial intelligence systems are going through a paradigm shift, which leads to fierce debates and even to political action. Relevant here is that, like Iconclass, AI systems rely on some form of standardisation to make sense of the chaotic mass of metadata which they scrape from the internet. Thus, every AI application comes with its own set of biases.
However, in AI systems produced by large software companies, those biases are usually hidden from view, because most companies regard their algorithms for the organisation of data as their trade secret.
Interestingly then, the decision to use Iconclass can be seen as a political choice – for openness. If we cannot prevent Big Tech to re-use the information we create, we can at least try to make sure they work with a better quality of information.
2. From Why to How
Having argued why it would be a good idea to add an extra layer of Iconclass descriptors to an existing database, we now turn to the question of how that could be done, using the case of Badisches Landesmuseum as an example.
To answer this question we have to look in some detail at the two systems we aim to connect. So, what are the characteristics of the BLM keyword system and what are the properties of the Iconclass concepts to which we can connect them?
BLM keyword characteristics
Even a cursory glance will show that the BLM keywords are a rich and hybrid set. Most significant for our case study is that they mix specific and general iconographic descriptors with terms that do not really describe image contents. Here are some random samples:
Here, the cotton plant and the Presentation of Christ in the temple are both straightforward image descriptors; the interwar period, though relevant enough as a retrieval term, is not.
In the BLM dataset there are circa 7,000 keywords. On average, each keyword has been applied 16.5 times, creating a total of circa 115,000 data points. Obviously, the frequency with which individual keywords were applied varies widely. For example: 2,711 keywords only occur once; 196 keywords were applied to more than 100 objects while 8 keywords were each used for over 1,000 objects. When finding Iconclass equivalents for BLM keywords, the math is simple: start with the keywords that were applied most frequently. That is the fastest way to produce a web of Iconclass links.
In a strict sense, the BLM keywords are not organised in an actual system. There is no connection between terms that are semantically related, like Verheißung der Geburt Jesu and Geburt Jesu, or Verkehrszeichen and Verkehrsunfall. One of the reasons to connect them to the systematically ordered Iconclass equivalents is precisely to bring them in such semantic relationships – by proxy.
To assess whether and how keywords can be connected to Iconclass concepts, it is essential to know more about the way they are actually used in the BLM online catalogue.
Visual warranty is therefore the most instructive factor. For example: Without a visual check, we cannot decide whether to connect the keyword Kaiserstuhl to the Iconclass concept for throne, or to the concept for a landscape with high hills, which is what Kaiserstuhl actually is in the context of the BLM catalogue – a hilly region in south-west Baden-Württemberg.
A visual check is also warranted for potentially ambiguous keywords, e.g. for homonyms like Schiff or Schloß, and for all keywords that cannot be directly connected to pictorial content.
It goes without saying that these visual checks increase the quality of the connections, but do not speed up the process of making them.
Selecting Iconclass concepts
Establishing how a keyword is actually used in the BLM catalogue is the obvious first step towards selecting an Iconclass equivalent. The next step is to assess whether the BLM keyword can be juxtaposed with an Iconclass concept with the same or a similar meaning.
Even though this may seem straightforward for simple keywords like Brücke, Stadtansicht, Kind, Berg or Mode, it still requires some Iconclass experience to select a suitable equivalent. Brücke (bridge), for example, can mean an element in a landscape but it is also a construction that enables transport across a water course. And these are only a few of the options.
Stadtansicht (city view), retrieves over a 100 hits, but they are essentially all extensions of one concept, i.e.:
25I1 Stadtansicht (allgemein); Vedute – or in English:
25I1 city-view in general; ‘veduta’
Kind (child) may not seem to be semantically complicated, but, iconographically speaking, it is extremely rich, which is reflected in the fact that the word refers to over 2,000 Iconclass themes, from the Madonna and Christ-child to Saturn devouring his children, and, of course, to children playing on a beach.
The situation is no different for Berg (mountain) or Mode (fashion). In fact, this is the situation for many, if not for most, of the keywords we find in the BLM dataset. In many cases, we have to choose to which one of multiple Iconclass concepts we can best link a BLM keyword. And if a keyword has been applied to a range of BLM objects, there is a good chance that there is more than one equivalence candidate in Iconclass.
If that is the case for many keywords, what then are the chances to automate the selection process? Without a proper understanding of the selection process, no functional requirements can be drawn up. To be able to translate this decision-making process into algorithms, we need an accurate analysis of how a human decides which Iconclass concept to link to a keyword. Which is exactly why we kept track of how we selected the Iconclass concepts which are now linked to the 500 most frequently used BLM keywords. With the evidence thus collected, we are able to distinguish a few varieties of the decision-making process.
At the simple end of the decision-making spectrum we find German BLM keywords that correspond with only a few or maybe even one hit in the German version of Iconclass. A good example is the keyword Kanal, which occurs in some 45 BLM records, where it is mostly used to tag historical photographs of a canal in a city or landscape. The illustrations of the online browser show that the Iconclass concept 25H22 Kanal was usually applied to similar images across a range of collections.
Even an apparently trivial link as the one between the BLM keyword Kanal and the Iconclass concept 25H22 Kanal could lead to interesting historical questions. The BLM keyword, for instance, also retrieves a Mesopotamiam tablet with cuneiform inscription. At first sight a database error. But it turns out that the inscription on the tablet – dated 2,035 BC – speaks about hiring labourers to dig a canal. The map, from the collection of Rijksmuseum Amsterdam was tagged with the Iconclass concept 25H22 canal. The reason is that the map shows the design for a fortified canal, the “fossa Eugeniana”, planned to connect Rheinberg with Venlo in Limburg. Its construction was ordered by the Spanish infante Isabella Clara Eugenia. It was planned in circa 1625, but the canal was never realised.
Of course the connection here is simply that it makes us think about the political and bureaucratic process that led to the construction of a canal – but the topic stretches over a distance of more than 3,600 years. Maybe not an exciting discovery, but still, the juxtaposition of these two objects could raise some eyebrows and set some further investigation in motion; which does seem to be in line with the purpose of the Datalab.
The keyword Mahlzeit turns out to be more complex than Kanal, both semantically and iconographically. Like Kanal, Mahlzeit is also a keyword in Iconclass, but it retrieves a more varied set of concepts. It refers, for example, to concepts like ritual meals in ancient Egypt, family meals and festive banquets in general, but also to scenes of meals in the Bible and Ovid’s Metamorphoses. At first glance, therefore, the process to select the best equivalent, is more complex.
Looking at the objects to which the BLM keyword Mahlzeit was assigned, two sets of pictures stand out: The collection contains quite a few photographs of people eating in the historical “Falkenstube” in Freiburg. It also contains several tiles made for a “Kachelofen”, the hearth type of which we see a plain white variant in the background in the Falkenstube. These decorated tiles in the BLM collection all show the same New Testament theme, namely the wedding feast at Cana, where Jesus changed water into wine. Both concepts are present in Iconclass. For the type of meal we see in the Falkenstube photo we could use 41C42 Mittagessen, Lunch or 41C43 Abendessen but also the broader term 41C4 Mahlzeit (im Familienkreis). For the Kana wedding meal, the BLM catalogue also uses the keyword Hochzeit zu Kana, which has 73C611 das Hochzeitsmahl in Kana (Johannes 2:1-11) as its equivalent in Iconclass.
When selecting an equivalent for the keyword Mahlzeit, we can ignore the subtle distinction between lunch and dinner, as we are aiming to tag individual objects. And the Hochzeit zu Kana will be connected to another Iconclass concept. Eventually then, choosing the best equivalent – 41C4 (family) meal – was not as difficult as anticipated. Still, it would not be easy to translate these considerations into a formal algorithm. What is easy, however, is to illustrate that a link to the more general equivalent 41C4 – as shown below – would link the BLM data to a substantial set of meal scenes.
But there is more to it: Other BLM keywords also stem from the domain covered by the Iconclass category 41C nutrition, nourishment. We find, for example, Ernährung, Ess- und Trinksitte, Geschirr, Hotelgeschirr, Trinkgefäß and Trinkkultur.
Although semantically related, they are not connected to each other in the BLM catalogue. By linking them to their Iconclass equivalents, they would be connected to a structure that could help the discovery of new information, albeit – again – by proxy.
There are at least three BLM keywords from the domain of the livestock industry: Viehwirtschaft, Viehmarkt and Viehherde. Most frequently used is Viehwirtschaft, which is used for pictures like these:
In Iconclass, Viehwirtschaft does not immediately retrieve a concept, but by simply using the word Vieh instead, we quickly find 47I21 Vieh, a narrower term for 47I2 Viehzucht, Viehhaltung, the Iconclass synonym for Viehwirtschaft. In other words: You just need a minimum of imagination to find an equivalent. Real Iconclass expertise is not required.
Some more Iconclass expertise is required to find an equivalent for the keyword Bollenhut. As defined on the German Wikipedia page, a Bollenhut is a straw hat typically worn since the early 19th century by protestant women of three Black Forest villages, namely Gutach, Kirnbach and Hornberg-Reichenbach. Here are some examples of the pictures to which the keyword Bollenhut was assigned in the BLM catalogue:
Bollenhut is not a keyword or a concept in Iconclass. The system, however, has ample provisions to deal with items of costume and fashion. 41D221 head-gear (in the German version 41D221 Kopfbedeckung) is a concept that is easy to find, but it is more generic than the quite specific Bollenhut.
Fortunately, 41D221 head-gear is also one of many Iconclass concepts that can be extended with an alphabetic listing of items.
Even if Bollenhut is not included in the default list of specific types of head-gear, it is easy to create the concept we need as an equivalent. We just have to add the word between brackets at the end of the notation: 41D221(BOLLENHUT). In addition, we can use one of the auxiliary features of the Iconclass toolset – the “key” (+82) – to record the fact that the Bollenhut is a fashion item worn by women.
Browsing around this section of the Iconclass schedules, we also find another concept which can usefully be linked to Bollenhut, namely 41D3 folk costume, regional costume. As there is no technical obstacle to connect a BLM keyword to multiple Iconclass concepts, this would then create an additional datapoint.
Finally, by adding this version of the Iconclass concept to the Wikidata page, another useful datapoint was created, with a direct link to the Iconclass browser.
A somewhat more developed level of Iconclass experience is required to find equivalents for the keywords Arslan Tasch and Elfenbeinrelief. Both words are used separately and in a fixed combination. Arslan Tasch is the name of an archaeological site in northern Syria. Ivory reliefs that were found there ended up in various museum collections. Here are three examples from the BLM collection.
Obviously, the keywords are not descriptors for the subject matter of these ivories. Still, they can be linked to Iconclass concepts. Iconclass section 61 historical events and situations; historical persons includes the concept 61E(...) names of cities and villages (with NAME) where Arslan Tash can be entered between brackets.
Elfenbeinrelief is a keyword to indicate the type of object. The material is Elfenbein – ivory, the sculptural technique is relief. Both aspects have an equivalent in Iconclass:
48C24(+321) piece of sculpture, reproduction of a piece of sculpture (+ relief ~ sculpture)
48C24(+652) piece of sculpture, reproduction of a piece of sculpture (+ bone, ivory ~ arts)
It is not hard to find these concept in Iconclass, but it is helpful if you know that keywords can be combined and that “keys” can be switched on.
As with all Iconclass concepts, it is easy to narrow or broaden a search. So, a few clicks in the online browser will retrieve samples of pieces of sculpture in a variety of settings.
A keyword from the same domain is Hortfund (a hoard, as defined in Wikidata as a “collection of valuable objects or artifacts”). The keyword does not provide information about the objects themselves, but about the fact that they were found – often excavated – together as, for example, a hoard of coins.
Iconclass did not yet contain this concept. The closest equivalent was 49K11 excavation ~ archaeology.
The BLM catalogue already provided a sufficient series of examples of Hortfund. But we may also expect to find examples in other collections, not just of actual objects stemming from a hoard find, but also reproduced in prints and illustrated books, as in this example from the Allard Pierson collection.
In short, it made perfect sense to add the concept to the Iconclass system.
3. Can enriching collection metadata with Iconclass be automated?
Even the handful examples we have discussed make it clear that the process to select the closest Iconclass equivalent can be both quite simple and very complex. In some cases, a BLM keyword immediately produces a single hit in Iconclass. In other cases, there seems to be no hit at all, at least not at first sight. Even then, with sufficient Iconclass expertise, there is almost always an equivalent. And if there is not, it can be made; usually quite easily, as with Hortfund.
Although the simplest cases may suggest otherwise, the wide scope of its complexity warns us that it is too soon to talk about ”automating” the selection process, as this final example shows:
With the keyword Postkolonialismus we retrieve examples of – mainly – African arts and crafts. Among them a series of objects produced by the Pende people in south-western Congo. The recent reparation of the “peoples and nationalities” section of Iconclass makes it easy to find a context for the Pende people: 32B321(PENDE) (indigenous) peoples of sub-Saharan Africa (with NAME).
Post-colonialism is not yet a concept in Iconclass, and the word is quickly gathering a rich cloud of connotations in the heritage world. Even so, if it were to be added to Iconclass, it would probably be as a “child” of the concept 44B04 colonial system. Given the fact that the keyword mainly refers to objects of African origin, that aspect should also be present in the concepts we choose as equivalent.
48(+763) art (+ African art) presents itself as that equivalent. A side effect of this Datalab exercise is that attention was drawn to the original text of the concept definition. Published in 1977, it read: African tribal art.
A tiny yet significant step was to remove the adjective “tribal” from the present version of the online browser.
Whether all of these subtle considerations can be expressed in computer code is highly doubtful. Therefore, it seems a safe bet that in this domain – at least at this moment – artificial intelligence is a complement for human intelligence, not a viable alternative.
Easy access to the BLM keywords, efficient retrieval of objects from the digital collection, a quick overview of how keywords were actually used – those are the features offered by the Datalab interface.
An above-average amount of (human) Iconclass experience should do the rest – for now.
If this exercise would really result in a web where German BLM keywords interact with systematically organized Iconclass concepts and multilingual Wikidata entities, a significant step towards further automation may be taken. But the paradox of efficient automation in the humanities is that it works better after a lot of time and effort is invested in the manual optimization of data.
Voorschoten 17 December 2023