The Relatives of Karl Friedrich von Baden

The potential and challenges of clustering museum data

It is a great challenge to make large museum collections more visually accessible. The potential of computer vision and clustering techniques is very promising in this regard as it helps to narrow down collections in a playful way.

The Baden State Museum carried out several experiments together with Lukáš Pilka, who used PixPlot to generate clusters from various collections. The first experiment focused on archaeological objects from the collection of Badisches Landesmuseum. In a second experiment, Pilka created a portrait gallery with contents from the Digital Curator database: He used computer vision to select male portraits, added the portrait of Karl Friedrich (1728–1811), Grand Duke of Baden, and then compared this data based on formal similarities using PixPlot.

The experiment highlighted the potential and limitations of such an approach: First of all, the retraining of the clusters is connected with domain expertise, i.e. in this case, it has to be verified by technical experts whether the computer vision clusters represent meaningful connections. The quality of the clusters can be improved by marking and fine-tuning. This process is quite resource-intensive.

Secondly, the PixPlot system is limited in its functionality and doesn’t offer technical options for a more in-depth exploration of the data. Sorting and filtering are, therefore, an exciting and good way to get a first insight into a collection, but would have required domain-experts willing going into a training process for labeling data.

This example shows how exciting individual methods such as clustering and machine vision are, but to make them usable in a sustainable way, further efforts and sepcific domain trainings are required. This approach might work in a different context better. Within the xCurator Solution a different solution was chosen, called Navigu from Pixolution, which combines visual search and clustering of the datasets and thus a different access to the collections than before. The PixPlot solution has high potential for further collection research and is highly recommended for this purpose.