14.02.2024 | Blog Easy Research in the Digital Reading Room
The first stage of the Digitaler Lesesaal of the Bundesarchiv (digital reading room of the German Federal Archives) was launched recently. This milestone provides the perfect opportunity to reflect on the use of search engine technology in the context of archive material. As a historian, this Federal Archives project is particularly close to my heart, and I am delighted to see that the number of users is steadily increasing.
What exactly is a digital reading room?
In general, the digital reading room provides freely accessible digital copies of archive material on the Internet or the presentation of archive records in digital form for a specific user in the archive building. In short, it involves the provision of a web-based research tool in the archive.
Researching historical data is complex
Researching historical data can be extremely complex and requires more than just a full-text search. After all, it involves extremely heterogeneous data such as files, maps, images, posters, films, and sound recordings.
The Bundesarchiv was aware of this challenge and accordingly selected the search technology following comparative tests. In the decisive benchmark, the iFinder search software delivered 25 per cent more relevant hits in the hit list than the current system.
What makes the search process so challenging?
Linguistically speaking, the German language is much more complex than English. Search engines usually build a full-text index and use so-called stemmer to better find plural and singular forms. However, this procedure does not always help, especially in the case of irregular flections and the use of multi-word terms. Here, additional linguistic procedures are necessary, such as lemmatization and compound decomposition, and extended lexicons or so-called AI-based word embeddings also come into play to map semantic similarities.
The iFinder finds suitable hits without the user having to think about the historical meanings of specific terms. For example, when searching for the “Pariser Verträge” (Paris Treaties), linguistic pre-processing also makes it possible to find newsreels from the Nazi era, during which the official propaganda referred to the treaties pejoratively as the „Pariser Schandverträgen“ (Paris Treaties of Shame) or the „Pariser Kriegsverträgen“ (Paris War Treaties).
User-friendliness is the be-all and end-all - especially with complex data
The right strategy when building the search index and at the time of the search query is therefore crucial for a good hit list. Hits should be complete and comprehensible without the hit list becoming too blurred. The iFinder already has all these capabilities: It is a complete product with a functionally complete front end that is accessible and responsive.
The presentation or user experience is another crucial aspect of the digital reading room. Only if a user likes using something, they will use it again. What sounds so simple becomes challenging when the data stocks are as large, complex, and extremely heterogeneous as data on archive records.
Heterogeneous databases? Intelligent search in metadata delivers suitable hits
A historical file, for example, has completely different metadata than a historical film poster. Nevertheless, when researching "Fritz Lang", the director of the legendary 1927’s film "Metropolis", a user would like to find letters and other correspondence, files, film posters, his films as well as photos in one hit list, summarized in one place.
If that user then filters the search by date to further narrow down the hits, it becomes difficult. Which date should be used for the different types of archive records? What to do if the user doesn’t have a specific date but rather a time period, as is often the case with film works? Without an intelligent filter level, the use of the digital reading room is not expedient. For example, one of the modules of the iFinder that we always like to recommend, the so-called "knowledge maps", is used in the advanced search. By searching in the many different metadata, users can navigate very specifically through the database.
If you ask an AI like Perplexity or ChatGPT today what needs to be considered when implementing a digital reading room project, you will get an immediate answer:
"Accessibility and user experience: the Digital Reading Room should be designed to be user-friendly to provide easy and intuitive access to digital content. This includes aspects such as search functions, navigation, and the presentation of digital materials."
The Digital Reading Room is therefore also about centralizing the knowledge of an archive with a particular focus on the presentation of digitized data. In the case of the Bundesarchiv, this involves a great deal of information, which is also subject to very complex authorizations, all of which are defined in the “Bundesarchivgesetz” -BArchG (Federal Archives Act).
With the launch of the first stage of the digital reading room of the German Federal Archive, my absolute favorite project has been successfully implemented. I am looking forward to the challenging tasks of the next expansion stages with the integration of further types of archive records, which we will be able to realize in cooperation with the Federal Archives.
Conclusion
Overall, the use of search engine technology in the context of archive records is a complex challenge, but one that can be mastered with the right strategy and technology. The Federal Archives' Digital Reading Room is an important step in this direction and will certainly help to make historical data more accessible to researchers and interested parties.
You are welcome to try out the new digital reading room for yourself.
https://digitaler-lesesaal.bundesarchiv.de/en
Look out for the digital copies of historical film works offered here, which you can easily select using the filter on the left-hand side.
A look into the future - from the digital reading room to the virtual reading room
We are happy to contribute our accumulated knowledge of the special requirements of archives to other projects. I am also extremely excited about the cooperation that has been initiated between the Laboratory for Educational Media at the University of the Bundeswehr Munich and IntraFind, in which we are further developing the idea of the digital reading room and thinking together about what a virtual reading room using virtual reality (VR) technology will look like.