by Eva Baaren (Sound & Vision)
(co-authors: Christian Olesen, Liliana Melgar, Norah Karrouche, Kaspar Beelen, Willem Melder, Roeland Ordelman, Julia Noordegraaf)

Abstract: The development of research tools for digital (AV) collections creates new opportunities for academics to research public debates about topics such as racism and gender equality. In order for scholars to critically select, use and reflect on their sources in digital environments, the metadata of digital collections need to be transparent and rich as possible. However, while users and policymakers tend to think that publishing metadata is easy, archives often experience technological and institutional difficulties. Based on our experiences with building the CLARIAH Media Studies research infrastructure, we argue that archives can overcome these difficulties by moving away from the notions that (1) metadata can only be used if they are complete and without error, (2) archivists are the only ones to understand and generate metadata fields and (3) the scholarly use of digital tools and collections should be self-explanatory and easy to use. Instead, archives should publish their metadata despite their imperfect nature and share knowledge about the their data model and its history. Also, they can benefit from engaging in projects aimed at improving data transparency in digital infrastructures. By testing the actual use and challenges of working with of raw data, they can truly improve the data quality and modes of access for all users.

Eva Baaren is a Media and Innovation Researcher and works as a Liaison for researchers in humanities and social sciences at the Netherlands Institute for Sound and Vision. She currently works on the CLARIAH Media Studies project (part of the CLARIAH national infrastructure funded by NWO), where she focuses on strategies to improve the access, quality and use of media collections, including their metadata.

Christian Olesen is Principal Investigator in the project MIMEHIST: Annotating EYE’s Jean Desmet Collection (2017-2018), which embeds the Desmet Collection in the Dutch digital research infrastructure CLARIAH. He is also Postdoctoral Researcher in the project The Sensory Moving Image Archive (2017-2019), which enables artistic researchers to source digitised audio-visual collections.

Liliana Melgar is an Information Scientist currently working as Postdoctoral Researcher at the University of Amsterdam. Liliana’s main responsibility in CLARIAH is to collect researchers’ needs and requirements, systematize use cases and conduct user studies with media scholars.

Norah Karrouche is a Historian and lectures at the Vrije Universiteit Amsterdam. She participates in CLARIAH as a Co-developer and Researcher (Erasmus Universiteit Rotterdam) with special emphasis on digital oral history.

Willem Melder is part of the ICT-development team of the CLARIAH infrastructure at the Netherlands Institute for Sound and Vision. Willem has a background in artificial intelligence and speech technology and focuses on automatic processing of audio-visual data and data interoperability.

Kaspar Beelen is an Assistant Professor at the University of Amsterdam, where he is part of the CLARIAH Media Studies development team. His research interests include parliamentary culture, political representation, party formation and ideology. Kaspar focuses on quantitative text analysis and its application to the study of historical and political phenomena.

Roeland Ordelman is Senior Researcher Multimedia Retrieval at the University of Twente, Manager R&D at the Netherlands Institute for Sound and Vision and founder of a start-up company for audio search technology, Cross Media Interaction (X-MI). His aim is to enhance the exploitability of the large volumes of audio-visual content becoming available for various types of user groups. He currently leads the development team of the CLARIAH Media Suite.

Julia Noordegraaf is Professor of Digital Heritage in the Department of Media Studies at the University of Amsterdam and Director of the Amsterdam Centre for Cultural Heritage and Identity (ACHI). Her research focuses on the preservation and reuse of audio-visual and digital heritage. She is a former fellow of the Netherlands Institute for Advanced Study in the Humanities and Social Sciences and acts as board member for Media Studies in CLARIAH. Noordegraaf also leads research projects on the conservation of digital art (in the Horizon 2020 Marie Curie ITN project NACCA) and on the reuse of digital heritage in data-driven historical research (in the eHumanities project CREATE and the Amsterdam Data Science Research project Perspectives on Data Quality).