Advancing FronTier Research In the Arts and hUManities (ATRIUM) will exploit and strengthen complementarities between leading European infrastructures: DARIAH, ARIADNE, CLARIN and OPERAS in order to provide vastly improved access to a rich portfolio of state-of-the-art services available to researchers across countries, languages, domains and media, building on a shared understanding and interoperability principles established in the SSHOC cluster project and other previous collaborations. Arts and Humanities is a very diverse field, covering a range of disciplines and communities of practice that have different epistemological and methodological foundations: an archaeologist and an art historian studying a Mycenaean fresco will have distinct goals and use different approaches to describing their objects of research. A literary scholar and a linguist will come to a textual corpus with radically different senses of what a corpus is and what questions can be asked of it. Yet research infrastructures in the Arts and Humanities domain must cater to a very wide range of stakeholders and offer services that cut across discipline-specific boundaries.
ATRIUM will tackle this heterogeneity within the Arts and Humanities by going deep and wide at the same time: on the one hand, ATRIUM will make a groundbreaking contribution to the consolidation and expansion of services, including data services, specifically in the field of archaeology, while, on the other hand, facilitating access to a wide array of essential text, image and sound-based services that benefit a number of other disciplines within the Arts and Humanities, and cover all phases of the research data lifecycle (creating, processing, analysing, preserving, providing access to and reusing).
The role of the ARIADNE RI
One of the outputs of ATRIUM is to enable the implementation of customisable workflows, making use of the services in the ATRIUM portfolio and the improved interoperability between these services delivered in the project. These typical workflows for different data types will be largely based on archaeological data, since this is the most diverse and varied within the humanities domain, but all workflows will have much broader, multidisciplinary implications. The focus is on five main data types, varying in ambition and complexity from the more traditional to the innovative:
- Text-based data has always been at the heart of Arts and Humanities research. In Archaeology, there is an untapped wealth of “grey literature” – reports on primary fieldwork – which often languish in museums and archives, meaning that academic research is often dated and lacks synthesis of new data. To make these data FAIR, they must first of all be findable and accessible. Initiatives such as ARIADNE now make some 300,000 of such unpublished reports available online. Optical Character Recognition (OCR) can be a challenge, and this is also an issue with archaeological field notes, which are increasingly deposited in repositories. However, the major obstacle to findability is indexing these publications and reports, which would be a huge task to undertake manually. That is why ATRIUM will pay special attention to applying, enhancing and customising the techniques of keyword extraction and Named Entity Recognition (NER) in our workflows. Interoperability will be achieved by linking the distributed national repositories and mapping keywords to controlled vocabularies, such as the Getty Art and Architecture thesaurus.
- Image-based data is also widely used in the Arts and Humanities. In Archaeology, it often concerns archaeological sites and monuments, fieldwork recording, and individual artefacts. There are vast analogue archives of antiquarian photographs of monuments and objects throughout Europe. ATRIUM will build on EU-funded research projects such as ArchAIDE and adapt universal image recognition models such as ResNet to do a limited exploration of high-level artefact recognition.
- 3D information is increasingly collected in field-recording, including standing building surveys. It is also important for recording complete morphological information about artefacts. 3DHOP (3D Heritage Online Presenter) is an established open source framework for the creation of web-based visualisation tools. The 3DHOP tool is well suited to cope with the diverse datasets that are of interest in the project, and can be used to create custom interaction schemes. Applications have so far focussed on individual artefacts, but its approach is scalable and we will explore its use for larger structures, including standing buildings. Here, the relatively recent application of Building Information Modelling (BIM) in the heritage sector (HBIM) highlights the benefits of a holistic approach in understanding building assets within, among other aspects, its surrounding environment, in order to achieve an optimal management strategy.
- Sound-based information is a data type of increasing interest and importance in the Arts and Humanities as a prerequisite for the creation of spoken corpora which are essential for both linguistics and oral history. In Archaeology, sound recordings are sometimes used in reflexive observations during fieldwork. However, these recordings may be of low sound quality, include non-native English speakers and are more often than not unindexed. Our workflows and demonstrators will address these issues, feeding back the texts of automatically transcribed recordings into the keyword extraction workflows using audio data recorded on archaeological sites.
- 2D geospatial information has always been of huge importance in Archaeology, which was an early adopter of maps and then with the advent of computer technologies, GIS-based approaches, to study distributions of artefacts and relationships between human beings and their natural and cultural environments. ARIADNE has demonstrated the benefit of using map-based interfaces as a key means of searching and browsing archaeological datasets. In turn, the Pelagios network has demonstrated the benefit of using the concept of place as a mechanism to enable not only geo-referencing but also interlinking data drawn from multiple Arts and Humanities disciplines. ATRIUM will investigate how the concept of place can be used to reach new audiences, with a particular focus on the creative industries and tourism. Our demonstrators will adopt the lightweight Pelagios Linked Open Data method and tools for semantic geo-annotation.
All developed workflows and demonstrators will be published in the SSH Open Marketplace, where they can be described in detail and contextualised with related tools, services and resources.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101132163