Tips and guidance for searching the Catalogue

This section comes from the Portal Guide (PDF) which explains how to use the Catalogue and services in depth and can be viewed/downloaded here.

The metadata in the Catalogue originates from many different sources with varying levels of information and different data models. In the first instance, these have been mapped to the AO-CAT which caters for most archaeological domains. However, there are some specialist domains which require additional concepts and terminology to be successfully mapped (otherwise a lot of relevant information would be lost) and two approaches have been used to address this issue. The first is the development of Application Profiles which are extensions of the (CIDOC CRM based) AO-CAT data model which enables additional data fields to be added and mapped to the Catalogue. Examples of specialist Application Profiles developed during the ARIADNEplus project are for the domains of “heritage science” (which covers scientific datasets such as aDNA and radio carbon dating), “inscriptions, marks and graffiti” and “burials and mortuary data”. An alternative approach, which is more appropriate when a domain can basically map to the AO-CAT but also has its own distinct terminology is to adopt an additional Ontology to extend the vocabulary used for metadata descriptions. In many cases, it is also possible to map the subject matter to the Getty AAT but as this was developed as a more general thesaurus, it doesn’t always contain the level of detail used in archaeology. To mitigate this, the original subject is also included in the search on all fields in the Catalogue and when the Getty AAT filter is used, the results are hierarchical i.e. they match the specified term(s) and all sub-terms. Consequently, it may be better to start with general terms and then narrow these down rather than starting with a very specific search term.

The other major consideration is the supplied metadata which may vary from publisher to publisher depending on how their original source data is structured, the meaning attributed to the terminology used, how much metadata is available for mapping to the AO-CAT data model (and the extent to which the metadata can be cleaned and enhanced), etc. To give some examples of how this can affect the data in the Catalogue (and ways of mitigating the differences):

  • A text search will result in the most resources since a match may be found in one of several metadata fields. These may not all be relevant since a description can include the search term as a feature found on an object or on a site. Filters such as the Resource type and Getty AAT Subjects should be used for more targeted searches.
  • The British Museum has supplied over 900,000 resources from the Portable Antiquities Scheme Database of which around half relate to coins and the other half to other types of artefacts which commonly make up ‘finds’. The coins all have Resource type “Coin” and the artefacts “Artefacts”. On the other hand, the DIME database (published by Aarhus University), which also records finds by the public in Denmark, has allocated Resource type “Artefacts” to all its resources, including coins. This is not wrong, it’s just another way of representing the data. To select all the resources from DIME that are coins, it is necessary to use “coins” as an initial search term and then filter by Publisher, Aarhus University, as the original subject (usually dime.find.coin) will indicate the type of artefact. The Getty AAT Subject has also been used, and some granularity has been applied so five terms are listed of which the most common is “Later western world coins” (a sub-term of Coins (money)).
  • Be aware that not all the resources (approx. 8%) have geographical co-ordinates supplied in their metadata and, where the location of the resource is considered sensitive, a bounding box will be shown containing a random ‘pin’ to indicate the approximate area. If there is a nearby resource, one pin will be shown – this can be clicked on to show the corresponding record. Where there are more than one resources nearby, these are shown in series, i.e. one at a time with each successive resource in the same location. This also means that in most cases, when the map filter is applied, the number of resources found will automatically reduce as all those missing location co-ordinates will be excluded.
  • The Catalogue does not provide searching by (modern-day) country as this is fairly meaningless in the context of archaeology (even more so for marine archaeology!). Recorded place names are included in the metadata. More usefully, the Map allows the selection of areas of interest, including defining an area by drawing a polygon thus enabling borders to be ignored or otherwise. However, it can be useful to define a country, particularly islands or where national boundaries have remained fairly unchanged. One way to do this is through the Publisher filter as many of these providers are the national repository for their archaeological outputs. In many cases, Publishers have provided a Collection record which summarises provided datasets. Alternatively, the When filter can also include regions (within the PeriodO definitions) which may be used for defining areas of interest.

Two approaches have been used to denote time periods – absolute start and end dates and period names. There are some obvious issues with both methods:

  1. If absolute dates are used across more than one country, it is likely that resources will be found that are not of interest as the dates can overlap from other periods or they may be approximate.
  2. PeriodO terms are used to describe archaeological time periods; it is well known that a defined period (e.g. “Bronze age”) in one country may cover a different time span in another country. However, the filter has been designed so that one or more defined periods may be used with the further option of restricting the named period to a specific region.

FAQs about Portal searches

Questions relating to geo-locations use of geo shapes on the Where map

Q: Why is there a very large number of resources shown in the same location on the map?

A: When a resource is considered ‘sensitive’, it will be allocated random geographic (or imprecise) co-ordinates within a 1km or 10km bounding box. This indicates the approximate area it was found in. In some cases, several resources found in close proximity may be allocated the same imprecise co-ordinates. This method is commonly applied to finds such as the British Museum PAN database records and the Danish DIME database records which will be displayed as red geo shapes with a bounding box. Alternatively, red and blue geo shapes with wavy lines indicate approximate locations and may be used for resources such as shipwrecks.

Q: Why is the number of resources found in the current zoomed-in view of the map greater that the actual total number of resources that can be seen?

A: This relates to resources that have been allocated bounding boxes to indicate the region that they apply to. In the case of reports and surveys, for example, the contents may refer to large areas such as archaeological survey areas or whole regions or countries. When the bounding boxes overlap the current map area on view, they will be included in the resources found although the allocated geo co-ordinates may be located elsewhere out of view.

Questions relating to the use of dates and the time line (When)

Q: Why are some resources not found when a time period is specified?

A: Not all the metadata uploaded into the Portal includes absolute dates. Most of the resources will have one or more period names and where possible, these are mapped to Period.o so that named periods from one or more regions can be searched for. In some cases, such as the National Monuments Service of Ireland, dates or periods have not been provided in the corresponding metadata but may be found in the resource title and/or description.