Fundamental to knowledge organisation systems are concepts. They are to be understood as "units of thought" (concepts) – ideas, meanings, or (categories of) objects and events. As such, concepts exist in consciousness as abstract entities that are independent of the terms by which they are labelled.
A controlled vocabulary is a system for organising knowledge that contains a structured set of concepts for classifying and organising data, ensuring it can be accessed and searched for later. These concepts relate to data descriptors that are connected by explicit relationships (hierarchical or associative). These descriptors are used to distinguish and define the characteristics of knowledge resources in a specific domain. They contain data values for general terms, individual names, and other values necessary for the structured description of data.
With the help of controlled vocabularies, resources can be queried, searched, analysed, and linked to other relevant information objects.
The most common types of controlled vocabularies are:
In practice, no clear distinction is often made between these types when designating the vocabularies. They are therefore increasingly being referred to as 'semantic artefacts'.
The selected vocabularies should have PIDs for their entities and be accessible, interoperable, and carefully documented, thus also being FAIR. Use openly licensed, well-developed, and published vocabularies that are widely recognized in the professional community. The integration or referencing of such vocabularies provides the precise meanings of the concepts and properties represented in the data.
Terminology Services 4 NFDI (TS4NFDI), one of the NFDI's basic services currently under development, is a cross-domain service for providing, curating, developing, harmonising and mapping terminologies. Its goal is to support the convergence of individual solutions and integrate them into a standardised, interoperable and sustainable suite of services.
Here you may find Recommendations for vocabularies, authority data, and application ontologies in the fields of cultural research and cultural heritage.
Use vocabularies relevant to your field and organise your research results accordingly from the beginning of your project. Always integrate the PIDs (as URIs) of the terms alongside their language labels into your data to ensure unambiguity, even when the data is analysed automatically, and to enable use in the context of Linked Data applications.
Please note that the URI of a term often does not match the URL of the website. For example, the URI for the city of Potsdam in GeoNames is https://sws.geonames.org/2852458/, while the URL of the website is https://www.geonames.org/2852458/potsdam.html. References to the URIs can be found via menu items such as ‘semantic view’, ‘Concept ID’, and the RDF or JSON-LD formats available.
Often, the use of widely accepted and published vocabularies within the community is only partially possible because terms are needed that are not yet covered in a known published vocabulary or ontology. Examples include insufficient offers for objects or visual content from non-European cultures or the lack of specific technical terms, e.g. for historical glass processing techniques in the Art and Architecture Thesaurus. The GND and Getty vocabularies allow their user communities to supplement the vocabulary in line with their editorial rules.
You can also create a project-specific vocabulary and publish it under an open license, preferably as Linked Open Data in a machine-readable format. Use specialised tools for thesaurus creation and publication, such as VocBench (open source), ACDH Vocabs Editor (MIT License), xTree or vocabulary modules from collection management software, as well as the editor Protégé for ontology modelling. The vocabulary can be published using tools like Skosmos an open-licensed, web-based SKOS browser. This purpose is also met by SkoHub, which offers additional vocabulary management functions.
The self-created vocabulary should, as much as possible, follow the structure of a published vocabulary and be designed as a local extension of it.
The vocabulary or ontology that applies to a specific data element should be clearly specified. Even for elements where this does not apply, the value type of the element should be clearly defined in the metadata of the digital object using a publicly available vocabulary or ontology.
Provide examples of vocabularies that can be used by the expert communities you represent and that can be addressed via the platform's interfaces.
Wherever possible, enable the use of widely accepted authority data or identification systems, such as authority data for persons (ORCID), organisations (ROR), funding organisations (Crossref Funder Registry), DFG Classification of Scientific Disciplines, GND etc.
Ensure that the relevant attributes are stored in the metadata to guarantee unambiguity and machine readability.
Project "Digitization of Gandharan Artefacts (DiGA)": Elwert, Frederik / Pons, Jessie: Brücken bauen für Buddha - Das Projekt "Digitalisierung Gandharischer Artefakte" (DiGA) und die Pelagios Working Group "Linked Data Methodologies in Gandharan Buddhist Art and Texts", in: DHd 2022 Kulturen des digitalen Gedächtnisses. 8. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" (DHd 2022), Potsdam, 7. März 2022