References can enrich understanding of data by linking it to related or supplementary information. This provides a more comprehensive view of the dataset and its context. Specifically, it should be indicated whether one dataset builds on another, whether additional data is needed to complete or understand the dataset in context, and whether supplementary information is stored in a different dataset.
The best way to reference these datasets is by using persistent identifiers (PIDs), and their linking should be also indicated through controlled, machine-readable vocabulary.
Plan the structure of your data publication from the beginning of your project and ensure the assignment or referencing of PIDs for all relevant components. External resources can also be included via PIDs.
Qualify the references whenever possible by specifying the type of relation with a controlled vocabulary. Choose the correct term from the perspective of the resource you are describing.
When describing collection objects, use authority data and controlled vocabularies to reference people, organisations, geographic locations, descriptive subject keywords, and related objects (e.g. preparatory or derivative works). The PIDs of these entities can be used to automatically supplement information, providing valuable additional search and filtering options. For individuals, these could include biographical data and name variants. For subject keywords, these are broader terms, synonyms, or terms in multiple languages. Relationships between entities should also be qualified and indicated with controlled vocabulary. Available vocabularies for this are the CIDOC CRM Properties, the MARC Relators or parts of the LIDO-Terminology.
Enable the qualified linking of metadata and data within the repository. Provide comprehensive and user-friendly tools for establishing these relationships through the forms available for research metadata. Ensure that these references and their associated PIDs are easily discoverable and reusable by both humans and machines.
Example of using controlled vocabulary for linking resources in DataCite: Bayer, Christiane / Frech, Andreas / Gabriel, Vanessa et. al.: DataCite Best Practice Guide (Version 4.0), 2025
Example of a complex data publication linked through references: Bender, K.: Thematic Research Collection of the Iconography of Venus from the Middle Ages to Modern Times, heiDATA, 2018, consisting of eight datasets with 29 files