Give Feedback

F1. (Meta-)Data are assigned a globally unique and persistent identifier

The unique addressability and localisation of data and metadata are a necessary condition for all further FAIR steps, from access to reuse. To be findable, each digital object and dataset should be unambiguously assigned a persistent identifier (PID).

PIDs are integrated into the globally applied system for identifying resources on the web, the Uniform Resource Identifier (URI). Globally unique, actionable resource identifiers not only enable the unique addressing of digital objects but can also refer to all identifiable entities, including non-digital ones: people, organisations, documents, physical objects, or abstract concepts. The use of URIs also eliminates ambiguities in data descriptions by assigning a unique identifier to each entity mentioned in the metadata.

A unique physical museum artefact, such as a vase, can be assigned a PID. This PID can lead to a webpage with a description of the physical object, but it does not identify the webpage or any other digital representation of the object (e.g., a dataset or a digital image). These digital representations should each be assigned their own PIDs. The digital representations or their metadata will include a thematic reference to the physical object, incorporating its PID.

In this context, identifiers consist of an internet link leading to a webpage that defines the information object, such as a research dataset, a specific subject keyword, a person, or a geographical location. Identifiers help others understand precisely what you mean, and they enable machine agents to interpret your data meaningfully during searches or automated integration. Identifiers are essential for collaboration between humans and machines. Moreover, identifiers help others accurately cite your work when reusing your data.

A persistent identifier must be globally unique — no one else can reuse or reassign the same identifier without referencing your data. Globally unique identifiers are issued by registration services whose algorithms guarantee the uniqueness of newly created identifiers.

Persistent identifiers must remain valid indefinitely, even if the web address (URL) of a resource changes. Registration services ensure that these links can still be resolved in the future, at least to a certain extent. Keeping weblinks active requires time and money, hence participating institutions usually have to register for a paid membership. Registration services act as central issuing authorities that assign unique prefixes for 'their' PIDs to the participating institutions. Many of these data platforms automatically generate globally unique and permanent identifiers for the datasets stored with them.

PID providers should publish a clear policy describing the conditions under which the permanent resolution of the identifier to the correct storage location and information object is guaranteed. Obviously, locally used identifiers (e.g., automatically assigned dataset IDs within a local database) that cannot automatically reference community-supported and publicly shared identification systems are not persistent. A data provider that chooses a ‘proprietary’ identification scheme must therefore provide appropriate and accurate mappings to public identification schemes to be considered FAIR.

PID systems

PIDs are offered through various identification systems:

Digital Object Identifier (DOI)

DOI combines a metadata model with the Handle System (see below) as a resolution infrastructure (i.e., DOIs are Handles). The system was introduced with the support of the International DOI Foundation (IDF) and became an official ISO standard (ISO 26324) in 2012. The DOI system is built on CNRI Handles. DOI registration agencies are responsible for assigning identifiers, and each has its own commercial or non-commercial business model to cover associated costs. The DOI system itself is maintained and developed by the IDF, which is governed by the members of the registration agencies.

Under the Handle System, there is a central, free, and global resolution mechanism for DOI names. DOI names from any registration agency can be resolved worldwide on any Handle server, meaning that DOIs are independent, and their resolution is not reliant on a single agency. A standard metadata kernel is defined for each DOI name. A license fee is required for the assignment of DOI names, but their resolution is free.

The DOI system is widely used in the scientific publication process.

Uniform Resource Name (URN)

URN is a standard developed by the Internet Engineering Task Force (IETF). There is no central management or resolver infrastructure. Large national libraries in Europe have established their own subgroup of URN, the URN:NBN, and operate a shared infrastructure for name resolution. International Standard Book Numbers (ISBN) for books are part of the URN system.

There are no licensing fees for assigning URNs, but a URN registration agency must establish an infrastructure for assignment and resolution. Since there is no shared resolution infrastructure or workflow for URNs, apart from specific areas like URN:NBN, it is impossible to establish general interoperability with URNs.

The URN service provided by the German National Library for public institutions is subject to certain conditions but is free of charge.

Handle

Handle is a non-commercial decentralised system for resolving identifiers, operated by the Corporation for National Research Initiatives (CNRI). It is used by many higher-level systems, e.g. DOI. Various initiatives use commercial Handle licenses to set up local Handle systems, such as the European Persistent Identifier Consortium (EPIC). Many repositories currently operate their own local Handle systems.

Archival Resource Key (ARK)

ARK is not a formal standard, but all ARKs follow the same structure and workflows. There is no central resolver (directory service for name resolution) – organisations can register a Name Assigning Authority Number (NAAN) and operate their own infrastructure for resolving ARKs. The system is operated by the California Digital Library with dozens of NAANs worldwide through a combined ARK/DOI infrastructure called EZID. This EZID infrastructure enables interoperability between ARKs and DOI names under the DataCite umbrella.

Persistent Uniform Resource Locator (PURL)

PURLs are web addresses that serve as permanent identifiers in the context of a dynamic and changing web infrastructure. Instead of directly pointing to web resources, PURLs use an intermediary resolver to refer to the actual location of the requested web resource. This functionality ensures the continuity of addressing resources, which can be transferred from server to server without negatively impacting the systems that depend on them.

After being managed by OCLC for a long time, the Internet Archive took over the administration of the PURL service in 2016. PURL servers are operated by various organisations, and formal membership is required. The source code is freely available.

The PID Network Germany is promoting the implementation of PIDs in key national and international open science initiatives. The NFDI basic service, PID4NFDI, which is currently under development, will support the integration of PIDs into the data and services of NFDI consortia. It will consider the varying degrees of maturity of PID providers and subject-specific requirements.

The role of data producers

Use PIDs for the entities mentioned or referenced in the data, as this significantly enhances their findability and reusability. Carefully differentiate between PIDs that refer to physical and digital objects. Publish on a platform that assigns PIDs to your data. Clarify with your data platform whether a digital resource, which is to be published in different formats, requires separate PIDs. Determine the level of granularity needed for the separate addressability of the elements in your research data via PIDs.

If a large number of PIDs are required, such as for museum or archive collections, it may be necessary to become a member of a PID provider organisation or find a partner organisation that can assign PIDs. Assigning PIDs may incur costs. A common requirement is that data packages, once submitted, may only be altered in (usually legally justified) exceptional cases. Continuously evolving metadata for collections ideally requires versioning of PIDs and thus also the citable versions of the associated data. This requirement is easier to implement when it involves high-quality digital representations of collection items. Exemplary cases include the ETH Zurich image archive, which has equipped over 770,000 digital objects in its collection with DOIs, or the scientific collections of Göttingen University, which used Handles to identify their collection objects.

In your research results, refer to the PID assigned to your dataset. Also, refer to the PIDs of datasets from others that you have used, and use PIDs to reference the entities your metadata refers to. More information on this can be found in section I3. (Meta)data include qualified references to other (meta)data.

The role of data platform operators

Choose the appropriate form of a persistent identification system and assign a PID to each resource. Make the policies for managing the identification schema publicly accessible, as Zenodo does, for example.

Further information on PIDs

Kahlert, Torsten / Hagemann-Wilholt, Stephanie: PID4NFDI Cookbook, 2025

THOR – Technical and Human Infrastructure for Open Research. Persistent Identifier Platform

Koster, Lukas: Persistent identifiers for heritage objects. Code4Lib Journal, 47, 2020

Arnold, Eckart / Müller, Stefan: Wie permanent sind Permalinks? Informationspraxis Bd. 3 Nr. 1, 2017

PIDs for collection objects: Subproject HeritagePIDs of the British programme Towards a National Collection (TaNC)

Winkler, Alexander: URIs im GLAM-Bereich – was sie sind und wie man sie verwendet, 2024

Citing research data

When research data have a persistent identifier and are cited in accordance with community standards, the corresponding digital objects or datasets are easier to find.

The role of data producers

Familiarise yourself with the citation guidelines for data that apply to your institution or publication outlet, and cite research data accordingly. This also applies to published resources that can be cited through PIDs (e.g. digitized collection objects), authority data, and vocabularies referenced in your data.

The role of data platform operators

Provide your users with easily accessible information about best practices for citing data. Make it easier for them to follow these practices, for example by offering a standardised button on the webpage labelled ‘Cite this dataset,’ which provides preformatted citations in commonly used citation styles.

Recommendations for good practice around citable data publications

Data Citation Synthesis Group: Joint Declaration of Data Citation Principles, Martone, M. (ed.), FORCE11, 2014

Murdoch University Library: Chicago Author-Date Referencing Guide, 2025

Dataverse Project: Best Practices. Data Citation

Use persistent identifiers for people and institutions

Use permanent identifiers for individuals and research organisations and institutions, e.g. Open Researcher and Contributor ID (ORCID), International Standard Name Identifier (ISNI), or the Gemeinsame Normdatei (GND), for institutions also use Research Organization Registry (ROR). Indicate the contributions of all project team members who should be acknowledged as responsible, whether as authors, contributors, or for their institutional affiliation. In this way, scientists support the presentation of their own research achievements. Author identification enables recognition and discoverability of individuals and institutions. It also facilitates to establish links between datasets, research activities, publications, and researchers.

The role of data producers

Distinguish yourself from other researchers or research groups with the same name. Apply for an author identifier if you do not already have one. For example, register with ORCID and reference your ORCID ID in your dataset and other places you find suitable for connecting research-relevant information.

Check if your institution is registered with ROR, and if so, use the corresponding identifier.

The role of data platform operators

Display existing identifiers for authors and institutions in the metadata, and if possible, allow linking to the associated profiles. To make complex roles relating to authorship, collaboration, and responsibilities more transparent, enable the assignment of role designations.

Further information on identifiers for people and role designations

Hagemann-Wilholt, Stephanie / Burger, Marleen / Dreyer, Britta et al.: Autor:innenidentifikation mit der ORCID iD: Warum und für wen?, in: ORCID in Deutschland – Blog, 2022

Example of the use of ORCID in reference systems: GEPRIS; Digital Humanities Lab (Leibniz-Institut für Europäische Geschichte)

Example of linking to the ORCID profiles of authors in a repository: Zenodo

Enable role assignments, see for example Contributor Types in the DataCite Metadata Schema v.4.6, Appendix 1