A2. Metadata should be accessible, even when the data is no longer available

Ensuring the availability of data sets over an indefinite period presents a significant challenge for repositories. This is due to the need for adaptation to new format standards, updates in the research software used, or changes to the repository's operating environment. Consequently, the full research dataset is usually only made available for a limited time. However, the metadata should remain available indefinitely. This allows the repository to maintain comprehensive records of all current and no longer available data sets with much less effort.

In some cases, research data may need to be withdrawn due to restrictions that only become apparent after publication. However, the metadata should remain accessible wherever possible.

In some cases, research data may need to be withdrawn due to restrictions that apparent after publication. However, the metadata should remain accessible wherever possible. become known after publication — yet the metadata should, whenever possible, remain accessible.

The role of data producers

Structure your data in a way that the metadata still provides a meaningful record of the entire dataset, even after the retention period for the entire project data agreed with the data platform has ended. Even if the data itself is no longer accessible, the metadata allows other researchers to contextualise it within their own research questions.

Therefore, keep metadata separate and do not embed it exclusively in components within the same file as the bit sequence. For example, metadata related to digital images should not be stored exclusively in the image file header. Store it in an external metadata format to ensure it can be effectively reused in contexts that work with these formats.

For use cases where digital images are directly worked with, it can still be beneficial to store certain information redundantly in image headers. This prevents data loss by including details such as licensing, author, a brief description, and keywords in the IPTC data, and specific technical information in the EXIF data of the image files.

The role of data platform operators

Ensure that metadata and data are stored in separate files.

Provide a blocking option for data sets. Archive metadata indefinitely and ensure that it remains accessible through its PID, even if the corresponding data is no longer accessible.

To guarantee the long-term accessibility of metadata beyond the lifespan of the repository, you should develop an exit strategy, which includes transferring the metadata to another repository if necessary.