Skip to content

07 Long Term Archiving

Motivation and Terminology

In order to keep data searchable, accessible and readable in the future, it is archived. The requirements of the DFG's Code of Conduct on "Good Research Practice" demand that relevant research data be made available for 10 years. Many academic institutions require their researchers to ensure the long-term preservation of their data (e.g. within the framework of a research data policy).

"Long-term" is an auxiliary word to describe an unspecified period of time during which technological and sociocultural changes may occur that may affect the preservation, access, search and reuse of digital research data. Accordingly, digital long-term archiving comprises a series of measures that must be planned, controlled and carried out.

CC-BY: Data Stewards, Ghent University

Sustainable File Formats

Not every file format is suitable for long-term archiving. In this context, a distinction is made between proprietary and open formats. Proprietary formats are those that require commercial software (e.g. Microsoft Office, AutoCAD, SPSS, MaxQDA). The files to be archived should be unencrypted, uncompressed, patent-free and created in an open, documented standard. These formats require less frequent migration and are characterized by a longer life span and higher distribution.

Sometimes proprietary file formats are indispensable for your own work. For long-term digital preservation, however, they should be converted into recommended formats. It is important to check whether the conversion was successful and the format is valid, as software may also produce errors. Both the original file and the file in the converted format should be saved. The UK Data Service provides an overview of recommended formats.

Requirements for Long-Term Archives

The following aspects should be considered when selecting a suitable storage for long-term archiving:

  • Technical requirements: the service provider should have a strategy for data conversion and migration. In addition, a check of the readability of the files and a virus check should be carried out at regular intervals. All steps should be documented.
  • Seal for trustworthy long-term archives: "A digital long-term archive is considered trustworthy if it operates in accordance with its goals and specifications for information preservation over long periods of time and if its users, producers, operators, partners rely on it". 59 For an external assessment as to whether or to what extent a long-term archive is trustworthy, various seals with different focuses are developed, which do not address every type and operating model of repositories equally (e.g. Nestor seal, DIN 31644 or CoreTrustSeal).
  • Cost of services: Always check whether service providers charge for data storage. The costs can depend, for example, on the amount of data, the implementation of technical standards or the affiliation of the data providers.
  • Making the data accessible: Before choosing the storage location, you should ask yourself whether the data should be accessible or just stored.
  • Longevity of the service provider: Economic and political factors influence the longevity of service providers.

Recommendations for a Jump Start

Jump start

Archive directly after you finished data processing
Archive your data in UFZs DMP or a similar service