Skip to main content

Research Data Management: Data sharing and preservation

Reusing data

Reusing and benefiting from existing datasets is the fundamental motive of data opening and sharing. Research data are valuable resources that often require a lot of time and money to create. It is thence worthwhile to consider reusing existing datasets that previous studies have generated and publicly archived. Yet reusing data is not only about saving time and resources. It also improves data repeatability and verifiability, and thus the reliability of scientific outputs.

At the same time, optimal use and reuse of archived data become possible only when the accessibility and reusability of research data have been ensured. Properly managed and openly published research data with appropriate licenses enable and facilitate shared use. FAIR principles in the below section give guidance on how to make your data truly open and reusable. See also Opening data on the right about how to open and publish your data.

Services for searching datasets include:

  • OpenAIRE explore is a search portal to datasets from a wide range of international repositories.
  • Fairedata Etsin is a research data finder, which also enables you to find research datasets from all fields of science.

Aila at the Finnish Social Science Data Archive (FSD) provides access to data archived, free of charge. After registering, you can search for and download relevant data easily and quickly.

When reusing data, good practices for the attribution of authorship and data citation must be followed. Learn more about how to reuse research data by OpenAIRE.

FAIR data principles

FAIR is an acronym that data is Findable, Accessible, Inter-operable, and Re-useable. These are all essential elements in making data truly open.

To be Findable:

F1. (meta)data are assigned a globally unique and persistent identifier
F2. data are described with rich metadata (defined by R1 below)
F3. metadata clearly and explicitly include the identifier of the data it describes
F4. (meta)data are registered or indexed in a searchable resource

To be Accessible:

A1. (meta)data are retrievable by their identifier using a standardised communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorisation procedure, where necessary
A2. metadata are accessible, even when the data are no longer available

To be Interoperable:

I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2. (meta)data use vocabularies that follow FAIR principles
I3. (meta)data include qualified references to other (meta)data

To be Reusable:

R1. meta(data) are richly described with a plurality of accurate and relevant attributes
R1.1. (meta)data are released with a clear and accessible data usage license
R1.2. (meta)data are associated with detailed provenance
R1.3. (meta)data meet domain-relevant community standards

See fairdata.fi and Guidelines on FAIR Data Management in Horizon 2020.

Opening data

The openness of research data increases the visibility and impact of your research, speeds up the adoption of your research findings and the creation of innovations, and facilitates disciplinary and interdisciplinary collaboration, all both within the scientific community and in the wider social circle. Open data improves the transparency and reliability of science, empowering and democratizing science.

Research data and related published research results produced at Hanken ought to be open and available for shared use. The discoverability and citability of research data ought to be ensured.

When opening your data, consider the following questions:

1. What part of the data will be opened and published? 

  • Data with personal information can only be published anonymised. Pseudonymised data is still personal data, and therefore cannot be opened without explicit consent for that purpose. See Anonymisation and Personal Data by the Finnish Social Science Data Archive (FSD).
  • Personal information can be shared subject to a license, if the original processing purpose allows it. If you plan to share data which includes personal information, contact Hanken’s Data Protection Officer dpo@hanken.fi.
  • Note that the metadata of the data holding personal information can still be able to be opened, although the actual data cannot be. 

2. Where will the data be opened?

  • Choose suitable repositories for sharing and opening your data already at the beginning of the project. Check that your data fulfill the repository requirements. 
  • Choose repositories which use persistent identifiers (DOI, URN).
  • Check the recommendations of the publishers, learned societies, and funders in your own field. Where have you or your colleagues published data?
  • Specific repositories for one data type can be found in re3data.org, a registry of research data repositories covering over 2,000 repositories.
  • General repositories: e.g., Aila, IDA, Zenodo, Dryad, and Figshare.
  • If you cannot open the data, you can open your metadata about your project data, for example, at Zenodo or the national Etsin.
  • Register your dataset in Hanken's research database Haris. You can register standalone datasets or datasets that are connected to a publication. If a publication has a relating dataset, our recommendation is to create two separate records in Haris – one for the publication and one for the dataset. The records can then be connected under the heading Relations to other content in the template. Availability to the data is made by adding a link or a DOI to the file location. It is not possible to upload files in the record for datasets. E-mail haris@hanken.fi if your have questions about reporting datasets.

3. When will the data be available? Do you set any embargo period?

4. Will some part of the data be destroyed? More information, see Data disposal by the Finnish Social Science Data Archive (FSD).

5. Which license will you use to open and share your data? Agreements on data ownership and other intellectual property rights must be concluded before commencing any actual research activities.

More information, see Five steps to decide what data to keep by the Digital Curation Centre (DCC).

Long-term pre­ser­va­tion of data

Long-term preservation means that data is preserved for more than 25 years. When creating your data, you need to consider how long it will be preserved. Also remember to check discipline-specific, funder-related, and publishers' data preservation time length requirements. A data archiving plan is part of research quality and transparency. If your data has long-term value, consider:

  • What part of the data is archived?

Special categories of personal data are advised to be destroyed when the project ends. GDPR, however, does not require the destruction of data. It requires that participants need to be informed about data preservation and the basis of the duration of preservation. If you are preserving personal information, contact Hanken's Data Protection Officer dpo@hanken.fi.

  • Where will the data be archived?
    • Finnish Ministry of Education and Culture has established Fairdata-PAS service for Finnish research organizations for long-term preservation of the nationally most significant research data. Fairdata-PAS is meant for digital preservation of research data for several decades, or even centuries. You can find guidelines by the University of Helsinki for assessing the value of the data here. If you wish to sign up for the queue for Fairdata-PAS, please contact openresearch@hanken.fi.
    • You can also contact openresearch@hanken.fi if you need other kinds of archiving services.
  • How long will the data be preserved?
  • Are there some costs related to archiving? Who takes care of them?
  • Will some part of the data be destroyed? See Data disposal by the Finnish Social Science Data Archive (FSD).

More information, see Five steps to decide what data to keep by the Digital Curation Centre (DCC).