Skip to Main Content

Research Data Management (RDM)

Publishing (meta)data in line with the FAIR data principles

When opening and publishing your research (meta)data, consider the following questions:

1. How to describe and publish the metadata of your research data?

  • It is strongly recommended to use Fairdata Qvain metadata tool to describe and publish the (meta)data. Qvain is part of the Fairdata services offered by the Ministry of Education and CSC. Data described and published by Qvain are transferred automatically to research.fi and Etsin (research dataset finder, also part of the Fairdata services).
  • You can log in Qvain with your HAKA account, click CREATE DATASET, and fill in the form. Please see Qvain User Guide.

  • It is through the metadata that your research data become findable and first assessed for downloads and reuse. Creating appropriate and rich metadata is the key to making your research data truly open, understandable, and reusable.
  • Note that even if you cannot publish and archive your research data, because, e.g., your data contain personal information, sensitive personal data or confidential data, you can still publish the metadata of your research data..
  • Creative Commons CC BY license is recommended for published (meta)data when possible.

2. Where will the research data be opened and published? Research data are archived and published in a national or international repository when possible. 

  • Recommended general repositories include: 
  • Check specific repositories for one data type in re3data.org, a registry of research data repositories covering over 2,000 repositories.
  • Criteria for choosing a repository include:
    • Choose a repository which uses persistent identifiers (e.g., DOI, URN) for (meta)data.
    • A repository which publishes machine-readable metadata and uses a known metadata standard.
    • A repository widely used by your colleagues. Also check the recommendations of the publishers, learned societies, and funders in your field.
    • A repository which allows you to choose the terms of use and internationally standardized licenses under which the data can be reused, and states them clearly as part of the metadata.
  • Define an appropriate access type (open, embargoed or restricted) to research data based on the feature of the data, your research process, need for the protection of trade secrets and other confidential data, and intellectual property agreements, as well as funders’ and publishers’ requirements. 
  • If your data has long-term value, consider preserving your data in Digital Preservation Service for Research Data. See Long-term pre­ser­va­tion of data below.

3. What part of the data will be opened and published? Will some part of the data be and erased and destroyed? 

  • Anonymised data are published and archived in a data repository for shared reuse whenever possible. 
  • According to Data Protection Act (1050/2018, Section 4 (4) and GDPR (point (e) of Art. 6 (1), if processing research material containing personal data and processing personal data included in their metadata for archiving purposes is necessary and proportionate to the aim of public interest pursued and to the rights of the data subject, it is lawful. Pseudonymised data are still personal data. Restricted access can be used as a measure to archive pseudonymised data. The research participants need to be informed of your open data plans in the privacy notice. 

4. When will the data be available? Do you need to set any embargo period?

5. Which license will you use to open and publish the (meta)data? Licensing is necessary for publishing data. It is recommended to use Creative Commons (CC) license CC BY when possible. 

6. Organize your datasets with standard and non-proprietary data formats, sensible and consistent file naming conventions, and version control. See Data formats and organizing.

7. Remember to Register your datasets in Haris

FAIR data principles

The FAIR data principles, formulated by Force11, are the guiding principles on how to make data truly open. FAIR is the acronym for "findable, accessible, interoperable, and reusable": 

                FAIR data principles and metadata

The FAIR data principles can be formularized as “Findable + Accessible + Interoperable = Reusable.” Making data reusable, and reusing and benefiting from existing datasets, are the fundamental motives of open data. A FAIR + (FAIR and Reproducible) solution is also promoted (See Christophe Bontemps and Valérie Orozco. 2021. “Toward a FAIR Reproducible Research”, in Abdelaati Daouia and Anne Ruiz-Gazen (eds.) Advances in Contemporary Statistics and Econometrics. Springer International Publishing.)

FAIR is not equal to open or free. Data can be closed and paid for yet perfectly FAIR, while data that are open and free are often not FAIR, and thus regarded as being cost-inefficient and re-useless.

The FAIR data principles are mainly about metadata which appears in almost all the FAIR principles. It is through the metadata that your research data become visible, findable and first assessed for downloads and reuse. Creating appropriate and rich metadata is the key to making data open, understandable, and reusable.

It is recommended to use the Fairdata services offered by the Ministry of Education and Culture and CSC. The services include:

  • IDA, Research Data Storage – Secure storage for research data.
  • Qvain, Research Metadata Tool – A metadata tool for describing and publishing datasets.
  • Etsin, Research Dataset Finder – Discover, access and download research data from all fields of science.
  • DPS, Digital Preservation Service for Research Data – Reliable preservation of digital information for decades or even centuries.

More information, see: 

Metadata and data documentation

Data documentation means describing the data, is data about data, and provides information about the who, what, when, where, why, how of the data. Investing time in documenting the data makes it easy to understand them for both others and yourself, and decrease the risk of false interpretation of the data. Data documentation can be a readme file (human-readable) and metadata (computer-readable): 

  • Readme files are text documents (e.g., in the format .txt) providing information about data files to ensure they are interpreted correctly. A readme file explains what data a research project has, how the data were created, where the data originate from, how to interpret them, what the abbreviations mean, what software is needed to use the data, how the data have been modified, and can include information about the title, creator, funder, relevant dates of data collection and publication, location, methodology, subject, file formats, file naming system and folder structure, data version, licence, and repository.

Write a readme file about your data and data files. Put the readme file in the most obvious place in the data file folders to ensure that it can be noticed and seen immediately.

  • Metadata are technical data that describe a research dataset. When making data FAIR (Findable, Accessible, Interoperatable, and Reusable), metadata play the key role. Systematically described research data is the key to making your data understandable, findable and reusable.

Metadata should be machine-readable and machine-actionable. That is, data need to be richly and systematically described in the way that machine can interpret and navigate all the metadata and linked data across different websites, and retrieve and transmit the right ones for a person conducting semantic queries. There are standard methods available for data documentation called metadata standards, which should be used if suitable for the data. The Fairdata Qvain metadata tool makes describing and publishing research data smooth and effortless for researchers without requiring technical skills.

It is strongly recommended to use Fairdata Qvain metadata tool to describe and publish your (meta)data. Qvain is part of the Fairdata services to support your research data to go FAIR. Data described and published by Qvain metadata tool are transferred automatically to Finnish metadata warehouse Metax, which is integrated with both Etsin (research dataset finder) and the Finnish National Research Information Hub (in Finnish: Tutkimustietovaranto, a service also commissioned by the Ministry of Education and CSC).

You can log in Qvain with your HAKA account, click CREATE DATASET, and fill in the form. Please see Qvain User Guide.

                Qvain

If you cannot publish and archive your research data, because, e.g., your data contain personal information, sensitive personal data or confidential data, you can still publish the metadata of your research data. The metadata of the data holding personal or confidential information can be published, although the actual data cannot be. 

More information, see:

Long-term pre­ser­va­tion of data

Long-term preservation means that data are preserved for several decades or even centuries. You can categorise your datasets according to the anticipated retention periods:

  • 1) Data to be destroyed upon the ending of the project.
  • 2) Data to be archived for a verification period, which varies across disciplines, e.g., 5–15 years.
  • 3) Data to be archived for potential reuse, e.g., for 25 years.
  • 4) Data with long-term value to be preserved by a curated facility for future generations for tens or hundreds of years.

Long-term preservation refers to the fourth category. That is, data are preserved for more than 25 years. When creating your data, consider how long it will be retained. Also remember to check discipline-specific, funder-related, and publishers' data retention time length requirements. 

Finnish Ministry of Education and Culture has established Fairdata-PAS service (Digital Preservation Service for Research Data, DPS for Research Data) for Finnish research organizations for long-term preservation of the nationally most significant research data. The service is meant for digital preservation of research datasets that have significant value to the organization or on a national level currently and especially also in the future.

If you wish to sign up for the queue for DPS for Research Data, please contact openresearch@hanken.fi.

More information, see Digital Preservation (Fairdata-PAS): Guidelines for UH Evaluators by the University of Helsinki.

Benefits of open data and data reuse

Making research data open and reusable, and reusing and benefiting from existing datasets, are the fundamental motives of open data. The FAIR data principles can be formularized as “Findable + Accessible + Interoperable = Reusable.” The openness and reuse of research data:

  • Increases the visibility and impact of your research.
  • Are recognised as part of a researcher’s academic merits. Activities related to the promotion of good data management and the appropriate opening of research data are part of academic work and are valued and included as impact merits in research evaluation criteria of recruitments and career promotion decisions (National policy and executive plan on open access to research data, 2021, p.5; Hanken's Guidelines on Open Science and Research, 2021, p.7).
  • Speeds up the adoption of your research findings and the creation of innovations.
  • Facilitates disciplinary and interdisciplinary collaboration, both within the scientific community and in the wider social circle.
  • Improves knowledge sharing, and increases the transparency and reliability of science, both empowering and democratizing science.
  • Contributes to attaining several SDGs.
  • Reusing published data from previous studies not only saves time and resources in data production,
  • but also improves data repeatability and verifiability, research reproducibility, and the reliability of research outputs.

More information about the benefits of open data, see:

When reusing data, good practices for the attribution of authorship and data citation shall be followed. See Reusing and citing data.