The definition of personal data is broad under the General Data Protection Regulation of the European Union (GDPR, 2016/679). Personal data means any information relating to an identified or identifiable natural person (data subject) and encompasses all data from which a natural person can be directly or indirectly identified (GDPR, Recital 26 and Art. 4).
Direct identifiers are information that is sufficient on its own to identify a natural person. Examples are a person’s full name, personal identity code, email address containing the personal name, and biometric identifiers (e.g., fingerprint, facial image, voice pattern or manual signature).
Indirect identifiers are information that on its own is not enough to identify someone, but can be used to deduce the identity of a person when linked with other available information. Examples are a person's age, gender, educational background, economic activity, occupational status, socio-economic status, household composition, income, marital status, mother tongue, nationality, ethnic background, place of work or study, and postal code.
Some types of information are identified as strong indirect identifiers which can be used to identify an individual fairly easily, such as a postal address, phone number, vehicle registration number, bibliographic citation of a publication by the individual, email address not in the form of the personal name, web address to a web page containing personal data, very rare disease, unusual job title, position held by only one person at a time (e.g., chairperson in an organisation), a student ID number, insurance or bank account number, and IP address of a computer.
The following personal data are defined as special categories of personal data (sensitive personal data) by the GDPR (Art. 9): personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person's sex life or sexual orientation. Personal data relating to criminal convictions and offences or related security measures are also, by their nature, particularly sensitive and merit specific protection as the context of their processing could create significant risks to the fundamental rights and freedoms of the data subjects (GDPR, Art. 10).
If you collect and process any information from individuals or about individuals (e.g., consumers, company managers), assume that it is personal data. Pseudonymised data can be attributed to a natural person by the use of the additional information and are still personal data. See more information on pseudonymisation and anonymisation below.
More information about what constitutes personal data, see:
Personal data shall be processed lawfully, fairly, and in a transparent manner to protect the fundamental rights and freedoms of the data subjects. Personal data collected and processed shall be protected with adequate organisational and technical measures to minimise the risk to the data subjects' rights in the event of unauthorised access and usage. The core requirements for data protection are also described in Hanken’s Data Protection Policy.
Here is one example of the situations where personal data are not adequately protected: University failed to sufficiently protect sensitive personal data, published on the web page of the European Data Protection Board (EDPB).
If you process personal data, follow the eight procedures below to maintain high ethical standards and comply with relevant data protection legislation:
(1) Before data collection (during research planning phase)
3.1 Request an ethical review statement when needed
3.2 Carry out a Data protection impact assessment (DPIA) when needed
(2) Data collection and analysis (during active research phase)
(3) After active research phase
(1) Before data collection (during research planning phase)
If your research proposal involves the processing of any personal data, you shall have plans in place to demonstrate compliance with EU and national data protection laws for the entire data life cycle through, for example, implementing data protection by design and default, conducting risk assessment, providing privacy notice, documenting processing activities, ensuring secure data storage and transfers, adequate data anonymisation, and data erasure.
Consider how you design your study so that your data can be the least identifiable while still accomplishing your research goals, and ensure that, by default, personal data will be processed with the highest privacy protection.
Understand the objectives of your study and define the clear, specified need for collecting personal data. Collect only the minimum amount of personal data necessary and proportionate to the accomplishment of your research tasks. Personal data shall not be collected just in case that they might be useful in the future.
Conduct a data minimisation review for the whole process of data management, including defining the types and amount of personal data collected, the extent to which they may be accessed, further processed and shared, the purposes for which they are used, and the period during which they are kept. You shall minimise the processing as far as possible.
A Data management plan (DMP) can help you plan the entire life cycle of your research data. It is an important part of Research data management (RDM) and an essential tool for following good and responsible research practices. A DMP describes what and how research data will be handled during and after the research project, and elaborates the key measures for ethical and legal compliance and for FAIR data production. Research funders increasingly require a DMP written and updated in different versions when the research project evolves.
Researchers can use Hanken’s DMP template or other Public DMP templates (with Hanken’s DMP guidance integrated) in DMPTuuli to write and update a DMP. Please see DMPTuuli with Hanken's DMP guidance and DMP template to learn how to get access to Hanken’s DMP template and DMP guidance.
3.1 Request an ethical review statement when needed
Researchers shall bear the responsibility for ethical and moral concerns and decisions involved in the research and during the interaction between the researchers and research participants.
All research shall comply with relevant Ethical principles and guidelines and follow any applicable ethical review practices. Conduct an ethical self-assessment and identify and address ethics issues in your research proposal.
Please contact Hanken’s Research Integrity Advisor (firstname.lastname@example.org) for advice.
3.2 Carry out a Data protection impact assessment (DPIA) when needed
If a planned personal data processing "is likely to result in a high risk to the rights and freedoms of the data subjects," a Data protection impact assessment (DPIA) shall be conducted prior to the processing (GDPR, Art. 35). This is particularly relevant when a new data processing technology is being introduced and may occur when the following data will be processed:
Check also the following four lists to determine whether you are required to conduct a DPIA:
A processing meeting one or two of the criteria may require a DPIA to be carried out. A DPIA is a process to help you identify and minimise the data protection risks of a project. The GDPR (Art. 35 (7): (a)-(d)) requires that the contents of a DPIA shall contain at least:
That is, in the DPIA, you identify the need for a DPIA, describe the nature of the data and data processing including data collection, analysis, storage and disposal, specify what and how much data will be collected and processed, what types of processing might involve what risks, the sources of risks and potential impact on the data subjects, and define additional safeguard measures to reduce or eliminate the risks.
Depending on the nature and scope of your processing, you can conduct a full or light version of a DPIA. Use Hanken's DPIA template (for studies and research) to conduct a full version of DPIA, or answer directly to the four minimum required aspects for a DPIA.
If your processing meets one or more of the criteria, but you consider the planned processing is not “likely to result in a high risk,” you shall justify and document the reasons for not carrying out a DPIA, and include the views of Hanken’s DPO (email@example.com) (Article 29 Data Protection Working Party, 2016/679, p. 12).
(2) Data collection and analysis (during active research phase)
Personal data shall be processed lawfully with at least one of the six lawful grounds defined by the GDPR (Art. 6): consent, contract, legal obligation, protection of vital interests, public interest or official authority, and legitimate interests. You need to rely on at least one legal basis to justify why you have the right to collect, store, and handle personal data:
For research work conducted by researchers including PhD students, the legal basis is usually scientific research carried out in the public interest (GDPR (point (e) of Art. 6 (1) and Finnish Data Protection Act (1050/2018, Chapter 2, Section 4, point (3)).
When collecting personal data, what researchers need to do to comply with good data management practices, data protection regulations, and research integrity includes:
Note that this consent (to participate in the research, required by ethical standards) is different from consent (to personal data processing, as a legal basis under the GDPR). The difference is acknowledged by TENK’s guidelines (p. 9).
If you do not ask for informed consent from the research participants, or if your study is one of the other five types described in Ethical review, you need to request for an ethical review statement by Hanken’s Research Ethics Committee.
There are rare cases wherein you may not have to ask for informed consent, for example, observation studies in public places and field experiments in which the experimental setup may substantially suffer from letting the participants know about the research in advance. Furthermore, if you only use secondary register data which is anonymised or aggregated (e.g., company-level data), you do not need to inform the research participants. (The "secondary register data" means that the original data about persons have been gathered by someone else or some other party/organisation than you. For example, in your study you are analysing a company's anonymised customer database or some survey data that a governmental agency has originally gathered.)
(2) Provide all the mandated information in a privacy notice to research participants about the processing of their personal data. Transparency is an overarching principle and a fundamental requirement under the GDPR. Personal data should be processed in a fair and transparent way. Regardless of the legal basis for processing personal data, data subjects should obtain sufficient information from you about why and how their personal data are being collected, used, stored, disseminated, or otherwise processed. The GDPR (Art. 12-14) stipulates long lists of information that shall be provided to the data subjects, including the purposes and legal basis for processing, identity and contact details of the data controller and DPO, recipients of personal data, international data transfers, data retention and deletion plans, and data subjects’ rights. Furthermore, the principle of transparency requires, in particular, the information provided to the data subjects be easily accessible and easy to understand (GDPR, Recital 39).
For personal data processing in research work by researchers including PhD students, data controllership shall be determined case by case. The data controller determines the purposes and means (i.e., why and how) of the processing of the personal data and is primarily responsible for compliance with data protection laws throughout the data lifecycle. The data controller can allocate responsibilities according to the actual roles of the parties. The role of data controller or joint controller can be defined in the following cases:
Basically, there are two situations with different timings for providing the required information, depending on whether the data are collected from the research participant or from some other sources:
Both the informed consent form and privacy notice shall be provided to the research participants before you collect their personal data. Afterwards, the privacy notice shall be held available, for example, on the research project’s website and/or Hanken’s webpage, and be provided upon request to all the data subjects, Data protection authorities (DPAs), and research funders. Keep both the informed consent and privacy notice on file.
This often applies when you collect secondary data from online forums/social media. You need to ensure that data processing is fair to all the data subjects involved, that their fundamental rights are respected in compliance with ethical and privacy principles, and that relevant terms and conditions of the platform are observed. When applicable, the privacy notice ought to be given to the data subjects who are involved in the collection and processing of the data from the online forums/social media and you need to obtain consent from them.
If the provision of such information proves impossible or would involve a disproportionate effort, or seriously impair the achievement of the objectives of your processing, you can make your privacy notice publicly available, for example, on your research project’s website and/or Hanken’s webpage to make the privacy information publicly available (GDPR, Art. 14 (5)). For example, Findata requires that data applicants post privacy notices online, either on the home organizations’ pages or the research projects’ pages, before granting access right to the secondary datasets from, e.g., Statistics Finland.
When it is difficult to provide all the required information at one time, you can adopt layered fair processing notices, providing and bringing first the most important information (e.g., the purposes of the processing and the identify of the data controller) in the first short layer to the data subjects’ attention, together with a click-through link to your privacy notice with more detailed information.
More information, see:
For studies and thesis-writing by BSc/MSc/eMBA students, consent is usually used as the legal basis unless the student is a member of a research project where one or more researchers (at the PhD level or above) are involved (GDPR, point (a) of Art. 6 (1)). When consent is used as a legal basis for processing personal data, the consent needs to meet the requirements of the GDPR. Consent to the processing of personal data should be a “freely given, specific, informed and unambiguous indication of the data subject’s wishes,” and “be presented in a manner which is clearly distinguishable from the other matters, in an intelligible and easily accessible form, using clear and plain language” (GDPR, Art. 4 and 7). Data subjects have the right to withdraw their consent at any time. See Consent of the data subject by the Office of the Data Protection Ombudsman.
When collecting personal data, what students need to do to comply with data protection laws includes:
For personal data processing in studies and thesis-writing by BSc/MSc/eMBA students, data controllership shall also be determined case by case:
Processing of special categories of personal data (sensitive personal data) shall be prohibited. Students and researchers needs to rely on at least one of the ten exceptions or derogations to the prohibition in order to collect and process special categories of personal data, data of a highly personal nature, and other specially protected personal data. These exceptions or derogations are specified in the GDPR (Art. 9) and supplemented in the Data Protection Act (1050/2018, Sections 6, 7 and 29).
A personal identity code may be processed: (1) based on consent, (2) if so provided by law, (3) to perform a statutory duty, (4) to implement the rights and duties of the data subject or the controller, or (5) for scientific or historical research purposes or statistical purposes (Data Protection Act (1050/2018, Chapter 5, Section 29).
A Data protection impact assessment (DPIA) may be needed when students and researchers process special categories of personal data or data of a highly personal nature. See the instructions under "3.2 Carry out a data protection impact assessment (DPIA) when needed" and contact firstname.lastname@example.org.
What is important in your data collection and data analysis stages is that your research data are stored and backed up in a location that cannot be accessed by anyone who is not authorised, and that data transfers outside Hanken and the EU/EEA are only carried out in full compliance with relevant regulations.
See the PDF file “Instructions for handling and storing data and documents on different information security levels” on the page of Information Management at Hanken and learn what different storage solutions are allowed and suitable for different documents and data on different data security levels.
For secure storage and backup of active research data during usage, students and researchers use:
data storage services provided and maintained by Hanken, including the researcher's own account on the Hanken network like H:\, Microsoft Office365 applications (e.g., Onedrive for Business), Webropol or SPSS. If you do not have a plan for data archival after the research project, this solution is suitable. OR
data storage services provided by CSC such as IDA which is also for data archival. IDA is a Fairdata service for both data storage and data archival. The Fairdata services are offered by the Finnish Ministry of Education and Culture and produced by CSC – IT Centre for Science.
Established and well-known infrastructures are mostly a more secure alternative for storing research data than, for example, the hard disc on the researcher’s personal computer, both in terms of data security and from a confidentiality perspective.
In addition to Hanken's and CSC's data storage systems, you can use your own password-protected personal computer and hardware (e.g., internal/external hard drives) and password-protected joint-use computers in a room located physically at Hanken with restricted access, to store and process data during research:
Unless you have entered into a Data processing agreement (DPA) with another system/service provider who acts as a data processor, you shall NOT use other systems and internet clouds, for example, iCloud, Dropbox, Google Docs, publicly available OneDrive (for consumers) and other survey platforms than Webropol. Data processors process personal data on behalf of the data controller and do not determine the purposes and means (i.e., why and how) of the processing of the personal data. A Data processing agreement (DPA) shall be signed between the data controller and data processor. Hanken’s DPA templates are available here (Data Processing Agreement template and Data Processing Appendix template (as part of an Agreement)).
If you collect personal data from online questionnaires or surveys, use the GDPR-compliant tools and platforms such as Webropol. Webropol's user instruction is available on the page of Hanken's IT services. If the information you plan to collect contains sensitive personal data or confidential data, it may be better that you do not collect it online.
If you collect interview data by recording the interview with mobile phone or dictaphones or recording teleconferences, see Security instructions for handling recorded interviews.
You can use Hanken’s video platform Panopto to transcribe research data, for both audio and video files. Please note that you are responsible for not sharing the personal data contained in Panopto with anyone else. See Transcribing qualitative data.
If you transfer personal data outside Hanken:
If you save and store your data in IDA by CSC, use the safe data transfer and sharing measures offered by IDA. See 1.8 I want to share my research data, what should I do? in FAQ of the Fairdata services by CSC.
You can use physical memory sticks or external hard drives, in cases where you or the other party do not have access to Hanken's data sharing systems (e.g., OneDrive for Business). Make sure that data are stored securely, and that you erase the personal data stored on your memory sticks and on your USB disks immediately after the transfer. You can encrypt the data on memory sticks and external hard drives.
Note that you should NOT send or share data by an ordinary, non-secured email, or use systems that are not provided by Hanken or CSC (e.g., DropBox, Google Docs, and publicly available OneDrive (for consumers) for data transfers.
If you have a third party outside Hanken as the data processor who provides, for example, translation/interpretation, transliteration/transcription or raw data analysis services, you need to sign a Data processing agreement (DPA) with the data processor. Hanken’s DPA templates are available here (Data Processing Agreement template and Data Processing Appendix template (as part of an Agreement)).
Transfers of personal data to third countries or in international organisations: For data transferred outside the EU/EEA, follow the European Commission's Rules on international data transfers (GDPR, Chapter V, Art. 44-50):
If personal data are transferred to non-EU/EEA countries, specify the countries' names in your privacy notice and the appropriate safeguards you plan to take to ensure that the level of data protection in compliance with the GDPR is not undermined. Contact email@example.com for advice.
If no personal data are transferred from and to non-EU/EEA countries, specify in the privacy notice that data transferred between project partners outside the EU/EEA will only be restricted to anonymized data, the transfer will be made via a secure channel, and processing and transfers of personal data will only reside inside the EU/EEA and be limited to the research.
More information, see Transfers of personal data out of the European Economic Area by the Office of the Data Protection Ombudsman.
Special categories of personal data (sensitive personal data) are classified as being on the increased information security level (See the PDF file “Instructions for handling and storing data and documents on different information security levels” in Information Management at Hanken).
If you work with sensitive personal data, use CSC's Sensitive Data Services for Research including Sensitive Data Connect (SD Connect, for sensitive data storage and sharing) and Sensitive Data Desktop (SD Desktop) which are designed to support secure sensitive data management through web-user interfaces accessible from the user's own computer.
Protect the data with strict access control and encryption if you work with sensitive personal data or confidential data such as trade secrets, politically sensitive information, information concerning national security, and data obtained in trust and confidence:
You can ask for advice from Hanken’s Information security officer (firstname.lastname@example.org) and Data protection officer (DPO, email@example.com) to ensure that your storage and transfer solutions meet data protection requirements.
If there are changes in personal data processing, for example, if there are new, compatible processing purposes other than the initial purpose, if there are new recipients of the personal data (e.g., new research partners or translation or transcription service providers), or if there is an addition of new data variables to the categories of personal data compiled into the dataset, the privacy notice and other documentation shall be updated and the research participants be informed of the changes prior to the new processing.
If informing each research participant of the changes proves to be impossible or would require a disproportionate effort, you can update your privacy notice on your research project’s website and/or Hanken’s webpage to make the information about the changes publicly available.
It is stated by the Office of the Data Protection Ombudsman on Minimisation of personal data in scientific research that "[a]nonymisation and pseudonymisation should be performed as soon as possible, for instance right after the data have been aggregated."
Pseudonymisation means the processing of personal data in such a manner that the personal data can no longer be attributed to the individual involved without the use of additional information. Such additional information shall be kept separately from the pseudonymised data and be subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person. (GDPR, Art. 4 (5))
Pseudonymisation can be done by removing or replacing identifiers with pseudonyms, aliases or codes. The data remain pseudonymous and personal as long as the additional identifying information exist.
Anonymisation refers to the processing of personal data in a manner that the individual concerned cannot be re-identified. Completely anonymous data do not exist, but by using various techniques and tools and following well-executed procedures, you can achieve a result where individual persons cannot be identified with reasonable efforts based on your data, e.g., by combining different indirect identifiers in your data, or by combining your data with the information from other external sources.
Make an anonymisation plan which describes the anonymisation measures and evaluates the disclosure risk of data subjects’ personal data. The anonymisation plan also works as documentation on how the data have been processed. You can use the Anonymisation plan template in Anonymisation and Personal Data by the Finnish Social Science Data Archive (FSD) to write an anonymisation plan.
It is recommended to avoid using open-ended questions to collect background information such as education or occupation. Instead, use a structured form to prevent interviewees from giving free-form responses that often contain identifiers. In categorising background information, utilise existing social classifications such as those Classifications by Statistics Finland.
Usually the first anonymisation measure is to remove direct and strong indirect identifiers from your data. Use pseudonyms, aliases or codes so the data subjects are not identifiable without the use of separately stored additional information. Information on the original values and techniques used to create the pseudonyms or codes should be kept organisationally and technically separate from the pseudonymised data.
Pseudonymised data can be attributed to a natural person by the use of the additional information and are still personal data. Pseudonymised data become anonymised when the separately kept identifying information used to create the pseudonyms or codes (e.g., decryption keys, codes, applications or techniques used to pseudonymise the data) has been irreversibly destroyed and cannot be linked to the pseudonymised data.
Anonymised data are no longer considered to constitute personal data and are not subject to the data protection regulations.
The table by the FSD provides good tips for recognising direct, indirect, and strong indirect identifiers and how to anonymise research data by removing, changing or categorising these different identifiers.
As publicly available information is constantly increasing, it is important to regularly assess whether a once anonymised dataset continues to be anonymous and conduct residual risk assessments.
For special categories of personal data involving pseudonymisation or anonymisation, it may be necessary to conduct a Data protection impact assessment (DPIA) in order to ensure an appropriate level of data protection and minimise the risks to the data subjects’ rights. See 3.2 Carry out a Data protection impact assessment (DPIA) when needed and contact firstname.lastname@example.org.
(3) After active research phase
Personal data that are no longer needed for the original purpose should be disposed as soon as possible unless there are special reasons or legislation that require archiving. For example, direct identifiers such as names, email addresses, and personal identity codes should be removed immediately after they are no longer necessary to carry out the research. Storage limitation reduces the risks related to personal data processing. If it is not possible to determine the exact data retention period, specify the criteria used to determine that period to your research participants.
Make sure that personal data, dispensable data files, temporary files created when programs are used, and all their backups be deleted within due time when they are no longer needed, and that the deleted data cannot be recovered.
Deleting files using operating system tools, or even reformatting a hard drive, will not irretrievably destroy the data. It is important to permanently destroy any data that includes personal, sensitive or confidential data after data storage is no longer necessary. Save your files to OneDrive and use the deletion feature. Remember to empty the trash as well. Data in Webropol will be erased by the Computer Centre shortly after a student's user ID is inactivated. You can ask for help and support from Hanken’s Information security officer (email@example.com) and DPO (firstname.lastname@example.org) for secure data disposal measures.
Anonymised data are published and archived in a data repository for shared use when possible. Inform your research participants of your data archival plan. Data with personal information can only be published anonymised. Pseudonymised data are still personal data, and therefore cannot be opened without explicit consent from the data subjects for that purpose. Before archiving the research data, pseudonymous data should be made anonymous by irretrievably destroying the separately kept identifying information. If you plan to publish and archive personal data, contact Hanken’s DPO (email@example.com).
If the open accessibility of a dataset is not possible for justified reasons, the metadata of the dataset can still be published openly available. It is strongly recommended to use Fairdata Qvain metadata tool to describe and publish your (meta)data. See Data publishing and preservation.
More information, see:
All research carried out in Finland shall comply with the guidelines by the Finnish National Board on Research Integrity (TENK): The Finnish Code of Conduct for Research Integrity and Procedures for Handling Alleged Violations of Research Integrity in Finland 2023 (the PDF file in English, Finnish, and Swedish). The Implementation checklist for the 2023 RI guidelines helps the leadership of an organisation, research leaders, and individual researchers ensure that the main practices of research integrity are followed.
In addition to the RI guidelines, TENK has issued the guidelines on the ethical principles to be followed as well as ethical review to be arranged for research in the humanities and social and behavioural sciences: The ethical principles of research with human participants and ethical review in the human sciences in Finland (2019, in English, Finnish, and Swedish):
When engaging in international collaboration, researchers shall follow the European Code of Conduct for Research Integrity (2023) by ALLEA, the European Federation of Academies of Sciences and Humanities, and any other applicable ethical guidelines.
Researchers shall bear the responsibility for ethical and moral concerns and decisions involved in the research and during the interaction between the researchers and research participants. Follow all the applicable ethical guidelines and good data protection practices to maintain high ethical standards and comply with relevant data protection legislation. See the section above on the Guidelines and procedures of personal data processing in research and studies at Hanken.
If you have questions concerning ethical guidelines and ethical review, contact Hanken's Research Integrity Advisor (firstname.lastname@example.org).
If your study is one of these six types, you need to fill in the ethical review request e-form and submit to Hanken’s Research Ethics Committee:
Submit your privacy notice, consent form, Data management plan (DMP), and/or Data protection impact assessment (DPIA, if you have conducted a DPIA) as the attachments to apply for the ethical review. Indicate the date in your DMP when your request for an ethical review statement was submitted to Hanken’s Research Ethics Committee.
If you have questions concerning ethical review, please contact Hanken's Research Integrity Advisor (email@example.com).
Watch the video TENK's Ethical review in human sciences:
Video: Ethical review in the human sciences in Finland, by TENK.
Legal issues related to data management include data protection laws, data-sharing agreements, data ownership, open data licenses, secondary data usage copyright permissions and other intellectual property rights (IPRs).
Agreements on data ownership and other IPRs shall be concluded before commencing any actual research activities. Agreements about authorship also need to be done before the beginning of the project.
Describe in your DMP how you agree upon the rights of use related to the research data your will collect, produce, and reuse for your research project. Clarify the transfer of rights procedures relevant to your project. Follow the funder's or publisher's policies. If applicable, describe confidentiality issues in your project as well.
Use a license when opening your research data, code or software for shared reuse. Licensing your open research data means that you clearly define the reuse terms and possible restrictions to the future reuse of your data. This way, you are in control of who has rights to reuse the data, and how they can reuse your data. Use machine-readable licenses that follow international standards, preferably Creative Commons. Besides Creative Commons licences, there are also specific licensing models for research data.
More information, see: