The definition of personal data is broad under the General Data Protection Regulation of the European Union (GDPR, EU 679/2016). Personal data means any information relating to an identified or identifiable natural person (data subject) and encompasses all data from which a natural person can be directly or indirectly identified (GDPR, Art. 4).
Direct identifiers are information that is sufficient on its own to identify a natural person. Examples are a person’s full name, social security number, email address containing the personal name, and biometric identifiers (e.g., fingerprint, facial image, voice pattern or manual signature).
Indirect identifiers are information that on its own is not enough to identify someone, but can be used to deduce the identity of a person when linked with other available information. Examples are a person's age, gender, educational background, economic activity, occupational status, socio-economic status, household composition, income, marital status, mother tongue, nationality, ethnic background, place of work or study, and postal code.
Some types of information are identified as strong indirect identifiers which can be used to identify an individual fairly easily, such as a postal address, phone number, vehicle registration number, bibliographic citation of a publication by the individual, email address not in the form of the personal name, web address to a web page containing personal data, very rare disease, unusual job title, position held by only one person at a time (e.g., chairperson in an organisation), a student ID number, insurance or bank account number, and IP address of a computer.
Special categories of personal data (sensitive personal data) are classified as being on the increased information security level (See the PDF file “Instructions for handling and storing data and documents on different information security levels” in Information Management at Hanken). The following personal data are classified as special categories of personal data by the GDPR (Art. 9): personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person's sex life or sexual orientation. Personal data relating to criminal convictions and offences or related security measures are also, by their nature, particularly sensitive and merit specific protection as the context of their processing could create significant risks to the fundamental rights and freedoms of the data subjects (GDPR, Art. 10).
If you collect and process any information from individuals or about individuals (e.g., consumers, company managers), assume that it is personal data. Pseudonymised data can be attributed to a natural person by the use of the additional information and are still personal data.
More information about what constitutes personal data, see:
Here is one example of the situations where personal data are not adequately protected: University failed to sufficiently protect sensitive personal data, published on the web page of the European Data Protection Board (EDPB).
If you process personal data, follow the eight procedures below to maintain high ethical standards and comply with relevant data protection legislations:
(1) Before data collection (during research planning phase)
3.1 Request an ethical review when needed
3.2 Carry out a Data protection impact assessment (DPIA) when needed
(2) Data collection and analysis (during active research phase)
(3) After active research phase
(1) Before data collection (during research planning phase)
Understand the objectives of your study and define the clear, specified need for collecting personal data. Collect only the minimum amount of personal data necessary and proportionate to the accomplishment of the research tasks. Personal data should not be collected just in case that they might be useful in the future. Consider if your data can be the least identifiable while still accomplishing your research goals.
Conduct a data minimisation review for the whole process of data management, including defining the types and amount of personal data collected, the extent to which they may be accessed, further processed and shared, the purposes for which they are used, and the period during which they are kept.
A Data management plan (DMP) can help you plan the entire life cycle of research data management. It is a formal document that describes how and what research data will be handled during and after the research project, and elaborates the key measures for ethical and legal compliance and for FAIR data production. Research funders increasingly require a DMP written and updated in different versions throughout the whole data life cycle.
Researchers can use Hanken’s DMP template or other Public DMP templates (with Hanken’s DMP guidance integrated) in DMPTuuli to write and update a DMP. Please see DMPTuuli with Hanken's DMP guidance and DMP template to learn how to get access to Hanken’s DMP template and DMP guidance.
3.1 Get ethical review when needed
If your study is one of the six types described in Ethical review, fill in the e-form Request for an ethical review for an empirical study and submit to Hanken’s Research Ethics Committee. Attach your Privacy notice and/or Data management plan (DMP) and/or Data protection impact assessment (DPIA, if you have conducted a DPIA) to apply for the ethical review. Indicate the date in the Privacy notice when a request for ethical review was submitted to Hanken’s Research Ethics Committee.
You can contact Hanken’s Research Integrity Advisor (email@example.com) for advice.
3.2 Carry out a Data protection impact assessment (DPIA) when needed
A Data protection impact assessment (DPIA) shall be conducted if the planned personal data processing is likely to result in a high risk to the rights and freedoms of the data subjects. This may occur when students and researchers process:
The contents of a DPIA shall contain at least:
That is, in the DPIA, you identify the need for a DPIA, describe the nature of data and data processing including data collection, analysis, storage and disposal, specify what and how much data will be collected and processed, what types of processing might involve risks, the sources of risks and potential impact on the data subjects, and identify additional measures to reduce or eliminate the risks.
More information about when a DPIA is needed, see:
When a DPIA is needed, students and researchers can use Hanken's DPIA template to conduct a DPIA. The DPIA shall be conducted by consulting Hanken's DPO (firstname.lastname@example.org) who shall also monitor its performance (GDPR, Art. 35 and 39).
(2) Data collection and analysis (during active research phase)
Personal data shall be processed lawfully with at least one of the six legal grounds defined by the GDPR (Art. 6): consent, contract, legal obligation, protection of vital interests, public interest or official authority, and legitimate interest. You need to rely on a legal basis to justify why you have the right to collect, handle, and store personal data:
For research work conducted by researchers including PhD students, the legal basis is usually scientific research carried out in the public interest (based on the Finnish Data Protection Act (1050/2018, Section 4) and GDPR (point (e) of Art. 6 (1)).
When collecting personal data, what researchers need to do to comply with good data management practices, data protection regulations, and research integrity includes:
Note that this consent (to participate in the research, required by ethical standards) is different from consent (to personal data processing, as a legal basis under the GDPR). The difference is acknowledged by TENK’s guidelines (p. 9).
If you do not ask for informed consent from the research participants, or if your study is one of the six types described in Ethical review, you need to request for an ethical review by Hanken’s Research Ethics Committee.
There are rare cases wherein you may not have to ask for informed consent, for example, observation studies in public places and field experiments in which the experimental setup may substantially suffer from letting the participants know about the research in advance. Furthermore, if you only use secondary register data which is anonymised or aggregated (e.g., company-level data), you do not need to inform the research participants. (The "secondary register data" means that the original data about persons have been gathered by someone else or some other party/organisation than you. For example, you are analysing a company's anonymised customer database in your study, or you are analysing survey data that a governmental agency has originally gathered.)
(2) Provide all the mandated information in a Privacy notice to research participants about the processing of their personal data. Transparency is an overarching principle and a fundamental requirement under the GDPR. Personal data should be processed in a fair and transparent way. Regardless of the legal basis for processing personal data, data subjects should obtain sufficient information from you about why and how their personal data are being collected, used, stored, disseminated, or otherwise processed. The GDPR (Art. 12-14) stipulates long lists of information that shall be provided to the data subjects, including the purpose and legal basis for processing, identity and contact details of the data controller and DPO, recipients of personal data, international data transfers, data retention and deletion plans, and data subjects’ rights.
The principle of transparency requires, in particular, information to the data subjects on the identity of the controller and the purposes of the processing and further information be easily accessible and easy to understand (GDPR, Recital 39). The data controller determines the purposes and means (i.e., why and how) of the processing of the personal data and is primarily responsible for compliance with data protection laws throughout the data lifecycle. The data controller can allocate responsibilities according to the actual roles of the parties. The role of data controller or joint controller shall be defined case by case:
Basically, there are two situations with different timings for providing the required information, depending on whether the data are collected from the research participant or from some other sources:
Both the Informed consent form and Privacy notice shall be provided to the research participants before you collect their personal data. Afterwards, the privacy notice shall be held available, for example, on the research project’s website and/or Hanken’s webpage, and be provided upon request to all the data subjects, Data protection authorities (DPAs), and research funders. Keep both the Informed consent and Privacy notice on file.
This applies when you collect secondary data from online forums/social media. You need to ensure that data processing is fair to all the data subjects involved, that their fundamental rights are respected in compliance with ethical and privacy principles, and that relevant terms and conditions of the platform are observed. When applicable, the Privacy notice ought to be given to the data subjects involved in the collection and processing of the data from the online forums/social media.
If the provision of such information proves impossible or would involve a disproportionate effort, or seriously impair the achievement of the objectives of your processing, you can make your Privacy notice publicly available, for example, on your research project’s website and/or Hanken’s webpage to make the privacy information publicly available (GDPR, Art. 14 (5)). For example, Findata requires that data applicants post Privacy notices online, either on the home organizations’ pages or the research projects’ pages, before granting access right to the secondary datasets from, e.g., Statistics Finland.
Or if it is difficult for provide all the required information at one time, you can adopt layered fair processing notices, providing and bringing first the most important information (e.g., the purposes of the processing and the identify of the data controller) in the first short layer to the data subjects’ attention, together with a click-through link to your Privacy notice with more detailed information.
More information, see:
For studies and thesis-writing by BSc/MSc/eMBA students, consent is usually used as the legal basis unless the student is a member of a research project where one or more researchers (at the PhD level or above) (GDPR, point (a) of Art. 6 (1)). When consent is used as a legal basis for processing personal data, the consent needs to meet the requirements of the GDPR. Consent to the processing of personal data should be a “freely given, specific, informed and unambiguous indication of the data subject’s wishes,” and “be presented in a manner which is clearly distinguishable from the other matters, in an intelligible and easily accessible form, using clear and plain language” (GDPR, Art. 4 and 7). Data subjects have the right to withdraw their consent at any time. For consent to be informed, the data subject should be aware at least of the identity of the controller and the purposes of the intended processing. See Consent of the data subject by the Office of the Data Protection Ombudsman.
When collecting personal data, what students need to do to comply with data protection laws includes:
For personal data processing in studies and thesis-writing by BSc/MSc/eMBA students, data controllership shall also be determined case by case:
Processing of special categories of personal data (sensitive personal data) shall be prohibited, but there are ten exceptions or derogations to the prohibition (GDPR (Art. 9), and Data Protection Act (1050/2018, Section 6 and 7)).
A Data protection impact assessment (DPIA) may be needed when students and researchers process special categories of personal data or data of a highly personal nature. See the instructions under "3.2 Carry out a data protection impact assessment (DPIA) when needed."
What is important in your data collection and data analysis stages is that your research data are stored and backed up in a location that cannot be accessed by anyone who is not authorised, and that data transfers outside Hanken and the EU/EEA are only carried out in full compliance with relevant regulations.
See the PDF file “Instructions for handling and storing data and documents on different information security levels” on the page of Information Management at Hanken and learn what different storage solutions are allowed and suitable for different documents and data on different data security levels.
For secure storage and backup of active research data during usage, researchers use:
Data storage services provided and maintained by Hanken, including the researcher's own account on the Hanken network like H:\, Microsoft Office365 applications (e.g., Onedrive for Business), Webropol or SPSS. If you do not have a plan for data archival after the research project, this solution is suitable.
OR data storage services provided by CSC such as IDA which is also for data archival. IDA is a Fairdata service for both data storage and data archival. The Fairdata services are offered by the Finnish Ministry of Education and Culture and produced by CSC – IT Centre for Science.
Established and well-known infrastructures are mostly a more secure alternative for storing research data than, for example, the hard disc on the researcher’s personal computer, both in terms of data security and from a confidentiality perspective.
In addition to Hanken's and CSC's data storage systems, you can use your own password-protected personal computer and hardware (e.g., internal/external hard drives) and password-protected joint-use computers in a room located physically at Hanken with restricted access, to store and process data during research.
Unless you have entered into a Data Processing Agreement (DPA) with another system/service provider, you shall NOT use other systems and internet clouds, for example, iCloud, Dropbox, Google Docs, publicly available OneDrive (for consumers) and other survey platforms than Webropol.
If you collect personal data from online questionnaires or surveys, use the GDPR-compliant tools and platforms such as Webropol. Webropol's user instruction is available on the page of Hanken's IT services. If the information you plan to collect contains sensitive personal data or confidential data such as trade secrets and information concerning national security, it may be better that you do not collect it online.
if you collect interview data by recording the interview with mobile phones or dictaphones or recording teleconferences, see Security instructions for handling recorded interviews in this LibGuide on RDM.
You can use Hanken’s video platform Panopto to transcribe research data, for both audio files and video files. Please note that you are responsible for not sharing the research data in Panopto with anyone else . See Transcribing qualitative data.
If you transfer personal data outside Hanken:
If you save and store your data in IDA by CSC, use the safe data transfer and sharing measures offered by IDA. See 1.8 I want to share my research data, what should I do? in FAQ of the Fairdata services by CSC.
You can use physical memory sticks or external hard drives, in cases where you or the other party do not have access to Hanken's data sharing systems (e.g., OneDrive for Business). Make sure that data are stored securely, and that you erase personal data stored on your memory sticks and on your USB disks immediately after use. You can encrypt the data on memory sticks and external hard drives.
Note that you should NOT send or share data by an ordinary, non-secured email, or use non-Hanken-provided systems (e.g., DropBox, GoogleDocs, and publicly available OneDrive (for consumers)) for data transfers.
If you have a third party outside Hanken as the Data processor who provides, for example, translation/interpretation, transliteration/transcription or data analysis services, you need to sign a Data processing agreement (DPA) with the Data processor. Use Hanken’s Template for Data processing agreement (DPA).
More information, see Transfers of personal data out of the European Economic Area by the Office of the Data Protection Ombudsman.
If personal data is transferred to non-EU/EEA countries, specify the countries' names in your Privacy notice and the appropriate safeguards you plan to take to ensure that the level of data protection in compliance with the GDPR is not undermined.
If no personal data is transferred from and to non-EU/EEA countries, specify in the Privacy notice that data transferred between project partners outside the EU/EEA will only be restricted to anonymized data and the transfer will be made via a secure channel. Processing and transfers of personal data will only reside inside the EU/EEA and be limited to the research.
If you work with sensitive personal data, use CSC's Sensitive Data Services for Research including Sensitive Data Connect (SD Connect, for sensitive data storage and sharing) and Sensitive Data Desktop (SD Desktop) which are designed to support secure sensitive data management through web-user interfaces accessible from the user's own computer.
Protect the data with strict access control and encryption if you work with sensitive personal data or confidential data such as trade secrets, politically sensitive information, and information concerning national security:
You can ask for advice from Hanken’s Data security officer (email@example.com) and Data protection officer (DPO, firstname.lastname@example.org) to ensure that your storage and transfer solutions meet data protection requirements.
If there are changes in personal data processing, for example, if there are new (compatible) processing purposes other than the initial purpose, if there are new recipients of the personal data (e.g., new research partners or translation or transcription service providers), or if there is an addition of new data variables to the categories of personal data compiled into the dataset, the Privacy notice and other documentation shall be updated and the research participants be informed of the changes prior to the new processing.
If informing each research participant of the changes proves to be impossible or would require a disproportionate effort, you can update your Privacy notice on your research project’s website and/or Hanken’s webpage to make the information about the changes publicly available.
Completely anonymous data do not exist, but by using various techniques and tools and following well-executed procedures, you can achieve a result where individual persons cannot be identified with reasonable efforts based on the information available, e.g., by combining different indirect identifiers, or by combining the data with information from other external sources.
Make an anonymisation plan which describes the anonymisation measures and evaluates the disclosure risk of data subjects’ personal data. The anonymisation plan also works as documentation on how the data have been processed. You can use the Anonymisation plan template in Anonymisation and Personal Data by the Finnish Social Science Data Archive (FSD) to write an anonymisation plan.
It is recommended to avoid using open-ended questions to collect background information such as education or occupation. Instead, use a structured form to prevent interviewees from giving free-form responses that often contain identifiers. In categorising background information, utilise existing social classifications such as those Classifications by Statistics Finland.
The first anonymisation measure is usually to remove direct and strong indirect identifiers from your data. Use pseudonyms, aliases or codes so the data subjects are not identifiable without the use of separately stored additional information. Information on the original values and techniques used to create the pseudonyms and codes should be kept organisationally and technically separate from the pseudonymised data.
Pseudonymised data can be attributed to a natural person by the use of the additional information and are still personal data. Pseudonymised data become anonymised when the separately kept identifying information used to create the pseudonyms and codes (for example, decryption keys, codes, applications or techniques used to pseudonymise the data) has been irreversibly destroyed and cannot be linked to the pseudonymised data.
The table by FSD provides good tips for recognising direct, indirect, and strong indirect identifiers and how to anonymise research data by removing, changing or categorising these identifiers.
As publicly available information is constantly increasing, it is important to regularly assess whether a once anonymised dataset continues to be anonymous and conduct residual risk assessments.
For special categories of personal data involving pseudonymisation or anonymisation, it may be necessary to conduct a Data protection impact assessment (DPIA) in order to ensure an appropriate level of data protection and minimise the risks to the data subjects’ rights. See 3.2 Carry out a Data protection impact assessment (DPIA) when needed.
(3) After active research phase
Personal data that are no longer needed for the original purposes should be disposed as soon as possible unless there are special reasons or legislations that require archiving. For example, direct identifiers such as names, email addresses, and social security numbers should be removed immediately after they are no longer necessary to carry out the research. Storage limitation reduces risks related to personal data processing. If it is not possible to determine the exact data retention period, specify the criteria used to determine that period to your research participants.
You also need to make sure that personal data, dispensable data files, temporary files created when programs are used, and all their backups be deleted within due time when they are no longer needed, and that the deleted data cannot be recovered.
Deleting files using operating system tools, or even reformatting a hard drive, will not irretrievably destroy the data. It is important to permanently destroy any data that includes personal, sensitive or confidential data after data storage is no longer necessary. You can ask for help and support from Hanken’s Data security manager (email@example.com) and DPO (firstname.lastname@example.org) for secure data disposal measures.
Anonymised data are published and archived in a data repository for shared use when possible. Inform your research participants of your data archival plan. Data with personal information can only be published anonymised. Pseudonymised data are still personal data, and therefore cannot be opened without explicit consent for that purpose. Before archiving the research data, pseudonymous data should be made anonymous by irretrievably destroying the separately kept identifying information. If you plan to publish and archive personal data, contact Hanken’s DPO (email@example.com).
If the open accessibility of a dataset is not possible for justified reasons, the metadata of the dataset can be published openly available. It is strongly recommended to use Fairdata Qvain metadata tool to describe and publish your (meta)data. See Data publishing and preservation.
Remember to register your datasets in Hanken's research database - Haris and add the persistent identifiers (PIDs, e.g., DOI and URN) for your (meta)data.
More information, see:
If your study is one of these six types, you need to fill in the ethical review request e-form and submit to Hanken’s Research Ethics Committee:
Submit your Privacy notice and/or Data management plan (DMP) and/or Data protection impact assessment (DPIA, if you have conducted a DPIA) as attachment(s) to request for an ethical review. Indicate the date in the Privacy notice when a request for ethical review was submitted to Hanken’s Research Ethics Committee.
If you have questions concerning ethical review, contact Hanken's Research Integrity Advisor Anu Helkkula (firstname.lastname@example.org).
Video: Ethical review in the human sciences in Finland by TENK
All research shall comply with the Finnish National Board on Research Integrity (TENK) guidelines Responsible conduct of research and procedures for handling allegations of misconduct in Finland (2012). The RCR guidelines are available in Finnish, Swedish, and English.
In addition to the RCR guidelines, TENK has issued guidelines on the ethical principles to be followed as well as ethical review to be arranged for research in the humanities and social and behavioural sciences The ethical principles of research with human participants and ethical review in the human sciences in Finland (2019) in Finnish, Swedish and English.
When engaging in international collaboration, researchers shall follow the European Code of Conduct for Research Integrity by ALLEA, the European Federation of Academies of Sciences and Humanities and any other applicable ethical guidelines.
Researchers shall bear the responsibility for ethical and moral concerns and decisions involved in the research and during the interaction between the researcher and research participants. Follow Hanken's ethical guidelines and good data protection practices to maintain high ethical standards and comply with relevant legislation. See Guidelines and procedures of personal data processing in research and studies at Hanken.
Legal issues related to data management include data protection policy, data-sharing agreements, data ownership, open data licenses, secondary data usage copyright permissions and other Intellectual Property Rights (IPRs) issues. Agreements on data ownership and other intellectual property rights shall be concluded before commencing any actual research activities.
Use a license when opening your research data, code or software for shared reuse. Licensing your open research data means that you clearly define the reuse terms and possible restrictions to the future reuse of your data. This way, you are in control of who has rights to reuse the data, and how. Use machine-readable licenses that follow international standards, preferably Creative Commons. Besides Creative Commons licences, there are also specific licensing models for research data.
Creative Commons CC BY 4.0 license is recommended for published datasets when possible.
More information, see: