The members of the Inter-Agency Standing Committee (IASC) Sub-Group on Data Responsibility (co-led by the OCHA Centre for Humanitarian Data, IOM, and UNHCR) have developed this FAQ to support organizations and staff around the world working with data in the COVID-19 response. The ongoing response presents a range of challenges and opportunities around the safe, ethical, and effective management of data. This resource will be updated regularly as we receive additional questions and feedback.
The World Health Organization recommends the following measures to ensure the ethical and secure use of data:
- Use anonymization and other tools as appropriate.
- Comply with informed consent agreements where such consent is needed and respect assurances about ways in which the data (anonymized or otherwise) would be used, shared, stored or protected.
- Adopt appropriate security measures to foster public trust.
- Any platforms established to share data should have an explicit ethical framework governing data collection and use.
In addition, consider the following:
- Ensure adequate de-identification of data within health data management activities. Consult relevant guidance to determine which tool is most appropriate for de-identification of the type of data you’re handling. When using digital (communication) technologies in healthcare, data protection is paramount. Determine which tools are used by healthcare professionals and only use tools that allow for the appropriate level of encryption.
- Clearly define the purpose of data management, measures for data minimisation and limitation of data retention, and the specific roles and responsibilities of different stakeholders throughout the data management process. This should include a clear overview of which parties are responsible for safeguarding data at different stages.
- When sharing data with specific recipients, be transparent regarding the appropriate use of the data, and make sure this is compatible with the original purpose for which the data was collected.
- Data can be vulnerable to interception at points of transfer between different organizations. Additionally, data may be misused intentionally or unintentionally after the transfer. Select the right method and tool for transfer, and to stipulate the licence or terms under which data may be used in a clear manner (see “What are the different licenses available for data sharing and what do they cover?” for more information on this point).
Following these best practices will help ensure responsible data management in the COVID-19 response.
Your organization may have standard definitions for data sensitivity included in a data policy or elsewhere. Data sensitivity definitions may also be found in applicable privacy or data protection legislation. In the absence of such guidance, any data that may put certain individuals, groups or organizations at risk of harm in a particular context should be considered sensitive. While personal data can categorically be considered sensitive, more nuanced issues arise for non-personal data. For example, locations of medical facilities in conflict settings can expose patients and staff to risk, while the same data would not necessarily be considered sensitive in a natural disaster response context.
In the health sector specifically, all identifiable data concerning health, factors influencing health (for example, cultural and socio-economic details) and the history of individuals are sensitive and must be handled with care and professionalism. In addition, any data (identifiable or not) that can be voluntarily or involuntarily misused against the interests of patients, potential patients, their family, groups or communities and/or health service providers or other humanitarian organizations and their staff, or put any of them at risk for political reasons, financial gain or any other reasons shall be treated as “highly sensitive” data. Even some seemingly non-sensitive data can be highly sensitive in certain contexts (for example, details of cholera outbreaks). Finally, the metadata generated as a ‘byproduct’ of data management can create a distinct set of risks, which should not be overlooked. For more information on the risks associated with metadata, see https://www.icrc.org/en/document/digital-trails-could-endanger-people-receiving-humanitarian-aid-icrc-and-privacy
In the COVID-19 response, the following common data types may be considered sensitive and should be treated with care:
- any directly identifiable data (such as datasets containing names or telephone numbers)
- any indirectly identifiable data (such as survey results or call detail records that have not been appropriately anonymized)
- non-identifiable data on sensitive topics, including but not limited to aggregated and/or anonymized data onviolence related injuries; rape; termination of pregnancy, and; patients in prisons or detention centers;
- information on the disease in a context where there is an obligation to abide by treatment or other related measures, such as quarantine;
- non-identifiable data which reveals or implies racial or ethnic origin, political opinions, religious or philosophical beliefs, offences or sex life or preferences.
Assessing the sensitivity of data requires a clear understanding of the context and the different ways in which data may lead to harm. Data Sensitivity Classifications such as this example (from the working draft OCHA Data Responsibility Guidelines) can help humanitarian organizations consistently assess and manage data sensitivity in different environments.
These classifications can be developed at the country level and/or at the sector/cluster level where necessary (e.g. the health cluster may wish to establish a sensitivity classification specific to data required for COVID-19 response interventions in certain contexts). Humanitarians operating at the National or Sub-National level are encouraged to engage with the appropriate partners and coordinating bodies to ensure data management is conducted according to relevant standards for IM services in public health. This includes aligning with existing context-specific data sensitivity classifications.
Data management in the COVID-19 response should be principled and follow existing best practice in humanitarian data management. Some key measures for upholding privacy and data protection include:
- Purpose limitation: clearly specify the purpose for which data is needed, explain this to the populations from whom data will be collected, and establish safeguards to ensure that data is used only for the intended purpose.
- Privacy by design: anticipate and build-in technical and procedural measures to prevent privacy invasive events at the outset of a data management exercise.
- Transparency: provide accurate and complete information to people about what data is being collected about them, for what purpose, how it will be used, how long it will be kept and who it will be shared with
- Necessity and proportionality: only collect data that is relevant and necessary to achieve the purpose specified, thereby abiding by the principle of data minimisation.
- Time limitations: ensure that any data processing is strictly limited in time and that data collected for COVID-19 response efforts is not retained beyond the time for which they are strictly needed to combat the pandemic.
For additional resources and examples of best practice on data protection and privacy in the COVID-19 response, see this repository from UN Global Pulse: https://www.unglobalpulse.org/policy/covid-19-data-protection-and-privacy-resources/
For detailed recommendations on data privacy, data protection, and responsible data management in digital contact tracing, see this recent working paper from UNICEF: https://www.unicef-irc.org/publications/1096-digital-contact-tracing-surveillance-covid-19-response-child-specific-issues-iwp.html
Data on the characteristics of units of a population (e.g. individuals, households or establishments) collected by a census, survey or experiment is referred to in statistics as ‘microdata’. In humanitarian response, this type of data is gathered through exercises such as household surveys, needs assessments, and other programme monitoring activities. Such data make up an increasingly significant volume of data in the humanitarian sector, and will play a key role in the COVID-19 response.
In its raw form, microdata can contain both personal data and non-personal data on a range of topics. Most humanitarian organisations acknowledge the sensitivity of personal data such as names, biometric data, or ID numbers and anonymise data sets accordingly as a matter of standard practice. However, it is often still possible to re-identify individual respondents or groups by combining answers to different questions, even after such ‘anonymisation’ is applied.
Depending on the type of data you’re managing, there are various tools available to determine and reduce the risk of re-identification in the data. For microdata, one such approach is Statistical Disclosure Control (SDC).
SDC is a technique used to assess and lower the risk of a person or organization being re-identified from the analysis of microdata (data on the characteristics of a population). The purpose of applying disclosure control to microdata is to be able to share the data more widely in a responsible manner. An SDC process can lower the risk of re-identification to an acceptable level but the risk threshold may vary depending on the context to which the data relates. There are a variety of free and open source tools available for conducting SDC, including sdcMicro. Read this guidance note from the Centre for Humanitarian Data for more information on how to start using SDC.
The World Health Organization has published technical guidance on surveillance and case definitions for COVID-19. This guidance includes resources for use in case-based reporting — including a Case-based reporting form, a Data dictionary for case-based reporting form, and Template for Line list for case-based reporting — as well as aggregated reporting, including an Aggregated weekly reporting form.
The World Health Organization maintains a real-time dashboard providing an overview of the COVID-19 situation here https://experience.arcgis.com/experience/685d0ace521648f8a5beeeee1b9125cd
Their data is updated live and can be accessed here: https://data.humdata.org/dataset/coronavirus-covid-19-cases-and-deaths
A number of humanitarian organizations are publishing data about different aspects of the global and country-level response to COVID-19. Many of these resources are available in a dedicated COVID-19 crisis page on the Humanitarian Data Exchange.
Many national health authorities also provide updates on a daily basis. Visit your national health authority’s website for more information.
Consult the relevant guidance (such as a data policy or specific protocols for a given data management activity) or focal point within your organization to see which methods and tools are considered appropriate for the secure transfer of (sensitive) data. In general, a secure method or tool will enable encryption of the data in transit and at rest, offer secure authentication functionality and access restrictions, among other security features. For example, most email service providers allow you to turn on encryption of emails and their attachments.
Licenses stipulate the terms under which data is shared. This means that a license will describe how data may be used and shared further, as well as any attribution to the original source that should take place. A list of commonly used licenses is available here: https://data.humdata.org/about/license
Epidemic models are an essential tool in the hands of governments and policy makers for planning and responding to COVID-19. This crisis shows how predictive analytics can inform and maximise the impact of interventions, especially in resource-limited contexts. It also shows the importance of having models that are validated and ready to be deployed right before or at the beginning of a crisis.
Unfortunately, translating the outputs of predictive models into timely and appropriate responses in the humanitarian sector remains a challenge for several reasons:
- First, there is no common standard for documenting predictive models and their intended use which highlights the critical aspects for the application of models in the humanitarian sector.
- Second, there is no common standard or mechanism for assessing the technical rigor and operational readiness of predictive models in the sector.
- Third, the development of predictive models is often led by technical specialists who may not consider important ethical concerns that the application of models in humanitarian contexts may entail.
One approach for addressing these challenges is to submit models for peer review. The Centre for Humanitarian Data recently published an updated version of its Peer Review Framework for Predictive Analytics in Humanitarian Response. The Framework aims to create standards and processes for the use of models in our sector. It is based on research with experts and stakeholders across a range of organizations that design and use predictive models. The Framework also draws on best practices from academia and the private sector.
Many individual organizations have policies and guidelines specific to the safe, ethical, and effective management of different types of data. Institutional policies on personal data protection are particularly relevant to the responsible management of health data and should serve as a primary reference for staff in the COVID-19 response.
In addition, many national and regional authorities have included provisions specific to health data management in national and regional data protection legislation and other relevant regulatory frameworks. National laws on medical practice may also include specific rules on health data management. Consult a local legal professional to ensure you are aware of and abide by all applicable data protection laws.
The World Health Organization Policy statement on data sharing by WHO in the context of public health emergencies (as of 13 April 2016) and Guidance on good data and record management practices are the primary global frameworks of reference for the management of data in public health emergencies.
The Global Health Cluster Standards for Public Health Information Services in Activated Health Clusters and other Humanitarian Health Coordination Mechanisms should also serve as a key reference for humanitarian practitioners. Although this document refers to Public Health Information Services (PHIS) in activated health clusters (HCs), these PHIS Standards are by no means restricted to health clusters, and can be applied to support government led emergency coordination or other types of humanitarian sectoral coordination mechanisms.
The WHO ‘Policy on the use and sharing of data collected in Member States by the WHO, outside the context of public health emergencies’ contains extensive annexes on security, safeguards, ethics and guidance on implementation and may also serve as a helpful reference. However, the policy excludes data shared in the context of public health emergencies, including Public Health Emergencies of International Concern (such as the COVID-19 pandemic) and data and reports from clinical trials and biological samples, and data collected by WHO prior to policy implementation.
When data is used for purposes other than informing the response (e.g. research), additional frameworks and principles may apply. Researchers should refer to the WHO Code of Conduct for responsible Research, which provides standards of good practice to guide individuals working on all research associated with WHO, including non-clinical research, in line with the principles of integrity, accountability, independence/impartiality, respect and professional commitment described in WHO’s Code of Ethics and Professional Conduct.
Data responsibility entails a set of principles, processes and tools that support the safe, ethical and effective management of data in humanitarian response. This includes data privacy, protection, and security, as well as other practical measures to mitigate risk and prevent harm.
There is a wealth of guidance available on how to responsibly manage data in public health emergencies and in humanitarian action more generally that should inform data management in the COVID-19 response. The following resources provide additional information and guidance on the safe, ethical, and effective management of data in humanitarian action:
- Recommendations on privacy and data protection in the fight against COVID-19 (Access Now)
- Handbook on Data Protection in Humanitarian Action (ICRC and Brussels Privacy Hub)
- Working Draft Data Responsibility Guidelines (OCHA Centre for Humanitarian Data)
- mHealth Data Security, Privacy, and Confidentiality: Guidelines for Program Implementers and Policymakers
For a broad range of resources related to data responsibility in development and humanitarian work, consult the Responsible Data Resource List maintained by MERL Tech and the Engine Room.
Recent changes to working conditions have increased the use of online conferencing tools throughout the humanitarian sector. These conferencing technologies are invaluable when face-to-face meetings are not possible, but they also pose a significant information security and data protection risk when not used responsibly. Some steps for reducing these risks include:
- Familiarizing yourself with your organization’s approved online conferencing tools, their features and settings
- Using only online conferencing tools that are approved, configured and verified as secure by your organization
- Using a unique access code so that only those with the code for that meeting can access the room, particularly when a sensitive topic is being discussed
- Monitoring the dashboard of participants to ensure no uninvited parties are attending throughout the call
For more information and additional recommendations, see this tip sheet developed by the ICRC, IFRC and the Centre for Humanitarian Data on the responsible use of online conferencing tools