The Humanitarian Data Exchange (HDX) is an open platform for sharing data across crises and organisations. Launched in July 2014, the goal of HDX is to make humanitarian data easy to find and use for analysis. Our growing collection of datasets has been accessed by users in over 200 countries and territories.
HDX is managed by OCHA's Centre for Humanitarian Data, which is located in The Hague. OCHA is part of the United Nations Secretariat and is responsible for bringing together humanitarian actors to ensure a coherent response to emergencies. The HDX team includes OCHA staff and a number of consultants who are based in North America, Europe and Africa.
For the latest content about HDX, click here. A selection is below:
- UN DESA - Humanitarian Data Exchange: Making Data Accessible for the COVID-19 Pandemic
- Thomson Reuters Foundation - In a World Awash with Data, Aid Workers Contend with Gaps
- Good Code Podcast - The UN on Humanitarian Data
- UN Radio - Saving lives with data: how the UN is developing digital tools for improved humanitarian aid
- Capacity4Dev.eu - Q&A: The challenge of open data in humanitarian response
- AcademyHealth - Sixth Annual Health Data Liberator Award Honors Work to Increase Use and Impact of Humanitarian Data
- Devex - Opinion: Humanitarian world is full of data myths. Here are the most popular
To find our logo files and other materials, please click here.
We define humanitarian data as:
- data about the context in which a humanitarian crisis is occurring (e.g., baseline/development data, damage assessments, geospatial data)
- data about the people affected by the crisis and their needs
- data about the response by organisations and people seeking to help those who need assistance.
We build and test HDX using the latest versions of Chrome and Firefox. We also test on Microsoft Edge, but do not formally support it.
Use our password recovery form to reset your account details. Enter your username or e-mail and we will send you an e-mail with a link to create a new password.
Anyone can view and download the data from the site, but registered users can access more features. After signing up you can:
- Contact data contributors to ask for more information about their data.
- Request access to the underlying data for metadata only entries (our HDX Connect feature).
- Join organisations to share data or to access private data, depending on your role within the organisation, e.g. an admin, editor, or member of the organisation (see more below).
- Request to create a new organisation and if approved, share data publicly or privately.
- Add data visualizations as showcase items alongside your organisations datasets.
- Follow the latest changes to data.
HDX allows registered users to follow the data they are interested in. Updates to the datasets that you follow will appear as a running list in your user dashboard(accessible from your user name in the top right of every page when you are logged in). You can follow data, organisations, locations, topics and crises.
You’ll find a ‘Request Access’ button for datasets where only metadata is provided. The HDX Connect feature makes it possible to discover what data is available or what data collection initiatives are underway. Only registered users have the ability to contact the organisation through the request access module. The administrator for the contributing organisation can decide whether to accept or deny the request. Once the connection is made, HDX is not involved in the decision to share the data. Learn more about HDX Connect here.
Organisations in HDX can be legal entities, such as WFP, or informal groups, such as the Shelter Cluster or Information Management Working Group for a specific country. Data can only be shared on HDX through an organisation. The HDX team verifies all organisations to ensure they are trusted and have relevant data to share with the HDX user community.
On an organisation’s page, click on the ‘Stats’ tab to see how many visitors an organisation has received and which datasets are most popular in terms of downloads. Here’s an example. The number of unique visitors is approximate and is based on the browser someone uses when visiting HDX. A user visiting from different browsers or from different devices will be counted separately.
You can also see a timeline of how often an individual dataset has been downloaded on each dataset page. The download timeline is located on the left side of a dataset page, just beside the dataset description. Downloads for a dataset are counted as the total number of downloads of any resource in a dataset, with repeated downloads of the same resource by the same user being counted a maximum of once per day.
There is a delay, usually less than one day, between when a user views a page or downloads a resource on HDX and when the activity is visible in these graphs and figures.
You can request an organisation through the ‘Add Data’ button. We ask you to submit the following information: an organisation name, description and link to an organisation-related website (optional). We review this information and then either accept the request or ask for more information, such as a sample dataset. Approved organisations will remain inactive and not displayed under ‘Organisations’ page until at least one dataset has been shared through HDX.
Registered users have an option to join an organisation during signup. You can also request membership through the organisation’s page. Please keep in mind that you need to work for the organisation in order to click on the ‘Request Membership’ button and a request will be sent to the organisation’s administrator(s). The requestor can not specify the role (i.e., admin, editor or member). Instead, the person receiving the request assigns the role. If you do not see this option displayed on organisation page, the organisation is a closed group and is not accepting new members.
Organisation membership includes three roles:
- Administrators can add, edit and delete datasets belonging to the organisation and accept or refuse new member requests.
- Editors can add, edit and delete datasets belonging to the organisation but cannot manage membership.
- Members can view the organisation’s private datasets, but cannot add new datasets or manage membership.
The user who requests the creation of an organisation is assigned an administrator role. That person can invite other HDX users into their organisation and assign them one of the three roles above, or registered users on HDX can request membership from the organisation’s administrator(s).
Organisation admins can invite new members, remove existing members or change their roles from the ‘Members’ tab on organisation page.
Registered users can also initiate a request to join your organisation during the signup process or later on from your organisation page(if you want to disable this option, read the question below‘I am an organisation admin. I don’t want anyone to request membership and want to manually add/remove members.’).
Membership requests are sent to your email and also added as a notification on HDX. If you can confirm that the user works for your organisation (ie. by using a company directory) or is in your trusted network, then you may approve the request. If you cannot verify who the user is, you should decline the request. Please do not approve membership requests for people outside your organisation or working group. For full details on managing members, please read this document. Please be aware that anyone added to your organisation on HDX can view the organisation’s private datasets.
Organisation admins have the option to make the organisation an open or closed group. By default, all organisations are an open group to allow new users to request membership. If you don’t want to allow any member to join your organisation, you can turn off the ‘Allow members’ checkbox under ‘Edit organisation page’. This will make your organization a closed group with existing members. No new member will be able to send a request to join your organization on HDX. The admin(s) of your organization can still manually invite new members, remove existing members or change their roles from the ‘Members’ tab.
Yes. Registered users can be part of several organisations.
If your organisation is not listed, you can request to create one or you may want to join an existing organisation via your dashboard. For instance, there may be a WFP organisation that was created by its staff at headquarters in Rome. You may prefer to join that one rather than creating a separate organisation for a specific location, e.g., WFP Liberia. You can see the full list of organisations by clicking Organisations in the main navigation.
If you have previously created an organisation and no longer see it on the site, this is because you have not yet shared a public dataset. Once you share a dataset, your organisation will become active and visible on the site. For details on how to upload a dataset, see “How do I add a dataset?”.
Yes. Each administrator is able to manage datasets and membership. If a user requests membership, the request will be sent to all organisation administrators. The decision to accept or deny a membership request will be taken by whichever administrator acts first. The other administrators are not alerted to this action. We are planning to make this process more clear in future versions of the platform, so please bear with us!
HDX offers custom organisation pages to all organisations on the site. The page includes the organisation’s logo and colour palette, topline figures, space for a data visualization and the list of datasets. If you would like a custom page, send a request to firstname.lastname@example.org.
‘Group message’ lets members of an organisation send messages to all other members of their organisation. Please find more details here.
You can keep your account. On the organisation page that you’re a part of, click the link to ‘Leave this organisation’. If you want to change the e-mail address associated with your account, click on your username on the upper-right corner of any HDX page and then select ‘User Settings’. From there, you can update your profile.
A dataset is a collection of related data resources. A resource is an individual file within a dataset. When sharing data, you first create a dataset and then you can add one or more resources to it. A resource can either be a file uploaded to HDX (such as a CSV or XLS file) or a link to another website with a downloadable file. A resource, such as a readme file, could also contain documentation that helps users to understand the dataset.
Click on the ‘Add Data’ button from any page on HDX. You will be required to login and associate yourself with an organisation. These slides provide a walkthrough of how to add a dataset. General information about all the metadata options in HDX is available in our Guide to Metadata.
You can only edit a dataset if you are an administrator or editor of your organisation. If you have the appropriate role, on the dataset page you will find an ‘Edit’ button just below the dataset title on the right. This will allow you to edit the dataset metadata and the resources. These slides provide a walk-through of how to edit a dataset.
If your data uses the HXL standard, then HDX can automatically create customizable graphs and key figures to help you highlight the most important aspects of your dataset. We call these ‘Quick Charts’. For a Quick Chart to be generated, your dataset needs to be public and contain a CSV or XLSX resource with HXL tags. HXL is easy! Check out the 30-second tutorial.
The resource can be stored on HDX or as a remote resource at another URL. Quick Charts will be generated from the first resource with HXL tags in the list of a dataset’s resources. The system will try to generate up to three charts based on the HXL tags, and these can be changed to best tell the story in your data. You can edit each Quick Chart’s title, axis labels, and description. Don’t forget to save the changes so they become the default view that users see when viewing your dataset. Here’s a good example to get you started.
Learn more about HXL and HDX Tools in the section below.
Organization admins and editors can add data visualizations to dataset pages to let users explore your data. The data visuals can be made using Tableau, Power BI or whatever software you prefer. The visuals will appear in the “Interactive Data” section at the top of the page.
Learn how to do this by taking a quick look at these slides.
Data Check automatically detects and highlights common humanitarian data errors including validation against CODs and other vocabularies from your HXL-tagged spreadsheet. You can access Data Check from:
- HDX via dataset pages (The “Validate with Data Check” option will appear under “More” button under HXL-tagged resources)
- HDX Tools, for datasets that exist outside of HDX. For this option, you should not use Data Check to process personal or otherwise sensitive data.
Data uploaded to HDX Tools is not retained within the HDX infrastructure, while data downloaded by HDX Tools from public URLs is cached only as long as necessary for processing.
Data Check uses a generic schema that detects many kinds of common errors like possible spelling mistakes or atypical numeric values, but in some cases, an organisation will want to validate against its own more-specific rules. In that case, you can write your own, custom HXL schema and validate using the HXL Proxy (Data Check’s backend engine) directly. Information is available on these pages in the HXL Proxy wiki: HXL schemas, Validation page, and Validation service.
We define data as information that common software can read and analyse. We encourage contributions in any common data format. HDX has built-in preview support for tabular data in CSV and Microsoft Excel (xls only) formats, and for geographic data in zipped shapefile, kml and geojson formats. If multiple formats are available, each can be added as a resource to the dataset, or if you only wish to add one format, then for tabular data, csv is preferable and for geographic data, zipped shapefile is preferred.
A PDF file is not data. If you have a data visualization in PDF format, you can add it as a showcase item on the dataset page. If you wish to share documents, graphics, or other types of humanitarian information that are not related to the data you are sharing, please visit our companion sites ReliefWeb and HumanitarianResponse. A resource, such as a readme file, could also contain documentation that helps users to understand the dataset.
Resources can be either different formats of the same data (such as XLSX and CSV) or different releases of the same data (such as March, April, and May needs assessments). Always put the resource with the most-recent or most-important information first, because the HDX system will by default use the first resource to create visualisations such as Quick Charts or geographic preview (this default can be overridden in the dataset edit page).
If you have data that is substantially different, like a different type of assessment or data about a different province, we recommend creating a separate dataset.
For datasets: the keywords in your dataset title are matched to the search terms users enter when looking for data in HDX. Avoid using abbreviations in the title that users may not be familiar with. Also avoid using words such as current, latest or previous when referring to the time period (e.g., latest 3W), as these terms become misleading as the dataset ages. The following is a good example of a dataset title: ‘Who is Doing What Where in Afghanistan in Dec 2016’.
For resources: by default, the resource name is the name of the uploaded file. However, you can change this if needed to make it more clear to users.
For zipped shapefiles: we recommend the filename be name_of_the_file.shp.zip. However, the system does not require this construction.
If your resource is simply a link to a file hosted elsewhere, there is no size limit. If you are uploading a file onto HDX, the file size is limited to 300MB. If you have larger files that you want to share, e-mail us at email@example.com.
Yes. HDX allows you to drag and drop files from your computer. First, you need to click on the ‘Add Data’ link and then select files from your computer. Drop the files in the designated area. A new dataset form will appear with some fields already pre-filled.
The data that users download from HDX will always reflect updates made to the remote resource (such as a file on Dropbox or Google Drive). However, the metadata and activity stream will not automatically indicate the updated date of the data. This has to be done manually in HDX by the dataset owner. We are working to improve this functionality, so please bear with us!
The HDX system will attempt to create a map, or geographic preview, from geodata formats that it recognizes. For a geographic preview to be generated, your data needs to be in either a zipped shapefile, kml or geojson format. Ensure that the ‘File type’ field for the resource also has one of the above formats. Pro tip: HDX will automatically add the correct format if the file extension is ‘.shp.zip’, ‘.kml’, or ‘.geojson’. Here are examples of geodata points, lines, and polygonsshowing the preview feature.
The preview feature will continue to work when there are multiple geodata resources in a single dataset (i.e., one HDX dataset with many resources attached). The layers icon in the top-right corner of the map enables users to switch between geodata layers. Here is an example.
To generate a map preview, a dataset can have multiple resources but each resource can only include one layer within it. Resources with multiple layers (e.g., multiple shapefiles in a single zip file) are not supported. In this case, the system will only create a preview of the first layer in the resource, however all the layers will still be available in the downloaded file. If you would like all of the layers to display, you need to create a separate resource for each layer.
Searching for datasets on HDX is done in two ways: by searching for terms that you type into the search bar found at the top of almost every page on HDX, and by filtering a list of search results.
Entering a search term causes HDX to look for matching terms in the titles, descriptions, locations and tags of a dataset. The resulting list of items can be further refined using the filter options on the left side of the search result. You can filter by location, tag, organisation, license and format as well as filtering for some special classes of datasets (like datasets with HXL tags or datasets with Quick Charts) in the ‘featured’ filters.
In 2015, HDX migrated the Common Operational Datasets (CODs) from the COD Registry on HumanitarianResponse.info to HDX. Each of these datasets has a ‘cod’ tag. To limit search results to only CODs, use the ‘CODs’ filter in the filter panel on the left side of the dataset list.You can also find all CODs datasets here.
The Data Grid is a prototype feature to help our users find the most critical and useful data. The Data Grid provides a quick way to find datasets that meet or partially meet the criteria for a set of core data categories, like internally displaced persons and refugee numbers, conflict events, transportation status, food prices, administrative divisions, health facilities, and baseline population. These categories of core data, determined from research with our users, may be customized to meet the needs of specific countries and the evolving data needs of humanitarian response. The small square to the left of the dataset name indicates if the dataset fully (solid blue) or partially (hashed blue and white) meets the criteria for the Data Grid category in which it appears. In the latter case, hovering on a dataset name displays some comments about the limitations of the dataset. Learn more in our blog post about it.
Data Grid is not available for all countries. Here is an overview.
All data on HDX must include a minimum set of metadata fields. You can read our Guide to Metadata to learn more. We encourage data contributors to include as much metadata as possible to make their data easier to understand and use for analysis.
Data quality is important to us, so we manually review every new dataset for relevance, timeliness, interpretability and comparability. We contact data contributors if we have any concerns or suggestions for improvement. You can learn more about our definition of the dimensions of data quality and our quality-assurance processes here.
This metadata field indicates how often you expect the data in your dataset to be updated. It should reflect the frequency with which you believe your data will change. This can be different from how often you check your data. It includes values like “Every day” and “Every year” as well as the following:
- Live – for datasets where updates are continuous and ongoing
- As needed – for datasets with an unpredictable, widely varying update frequency
- Never – for datasets with data that will never be changed
We recommend you choose the nearest less frequent regular value instead of “As needed” or “Never”. This helps with our monitoring of data freshness. For example, if your data will be updated every 1-6 days, pick “Every week”, or if every 2 to 9 weeks, choose “Every three months”.
The green leaf symbol indicates that a dataset is up to date – that there has been an update to the data in the dataset (not the dataset metadata) within the expected update frequency plus some leeway. For more information on the expected update frequency metadata field and the number of days a dataset qualifies as being fresh, see here.
No. HDX will never make changes to the data that has been shared. We do add tags, or make changes to dataset titles to help make your data more discoverable by HDX users. We may also add a data visualization for the data in the dataset showcase. A list of changes appears in the activity stream on the left-hand column of the dataset page.
The HDX team manually reviews every dataset uploaded to the platform as part of a standard quality assurance (QA) process. This process exists to ensure compliance with the HDX Terms of Service, which prohibit the sharing of personal data. It also serves as a means to check different quality criteria, including the completeness of metadata, the relevance of the data to humanitarian action, and the integrity of the data file(s).
If an issue is found, the resource(s) requiring additional review will be temporarily unavailable for download and marked as ‘under review’ in the dataset page on the public HDX interface.
The Humanitarian Exchange Language (HXL) is a simple standard for messy data. It is based on spreadsheet formats such as CSV or Excel. The standard works by adding hashtags with semantic information in the row between the column header and data allow software to validate, clean, merge and analyse data more easily. To learn more about HXL and who’s currently using it, visit the HXL standard site.
HDX is currently adding features to visualise HXL-tagged data. To learn more about HXL and who’s currently using it, visit the HXL standard site.
HDX Tools include a number of HXL-enabled support processes that help you do more with your data, more quickly. The tools include:
- Quick Charts – Automatically generate embeddable, live data charts, graphs and key figures from your spreadsheet.
- HXL Tag Assist – See HXL hashtags in action and add them to your own spreadsheet.
- Data Check – Data cleaning for humanitarian data, automatically detects and highlights common errors including validation against CODs and other vocabularies.
You can find all HDX Tools through tools.humdata.org. The tools will work with data that is stored on HDX, the cloud or local machines. The only requirement is that the data includes HXL hashtags.
If your data uses HXL hashtags, then the Quick Charts tool can automatically create customizable graphs and key figures to help you highlight the most important aspects of your dataset. Quick Charts require the following:
- The first resource in your dataset (stored on HDX or remotely) must have HXL hashtags.
- That dataset must have the HDX category tag ‘HXL’ (not to be confused with the actual HXL hashtags).
For more details you can view these walkthrough slides.
Every Quick Chart on HDX includes a small link icon at the bottom, that will give you HTML markup to copy into a web page or blog to add the chart. The chart will be live, and will update whenever the source data updates. If your data is not on HDX, you can also generate a Quick Chart using the standalone version of the service, available on https://tools.humdata.org.
The HXL Tag Assist tool will show you different HXL hashtags in datasets that organisations have already uploaded to HDX. You can find a quick (and portable) list of the core HXL hashtags on the HXL Postcard. The detailed list of HXL hashtags and attributes is available in the HXL hashtag dictionary. Finally, an up-to-date machine-readable version of the hashtag dictionary is available on HDX.
You can use Data Check to compare your HXL-tagged dataset against a collection of validation rules that you can configure. Data Check identifies the errors in your data such as spelling mistakes, incorrect geographical codes, extra whitespace, numerical outliers, and incorrect data types.
For more details you can view these walkthrough slides.
For the purpose of sharing data through HDX, we have developed the following categories to communicate data sensitivity:
- Non-Sensitive – This includes datasets containing country statistics, roadmaps, weather data and other data with no foreseeable risk associated with sharing.
- Uncertain Sensitivity – For this data, sensitivity depends on a number of factors, including other datasets collected in the same context, what technology is or could be used to extract insights, and the local context from which the data is collected or which will be impacted by use of the data.
- Sensitive – This includes any dataset containing personal data of affected populations or aid workers. Datasets containing demographically identifiable information (DII) or community identifiable information (CII) that can put affected populations or aid workers at risk, are also considered sensitive data. Depending on context, satellite imagery can also fall into this third category of sensitivity.
The Working Draft of the OCHA Data Responsibility Guidelines (‘the Guidelines’) helps staff better assess and manage the sensitivity of the data they handle in different crisis contexts. We recommend that HDX users familiarize themselves with the Guidelines.
Different data can have different levels of sensitivity depending on the context. For example, locations of medical facilities in conflict settings can expose patients and staff to risk of attacks, whereas the same facility location data would likely not be considered sensitive in a natural disaster setting.
Recognizing this complexity, the Guidelines include an Information and Data Sensitivity Classification model to help colleagues assess and manage sensitivity in a standardized way.
For microdata (survey and needs-assessment data), you can manage the sensitivity level by applying a Statistical Disclosure Control (SDC) process. There are several tools available online to do SDC – we use sdcMicro.
The Centre has developed a Guidance Note on Statistical Disclosure Control that outlines the steps involved in the SDC process, potential applications for its use, case studies and key actions for humanitarian data practitioners to take when managing sensitive microdata.
HDX endeavors not to allow publicly shared data that includes community identifiable information (CII) or demographically identifiable information (DII) that may put affected people at risk. However, this type of data is more challenging to identify within datasets during our quality assurance process without deeper analysis. In cases where we suspect that survey data may have a high risk of re-identification of affected people, we run an internal statistical disclosure control process using sdcMicro. Data is made private while we run this process. If the risk level is found to be too high for public sharing on HDX given the particular context to which the data relates, HDX will notify the data contributor to determine a course of action.
HDX promotes the use of licenses developed by the Creative Commons Foundation and the Open Data Foundation. The main difference between the two classes of licences is that the Creative Commons licences were developed for sharing creative works in general, while the Open Data Commons licences were developed more specifically for sharing databases. See the full list of licences here.