Open Targets

The user is responsible for reviewing and complying with the license requirements of the software, notebooks, and data referenced in this documentation.

Users are responsible for the costs associated with analyzing the Open Targets dataset and its storage in their project spaces.

Instance type availability and pricing are subject to the contract between the user or the user’s organization and DNAnexus.

Citations for the Open Targets

The latest publication about Open Targets can be found in Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery (2025) which describes recent updates to the Open Targets Platform. Users can also find more information about Open Targets in previous publications:

Open Targets is a public-private initiative led by the European Bioinformatics Institute (EBI) that comprehensively aggregates public data sources for drug discovery. The official version of dataset is hosted on Open Targets Platform

Overview of the Open Targets Dataset

Open Targets is an integrated data resource that enables the systematic identification and prioritization of therapeutic targets. It combines diverse publicly available datasets with resources generated by the Open Targets consortium to compute and score target–disease associations, helping drive more informed decisions in early drug discovery. By aggregating evidence across genetics, molecular QTLs, somatic variation, expression, pathways, chemical biology, pharmacology, and literature, it provides comprehensive annotation of targets, diseases and drugs within a unified framework.

The Open Targets Platform integrates data informing multiple steps in the target identification and prioritization process, from assessing the casual and supporting evidence of a target’s role in disease through target prioritisation to therapeutic hypothesis generation.

For dataset information can be found in the Official Open Target documentation. The schema for each dataset can be viewed online in the Open Target Data Download section.

On DNAnexus, we provide the complete Open Targets Platform release (version 25.09), including 38 datasets across seven major categories (target–disease associations, targets, ontology, genetics, diseases, drugs, and literature). For an overview of this release, users can refer to the official Release blog and Release note. In December 2025, Open Targets released a new version (25.12). We also provide this version on DNAnexus. Please refer to their Release note to learn more about this new version.

See the “Where to Access Open Targets” section below to start accessing the dataset

Where to Access Open Targets

The following files are available for the Open Targets datasets:

To use the dataset and notebooks, please copy the data and notebooks into your own project space. Details on how to copy the data are present under the section titled "Copying Data and Notebooks into a Project".

Running analyses on Open Targets

Copying Data and Notebooks into a Project

To utilize the dataset, please copy the data from the project listed above into your own project.

Here are the steps to copy the Open Targets data into a Project Space:

  1. Create a project for your Open Targets dataset, billed to your own organization. Tutorials on how to set up a project can be found on this page.

  2. Go to Resources Tab and find the project titled “Public Datasets Region” and select the folder "Open-Targets".

  3. Select the data folder and the notebooks

  4. Select "Copy" on the top right menu, and select the project that you created in Step 1.

  5. Then, go to the project space you created in Step 1 to start exploring the Open Targets dataset and notebooks.

  6. To run the JupyterLab Notebooks, please see the JupyterLab section including a JupyterLab Notebook and Running a Spark JupyterLab Notebook of the Academy Documentation

Example notebook

We prepared an example of a notebook showing extracting colocalizations for GWAS credible sets associated with autoimmune diseases. The notebook is named as “autoimmune_colocalisations_spark.ipynb” and is optimized for the JupyterLab with Spark Cluster

  • Instance type: mem1_ssd1_v2_x16

  • Please follow the provided command-line instructions in the terminal that are found in the notebook example before running the notebook.

Video: Utilizing the Open Targets Dataset on the DNAnexus Platform

Last updated