> For the complete documentation index, see [llms.txt](https://academy.dnanexus.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://academy.dnanexus.com/public-datasets-on-the-dnanexus-platform/molecular-modeling/alphafold2.md).

# AlphaFold2

## Necessary Disclaimers and Legal

Users are responsible for reviewing and complying with the license requirements of the software, notebooks, and data referenced in this documentation.

Users are responsible for compute and storage costs incurred within their DNAnexus project spaces.

Instance type availability and pricing are subject to the agreement between the user (or their organization) and DNAnexus.

## Citations and Acknowledgments

This documentation references data and tools from the following resources:

* For AlphaFold2 predictions, please cite the original AlphaFold publication and the [nf-core/proteinfold pipeline](https://zenodo.org/records/7437038) (see [CITATIONS.md](http://citations.md)).

## Overview of post-folding analysis notebook

AlphaFold2, as implemented in the nf-core/proteinfold workflow, generates two primary outputs:

1. a predicted 3D structure
2. associated confidence metric

The PDB file contains atomic coordinates of the predicted model. Confidence information is encoded in the B-factor field as the predicted Local Distance Difference Test (pLDDT) score, which reflects residue-level confidence in the predicted local structure. Higher pLDDT values indicate greater confidence in local structural accuracy, while lower values often correspond to flexible or disordered regions. For additional background on AlphaFold2 outputs and confidence metrics, please refer to the DeepMind article [Enabling high-accuracy protein structure prediction at the proteome scale](https://deepmind.google/blog/enabling-high-accuracy-protein-structure-prediction-at-the-proteome-scale), training material from [EMBL-EBI](https://www.ebi.ac.uk/training/online/courses/alphafold/inputs-and-outputs/alphafold-inputs-and-outputs-recap/), and the [nf-core/proteinfold](https://nf-co.re/proteinfold/1.1.1/docs/output/).&#x20;

The Notebook is available on the Platform: alphafold2\_plddt\_p2rank\_analysis-2026-04-08.ipynb . It is available here on [AWS US East](https://platform.dnanexus.com/panx/projects/J3JyY6j030gzQypGpk273241/data/Post_folding_analysis), [AWS Europe (Frankfurt)](https://platform.dnanexus.com/panx/projects/J780j7848VpfB6kJ8p7y29xG/data/Post_folding_analysis), [AWS Europe (London)](https://platform.dnanexus.com/panx/projects/J780fzpKpb7Gq5X4ZJfBP7QX/data/Post_folding_analysis), [Azure Amsterdam,](https://platform.dnanexus.com/panx/projects/J780gY0B34pvq5X4ZJfBP7YP/data/Post_folding_analysis) [Azure US (West)](https://platform.dnanexus.com/panx/projects/J780v289Z00G4Kx14b188ybj/data/Post_folding_analysis).

### Workflow description:

This notebook evaluates the structural reliability of an AlphaFold2-predicted model and identifies confidence-supported binding regions. The workflow:

* Extract residue-level pLDDT scores from the PDB file
* Identify low-confidence regions (pLDDT < 50)
* Predict potential binding pockets using P2Rank
* Retain pockets enriched in high-confidence residues (e.g., pLDDT ≥ 70)

The final output provides a confidence-aware overview of predicted binding sites within the structure.

## Running notebooks on the DNAnexus platform

### Copying notebooks and snapshot into a Project&#x20;

To use the notebooks, copy them into your project. Here are the steps to copy the notebooks into a project space:

1. Create a project for your analysis, billed to your own organization. Tutorials on how to set up a project can be found on this page.
2. Go to Resources Tab and find the project titled “Public Datasets AWS US (East)” and select the folder&#x20;
   1. "Post\_folding\_analysis" (post folding notebook)
   2. “Notebook\_snapshot”
3. Select notebooks and files in these three folders you want to copy. Please use snapshot: **snapshot-molecular\_modeling-jupyterlab-2026-04-08.tar.gz** for environment setup
4. Select "Copy" on the top right menu, and select the project that you created in Step 1
5. Then, go to the project space you created in Step 1 to start exploring two notebooks.
6. To run the JupyterLab Notebooks, please see the [JupyterLab section of the Academy Documentation](https://academy.dnanexus.com/interactivecloudcomputing/jupyterlab).

### Instance Type Selection

* Instance wait times are subject to queue availability. Less common instance types may result in longer wait times due to their limited availability.
* Instances started with snapshots may take longer to initialize due to environment setup.
* Instance type availability and pricing are subject to the contract between the user or the user’s organization and DNAnexus.&#x20;
* The two notebooks are optimized for [JupyterLab with Python, R, Stata, ML, Image Processing](https://academy.dnanexus.com/interactivecloudcomputing/jupyterlab) (version 2.11). If you do not have access, please contact the Success Team at <success@dnanexus.com> or the Sales Team at <sales@dnanexus.com>.
* Recommended instance type for this demo: mem1\_ssd1\_v2\_x16.

### A note on notebooks

* Use the snapshot when starting the job (e.g., snapshot-molecular\_modeling-jupyterlab-2026-04-08.tar.gz). The snapshots can be found in the “Notebook\_snapshot” folder under “Public Datasets AWS US (East)”.
* Before running the notebooks, follow the instructions in the notebook markdown to select the correct kernel. If the required kernel is not available, activate the corresponding conda environment and register the kernel as described in the provided instructions.
* The post-folding analysis notebook uses an example output file (T1024) generated by AlphaFold2 within the nf-core/proteinfold pipeline. These files are available in the Results folder for each region. They are available in the Public Datasets projects in each region.&#x20;
* If you would like to use this dataset in your own project, follow the section “Copying Notebooks and Snapshot into a Project”, and update the data path in the notebook accordingly. Alternatively, you may use the provided script to download the data directly from the Public Datasets AWS US (East) project.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://academy.dnanexus.com/public-datasets-on-the-dnanexus-platform/molecular-modeling/alphafold2.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
