-
Notifications
You must be signed in to change notification settings - Fork 55
Home
AlphaPulldown is a general pipeline for high-throughput structural modeling built on AlphaFold-Multimer and AlphaFold3. It was designed for customizable high-throughput screening of protein-protein interactions. It extends AlphaFold's capabilities by incorporating additional run options, such as customizable multimeric structural templates (TrueMultimer), MMseqs2 multiple sequence alignment (MSA) via ColabFold databases, protein fragment predictions, and the ability to incorporate mass spec data as an input using AlphaLink2.
Although it is recommended to use AlphaPulldown via Snakemake pipeline, it is also possible to run two-step pipeline composed of python scripts. For details on using the scripts, please refer to the documentation
To enable faster usage and avoid redundant feature recalculations, we have developed a public database containing precomputed features for all major model organisms. You can browse the full list and download individual features at https://alphapulldown.s3.embl.de/index.html or https://s3.embl.de/alphapulldown/index.html.
For more details, click here.
Figure 1 Overview of AlphaPulldown worflow
The AlphaPulldown workflow involves the following 3 steps:
-
Create and store MSA and template features:
In this step, AlphaFold searches preinstalled databases using HMMER for each queried protein sequence and calculates multiple sequence alignments (MSAs) for all found homologs. It also searches for homolog structures to use as templates for feature generation. This step only requires CPU.
Customizable options include:
- To speed up the search process, MMSeq2 can be used instead of the default HHMER.
- Use custom MSA.
- Use a custom structural template, including a multimeric one (TrueMultimer mode).
-
Structure prediction:
In this step, the AlphaFold neural network runs and produces the final protein structure, requiring GPU. A key strength of AlphaPulldown is its ability to flexibly define how proteins are combined for the structure prediction of protein complexes. Here are the three main approaches you can use:
Figure 2 Three typical scenarios covered by AlphaPulldown
-
Single file (custom mode or homo-oligomer mode): Create a file where each row lists the protein sequences you want to predict together or each row tells the program to model homo-oligomers with your specified number of copies.
-
Multiple Files (pulldown mode): Provide several files, each containing protein sequences. AlphaPulldown will automatically generate all possible combinations by pairing rows of protein names from each file.
-
All versus all: AlphaPulldown will generate all possible non-redundant combinations of proteins in the list.
Additionally, AlphaPulldown also allows:
- Select only region(s) of proteins that you want to predict instead of the full-length sequences.
- Adjust MSA depth to control the influence of the initial MSA on the final model.
- Integrate high-throughput crosslinking data with AlphaFold modeling via AlphaLink2.
-
-
Downstream analysis of results:
The results for all predicted models can be systematized using one of the following options:
- A table containing various scores and physical parameters of protein complex interactions.
- A Jupyter notebook with interactive 3D protein models and PAE plots.