Skip to content

haddocking/powerfit

PowerFit

PyPI - Version DOI Research Software Directory Badge

About PowerFit

PowerFit is a Python package and simple command-line program to automatically fit high-resolution atomic structures in cryo-EM densities. To this end it performs a full-exhaustive 6-dimensional cross-correlation search between the atomic structure and the density. It takes as input an atomic structure in PDB-format and a cryo-EM density with its resolution; and outputs positions and rotations of the atomic structure corresponding to high correlation values. PowerFit uses the local cross-correlation function as its base score. The score can optionally be enhanced by a Laplace pre-filter and/or a core-weighted version to minimize overlapping densities from neighboring subunits. It can further be hardware-accelerated by leveraging multi-core CPU machines out of the box or by GPU via the OpenCL framework. PowerFit is Free Software and has been succesfully installed and used on Linux and MacOSX machines.

Requirements

Minimal requirements for the CPU version:

  • Python3.10 or greater
  • NumPy 1.8+
  • SciPy
  • GCC (or another C-compiler)
  • FFTW3
  • pyFFTW 0.10+

To offload computations to a discrete or integrated* GPU the following is also required

  • OpenCL1.1+
  • pyopencl
  • pyvkfft

Recommended for installation

  • git
  • pip

* Integrated graphics on CPUs are able to signficantly outperform the native CPU implementation in some cases. This is mostly applicable to Intel devices, see the section tested platfoms.

Installation

If you already have fulfilled the requirements, the installation should be as easy as opening up a shell and typing

# To run on CPU
pip install powerfit-em
# To run on GPU
pip install powerfit-em[opencl]

If you are starting from a clean system, follow the instructions for your particular operating system as described below, they should get you up and running in no time.

Conda

If you do not have system admin rights, you likely cannot compile pyvkfft locally. However, by installing powerfit in a conda environment, you can still do computations on GPU. If you are on a Linux system and have Conda or Mamba available, follow these instructions;

Steps for running on GPU with Conda

For AMD or NVIDIA GPUs you can run the following command. Note that this relies on OpenCL drivers being available system wide (under /etc/OpenCL/vendors/).

conda create -n powerfit -c conda-forge python=3.12 ocl-icd ocl-icd-system pyopencl pyvkfft
conda activate powerfit
pip install powerfit-em[opencl]

On Intel integrated graphics you can use the following command. This includes the OpenCL runtime and does not rely on your system setup:

conda create -n powerfit -c conda-forge python=3.12 ocl-icd intel-compute-runtime pyopencl pyvkfft
conda activate powerfit
pip install powerfit-em[opencl]

Some older Intel processors might need to use intel-opencl-rt instead of intel-compute-runtime.

After installation, you can check that the OpenCL installation is working by running

python -c 'import pyopencl as cl;from pyvkfft.fft import rfftn; ps=cl.get_platforms();print(ps);print(ps[0].get_devices())'

Docker

Powerfit can be run in a Docker container.

Install docker by following the instructions.

Linux

Linux systems usually already include a Python3.10 or greater distribution. First make sure the Python header files, pip and git are available by opening up a terminal and typing for Debian and Ubuntu systems

sudo apt update
sudo apt install python3-dev python3-pip git build-essential

If you are working on Fedora, this should be replaced by

sudo yum install python3-devel python3-pip git development-c development-tools
Steps for running on GPU

If you want to use the GPU version of PowerFit, you need to install the drivers for your GPU.

After installing the drivers, you need to install the OpenCL development libraries. For Debian/Ubuntu, this can be done by running

sudo apt install ocl-icd-opencl-dev ocl-icd-libopencl1

For Fedora, this can be done by running

sudo dnf install opencl-headers ocl-icd-devel

Install pyvkfft, a Python wrapper for the VkFFT library, using

pip install pyvkfft

Check that the OpenCL installation is working by running

python -c 'import pyopencl as cl;from pyvkfft.fft import rfftn; ps=cl.get_platforms();print(ps);print(ps[0].get_devices())'
# Should print the name of your GPU

Your system is now prepared, follow the general instructions above to install PowerFit.

MacOSX

First install git by following the instructions on their website, or using a package manager such as brew

brew install git

Next install pip, the Python package manager, by following the installation instructions on the website or open a terminal and type

python -m ensurepip --upgrade

To get faster score calculation, install the pyFTTW Python package in your conda environment with conda install -c conda-forge pyfftw.

Follow the general instructions above to install PowerFit.

Windows

First install git for Windows, as it comes with a handy bash shell. Go to git-scm, download git and install it. Next, install a Python distribution such as Anaconda. After installation, open up the bash shell shipped with git and follow the general instructions written above.

Usage

After installing PowerFit the command line tool powerfit should be at your disposal. The general pattern to invoke powerfit is

powerfit <map> <resolution> <pdb>

where <map> is a density map in CCP4 or MRC-format, <resolution> is the resolution of the map in ångstrom, and <pdb> is an atomic model in the PDB-format. This performs a 10° rotational search using the local cross-correlation score on a single CPU-core. During the search, powerfit will update you about the progress of the search if you are using it interactively in the shell.

Usage in Docker

The Docker images of powerfit are available in the GitHub Container Registry.

Running PowerFit in a Docker container with data located at a hypothetical /path/to/data on your machine can be done as follows

docker run --rm -ti --user $(id -u):$(id -g) \
    -v /path/to/data:/data ghcr.io/haddocking/powerfit:v3.1.0 \
    /data/<map> <resolution> /data/<pdb> \
    -d /data/<results-dir>

For <map>, <pdb>, <results-dir> use paths relative to /path/to/data.

To run tutorial example use

# cd into powerfit-tutorial repo
docker run --rm -ti --user $(id -u):$(id -g) \
    -v $PWD:/data ghcr.io/haddocking/powerfit:v3.1.0 \
    /data/ribosome-KsgA.map 13 /data/KsgA.pdb \
    -a 20 -p 2 -l -d /data/run-KsgA-docker

To run on NVIDIA GPU using NVIDIA container toolkit use

docker run --rm -ti \
    --runtime=nvidia --gpus all -v /etc/OpenCL:/etc/OpenCL \
    -v $PWD:/data ghcr.io/haddocking/powerfit:v3.1.0 \
    /data/ribosome-KsgA.map 13 /data/KsgA.pdb \
    -a 20 -l -d /data/run-KsgA-docker-nv --gpu

To run on Intel integrated graphics use

docker run --rm -ti \
    --device=/dev/dri \
    -v $PWD:/data ghcr.io/haddocking/powerfit:v3.1.0 \
    /data/ribosome-KsgA.map 13 /data/KsgA.pdb \
    -a 20 -l -d /data/run-KsgA-docker-nv --gpu

To run on AMD GPU use

sudo docker run --rm -ti \
    --device=/dev/kfd --device=/dev/dri \
    --security-opt seccomp=unconfined \
    --group-add video --ipc=host \
    -v $PWD:/data ghcr.io/haddocking/powerfit-rocm:v3.1.0 \
    /data/ribosome-KsgA.map 13 /data/KsgA.pdb \
    -a 20 -l -d /data/run-KsgA-docker-amd --gpu

Options

First, to see all options and their descriptions type

powerfit --help

The information should explain all options decently. In addtion, here are some examples for common operations.

To perform a search with an approximate 24° rotational sampling interval with laplace pre-filtering and core-weighted scoring function using 1 CPU

powerfit <map> <resolution> <pdb> -a 24

To use multiple CPU cores without laplace pre-filter and 5° rotational interval

powerfit <map> <resolution> <pdb> -p 4 --no-laplace -a 5

To off-load computations to the GPU and do not use the core-weighted scoring function and write out the top 15 solutions

powerfit <map> <resolution> <pdb> -g --no-core-weighted -n 15

Note that all options can be combined except for the -g and -p flag: calculations are either performed on the CPU or GPU.

To run on GPU

powerfit <map> <resolution> <pdb> --gpu
...
Using GPU-accelerated search.
...

Output

When the search is finished, several output files are created

  • fit_N.pdb: the top N best fits.
  • solutions.out: all the non-redundant solutions found, ordered by their correlation score. The first column shows the rank, column 2 the correlation score, column 3 and 4 the Fisher z-score and the number of standard deviations (see N. Volkmann 2009, and Van Zundert and Bonvin 2016); column 5 to 7 are the x, y and z coordinate of the center of the chain; column 8 to 17 are the rotation matrix values.
  • lcc.mrc: a cross-correlation map, showing at each grid position the highest correlation score found during the rotational search.
  • powerfit.log: a log file, including the input parameters with date and timing information.
  • report.html and state.mvsj: an HTML report and its MolViewSpec with interactive 3D visualization of the best fits. Only written if the --report --delimiter , arguments are passed.

Licensing

If this software was useful to your research, please cite us

G.C.P. van Zundert and A.M.J.J. Bonvin. Fast and sensitive rigid-body fitting into cryo-EM density maps with PowerFit. AIMS Biophysics 2, 73-87 (2015) https://doi.org/10.3934/biophy.2015.2.73.

For the use of image-pyramids and reliability measures for fitting, please cite

G.C.P van Zundert and A.M.J.J. Bonvin. Defining the limits and reliability of rigid-body fitting in cryo-EM maps using multi-scale image pyramids. J. Struct. Biol. 195, 252-258 (2016) https://doi.org/10.1016/j.jsb.2016.06.011.

If you used PowerFit v1, please cite software with https://doi.org/10.5281/zenodo.1037227. For version 2 or higher, please cite software with https://doi.org/10.5281/zenodo.14185749.

Apache License Version 2.0

The elements.py module is licensed under MIT License (see header). Copyright (c) 2005-2015, Christoph Gohlke

Tested platforms

Operating System CPU single CPU multi GPU
Linux Yes Yes Yes
MacOSX Yes Yes No
Windows Yes Fail No

The GPU version has been successfully tested on Linux and with a Docker container for the following devices;

  • NVIDIA GeForce GTX 1050 Ti
  • NVIDIA GeForce RTX 4070
  • AMD Radeon RX 7800 XT
  • AMD Radeon RX 7900 XTX
  • Intel Iris Xe Graphics (on a Core i7-1185G7)

The integrated graphics of AMD Ryzen CPUs do not officially support OpenCL. If they do seem available in PyOpenCL be aware that this may lead to incorrect results.

Contributing

To contribute to PowerFit, see CONTRIBUTING.md.

About

Rigid body fitting of atomic strucures in cryo-electron microscopy density maps

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors 12