Skip to content

Error in Cluster step #29

@wwood

Description

@wwood

Hi,

I'm trying to run MetaCompass to test it out, using as input some reads with simulated variants. I ran this yesterday and the git/db are up to date. Any ideas? Thanks. (I can supply the reads if needed)

➜  docker build -t metacompass:2.0-beta .; docker run -it --rm -v `pwd`/RefSeq_V2_db:/data -v `pwd`/kicking_tyres:/input metacompass:2.0-beta
[+] Building 0.9s (17/17) FINISHED                                                                                                                             docker:default
 => [internal] load build definition from Dockerfile                                                                                                                     0.0s
 => => transferring dockerfile: 3.15kB                                                                                                                                   0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                          0.8s
 => [internal] load .dockerignore                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [ 1/13] FROM docker.io/library/ubuntu:22.04@sha256:1ec65b2719518e27d4d25f104d93f9fac60dc437f81452302406825c46fcc9cb                                                  0.0s
 => CACHED [ 2/13] RUN apt-get update && apt-get install -y     wget     curl     git     build-essential     ca-certificates     && rm -rf /var/lib/apt/lists/*         0.0s
 => CACHED [ 3/13] RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh &&     bash /tmp/miniconda.sh -b -p /opt/conda &  0.0s
 => CACHED [ 4/13] RUN conda config --remove channels defaults &&     conda config --add channels conda-forge &&     conda config --add channels bioconda &&     conda   0.0s
 => CACHED [ 5/13] WORKDIR /opt                                                                                                                                          0.0s
 => CACHED [ 6/13] RUN git clone https://github.com/marbl/MetaCompass.git                                                                                                0.0s
 => CACHED [ 7/13] WORKDIR /opt/MetaCompass                                                                                                                              0.0s
 => CACHED [ 8/13] RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main                                                                0.0s
 => CACHED [ 9/13] RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r                                                                   0.0s
 => CACHED [10/13] RUN conda env create -f metacompass_environment.yml                                                                                                   0.0s
 => CACHED [11/13] RUN nextflow -version                                                                                                                                 0.0s
 => CACHED [12/13] RUN mkdir -p /opt/metacompass_db                                                                                                                      0.0s
 => CACHED [13/13] RUN echo '#!/bin/bash\nconda activate metacompass\nexec "$@"' > /usr/local/bin/entrypoint.sh &&     chmod +x /usr/local/bin/entrypoint.sh             0.0s
 => exporting to image                                                                                                                                                   0.0s
 => => exporting layers                                                                                                                                                  0.0s
 => => writing image sha256:ab05ac3157ebf63ec66989da1b798529208afe8b9866e35456a64c271c6a92ce                                                                             0.0s
 => => naming to docker.io/library/metacompass:2.0-beta                                                                                                                  0.0s
root@8a8796107568:/opt/MetaCompass# ls /input
sample_0_reads.1.fq.gz  sample_0_reads.2.fq.gz

root@8a8796107568:/opt/MetaCompass# nextflow run metacompass2.nf \                                                                                                              --reference_db /data \
  --forward /input/sample_0_reads.1.fq.gz \
  --reverse /input/sample_0_reads.2.fq.gz \                                                                                                                                     --output /output --threads 8

 N E X T F L O W   ~  version 25.04.6

Launching `metacompass2.nf` [serene_cajal] DSL2 - revision: f6e5b1e425

Output dir is /opt/MetaCompass/results
executor >  local (47)                                                                                                                                                        [b5/08ebc8] filter_reads     [100%] 1 of 1 ✔
[45/5dec41] map_to_gene (39) [100%] 40 of 40 ✔
executor >  local (47)                                                                                                                                                        [b5/08ebc8] filter_reads     [100%] 1 of 1 ✔
[45/5dec41] map_to_gene (39) [100%] 40 of 40 ✔
[b8/769d55] select_genomes   [100%] 1 of 1 ✔
[d2/67cac2] collect_refs     [100%] 1 of 1 ✔
[36/8e4e6c] SkaniTriangle    [100%] 1 of 1 ✔
[06/52a962] Cluster          [  0%] 0 of 1 ✘
[-        ] ConcatFasta      -
[a8/c035ff] IndexReads       [100%] 1 of 1 ✔
[-        ] ClusterIndex     -
[61/b76977] interleaveReads  [100%] 1 of 1 ✔
[-        ] reduceClusters   -
[-        ] refAssembly      -
[-        ] deNovoAssembly   -
[-        ] createOutputs    -

ERROR ~ Error executing process > 'Cluster'

Caused by:
  Process `Cluster` terminated with an error exit status (1)


Command executed:

  python3 "/opt/MetaCompass/scripts/cluster.py"  skani_matrix_stool.txt 5 .

Command exit status:
  1

Command output:
  (empty)

Command error:
  /opt/MetaCompass/scripts/cluster.py:27: SyntaxWarning: invalid escape sequence '\.'
    modified_items = re.match("(GCA_[0-9]+\.[0-9]+)", matched_genome_name).group(1)
  /opt/MetaCompass/scripts/cluster.py:103: SyntaxWarning: invalid escape sequence '\.'
    modified_items = [re.match("(GCA_[0-9]+\.[0-9]+)", item).group(1) for item in items]
  Traceback (most recent call last):
    File "/opt/MetaCompass/scripts/cluster.py", line 20, in <module>
      matched_genome = matching_files.read_text()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/envs/metacompass/lib/python3.12/pathlib.py", line 1027, in read_text
      with self.open(mode='r', encoding=encoding, errors=errors) as f:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/envs/metacompass/lib/python3.12/pathlib.py", line 1013, in open
      return io.open(self, mode, buffering, encoding, errors, newline)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  FileNotFoundError: [Errno 2] No such file or directory: 'matching_files.txt'

Work dir:
  /opt/MetaCompass/work/06/52a962b3f692671a69dba6153dc91a

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

The dockerfile (largely AI written)

# MetaCompass Dockerfile
# Based on installation instructions from https://github.com/marbl/MetaCompass
FROM ubuntu:22.04

# Set non-interactive frontend to avoid prompts during package installation
ENV DEBIAN_FRONTEND=noninteractive

# Install system dependencies
RUN apt-get update && apt-get install -y \
    wget \
    curl \
    git \
    build-essential \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Install Miniconda
ENV CONDA_DIR=/opt/conda
ENV PATH=$CONDA_DIR/bin:$PATH
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh && \
    bash /tmp/miniconda.sh -b -p $CONDA_DIR && \
    rm /tmp/miniconda.sh

# Set up conda
# RUN conda config --set always_yes yes --set changeps1 no && \
#     conda update -q conda

# Configure conda channels and accept TOS
RUN conda config --remove channels defaults && \
    conda config --add channels conda-forge && \
    conda config --add channels bioconda && \
    conda config --set channel_priority strict

# Clone MetaCompass repository
WORKDIR /opt
RUN git clone https://github.com/marbl/MetaCompass.git

# Set working directory to MetaCompass
WORKDIR /opt/MetaCompass

# Create conda environment from the provided environment file
RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
RUN conda env create -f metacompass_environment.yml

# Make RUN commands use the new environment
SHELL ["conda", "run", "-n", "metacompass", "/bin/bash", "-c"]

# Verify Nextflow installation
RUN nextflow -version

# Create directory for reference database
RUN mkdir -p /opt/metacompass_db

# Set environment variables
ENV CONDA_DEFAULT_ENV=metacompass
ENV CONDA_PREFIX=$CONDA_DIR/envs/metacompass
ENV PATH=$CONDA_PREFIX/bin:$PATH

# Create a script to activate the conda environment
RUN echo '#!/bin/bash\n\
conda activate metacompass\n\
exec "$@"' > /usr/local/bin/entrypoint.sh && \
    chmod +x /usr/local/bin/entrypoint.sh

# Set entrypoint to activate conda environment
# ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

# Default command
CMD ["/bin/bash"]

# Add labels for documentation
LABEL maintainer="MetaCompass Docker Image" \
      description="Docker image for MetaCompass v2.0-beta - Reference-guided Assembly of Metagenomes" \
      version="2.0-beta" \
      source="https://github.com/marbl/MetaCompass"

# Expose any necessary ports (if needed for web interfaces)
# EXPOSE 8080

# Notes for users:
# 1. To download the pre-built reference database (16GB), run:
#    wget https://obj.umiacs.umd.edu/metacompass-db/RefSeq_V2_db.tar.gz
#    tar -xzf RefSeq_V2_db.tar.gz
#
# 2. To run MetaCompass:
#    nextflow run metacompass2.nf \
#      --reference_db /path/to/RefSeq_V2_db \
#      --forward /path/to/forward_reads.fastq \
#      --reverse /path/to/reverse_reads.fastq \
#      --output /path/to/output_directory \
#      --threads 8
#
# 3. Hardware requirements:
#    - 90GB+ disk space for normal installation
#    - 8GB+ memory (16GB recommended) for Pilon error correction

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions