Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BACTOPIA:GATHER:GATHER_MODULE (SRR2838702)' #512

Open
happymanmohit opened this issue May 3, 2024 · 18 comments
Open

BACTOPIA:GATHER:GATHER_MODULE (SRR2838702)' #512

happymanmohit opened this issue May 3, 2024 · 18 comments
Labels
question Further information is requested

Comments

@happymanmohit
Copy link

happymanmohit commented May 3, 2024

Hii Robert

    I have a new MacBook Pro with following specifications 
  Chip Apple M3 Max
   Memory  36GB
  macOS Sonoma 14.1

I was trying to test the Bactopia using docker but it fails with this error

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 127.

The full error message was:

Error executing process > 'BACTOPIA:GATHER:GATHER_MODULE (SRR2838702)'

Caused by:
Process BACTOPIA:GATHER:GATHER_MODULE (SRR2838702) terminated with an error exit status (127)

Command executed:

MERGED="multiple-read-sets-merged.txt"
mkdir -p fastqs
mkdir -p extra

if [ "paired-end" == "paired-end" ]; then
# Paired-End Reads
cp -L 001-r1 fastqs/SRR2838702_R1.fastq.gz
cp -L 001-r2 fastqs/SRR2838702_R2.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "single-end" ]; then
# Single-End Reads
cp -L 001-r1 fastqs/SRR2838702.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "ont" ]; then
# Nanopore reads
cp -L 001-r1 fastqs/SRR2838702.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "hybrid" ] || [ "paired-end" == "short_polish" ]; then
# Paired-End Reads
cp -L 001-r1 fastqs/SRR2838702_R1.fastq.gz
cp -L 001-r2 fastqs/SRR2838702_R2.fastq.gz
cp -L EMPTY_EXTRA extra/SRR2838702.fastq.gz
elif [ "paired-end" == "merge-pe" ] || [ "paired-end" == "hybrid-merge-pe" ] || [ "paired-end" == "short_polish-merge-pe" ]; then
# Merge Paired-End Reads
echo "This sample had reads merged." > ${MERGED}
echo "R1:" >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5" "$9}' >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702_R1.fastq.gz
echo "Merged R1:" >> ${MERGED}
ls -l fastqs/SRR2838702_R1.fastq.gz | awk '{print $5" "$9}' >> ${MERGED}

  echo "R2:" >> ${MERGED}
  find -name "*r2" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5"	"$9}' >> ${MERGED}
  find -name "*r2" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702_R2.fastq.gz
  echo "Merged R2:" >> ${MERGED}
  ls -l fastqs/SRR2838702_R2.fastq.gz | awk '{print $5"	"$9}' >> ${MERGED}

  if [ "paired-end" == "hybrid-merge-pe" ]; then
      cp -L EMPTY_EXTRA extra/SRR2838702.fastq.gz
  else
      touch extra/empty.fna.gz
  fi

elif [ "paired-end" == "merge-se" ]; then
# Merge Single-End Reads
echo "This sample had reads merged." > ${MERGED}
echo "SE:" >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5" "$9}' >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702.fastq.gz
echo "Merged SE:" >> ${MERGED}
ls -l fastqs/SRR2838702.fastq.gz | awk '{print $5" "$9}' >> ${MERGED}

  touch extra/empty.fna.gz

elif [ "paired-end" == "sra_accession" ] || [ "paired-end" == "sra_accession_ont" ]; then
if [ "1" == "3" ]; then
echo "Unable to download SRR2838702 from both SRA and ENA 3 times. This may or may
not be a temporary connection issue. Rather than stop the whole Bactopia run,
further analysis of SRR2838702 will be discontinued." |
sed 's/^\s*//' > SRR2838702-fastq-download-error.txt
exit
else
# Download accession from ENA/SRA
fastq-dl
--accession SRR2838702
--provider SRA
--cpus 1
--outdir fastqs/
--prefix SRR2838702
--group-by-experiment
touch extra/empty.fna.gz
fi
elif [ "false" == "true" ]; then
if [ "paired-end" == "assembly_accession" ]; then
if [ "1" == "3" ]; then
touch extra/empty.fna.gz
echo "Unable to download SRR2838702 from NCBI Assembly 3 times. This may or may
not be a temporary connection issue. Rather than stop the whole Bactopia run,
further analysis of SRR2838702 will be discontinued." |
sed 's/^\s*//' > SRR2838702-assembly-download-error.txt
exit
else
# Verify Assembly accession
check-assembly-accession.py SRR2838702 > accession.txt 2> check-assembly-accession.txt

          if [ -s "accession.txt" ]; then
              # Download from NCBI assembly and simulate reads
              mkdir fasta/
              ncbi-genome-download bacteria -o ./ -F fasta -p 1 \
                                          -u "https://ftp.ncbi.nlm.nih.gov/genomes" \
                                          -s null -A accession.txt -r 50 
              find . -name "*SRR2838702*.fna.gz" | xargs -I {} mv {} fasta/
              rename 's/(GC[AF]_\d+).*/$1.fna.gz/' fasta/*
              gzip -cd fasta/SRR2838702.fna.gz > SRR2838702-art.fna
              rm check-assembly-accession.txt
          else
              mv check-assembly-accession.txt SRR2838702-assembly-accession-error.txt
              exit
          fi
      fi
  elif [ "paired-end" == "assembly" ]; then
      if [ "false" == "true" ]; then
          gzip -cd EMPTY_EXTRA > SRR2838702-art.fna
      else 
          cat EMPTY_EXTRA > SRR2838702-art.fna
      fi
  fi

  # Simulate reads from assembly, reads are 250bp without errors
  art_illumina -p -ss MSv3 -l 250 -m 400 -s 30 --fcov 150 -ir 0 -ir2 0 -dr 0 -dr2 0 -rs 42                        -na -qL 33 -qU 40 -o SRR2838702_R --id SRR2838702 -i SRR2838702-art.fna

  mv SRR2838702_R1.fq fastqs/SRR2838702_R1.fastq
  mv SRR2838702_R2.fq fastqs/SRR2838702_R2.fastq
  pigz -p 1 --fast fastqs/*.fastq
  cp SRR2838702-art.fna extra/SRR2838702.fna
  pigz -p 1 --best extra/SRR2838702.fna

fi

Validate input FASTQs

IS_PAIRED="unknown"
if [ "false" == "false" ]; then
ERROR=0
# Check paired-end reads have same read counts
OPTS="--sample SRR2838702 --min_basepairs 2241820 --min_reads 7472 --min_proportion 0.5 --runtype paired-end"
if [ -f "fastqs/SRR2838702_R2.fastq.gz" ]; then
# Paired-end
IS_PAIRED="true"
gzip -cd fastqs/SRR2838702_R1.fastq.gz | fastq-scan > r1.json
gzip -cd fastqs/SRR2838702_R2.fastq.gz | fastq-scan > r2.json
if ! reformat.sh in1=fastqs/SRR2838702_R1.fastq.gz in2=fastqs/SRR2838702_R2.fastq.gz qin=auto out=/dev/null 2> SRR2838702-paired-end-error.txt; then
ERROR=1
echo "SRR2838702 FASTQs contains an error. Please check the input FASTQs.
Further analysis is discontinued." |
sed 's/^\s*//' >> SRR2838702-paired-end-error.txt
else
rm -f SRR2838702-paired-end-error.txt
fi

      if [[ -s r1.json ]] && [[ -s r2.json ]]; then
          if ! check-fastqs.py --fq1 r1.json --fq2 r2.json ${OPTS}; then
              ERROR=1
          fi
      else
          NOT_GZIP=0
          if ! gzip -t fastqs/SRR2838702_R1.fastq.gz; then
              NOT_GZIP=1
          elif ! gzip -t fastqs/SRR2838702_R2.fastq.gz; then
              NOT_GZIP=1
          fi

          if [ "${NOT_GZIP}" -eq "0" ]; then
              echo "SRR2838702 FASTQs are empty. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-empty-error.txt
              ERROR=1
          else
              echo "SRR2838702 FASTQs failed Gzip tests. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-gzip-error.txt
              ERROR=1
          fi
      fi
      rm r1.json r2.json
  else
      # Single-end
      IS_PAIRED="false"
      gzip -cd fastqs/SRR2838702.fastq.gz | fastq-scan > r1.json

      if [[ -s r1.json ]]; then
          if ! check-fastqs.py --fq1 r1.json ${OPTS}; then
              ERROR=1
          fi
      else
          if ! gzip -t fastqs/SRR2838702.fastq.gz; then
              echo "SRR2838702 FASTQs failed Gzip tests. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-gzip-error.txt
              ERROR=1
          elif ! gzip -t fastqs/SRR2838702_R2.fastq.gz; then
              echo "SRR2838702 FASTQs are empty. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-empty-error.txt
              ERROR=1
          fi
      fi
      rm r1.json
  fi

  # Short polish should not be considered paired-end
  if [ "paired-end" == "short_polish" ]; then
      IS_PAIRED="false"
  fi

  # Failed validations so, let's keep them from continuing
  if [ "${ERROR}" -eq "1" ]; then
      mv fastqs/ failed-tests-fastqs/
  fi

fi

Dump meta values to a TSV

echo "sampleruntypeoriginal_runtypeis_pairedis_compressedspeciesgenome_size" | sed 's// /g' > SRR2838702-meta.tsv
echo "SRR2838702paired-endpaired-end$IS_PAIREDtruenull358242" | sed 's// /g' >> SRR2838702-meta.tsv

Capture versions

cat <<-END_VERSIONS > versions.yml
"BACTOPIA:GATHER:GATHER_MODULE":
art: $(echo $(art_illumina --help 2>&1) | sed 's/^.Version //;s/ .$//')
fastq-dl: $(echo $(fastq-dl --version 2>&1) | sed 's/fastq-dl, version //')
fastq-scan: $(echo $(fastq-scan -v 2>&1) | sed 's/fastq-scan //')
ncbi-genome-download: $(echo $(ncbi-genome-download --version 2>&1))
pigz: $(echo $(pigz --version 2>&1) | sed 's/pigz //')
END_VERSIONS

Command exit status:
127

Command output:
(empty)

Command error:
.command.sh: line 127: fastq-scan: command not found

Work dir:
/Users/karinsauer/work/4f/60d85a5d84386597c91a4c8c76de9a

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line
Run times
03-May-2024 11:20:38 - 03-May-2024 11:20:46 (duration: 7.9s

@happymanmohit happymanmohit added the question Further information is requested label May 3, 2024
@rpetit3
Copy link
Member

rpetit3 commented May 3, 2024

Hi @happymanmohit

can you share the comand you used?

@happymanmohit
Copy link
Author

Hii@rpetit3
Bactopia -profile test, docker

@rpetit3
Copy link
Member

rpetit3 commented May 3, 2024

try

bactopia -profile test,arm

@happymanmohit
Copy link
Author

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 127.

The full error message was:

Error executing process > 'BACTOPIA:GATHER:GATHER_MODULE (SRR2838702)'

Caused by:
Process BACTOPIA:GATHER:GATHER_MODULE (SRR2838702) terminated with an error exit status (127)

Command executed:

MERGED="multiple-read-sets-merged.txt"
mkdir -p fastqs
mkdir -p extra

if [ "paired-end" == "paired-end" ]; then
# Paired-End Reads
cp -L 001-r1 fastqs/SRR2838702_R1.fastq.gz
cp -L 001-r2 fastqs/SRR2838702_R2.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "single-end" ]; then
# Single-End Reads
cp -L 001-r1 fastqs/SRR2838702.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "ont" ]; then
# Nanopore reads
cp -L 001-r1 fastqs/SRR2838702.fastq.gz
touch extra/empty.fna.gz
elif [ "paired-end" == "hybrid" ] || [ "paired-end" == "short_polish" ]; then
# Paired-End Reads
cp -L 001-r1 fastqs/SRR2838702_R1.fastq.gz
cp -L 001-r2 fastqs/SRR2838702_R2.fastq.gz
cp -L EMPTY_EXTRA extra/SRR2838702.fastq.gz
elif [ "paired-end" == "merge-pe" ] || [ "paired-end" == "hybrid-merge-pe" ] || [ "paired-end" == "short_polish-merge-pe" ]; then
# Merge Paired-End Reads
echo "This sample had reads merged." > ${MERGED}
echo "R1:" >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5" "$9}' >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702_R1.fastq.gz
echo "Merged R1:" >> ${MERGED}
ls -l fastqs/SRR2838702_R1.fastq.gz | awk '{print $5" "$9}' >> ${MERGED}

  echo "R2:" >> ${MERGED}
  find -name "*r2" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5"	"$9}' >> ${MERGED}
  find -name "*r2" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702_R2.fastq.gz
  echo "Merged R2:" >> ${MERGED}
  ls -l fastqs/SRR2838702_R2.fastq.gz | awk '{print $5"	"$9}' >> ${MERGED}

  if [ "paired-end" == "hybrid-merge-pe" ]; then
      cp -L EMPTY_EXTRA extra/SRR2838702.fastq.gz
  else
      touch extra/empty.fna.gz
  fi

elif [ "paired-end" == "merge-se" ]; then
# Merge Single-End Reads
echo "This sample had reads merged." > ${MERGED}
echo "SE:" >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} ls -l {} | awk '{print $5" "$9}' >> ${MERGED}
find -name "*r1" | sort | xargs -I {} readlink {} | xargs -I {} cat {} > fastqs/SRR2838702.fastq.gz
echo "Merged SE:" >> ${MERGED}
ls -l fastqs/SRR2838702.fastq.gz | awk '{print $5" "$9}' >> ${MERGED}

  touch extra/empty.fna.gz

elif [ "paired-end" == "sra_accession" ] || [ "paired-end" == "sra_accession_ont" ]; then
if [ "1" == "3" ]; then
echo "Unable to download SRR2838702 from both SRA and ENA 3 times. This may or may
not be a temporary connection issue. Rather than stop the whole Bactopia run,
further analysis of SRR2838702 will be discontinued." |
sed 's/^\s*//' > SRR2838702-fastq-download-error.txt
exit
else
# Download accession from ENA/SRA
fastq-dl
--accession SRR2838702
--provider SRA
--cpus 1
--outdir fastqs/
--prefix SRR2838702
--group-by-experiment
touch extra/empty.fna.gz
fi
elif [ "false" == "true" ]; then
if [ "paired-end" == "assembly_accession" ]; then
if [ "1" == "3" ]; then
touch extra/empty.fna.gz
echo "Unable to download SRR2838702 from NCBI Assembly 3 times. This may or may
not be a temporary connection issue. Rather than stop the whole Bactopia run,
further analysis of SRR2838702 will be discontinued." |
sed 's/^\s*//' > SRR2838702-assembly-download-error.txt
exit
else
# Verify Assembly accession
check-assembly-accession.py SRR2838702 > accession.txt 2> check-assembly-accession.txt

          if [ -s "accession.txt" ]; then
              # Download from NCBI assembly and simulate reads
              mkdir fasta/
              ncbi-genome-download bacteria -o ./ -F fasta -p 1 \
                                          -u "https://ftp.ncbi.nlm.nih.gov/genomes" \
                                          -s null -A accession.txt -r 50 
              find . -name "*SRR2838702*.fna.gz" | xargs -I {} mv {} fasta/
              rename 's/(GC[AF]_\d+).*/$1.fna.gz/' fasta/*
              gzip -cd fasta/SRR2838702.fna.gz > SRR2838702-art.fna
              rm check-assembly-accession.txt
          else
              mv check-assembly-accession.txt SRR2838702-assembly-accession-error.txt
              exit
          fi
      fi
  elif [ "paired-end" == "assembly" ]; then
      if [ "false" == "true" ]; then
          gzip -cd EMPTY_EXTRA > SRR2838702-art.fna
      else 
          cat EMPTY_EXTRA > SRR2838702-art.fna
      fi
  fi

  # Simulate reads from assembly, reads are 250bp without errors
  art_illumina -p -ss MSv3 -l 250 -m 400 -s 30 --fcov 150 -ir 0 -ir2 0 -dr 0 -dr2 0 -rs 42                        -na -qL 33 -qU 40 -o SRR2838702_R --id SRR2838702 -i SRR2838702-art.fna

  mv SRR2838702_R1.fq fastqs/SRR2838702_R1.fastq
  mv SRR2838702_R2.fq fastqs/SRR2838702_R2.fastq
  pigz -p 1 --fast fastqs/*.fastq
  cp SRR2838702-art.fna extra/SRR2838702.fna
  pigz -p 1 --best extra/SRR2838702.fna

fi

Validate input FASTQs

IS_PAIRED="unknown"
if [ "false" == "false" ]; then
ERROR=0
# Check paired-end reads have same read counts
OPTS="--sample SRR2838702 --min_basepairs 2241820 --min_reads 7472 --min_proportion 0.5 --runtype paired-end"
if [ -f "fastqs/SRR2838702_R2.fastq.gz" ]; then
# Paired-end
IS_PAIRED="true"
gzip -cd fastqs/SRR2838702_R1.fastq.gz | fastq-scan > r1.json
gzip -cd fastqs/SRR2838702_R2.fastq.gz | fastq-scan > r2.json
if ! reformat.sh in1=fastqs/SRR2838702_R1.fastq.gz in2=fastqs/SRR2838702_R2.fastq.gz qin=auto out=/dev/null 2> SRR2838702-paired-end-error.txt; then
ERROR=1
echo "SRR2838702 FASTQs contains an error. Please check the input FASTQs.
Further analysis is discontinued." |
sed 's/^\s*//' >> SRR2838702-paired-end-error.txt
else
rm -f SRR2838702-paired-end-error.txt
fi

      if [[ -s r1.json ]] && [[ -s r2.json ]]; then
          if ! check-fastqs.py --fq1 r1.json --fq2 r2.json ${OPTS}; then
              ERROR=1
          fi
      else
          NOT_GZIP=0
          if ! gzip -t fastqs/SRR2838702_R1.fastq.gz; then
              NOT_GZIP=1
          elif ! gzip -t fastqs/SRR2838702_R2.fastq.gz; then
              NOT_GZIP=1
          fi

          if [ "${NOT_GZIP}" -eq "0" ]; then
              echo "SRR2838702 FASTQs are empty. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-empty-error.txt
              ERROR=1
          else
              echo "SRR2838702 FASTQs failed Gzip tests. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-gzip-error.txt
              ERROR=1
          fi
      fi
      rm r1.json r2.json
  else
      # Single-end
      IS_PAIRED="false"
      gzip -cd fastqs/SRR2838702.fastq.gz | fastq-scan > r1.json

      if [[ -s r1.json ]]; then
          if ! check-fastqs.py --fq1 r1.json ${OPTS}; then
              ERROR=1
          fi
      else
          if ! gzip -t fastqs/SRR2838702.fastq.gz; then
              echo "SRR2838702 FASTQs failed Gzip tests. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-gzip-error.txt
              ERROR=1
          elif ! gzip -t fastqs/SRR2838702_R2.fastq.gz; then
              echo "SRR2838702 FASTQs are empty. Please check the input FASTQs.
                  Further analysis is discontinued." | \
              sed 's/^\s*//' > SRR2838702-empty-error.txt
              ERROR=1
          fi
      fi
      rm r1.json
  fi

  # Short polish should not be considered paired-end
  if [ "paired-end" == "short_polish" ]; then
      IS_PAIRED="false"
  fi

  # Failed validations so, let's keep them from continuing
  if [ "${ERROR}" -eq "1" ]; then
      mv fastqs/ failed-tests-fastqs/
  fi

fi

Dump meta values to a TSV

echo "sampleruntypeoriginal_runtypeis_pairedis_compressedspeciesgenome_size" | sed 's// /g' > SRR2838702-meta.tsv
echo "SRR2838702paired-endpaired-end$IS_PAIREDtruenull358242" | sed 's// /g' >> SRR2838702-meta.tsv

Capture versions

cat <<-END_VERSIONS > versions.yml
"BACTOPIA:GATHER:GATHER_MODULE":
art: $(echo $(art_illumina --help 2>&1) | sed 's/^.Version //;s/ .$//')
fastq-dl: $(echo $(fastq-dl --version 2>&1) | sed 's/fastq-dl, version //')
fastq-scan: $(echo $(fastq-scan -v 2>&1) | sed 's/fastq-scan //')
ncbi-genome-download: $(echo $(ncbi-genome-download --version 2>&1))
pigz: $(echo $(pigz --version 2>&1) | sed 's/pigz //')
END_VERSIONS

Command exit status:
127

Command output:
(empty)

Command error:
.command.sh: line 127: fastq-scan: command not found

Work dir:
/Users/karinsauer/work/86/24244425c0755b1ea2926fa40e8e79

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command

@rpetit3
Copy link
Member

rpetit3 commented May 3, 2024

Just to verify, docker is installed and turned on?

Also could you share the .nextflow.log file?

@happymanmohit
Copy link
Author

yes docker installed and its turned on. I tried searching for the .nextflow.log it is not in the Bactopia runs except that where should I look for it.

@rpetit3
Copy link
Member

rpetit3 commented May 3, 2024

It will be in the directory you launched bactopia from.

Did you install bactopia through conda?

@happymanmohit
Copy link
Author

yes I installed it through Conda

@rpetit3
Copy link
Member

rpetit3 commented May 3, 2024

On the command line can you try

docker run hello-world

For some reason NextFlow is not using docker

@happymanmohit
Copy link
Author

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
478afc919002: Pull complete
Digest: sha256:a26bff933ddc26d5cdf7faa98b4ae1e3ec20c4985e6f87ac0973052224d24302
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

  1. The Docker client contacted the Docker daemon.
  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (arm64v8)
  3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
  4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/

For more examples and ideas, visit:
https://docs.docker.com/get-started/

@rpetit3
Copy link
Member

rpetit3 commented May 6, 2024

Good! Now we have to figure out why Nextflow is not seeing docker.

Can you try running like so:

bactopia -profile test,arm

Then we'll want to dig up a .command.run file to see if its using docker there.

@happymanmohit
Copy link
Author

His @rpetit3 I ran the command and it ran sucessfully

(bactopia) karinsauer@Karins-MacBook-Pro ~ % bactopia -profile test,arm
2024-05-06 15:43:12 INFO 2024-05-06 15:43:12:root:INFO - Checking if environment pre-builds are needed (this may take a while if building for the first time) download.py:544
N E X T F L O W ~ version 23.10.1
Launching /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/main.nf [sick_mayer] DSL2 - revision: 0cd9f79ba7



| |__ __ _ | | ___ _ __ () __ _
| '
\ / |/ __| __/ _ \| '_ \| |/ _ |
| |
) | (| | (__| || () | |) | | (| |
|.__/ _,|___|__/| ./||_,|
|
|
bactopia v3.0.1
Bactopia is a flexible pipeline for complete analysis of bacterial genomes

Core Nextflow options
runName : sick_mayer
containerEngine : docker
container : quay.io/bactopia/bactopia:3.0.1
launchDir : /Users/karinsauer
workDir : /Users/karinsauer/work
projectDir : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1
userName : karinsauer
profile : test,arm
configFiles : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/nextflow.config

Required Parameters
r1 : https://github.com/bactopia/bactopia-tests/raw/main/data/species/portiera/illumina/SRR2838702_R1.fastq.gz
r2 : https://github.com/bactopia/bactopia-tests/raw/main/data/species/portiera/illumina/SRR2838702_R2.fastq.gz
sample : SRR2838702

Dataset Parameters
genome_size : 358242

QC Parameters
adapters : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/EMPTY_ADAPTERS
phix : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/EMPTY_PHIX

Prokka Parameters
proteins : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/proteins.faa

Max Job Request Parameters
max_cpus : 2
max_memory : 6

Nextflow Profile Parameters
condadir : /Users/karinsauer/.bactopia/conda
datasets_cache : /Users/karinsauer/.bactopia/datasets
singularity_cache_dir: /Users/karinsauer/.bactopia/singularity

!! Only displaying parameters that differ from the pipeline defaults !!

If you use bactopia for your analysis please cite:


executor > local (13)
[skipped ] process > BACTOPIA:DATASETS [100%] 1 of 1, stored: 1 ✔
[14/5cfaac] process > BACTOPIA:GATHER:GATHER_MODULE (SRR2838702) [100%] 1 of 1 ✔
[2c/366585] process > BACTOPIA:GATHER:CSVTK_CONCAT (meta) [100%] 1 of 1 ✔
[74/4b7113] process > BACTOPIA:QC:QC_MODULE (SRR2838702) [100%] 1 of 1 ✔
[a8/9b67df] process > BACTOPIA:ASSEMBLER:ASSEMBLER_MODULE (SRR2838702) [100%] 1 of 1 ✔
[6e/d970d1] process > BACTOPIA:ASSEMBLER:CSVTK_CONCAT (assembly-scan) [100%] 1 of 1 ✔
[5d/f290b4] process > BACTOPIA:SKETCHER:SKETCHER_MODULE (SRR2838702) [100%] 1 of 1 ✔
[9e/58e450] process > BACTOPIA:ANNOTATOR:PROKKA_MODULE (SRR2838702) [100%] 1 of 1 ✔
[ad/439dc9] process > BACTOPIA:AMRFINDERPLUS:AMRFINDERPLUS_RUN (SRR2838702) [100%] 1 of 1 ✔
[09/f449e2] process > BACTOPIA:AMRFINDERPLUS:GENES_CONCAT (amrfinderplus-genes) [100%] 1 of 1 ✔
[f3/c19c26] process > BACTOPIA:AMRFINDERPLUS:PROTEINS_CONCAT (amrfinderplus-proteins) [100%] 1 of 1 ✔
[38/7c9ebf] process > BACTOPIA:MLST:MLST_MODULE (SRR2838702) [100%] 1 of 1 ✔
[65/ddeba3] process > BACTOPIA:MLST:CSVTK_CONCAT (mlst) [100%] 1 of 1 ✔
[d6/4a6b2b] process > BACTOPIA:DUMPSOFTWAREVERSIONS (1) [100%] 1 of 1 ✔

Bactopia Execution Summary
---------------------------
Bactopia Version : 3.0.1
Nextflow Version : 23.10.1
Command Line     : nextflow run /Users/karinsauer/miniforge3/envs/bactopia//share/bactopia-3.0.1/main.nf -w /Users/karinsauer/work/ -profile test,arm
Resumed          : false
Completed At     : 2024-05-06T15:47:00.403616-04:00
Duration         : 3m 39s
Success          : true
Exit Code        : 0
Error Report     : -
Launch Dir       : /Users/karinsauer

WARN: Graphviz is required to render the execution DAG in the given format -- See http://www.graphviz.org for more info.
Completed at: 06-May-2024 15:47:01
Duration : 3m 39s
CPU hours : 0.1
Succeeded : 13

@happymanmohit
Copy link
Author

Hii @rpetit3
can you please tell me why it is not doing quality control and other modules

(bactopia) karinsauer@Karins-MacBook-Pro ~ % bactopia --se fastqs/raw_reads.fastq.gz --sample raw_reasds --coverage 100 --genome_size 630000 --outdir OUTDIR --max_cpus 2 -profile docker
2024-05-06 15:52:59 INFO 2024-05-06 15:52:59:root:INFO - Checking if environment pre-builds are needed (this may take a while if building for the first time) download.py:544
N E X T F L O W ~ version 23.10.1
Launching /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/main.nf [golden_turing] DSL2 - revision: 0cd9f79ba7



| |__ __ _ | | ___ _ __ () __ _
| '
\ / |/ __| __/ _ \| '_ \| |/ _ |
| |
) | (| | (__| || () | |) | | (| |
|.__/ _,|___|__/| ./||_,|
|
|
bactopia v3.0.1
Bactopia is a flexible pipeline for complete analysis of bacterial genomes

Core Nextflow options
runName : golden_turing
containerEngine : docker
container : quay.io/bactopia/bactopia:3.0.1
launchDir : /Users/karinsauer
workDir : /Users/karinsauer/work
projectDir : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1
userName : karinsauer
profile : docker
configFiles : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/nextflow.config

Required Parameters
se : fastqs/raw_reads.fastq.gz
sample : raw_reasds

Dataset Parameters
genome_size : 630000

QC Parameters
adapters : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/EMPTY_ADAPTERS
phix : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/EMPTY_PHIX

Prokka Parameters
proteins : /Users/karinsauer/miniforge3/envs/bactopia/share/bactopia-3.0.1/data/proteins.faa

Optional Parameters
outdir : OUTDIR

Max Job Request Parameters
max_cpus : 2

Nextflow Profile Parameters
condadir : /Users/karinsauer/.bactopia/conda
datasets_cache : /Users/karinsauer/.bactopia/datasets
singularity_cache_dir: /Users/karinsauer/.bactopia/singularity

!! Only displaying parameters that differ from the pipeline defaults !!

If you use bactopia for your analysis please cite:


Each task will use 2 CPUs out of the available 14 CPUs. At most
7 task(s) will be run at a time, this can affect the efficiency
of Bactopia. You can use the '-qs' parameter to alter the number of
tasks to run at a time (e.g. '-qs 2', means only 2 tasks or a maximum
of 4 CPUs will be used at once)

executor > local (3)
executor > local (3)
[skipped ] process > BACTOPIA:DATASETS [100%] 1 of 1, stored: 1 ✔
[1f/7cd3df] process > BACTOPIA:GATHER:GATHER_MODULE (raw_reasds) [100%] 1 of 1 ✔
[08/c7822f] process > BACTOPIA:GATHER:CSVTK_CONCAT (meta) [100%] 1 of 1 ✔
[- ] process > BACTOPIA:QC:QC_MODULE -
[- ] process > BACTOPIA:ASSEMBLER:ASSEMBLER_MODULE -
[- ] process > BACTOPIA:ASSEMBLER:CSVTK_CONCAT -
[- ] process > BACTOPIA:SKETCHER:SKETCHER_MODULE -
[- ] process > BACTOPIA:ANNOTATOR:PROKKA_MODULE -
[- ] process > BACTOPIA:AMRFINDERPLUS:AMRFINDERPLUS_RUN -
[- ] process > BACTOPIA:AMRFINDERPLUS:GENES_CONCAT -
[- ] process > BACTOPIA:AMRFINDERPLUS:PROTEINS_CONCAT -
[- ] process > BACTOPIA:MLST:MLST_MODULE -
[- ] process > BACTOPIA:MLST:CSVTK_CONCAT -
[6f/1d93b3] process > BACTOPIA:DUMPSOFTWAREVERSIONS (1) [100%] 1 of 1 ✔
[skipping] Stored process > BACTOPIA:DATASETS

Bactopia Execution Summary
---------------------------
Bactopia Version : 3.0.1
Nextflow Version : 23.10.1
Command Line     : nextflow run /Users/karinsauer/miniforge3/envs/bactopia//share/bactopia-3.0.1/main.nf -w /Users/karinsauer/work/ --se fastqs/raw_reads.fastq.gz --sample raw_reasds --coverage 100 --genome_size 630000 --outdir OUTDIR --max_cpus 2 -profile docker
Resumed          : false
Completed At     : 2024-05-06T15:53:13.000736-04:00
Duration         : 6s
Success          : true
Exit Code        : 0
Error Report     : -
Launch Dir       : /Users/karinsauer

WARN: Graphviz is required to render the execution DAG in the given format -- See http://www.graphviz.org for more info.

@rpetit3
Copy link
Member

rpetit3 commented May 6, 2024

Awesome! Nice to see it working.

The QC step ran, what other modules were you expecting?

@happymanmohit
Copy link
Author

Why there is no 100% in front of the modules after gather and also I don't see results like include in out directory

@rpetit3
Copy link
Member

rpetit3 commented May 7, 2024

In the bactopia/raw_reads/ folder look for a file with error.txt in the name.

@happymanmohit
Copy link
Author

I tried it again and this time iy ran successfully. I have another question about the further analysis using Bactopia Tools. My aim is to compare the clinical strains and the laboratory strains to see regions which are either repeated multiple times or absent in the clinical strains

@rpetit3
Copy link
Member

rpetit3 commented May 9, 2024

Would presence/absence of genes be able to answer your question? Or are these regions mostly intergenic?

If the presence or absence will work, you can consider running the pangenome Bactopia tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants