Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How could I use PHACTS to predict the lifecycle of thousands of phages? #1

Open
loukesio opened this issue Sep 6, 2021 · 2 comments

Comments

@loukesio
Copy link

loukesio commented Sep 6, 2021

Thank you for the nice tool!
I would like use PHACTS to predict the life cycle of thousands of phages. Do you have any idea how could i do it?
Could I use it in our local cluter?

Thank you for your time :)

@deprekate
Copy link
Owner

I haven't gotten around to converting the old perl code into nice user friendly python. But the old tarball at:
https://edwards.sdsu.edu/PHACTS/PHACTS-0.3.tar.gz
should be self contained. All you would need to do is download and compile FASTA36 (https://fasta.bioch.virginia.edu/fasta_www2/fasta_down.shtml)

and then edit line 37 of phacts.pl so it points to your fasta36 install

#------------------------------------------------------------------------------
# This is the path to your FASTA35 install 
my $fasta_path = "/home3/katelyn/opt/PHACTS/fasta-36.3.8e/bin/fasta36";
#------------------------------------------------------------------------------

@deprekate
Copy link
Owner

As far as running phacts on hundreds of phages.
Normally I run my big jobs on clusters that have hundreds of cores.

However you should be able to do the same on a regular computer. The easiest would be to use the linux command xargs. The syntax to run on the two test genomes with one command would be:

$ pip install phacts
$ ls tests/ | xargs -I{} phacts.py tests/{} -o {}.txt

*be careful with the above command, I don't have phacts checking to make sure the file given to the -o does not exist (yet), so it will overwrite any existing file with the same name

The old version of phacts had threading implementation, but not the new one (yet). So when you run a single phacts job, it runs 10 replicates serially instead of parallel, which is why if you bump the -r up to 50, it takes 5 times as long.
When you run the above xargs command it will run each job serially, one after the other. It takes about 3 minutes to run the two test genomes through phacts on my laptop. So you could potentially get a few hundred genomes through using that command, in about a day.

If you have thousands of genomes (or want to bump up the -r replicates), instead of xargs you can use the parallel command, which will use multiple cores, so if you have 8 cores, 8 jobs will run at once. When a job finishes, a new job will be sent to that core to run. This will allow you to get thousands of genomes through, in a reasonable amount of time.
The syntax for parallel is very similiar to xargs:

$ ls tests/ | parallel -I{} phacts.py tests/{} -o {}.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants