This repository contains the code for the Nominate Diversify Narrow Filter (NDNF) pipeline for picking HIV gRNAs. This pipeline is designed to pick gRNAs for Cas molecules based on the PAM and protospacer length.
The code in the repository is organized as follows:
code/
&tests/
- Python library code and associated tests.data/
- Directory of data files for pipeline runs.notebooks/
- Example notebooks describing the NDNF pipeline and mutational data.results/
- Experimental results associated with the analysis performed for the Fronteers 2023 figures.scripts/
- Utility cli scripts that perform the nominate and filter steps.
Code requirements can be installed from the requirements.conda
file provided.
Once DVC has been installed.
make prepare
will download the remaining genomes.
make test
will run all unit tests to ensure your installation is correct.
Generated using the notebooks in results/pipeline_runs/
. These can also be used as templates for one's own exploration.
Generated using the notebooks in results/mismatch_effect/
, with mismatch_maker.ipynb
responsible for generating the simulation data and mismatch_figure.ipynb
for generating the visualization.
Generated using the notebooks in results/mutation_exploration/
, with variant_maker.ipynb
responsible for generating the simulation data and variant_figure.ipynb
for generating the visualization.
Generated using the notebooks in results/broadVsafe/
, with additional_cas.ipynb
responsible for generating the simulation data and broad_vs_safe.ipynb
for generating the visualization.
Generated using the notebooks in results/stability/
, with pipeline_stability_exp.ipynb
responsible for generating the simulation data and pipeline_stability_fig.ipynb
for generating the visualization.
Generated using the notebook notebooks/train_test_split.ipynb
.
Generated using the notebook notebooks/mutation_scoring.ipynb
.
In order to relicate the Frontiers figures, the notebooks should be run in the following order to ensure intermediate files are properly created.
notebooks/train_test_split.ipynb
- Which will generate the random samplings for the other notebooks.notebooks/mutation_scoring.ipynb
- Which will process the RC index file for other notebooks.- Any notebooks in the
results/
folder will work.