Releases: Alan-Collins/CRISPR_comparison_toolkit
v1.0.3
v1.0.2
v1.0.2 CCTK Blast bugfixes and performance improvement
- Check if repeats BLAST hits are overlapping. If two overlapping BLAST hits are found, the first is kept and the second skipped. Handles cases where repeats go hard on the P in CRISPR.
- Add user control of minimum spacing between repeats (i.e., spacer length) with
-l
,--min-sp-len
. Default value = 25. If repeats have consecutive BLAST hits, it causes problems when retrieving spacer sequences from the blastdb asblastdbcmd
returns the whole contig if the start index is not a smaller number then the stop index of the sequence retrieved. - Add user control of minimum array length (i.e., number of spacers in the array) with
-n
,--min-array-len
. Default value = 2. This default is what was hard coded before. This change simply gives users control over this setting. - Don't automatically build array network if more than 2000 arrays identified (also impacts
cctk minced
). Now instructs user to runcctk network
to build the network if desired to improve run time of CRISPR identification tools. - Various code efficiency changes to improve running speed
v1.0.1
v1.0.1 Bugfix for spacerblast
Fixes spacerblast bug causing failure which occurred when trying to extend spacer matches or retrieve PAM sequences beyond the ends of the target sequence.
v1.0
Changelog - Version 1.0
CCTK Quickrun - New
- Added
quickrun
command to run a basic pipeline ofcctk minced
,cctk crispriff
, andcctk crisprtree
on a set of assemblies. All clusters containing between 3 and 15 (controlled with-m
) arrays are plotted to give a quick sense of the dataset.
CCTK Minced
- The default repeat database now contains representatives of CRISPR subtypes I-A, I-B, I-C, I-D, I-E, I-F, I-G, II-A, II-B, II-C, III-A, and III-B.
CCTK Minced and CCTK Blast
-
Appending an existing dataset now only requires the presence of a CRISPR_spacer.fna file containing your CRISPR spacers in fasta format. This will make it easier to convert spacers found using another tool into CCTK formats.
-
Repeats are now also output by CCTK Minced and Blast. Repeat information is written to a new file called Array_repeats.txt and repeat information for each assembly is written as an additional column to the CRISPR_summay_table.(txt|csv) files.
The repeats associated with each array are recorded and assigned an ID composed of the array ID and a letter. The letter is used to distinguish repeat variants associated with the same array of spacers.
- The number of mismatches between each array's consensus repeat and the closest match in the built-in or user-provided repeat database is now written as an additional column to the CRISPR_summay_table.(txt|csv) files.
CCTK Minced, CCTK Blast, and CCTK network
- Details of which arrays are part of the same cluster now written to new file, Array_clusters.txt.
CCTK Spacerblast
- Support to define a seed region has been added. When performing a search for protospacers with less than 100% identity, a region can be defined that must not contain mismatches between the spacer and protospacer.
CCTK CRISPRtree
-
Branch support can now be provided in 3 different ways:
- As coloured circles as in previous versions
- As text displaying % support
- Only included in the newick string (when using the other two modes, branch support is also included in the newick string)
v0.8.4
Font size is now scaled in both vertical and horizontal dimension by default for crisprdiff and crisprtree. Font size can be fixed by user.
Bug fixes.
v0.8.0
- Fix issue where CRISPRtree won't spot unrelated arrays in input and will run forever
- Improved spacing of array plotting
- Fixed minor bugs
- Improved help messages
v0.7.7
Initial release