Skip to content

Latest commit



96 lines (65 loc) · 3.06 KB

File metadata and controls

96 lines (65 loc) · 3.06 KB

Tutorial to detect copy number variation using CNV-Seq

Table of Contents

About the tool

CNV-Seq allows the detection of copy number variation using NGS data.


In order to reduce the noise around repeat regions in the genome, first filter reads with a mapping quality > 4:

samtools view -b -q 4 sample.bam > sample.qfilt.bam

Hits file

We only need to provide two "hits" files for CNV-Seq to work, one for the sample, and one for the reference.

These can be extracted as follows:

samtools view -F 4 sample.qfilt.bam | perl -lane 'print "$F[2]\t$F[3]"' > sample.qfilt.hits

This creates a two column, tab delimited file, with the second column giving the corresponding 1-based leftmost mapping position of a read.

2L	1
2L	1
2L	4
2L	4
2L	4
2L	4
2L	4
2L	4
2L	4

Next, to select only fully assembled chromosomes run This will output filtered files eg sample.qfilt.hits.filt:

Run CNV-Seq

Now run the main perl script: --ref ref.qfilt.hits --test sample.qfilt.hits --genome-size 23542271 window-size 10000


This produces two files sample-vs-reference.cnv and sample-vs-reference.count

sample-vs-reference.count shows the raw count data for each CNV.

chromosome start end test ref
X 1 363 70 124
X 183 545 82 123
X 365 727 90 115

sample-vs-reference.cnv contains the stats.

chromosome start end test ref position log2 p.value cnv cnv.size cnv.log2 cnv.p.value
X 1 363 70 124 182 -0.479319752689881 0.0462467378993667 0 NA NA NA
X 183 545 82 123 364 -0.239368959969129 0.194296291525077 0 NA NA NA
X 365 727 90 115 546 -0.00804341386267303 0.488268352520283 0 NA NA NA


To plot and output CNV info for all samples, run on .cnv files:

$ bash *.cnv

This calls a modified version of the main script provided with CNV-Seq.

The main tweaks:

  • Change the plotting colours/densities
  • Show genes of interest in a closeup plot of notch region e.g.
  • Save CNV details for each file parsed for later processing

This will by default produce two plots (one for notch region, and one for chromosome X) and a cnvs.txt file. To plot for different chromosomes alter the chrom var in the script

Visualasing CNVs in IGV

To create an IGV compatible track for each sample, run

for file in *_cnvs.txt; do
	 perl $file