Tutorial to detect copy number variation using CNV-Seq

About the tool

CNV-Seq allows the detection of copy number variation using NGS data.

Input

In order to reduce the noise around repeat regions in the genome, first filter reads with a mapping quality > 4:

samtools view -b -q 4 sample.bam > sample.qfilt.bam

Hits file

We only need to provide two "hits" files for CNV-Seq to work, one for the sample, and one for the reference.

These can be extracted as follows:

samtools view -F 4 sample.qfilt.bam | perl -lane 'print "$F[2]\t$F[3]"' > sample.qfilt.hits

This creates a two column, tab delimited file, with the second column giving the corresponding 1-based leftmost mapping position of a read.

2L	1
2L	1
2L	4
2L	4
2L	4
2L	4
2L	4
2L	4
2L	4

Next, to select only fully assembled chromosomes run filt.pl. This will output filtered files eg sample.qfilt.hits.filt:

Run CNV-Seq

Now run the main perl script:

cnv-seq.pl --ref ref.qfilt.hits --test sample.qfilt.hits --genome-size 23542271 window-size 10000

Output

This produces two files sample-vs-reference.cnv and sample-vs-reference.count

sample-vs-reference.count shows the raw count data for each CNV.

chromosome	start	end	test	ref
X	1	363	70	124
X	183	545	82	123
X	365	727	90	115

sample-vs-reference.cnv contains the stats.

chromosome	start	end	test	ref	position	log2	p.value	cnv.size	cnv.log2	cnv.p.value
X	1	363	70	124	182	-0.479319752689881	0.0462467378993667	NA	NA	NA
X	183	545	82	123	364	-0.239368959969129	0.194296291525077	NA	NA	NA
X	365	727	90	115	546	-0.00804341386267303	0.488268352520283	NA	NA	NA

Plotting

To plot and output CNV info for all samples, run cnv_seq_process.sh on .cnv files:

$ bash cnv_seq_process.sh *.cnv

This calls a modified version of the main script provided with CNV-Seq.

The main tweaks:

Change the plotting colours/densities
Show genes of interest in a closeup plot of notch region e.g.
Save CNV details for each file parsed for later processing

This will by default produce two plots (one for notch region, and one for chromosome X) and a cnvs.txt file. To plot for different chromosomes alter the chrom var in the script cnv_seq_process.sh.

Visualasing CNVs in IGV

To create an IGV compatible track for each sample, run cnv2gff.pl:

for file in *_cnvs.txt; do
	 perl cnv2gff.pl $file
done

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNV-Seq.md

CNV-Seq.md

Tutorial to detect copy number variation using CNV-Seq

Table of Contents

About the tool

Input

Hits file

Run CNV-Seq

Output

Plotting

Visualasing CNVs in IGV

Files

CNV-Seq.md

Latest commit

History

CNV-Seq.md

File metadata and controls

Tutorial to detect copy number variation using CNV-Seq

Table of Contents

About the tool

Input

Hits file

Run CNV-Seq

Output

Plotting

Visualasing CNVs in IGV