Skip to content

Interpreting the figure

Timothy Tickle edited this page Jun 24, 2016 · 16 revisions

Interpretation

The resulting cnv.pdf file should look like the following figure.

Test Single-Cell CNV Data

The rows of the figure are cell observations. At the very top are reference observations; the matrix is separated horizontally between reference and not-reference cells. Non-reference cells are positioned using hierarchical clustering using euclidean distance and average linkage. Reference cells are in the order given by the --ref parameter.

The columns of the figure are genes separated as contigs/chromosomes. Contigs are ordered as they first appear in the genomic_position file.

To create the measurements input values are converted to log2((TPM/10)+1) transformed and filtered requiring a minimum average per gene. The remaining raw gene values are then centered and thresholded to reduce outliers. Cells are then ordered by contig and smoothing is performed using a moving average using the cell's centered expression along the genomic coordinates. Ends of the contigs are removed (set to 0; due to being unreliable). This smoothed gene expression is then averaged in the reference observations and removed from the non-reference cells (if no reference is given a global average is used). If a gene measurement is too close to the average, it is set to zero (and is considered noise or existing genomic structure).

In the resulting example.pdf file we can patterns of expression consistent with genomic order that are not in the reference. One can infer these increases and decreased in expression to be a result of copy number variation. In this example we see deletions of Chr1p and Chr19q which are characteristic of oligodendroglioma.

Clone this wiki locally