Skip to content

infercnv 10x

GeorgescuC edited this page Feb 20, 2020 · 7 revisions

Leveraging 10x count data in InferCNV

Below provides an example of how you might generate a counts matrix for use with inferCNV, starting with 10x data.

Here, we'll use Seurat for converting 10x count data to a compatible matrix format.

Seurat 3.0+

Seurat recommended method to access the data:

library(Seurat)
counts_matrix = GetAssayData(seurat_obj, slot="counts")

Manual access to the data:

library(Seurat)
counts_matrix = seurat_obj@assays$RNA@counts[,colnames(seurat_obj)]

Seurat 2.X

library(Seurat)                                                                                           
                                                                                                          
data = Read10X(data.dir = "10x_data_dir/") 
seurat_obj = CreateSeuratObject(raw.data=data, min.cells=3, min.genes=200)
counts_matrix = as.matrix([email protected][,[email protected]])

# use more palatable column names (cell identifiers)            
cell.names <- sapply(seq_along(colnames(counts_matrix)), function(i) paste0("cell_", i), USE.NAMES = F)      
colnames(counts_matrix) = cell.names    

Writing the matrix to file

This step is not required as infercnv allows as input an R (sparse) matrix, but it can be useful to save the matrix for access elsewhere or in another session.

# save the output table as an R object (faster and more size efficient)
saveRDS(round(counts_matrix, digits=3), "sc.10x.counts.matrix")

# write the output table in txt format
write.table(round(counts_matrix, digits=3), file='sc.10x.counts.matrix', quote=F, sep="\t")                                                                                                        

Note, if the regular tab-delimited data matrix file is going to be too large, you can save the matrix as a sparse Matrix object, and use this sparse Matrix object as input to inferCNV.

Running infercnv

Now, the data is ready for use with InferCNV.

When using infercnv::run(), set 'cutoff=0.1' with 10xGenomics data, instead of the default (1) we tend to use with smartSeq2 and less sparse data.

Clone this wiki locally