GitHub - new-zbc/BNPSpace: Nonparametric Bayesian clustering with feature selection for SRT data

BNPSpace

Overview

BNPSpace is a Bayesian nonparametric spatial clustering framework with feature selection for spatial transcriptomic data. This framework integrates four components into a hierarchical model. Firstly, it employs a multivariate count-generating process based on the zero-inflated Poisson (ZIP) model for directly modeling SRT count data, thereby avoiding the need for ad hoc data normalization methods. Secondly, BNPSpace adopts a feature selection strategy to provide low-dimensional representations of SRT data in terms of a list of discriminating genes (DGs), thus facilitating direct biological interpretability. Distinctively, BNPSpace clusters samples based on the similarity of their DG count vectors using a mixture of finite mixtures (MFM) model, allowing for an automatic selection of the number of spatial domains. Moreover, BNPSpace uses an MRF prior to accounting for the geospatial profile of SRT data, which in turn enhances clustering accuracy.

User manual

The required R packages:

Rcpp
SingleCellExperiment
RcppArmadillo
flexclust

The data structure used for BNPSpace is a SingleCellExperiment object containing assay "counts" and col data "row" and "col". "row" and "col" are the coordinates of each spot.

source("demo/dataSim.R")
> sce
#class: SingleCellExperiment 
#dim: 500 1600 
#metadata(0):
#assays(2): counts logcounts
#rownames(500): gene_1 gene_2 ... gene_499 gene_500
#rowData names(0):
#colnames(1600): spot_1 spot_2 ... spot_1599 spot_1600
#colData names(4): row col label sizeFactor
#reducedDimNames(1): PCA
#mainExpName: NULL
#altExpNames(0):

The above simulation data contains two assays, original counts data and log-normalized data (logcounts). There are 500 genes and 1600 spots. PCA represents the dimension reduction features using principal component analysis. The col data names "row" and "col" are spatial coordinates, "label" is the cluster ground truth, and "sizeFactor" is the size factor of each spot.

Then, we can run BNPSpace.

library(Rcpp)
library(SingleCellExperiment)
library(RcppArmadillo)
library(flexclust)
source("R/utils.R")
source("R/main.R")
### construct spots network
Adj = find_neighbors(sce, "ST", "image")
neighbors = find_neighbor_index(Adj, "ST")
result = NSCFS(sce, neighbors, n_clusters = NULL, d = 1, n_iters = 500, seed = 1)
pred_label = result$pred_label
ARI_value = randIndex(table(pred_label, colData(sce)$label))

Finally, we draw the clustering results and feature selection results.

library(BayesSpace)
metadata(sce)$BayesSpace.data <- list()
metadata(sce)$BayesSpace.data$platform <- "ST"
metadata(sce)$BayesSpace.data$is.enhanced <- FALSE
p <- clusterPlot(sce, label= pred_label, palette=NULL, size=0.05) +
  labs(title=paste0("BNPSpace: ARI=", round(ARI_value, 3)), fill = "Domains") + 
  theme(legend.key.size = unit(0.5, 'cm'), 
        legend.key.height = unit(0.5, 'cm'), 
        legend.key.width = unit(0.5, 'cm'), 
        legend.title = element_text(size=12), 
        legend.text = element_text(size=10),
        panel.border = element_rect(colour = "black", fill=NA, size=1),
        plot.title = element_text(size = 16, face = "bold", hjust = 0.5))

data_gamma = data.frame(gene = sample(500), gamma_est = result$gamma, gamma_true = c(rep(TRUE, 20), rep(FALSE, 480)))

library(ggplot2)
PPI_plot = ggplot(data = data_gamma) + geom_point(aes(x = gene, y = gamma_est, color = gamma_true))+
  ylab("Posterior probability of inclusion (PPI)") + xlab("Gene index") + 
  labs(title = "PPIs by BNPSpace", color = "DGs") +
  geom_hline(aes(yintercept = 0.5), color = "blue") +
  theme(legend.key.size = unit(0.5, 'cm'), 
        legend.key.height = unit(0.5, 'cm'), 
        legend.key.width = unit(0.5, 'cm'), 
        legend.title = element_text(size=12),
        legend.text = element_text(size=10),
        plot.title = element_text(size = 16, face = "bold", hjust = 0.5))
library(patchwork)
p + PPI_plot + plot_layout(nrow = 1)

Tutorials

For a tutorial on a real application, please refer to Tutorial.html

If there is any bugs without being fixed timely, please contact author directly [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
R		R
application		application
demo		demo
reproduce		reproduce
simulation1		simulation1
simulation2		simulation2
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Tutorial.html		Tutorial.html
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BNPSpace

Overview

User manual

Tutorials

About

Releases

Packages

Languages

License

new-zbc/BNPSpace

Folders and files

Latest commit

History

Repository files navigation

BNPSpace

Overview

User manual

Tutorials

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages