single-cell-clustering-and-batch-correction-with-omicverse
Single-cell clustering and batch correction with omicverse
Overview
This skill distills the single-cell tutorials t_cluster.ipynb and t_single_batch.ipynb. Use it when a user wants to preprocess an AnnData object, explore clustering alternatives (Leiden, Louvain, scICE, GMM, topic/cNMF models), and evaluate or harmonise batches with omicverse utilities.
Instructions
- Import libraries and set plotting defaults
- Load
omicverse as ov,scanpy as sc, and plotting helpers (scvelo as scvwhen using dentate gyrus demo data). - Apply
ov.plot_set()orov.utils.ov_plot_set()so figures adopt omicverse styling before embedding plots.
- Load
- Load data and annotate batches
- For demo clustering, fetch
scv.datasets.dentategyrus(); for integration, read provided.h5adfiles viaov.read()and setadata.obs['batch']identifiers for each cohort. - Confirm inputs are sparse numeric matrices; convert with
adata.X = adata.X.astype(np.int64)when required for QC steps.
- For demo clustering, fetch
- Run quality control
- Execute
ov.pp.qc(adata, tresh={'mito_perc': 0.2, 'nUMIs': 500, 'detected_genes': 250}, batch_key='batch')to drop low-quality cells and inspect summary statistics per batch. - Save intermediate filtered objects (
adata.write_h5ad(...)) so users can resume from clean checkpoints.
- Execute
- Preprocess and select features
- Call
ov.pp.preprocess(adata, mode='shiftlog|pearson', n_HVGs=3000, batch_key=None)to normalise, log-transform, and flag highly variable genes; assignadata.raw = adataand subset toadata.var.highly_variable_featuresfor downstream modelling. - Scale expression (
ov.pp.scale(adata)) and compute PCA scores withov.pp.pca(adata, layer='scaled', n_pcs=50). Encourage reviewing variance explained viaov.utils.plot_pca_variance_ratio(adata).
- Call
- Construct neighbourhood graph and baseline clustering
More from starlitnightly/omicverse
single-cell-downstream-analysis
AUCell pathway scoring, metacell DEG, scDrug response, SCENIC regulons, cNMF programs, and NOCD community detection in OmicVerse.
50single-cell-annotation-skills-with-omicverse
Cell type annotation: SCSA, MetaTiME, CellVote consensus, CellMatch, GPTAnno, weighted KNN label transfer in OmicVerse.
49bulk-rna-seq-deseq2-analysis-with-omicverse
PyDESeq2 differential expression: ID mapping, DE testing, fold-change thresholding, and GSEA enrichment visualization in OmicVerse.
47single-cell-preprocessing-with-omicverse
Single-cell QC, normalization, HVG detection, PCA, neighbor graph, UMAP/tSNE embedding pipelines in OmicVerse (CPU/GPU).
44single-cell-multi-omics-integration
Multi-omics integration: MOFA factor analysis, GLUE unpaired alignment, SIMBA batch correction, TOSICA label transfer, StaVIA trajectory. Covers scRNA+scATAC paired/unpaired workflows.
41data-export-pdf
Create professional PDF reports with text, tables, and embedded images using reportlab. Works with ANY LLM provider (GPT, Gemini, Claude, etc.).
38