Leiden usage workflow
example data:
# a raw scRNA-seq data, goolam.rds, has been stored in the ./scRNAsqData.
# https://hemberg-lab.github.io/scRNA.seq.datasets/mouse/edev/#goolam
da <- readRDS("./scRNAseqData/goolam.rds")
da
output:
class: SingleCellExperiment
dim: 41428 124
metadata(0):
assays(2): counts logcounts
rownames(41428): ENSMUSG00000000001 ENSMUSG00000000003 ... ERCC-00170 ERCC-00171
rowData names(19): is_feature_control is_feature_control_ERCC ... feature_symbol feature_id
colnames(124): X2cell_1_A X2cell_1_B ... ME_4cell_6_C ME_4cell_6_D
colData names(31): cell_type1 total_features ... pct_counts_top_50_features_ERCC is_cell_control
reducedDimNames(0):
mainExpName: NULL
altExpNames(0)
Step 1: Normalizing and mapping the raw scRNA-seq data to multiple low-dimensional latent spaces. A 9_latent_data folder is produced and saved in the *./OutputData*.
# activate the environment
conda activate DEPF
cd DEPF/HierarchicalAutoencoder/
Rscript runHA.R
output:
[1] "goolam.rds"
output 9 of 1 latent: goolam .
output 9 of 2 latent: goolam .
output 9 of 3 latent: goolam .
output 9 of 4 latent: goolam .
output 9 of 5 latent: goolam .
output 9 of 6 latent: goolam .
output 9 of 7 latent: goolam .
output 9 of 8 latent: goolam .
output 9 of 9 latent: goolam .
Step 2: Selecting Leiden to generate a clustering ensemble.
R
source("runLeiden.R")
runLeiden(res=1, ensemble_num = 10)
output:
********************************************** .
output 9 of 1 latent: goolam .
********************************************** .
---------------------------------------------- .
resolution: 1 .
---------------------------------------------- .
Computing nearest neighbor graph
Computing SNN
[1] "============ Leiden ============"
...
...
...
********************************************** .
output 9 of 9 latent: goolam .
********************************************** .
---------------------------------------------- .
resolution: 1 .
---------------------------------------------- .
Computing nearest neighbor graph
Computing SNN
[1] "============ Leiden ============"
The Leiden_resolution_1.csv is produced and saved in the ./OutputData.
Step 3: Performing dynamic ensemble pruning. The final_clustering.csv is produced and saved in the ./OutputData.
runBioFOA("Louvain", 5, 1)
output:
==========run Ensemble Pruning.==================
goolam
============load Leiden ensemble=================
========= NMI ==========
0.8400
========= ARI ==========
0.6700