nCount_RNA: number of total reads (smartseq2) or UMI (10x) per cell.
nFeature_RNA: number of genes per cell.
cell_name: unique name assigned to each cell.
cell_barcode_10x: unique 10x barcode ID for each cell (10x data only).
sequencing_run_10x: unique Illumina NovaSeq 6000 system sequencing run ID for each cDNA library sequenced (10x data only). Each library contained more than one channel/sample.
channel_10x: unique tissue channel/sample name (10x data only). For several tissues, more than one channel/sample was sequenced (each designated by a different subtissue name).
possibly_contaminated_barcode_10x: contamination filtering was done to resolve cross-sample contamination in an Illumina sequencing run caused by cell barcode hopping among multiplexed 10x samples (see methods section in Tabula Microcebus manuscript for full explanation).
method: smartseq2 (full-length) or 10x (3prime).
individual: lemur individuals available in the dataset.
age: age of the individual (years).
sex: sex of the individual.
tissue: tissue sampled.
tissue_system: tissue/organ system for each tissue sampled.
tissue_order: numerical ordering of each of the 27 tissues by tissue system (according to Fig. 1C in Tabula Microcebus manuscript).
subtissue: specification of the anatomical site sampled within the tissue. For tissues sampled multiple times at the same anatomical site, each 10x channel has distinct subtissue number.
compartment_v1: functional compartment for each cell type (i.e., epithelial, endothelial, stromal, immune (hematopoietic, lymphoid, myeloid, megakaryocyte-erythroid), neural, germ).
cell_ontology_class_v1: cell type designation using the Cell Ontology.
free_annotation_v1: detailed cell type designation using free text and molecular markers. PF, proliferating; LQ, low quality.
tissue__cell_ontology_class_v1: concatenation of the tissue and cell ontology designation.
tissue__free_annotation_v1: concatenation of the tissue and free annotation designation.
mix_hybrid: clusters with a small number of cells that contain more than one cell type but could not be partitioned into separate clusters by subclustering with the Louvain algorithm or manually with cellxgene were labeled as a ‘mix’ cell type. Clusters with cells that expressed markers for more than one cell type and it was biologically plausible they were not a technical artifact were labeled as a ‘hybrid’ cell type.
low_quality: clusters that separated from a main cluster but did not express any distinguishing markers and differed only in parameters of technical quality (i.e. fewer genes and counts detected per cell) were considered low quality.
dendrogram_annotation_number: number assigned to each of the 256 cell type designations across the Tabula Microcebus, arranged by compartment and then ordered by organ system or biological relatedness (according to Fig. 2A in Tabula Microcebus manuscript). In addition, separate numbering is assigned to each of hybrid and mix cell types (labeled with prefix letter ‘H’ and ‘M’, respectively).
dendrogram_annotation_order: numerical ordering of the 256 cell type designations with the addition of the hybrid and mix cell types (according to Fig. 2B in Tabula Microcebus manuscript).
order__compartment_freeannotation_tissue, order__tissue_compartment_freeannotation: numerical ordering of the 768 molecularly distinct cell types where each cell type designation is separated by its tissue of origin (with mix cell types excluded). order__compartment_freeannotation_tissue: cell types are ordered by compartment (compartment_v1), then by free annotation (free_annotation_v1), and then by tissue (tissue_order); order__tissue_compartment_freeannotation: cell types are ordered by tissue (tissue_order), then by compartment (compartment_v1), and then by free annotation (free_annotation_v1).
MHC: counts for the major histocompatibility complex (MHC) genes based on reannotation of the locus using expression data from the Tabula Microcebus (original locus annotation from NCBI’s Annotation Release 101). Note the count is only available for cells sequenced by 10x method and count is NAN for cells sequenced by smartseq2 method. Both raw counts and normalized counts (labeled with prefix letter ‘n’) provided.
MHC_C_I, MHC_NC_I, MHC_all_II: sum of counts from classical Class I genes.
nMHC_C_I, nMHC_NC_I, nMHC_all_II: sum of normalized counts from classical Class I genes.
counts and normalized counts from individual classical Class I genes (Mimu_168, Mimu_W03, Mimu_W04, Mimu_249, nMimu_168, nMimu_W03, nMimu_W04, nMimu_249), non-classical Class I genes (Mimu_180ps, Mimu_191, Mimu_202, Mimu_208, Mimu_218, Mimu_229ps, Mimu_239ps, nMimu_180ps, nMimu_191, nMimu_202, nMimu_208, nMimu_218, nMimu_229ps, nMimu_239ps), and Class II genes (Mimu_DMA, Mimu_DMB, Mimu_DPA, Mimu_DPB, Mimu_DQA, Mimu_DQB, Mimu_DRA, Mimu_DRB, nMimu_DMA, nMimu_DMB, nMimu_DPA, nMimu_DPB, nMimu_DQA, nMimu_DQB, nMimu_DRA, nMimu_DRB).