Our goal is to make sequence data rapidly and broadly available to the scientific community as a community resource. It is our intention to publish the work of this project in a timely fashion, and we welcome collaborative interaction on the project and analyses. However, considerable investment was made in generating these data and we ask that you respect rights of first publication and acknowledgment as outlined in the Toronto agreement (Toronto International Data Release Workshop Authors. Prepublication data sharing. Nature. 2009 Sep 10;461(7261):168-70). By accessing these data, you agree not to publish any articles containing analyses of genes, cell types or transcriptomic data on a whole atlas or tissue scale prior to initial publication by the Tabula Microcebus Consortium and its collaborating scientists. If you wish to make use of restricted data for publication or are interested in collaborating on the analyses of these data, please use the Contact Us link. Redistribution of these data should include the full text of the data use policy.
We provide in Figshare the cell by gene count data for the Tabula Microcebus mouse lemur scRNAseq cell atlas in Python’s h5ad and Matlab’s mat formats, as well as scripts to export the files to R’s Seurat format. Data is organized as described below. To explore and reannotate the data interactively using the browser, view the Organs Tab.
The h5ad file contains the following groups:
A mat file of the complete lemur cell atlas dataset converted from the h5ad file is provided in the Figshare files. We also provide a Matlab script to import the h5ad file to mat file: please download the h5ad file of interest, Matlab script “LCA_h5ad2Mat.m” and Matlab function “read_csmatrix.m” to the same folder, and run “LCA_h5ad2Mat.m”.
We provide Python and R scripts to convert a Python h5ad file into an R Seurat file. Please download the h5ad file of interest and Python script “LCA_h5ad2csv.py” to the same folder, and run in Python << python LCA_h5ad2csv.py -i input.h5ad -o output_folder -c layer_id >>, where input.h5ad is the h5ad file of interest, output_folder is the folder where the csv files will be exported, and layer_id is the gene count matrix to export (if layer_id = raw_counts, then the raw data is exported (adata.layers[‘raw_counts’]); if layer_id = log, then the log transformed data is exported (adata.X)). Then run “LCA_csv2seurat.R” in R to create a Seurat object from the csv files.
The mat file contains a single variable named “rawData”, a Matlab structure variable with the following fields: