Announcing 4000+ Single-cell Datasets
Our partnership with Elucidata: Datasets from 10+ sources, 91 cell lines, 508 tissues, 712 diseases, and 106 drugs. Pre-harmonized, in consistent schemas, and annotated with ontology mappings.
There are more than 80TB of semi-structured and unstandardized datasets in the public domain, scattered across several databases, making it hard to search and interpret. A wide array of pre-processing steps make harmonizing multiple datasets almost impossible without complex data science skills.
Elucidata has spent thousands of human hours tackling these problems. Using Polly, the company’s in-house data harmonization engine, Elucidata is able to transform TBs of public datasets into a standardized tabular schema, giving biologists a single source of truth for biomolecular data.
Today we are excited to announce our partnership with them, which will allow any user of LatchBio to explore and use that data seamlessly.
We hope that together we can make it easy for scientists to adopt a data-focused approach for formulating new hypotheses, validating hypotheses, and integrating public data into their research.
From hundreds of terabytes of data across different platforms, we now have one source of data:
Harmonized
Elucidata's curation models provide rich and harmonized metadata annotations with scientific context. Point-and-click filters allow scientists to find datasets most relevant to their research.
Consistent Schemas
All single-cell datasets are processed (identifier mapping, normalization, quality check) through a standard pipeline and made available in consistent h5ad formats. The data is readily usable for visualization and downstream analysis.
Ontology Mapping
Every dataset is mapped to Organism (NCBI Taxonomy), Cell type (Cell Ontology), Disease (MeSH: Medical Subject Headings), Cell Line (The Cellosaurus), Tissue (BRENDA Tissue Ontology), and Drug (ChEBI: Chemical Entities of Biological Interest)
Explore Now!
You can explore it all now on console.latch.bio/datasets
This marks the beginning of our collaboration with Elucidata. We will continue adding many more datasets across a range of modalities. Do you have a dataset that you'd like to see on Latch? Let us know!