| Literature DB >> 34850121 |
Pablo Moreno1, Silvie Fexova1, Nancy George1, Jonathan R Manning1, Zhichiao Miao1, Suhaib Mohammed1, Alfonso Muñoz-Pomer1, Anja Fullgrabe1, Yalan Bi1, Natassja Bush1, Haider Iqbal1, Upendra Kumbham1, Andrey Solovyev1, Lingyun Zhao1, Ananth Prakash1, David García-Seisdedos1, Deepti J Kundu1, Shengbo Wang1, Mathias Walzer1, Laura Clarke1, David Osumi-Sutherland1, Marcela Karey Tello-Ruiz2, Sunita Kumari2, Doreen Ware2,3, Jana Eliasova4, Mark J Arends5, Martijn C Nawijn6, Kerstin Meyer4, Tony Burdett1, John Marioni1, Sarah Teichmann4, Juan Antonio Vizcaíno1, Alvis Brazma1, Irene Papatheodorou1.
Abstract
The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34850121 PMCID: PMC8728300 DOI: 10.1093/nar/gkab1030
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Top 15 most represented species in Expression Atlas, considering publicly available experiments across all technologies (RNA-Seq, Microarrays, Proteomics and Single Cell RNA-Seq), separated by differential and baseline studies. The 15 most represented species are shown, which jointly cover ∼94% of all studies. Separate varieties of Oryza sativa are considered, however when taken together they would make up for the second most represented plant species after Arabidopsis thaliana.
Figure 2.Top-10 represented human organism parts in Single Cell Expression Atlas, by number of cells (left) and number of studies (right).
Figure 3.List of 15 organism parts with the highest number of studies in Bulk Expression Atlas. These 15 organism parts, across different organisms, cover a total of 1997 studies, which represents ∼62% of all studies that have an organism part annotation.
Figure 4.Top-15 most represented species in Expression Atlas bulk. These 15 species cover >95% of all studies in EA, where >50% of the studies are either Human or Mouse studies. Counting all three different varieties of rice (Oryza sativa) together, this species would be the second most represented plant species after Arabidopsis thaliana.
Figure 5.Proportion of studies loaded each year broken down by technology, for Expression Atlas bulk. Data for 2021 is incomplete due to pending loadings. Until 2019 included, there was a clear trend in the reduction of loading of Microarrays and an increase in loading of RNA-Seq studies.
Figure 6.Most represented human diseases in Expression Atlas (bulk RNA-Seq, Microarrays and Proteomics) by number of public studies available. These diseases cover ∼47% of all the studies that have a disease annotation (1095), out of a total of ∼685 different human diseases annotated to all Atlas studies (this doesn’t account for diseases annotated at different granularity levels on the studies, for instance lung cancer and lung adenocarcinoma, which are counted separately).
Figure 7.(A) The Single Cell Expression Atlas organ anatomogram for lung (for example shown at https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-130148/results/anatomogram), displaying marker genes for the different lung cell types. Hovering over specific sections of the heatmap gives more details about the gene's expression. As the user clicks on an active section of the lung anatomogram, the heatmap to the right changes to display only cell types that exist under that specific part of the organ. (B) As the user dives into more and more detailed views, it will end up at a cellular view, where in this case type I and type II pneumocytes are shown.
Figure 8.New selectors for dimensionality reduction cell plots, where the user can choose whether to use UMAP or t-SNE at different scales (plot options). By default, the landing page will show cell types as inferred by the author of the study if available (current field selected in ‘Colour plot by:’).