Literature DB >> 28070014

Lung Gene Expression Analysis (LGEA): an integrative web portal for comprehensive gene expression data analysis in lung development.

Yina Du1, Joseph A Kitzmiller1, Anusha Sridharan1, Anne K Perl1, James P Bridges1, Ravi S Misra2, Gloria S Pryhuber2, Thomas J Mariani2, Soumyaroop Bhattacharya2, Minzhe Guo1, S Steven Potter3, Phillip Dexheimer4, Bruce Aronow4, Alan H Jobe1, Jeffrey A Whitsett1, Yan Xu1,4.   

Abstract

'LungGENS', our previously developed web tool for mapping single-cell gene expression in the developing lung, has been well received by the pulmonary research community. With continued support from the 'LungMAP' consortium, we extended the scope of the LungGENS database to accommodate transcriptomics data from pulmonary tissues and cells from human and mouse at different stages of lung development. Lung Gene Expression Analysis (LGEA) web portal is an extended version of LungGENS useful for the analysis, display and interpretation of gene expression patterns obtained from single cells, sorted cell populations and whole lung tissues. The LGEA web portal is freely available at http://research.cchmc.org/pbge/lunggens/mainportal.html. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

Entities:  

Keywords:  Airway Epithelium; Surfactant protein; Systemic disease and lungs; TTF-1

Mesh:

Substances:

Year:  2017        PMID: 28070014      PMCID: PMC5520249          DOI: 10.1136/thoraxjnl-2016-209598

Source DB:  PubMed          Journal:  Thorax        ISSN: 0040-6376            Impact factor:   9.139


Introduction

We previously developed ‘LungGENS’,1 a web tool for mapping single-cell gene expression in the developing lung. LungGENS was visited by approximately 45 institutions in 30 countries during the past year. The initial phase of the LungGENS web tool was based on single-cell RNA sequencing (scRNA-seq) data from normal fetal mouse lung, with our newly developed analytic pipeline ‘SINCERA’.2 With continued support from the ‘LungMAP’ consortium, transcriptomic data derived from various technical platforms, species and lung developmental stages have become available. Integration and visualisation of these data sets with user-friendly web interfaces will empower investigators to access and interpret data contained in the extended database to better understand lung development and disease. To accommodate heterogeneous data structures and types, we developed Lung Gene Expression Analysis (LGEA) web portal, an extended version of the LungGENS, seeking to identify lung cell types and the dynamic changes in gene expression influencing lung formation and function using RNA-seq from single cells, purified cell populations and whole tissue.

Methods

The web pages and JavaScript functions of the LGEA web portal were designed and developed using HTML/CSS, JavaScript/jQuery and Java in Eclipse (http://www.eclipse.org/), a Java IDE. Apache Tomcat (http://tomcat.apache.org/) was used as web server. JSON (JavaScript Object Notation) format was adopted as an interchangeable data structure for these programming languages to encode LGEA query results, making downstream data processing and exchange easy and language-independent. When a gene symbol or a cell type is chosen, the client has initiated an HTTP request to the LGEA web server. A Java servlet on the web server handles the request, retrieves the data from database using SQL scripts and prepares retrieved data in JSON format. Finally, a processed HTTP response is returned to the client by displaying a page containing the query data. Oracle Database 11g (https://www.oracle.com/database/index.html) is used as a central component of LGEA web portal to improve data storage and efficient database management. The relational database of LGEA web portal is designed in compliance with the design structure in the previous LungGENS relational database using gene symbols and their associated cell types as primary keys within the relational data tables. The interaction and visualisation of LGEA web portal is supported by Highcharts (http://www.highcharts.com/), an interactive charting library. Highcharts is compatible with modern mobile and desktop browsers (eg, Safari, Firefox and Chrome). In addition to using interactive heatmaps, histograms, bar graphs and profile charts to display gene expression data from individual cells, we implemented new graphical and statistical presentations including principal component analysis (PCA), scatter plot, box plot and Venn diagram into the LGEA web page design.

Results

Lung development is a highly regulated and coordinated process typified by stage-specific changes in structure and function including branching morphogenesis, angiogenesis, sacculation, alveologenesis and cytodifferentiation.3 In mice, formation and maturation of the gas exchange region of the lung begins at approximately embryonic day 15 (E15) and ends at postnatal day 30 (PN30). In addition to mouse lung E16.5 single-cell RNA-seq data previously published in LungGENS, the LGEA database has been extended to include single cell, sorted cell and developmental time course data from whole lung tissues from E16.5 to PN28 and adults. The database is synchronised with ongoing studies from the research centres of ‘LungMAP’ consortium. The LGEA web portal provides three major types of analyses using the extended database: (1) single-cell transcriptome analysis using ‘LungGENS’ (2) sorted lung cell populations analysis using ‘LungSortedCells’ and (3) Lung Developmental Time Course analysis using ‘LungDTC’ as depicted in figure 1A.
Figure 1

(A) The home page of the Lung Gene Expression Analysis (LGEA) web portal provides access to data and to query results. Two integrative analytical tools ‘Gene At Glance’ (B) and ‘SigComparison’ (C) in LGEA are shown. LungSortedCells and LungDTC query functions are shown (D and E).

(A) The home page of the Lung Gene Expression Analysis (LGEA) web portal provides access to data and to query results. Two integrative analytical tools ‘Gene At Glance’ (B) and ‘SigComparison’ (C) in LGEA are shown. LungSortedCells and LungDTC query functions are shown (D and E).

LungGENS

The initial release of LungGENS was hosted using scRNA-seq data obtained from fetal mouse lung at E16.5 (148 cells). The current version of LungGENS database contains additional cells sequenced from E16.5 and E18.5 mouse lung, processed using Fluidigm C1 microfluidics technology. ‘Gene Query’ and ‘Cell Type Query’ retrieve data from the expanded database to provide cell-specific gene expression patterns for each lung cell type and associated gene signatures, surface markers and transcription factors for cell types of interest. ‘Gene list query’ has been expanded to all data sets in the LGEA web portal. Users can input a list of gene symbols and retrieve predicted cell types co-expressing their gene list of interest.

LungSortedCells

The LungSortedCells database includes fluorescence-activated cell sorting (FACS) sorted cell populations enriched for endothelial, mesenchymal, immune and epithelial cells from human lung (processed by Human Tissue Core (HTC) at University of Rochester supported by the LungMAP consortium) at day 1 and 20 months; sorted mouse alveolar type 2 cells at PN7 and PN28, sorted mouse mesenchymal, immune and epithelial cells at PN7 and PN28, and Pdgfra expressing fibroblasts at E16.5, E18.5, PN7 and PN28 (processed by ‘CCHMC’” Mouse Hub supported by the LungMAP consortium). ‘Gene Query’ allows users to input a gene symbol of interest. Query output uses a bar graph to display its expression levels across all cell types and a monocolor heatmap to provide an overview of the levels of expression of the queried gene expression across all data sets in LGEA database (figure 1D). ‘Cell type query’ identifies a list of signature genes for a given cell type, displays them using an interactive heatmap and provides downloadable data table for the query. Transcription factors, cell surface markers and signature genes identified from scRNA-seq analysis were also listed in tabular form as cross-reference (figure 1D). Gene symbols in the data tables are designed as a pop-up query panel, enabling users to redirect the query gene to any data set within LGEA database (figure 1D). Signature genes are identified using the following criteria: (1) gene A is expressed in cell B with an expression level >0.6 quantile in the whole-genome distribution; (2) gene A is expressed in cell B at least fivefold higher than the average expression of gene A in all other cell types; (3) gene A is most highly expressed in cell B with at least 1.5-fold higher expression than the cell type expressing the next highest level of gene A and (4) the coefficient of variation of gene A in cell B among biological replicates is <0.5.

LungDTC

We collected Developmental Time Course (DTC) data sets from whole mouse lung RNA microarray experiments from three mouse strains of (E15.5 to PN30),3 4 whole mouse lung RNA-seq at E16.5, E18.5, PN1, PN3, PN7, PN14 and PN28 (processed by ‘CCHMC’ Mouse Hub) and whole Rhesus macaque lung RNA-seq (GA100, 130 and 150). We present a combination of PCA, scatter chart, line chart and downloadable differentially expressed gene tables to display dynamic gene expression patterns across important developmental time periods (figure 1E). Users can compare the expression data from distinct mouse strains and different technical platforms including RNA microarray and RNA-seq. Dynamic profile patterns are displayed in line charts and downloadable gene tables (figure 1E) from which users can explore the expression of an individual gene profile and redirect sets of genes sharing similar expression patterns to ToppGene (https://toppgene.cchmc.org/enrichment.jsp) for gene set enrichment analysis.

LGEA Tools

To facilitate comparison and integration analyses, we developed new tools including ‘Gene at a Glance’ (figure 1B) and Signature Comparison (‘SigComparison’) (figure 1C). ‘Gene at a Glance’ enables users to input any gene of interest and displays the given gene expression information across developmental times and conditions within the LGEA database (figure 1B). ‘SigComparison’ compares signature genes between two experimental conditions within the LGEA database, displays the results using a Venn diagram and calculates the correlation of the overlapping data sets. Alternatively, users can input and compare their gene lists with the signature genes identified for specific cell types in LGEA database or compare two gene lists independently of the LGEA database (figure 1C). In addition to hosting analytic tools developed for LungMAP, LGEA provides URL links to >60 commonly used internal and external resources. For example, by clicking the tab for ‘Lung Image’ from the LGEA homepage, users will be redirected to Lung Image web collection (https://research.cchmc.org/lungimage/) hosted by Dr Whitsett's laboratory. The Lung Image gallery contains >2000 immunofluorescence confocal microscopies images obtained from embryonic (E16.5 and E18.5) and postnatal mouse (PN1, PN3, PN7, PN10, PN14 and PN28) lungs, with protein markers representing major pulmonary cell types. The gallery also contains >1000 images of postnatal human lung from 4 months to 4 years of age. A link between each protein marker and available single-cell RNA-seq data in LungGENS is provided.

Limitations and future directions

The first phase of LGEA is aimed to develop user-friendly tools for the lung research community for quick and easy transcriptomic data access. Some areas are still needed for further development in order to improve the functionality and to better meet different levels of data analysis. Some of the limitations are described below. (1) Currently, LGEA only covers transcriptomic data from normal mouse and human lung. Other omics data types including proteomic, metabolomics and lipidomic and data related to lung diseases are not yet included. We are actively working to expand LGEA database to include single-cell data from idiopathic pulmonary fibrosis, cystic fibrosis and other chronic lung diseases. (2) The current version of LGEA contains single-cell data processed using Fluidigm C1 microfluidics technology, limiting the number of cells being captured, in turn influencing the power of statistical analysis. At present, RNA data is being produced from thousands of individual lung cells using ‘Drop-seq’.5 We are actively working on analytic pipeline to facilitate the complex data mining of the ‘Drop-seq’ RNA-sequencing data. LungGENS will be expanded to include single-cell RNA-seq data from this new platform. Transcriptomic data from increasing numbers of single cells will increase statistical power used for cell-type characterisation and signature gene identification, enabling identification of rare or novel cell types. (3) Current LGEA web queries are performed on data for one gene/cell at a time or for gene lists containing <500 genes at a time. (4).The current version of LGEA query only accepts official gene symbols from annotated human and mouse genomes. Since there are many online tools for the gene ID conversion; including Biomart (http://central.biomart.org/), DAVID (http://david.abcc.ncifcrf.gov/) and biological DataBase network (http://biodbnet.abcc.ncifcrf.gov/db/db2db.php), we recommend users to convert different types of IDs to official gene symbols prior to LGEA applications. (5) The current version of LGEA does not provide customised application program interfaces (API) for programmatic data access. Nevertheless, at present, a portion of the LGEA functions can be directly accessed using programming language, such as R, Python or Java, by using appropriate network (HTTP) client-side APIs since the query results are encoded in JSON format.

Conclusions

The new LGEA web portal is designed to implement new features and analytical methods to provide an extended database enabling rapid analysis of (1) scRNA-seq using ‘LungGENS’ (2) sorted lung cell populations using ‘LungSortedCells’ and (3) lung developmental time course data using ‘LungDTC’. LGEA provides useful graphical interfaces with new interactive options to uses increasingly comprehensive RNA expression data sets. The new LGEA web portal database will be naturally extended to new data generated from normal and abnormal lung tissues and cells from additional species, developmental times and experimental protocols. The LGEA will be broadly applicable for lung research and freely available at http://research.cchmc.org/pbge/lunggens/mainportal.html and LungMAP research Consortium website (http://www.lungmap.net/) supported by National Heart, Lung, and Blood Institute (NHLBI).
  5 in total

1.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets.

Authors:  Evan Z Macosko; Anindita Basu; Rahul Satija; James Nemesh; Karthik Shekhar; Melissa Goldman; Itay Tirosh; Allison R Bialas; Nolan Kamitaki; Emily M Martersteck; John J Trombetta; David A Weitz; Joshua R Sanes; Alex K Shalek; Aviv Regev; Steven A McCarroll
Journal:  Cell       Date:  2015-05-21       Impact factor: 41.582

Review 2.  'LungGENS': a web-based tool for mapping single-cell gene expression in the developing lung.

Authors:  Yina Du; Minzhe Guo; Jeffrey A Whitsett; Yan Xu
Journal:  Thorax       Date:  2015-06-30       Impact factor: 9.139

3.  Expression profiling of the developing mouse lung: insights into the establishment of the extracellular matrix.

Authors:  Thomas J Mariani; Jeremy J Reed; Steven D Shapiro
Journal:  Am J Respir Cell Mol Biol       Date:  2002-05       Impact factor: 6.914

4.  Transcriptional programs controlling perinatal lung maturation.

Authors:  Yan Xu; Yanhua Wang; Valérie Besnard; Machiko Ikegami; Susan E Wert; Caleb Heffner; Stephen A Murray; Leah Rae Donahue; Jeffrey A Whitsett
Journal:  PLoS One       Date:  2012-08-20       Impact factor: 3.240

5.  SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis.

Authors:  Minzhe Guo; Hui Wang; S Steven Potter; Jeffrey A Whitsett; Yan Xu
Journal:  PLoS Comput Biol       Date:  2015-11-24       Impact factor: 4.475

  5 in total
  67 in total

Review 1.  The Pediatric Cell Atlas: Defining the Growth Phase of Human Development at Single-Cell Resolution.

Authors:  Deanne M Taylor; Bruce J Aronow; Kai Tan; Kathrin Bernt; Nathan Salomonis; Casey S Greene; Alina Frolova; Sarah E Henrickson; Andrew Wells; Liming Pei; Jyoti K Jaiswal; Jeffrey Whitsett; Kathryn E Hamilton; Sonya A MacParland; Judith Kelsen; Robert O Heuckeroth; S Steven Potter; Laura A Vella; Natalie A Terry; Louis R Ghanem; Benjamin C Kennedy; Ingo Helbig; Kathleen E Sullivan; Leslie Castelo-Soccio; Arnold Kreigstein; Florian Herse; Martijn C Nawijn; Gerard H Koppelman; Melissa Haendel; Nomi L Harris; Jo Lynne Rokita; Yuanchao Zhang; Aviv Regev; Orit Rozenblatt-Rosen; Jennifer E Rood; Timothy L Tickle; Roser Vento-Tormo; Saif Alimohamed; Monkol Lek; Jessica C Mar; Kathleen M Loomes; David M Barrett; Prech Uapinyoying; Alan H Beggs; Pankaj B Agrawal; Yi-Wen Chen; Amanda B Muir; Lana X Garmire; Scott B Snapper; Javad Nazarian; Steven H Seeholzer; Hossein Fazelinia; Larry N Singh; Robert B Faryabi; Pichai Raman; Noor Dawany; Hongbo Michael Xie; Batsal Devkota; Sharon J Diskin; Stewart A Anderson; Eric F Rappaport; William Peranteau; Kathryn A Wikenheiser-Brokamp; Sarah Teichmann; Douglas Wallace; Tao Peng; Yang-Yang Ding; Man S Kim; Yi Xing; Sek Won Kong; Carsten G Bönnemann; Kenneth D Mandl; Peter S White
Journal:  Dev Cell       Date:  2019-03-28       Impact factor: 12.270

2.  Transcriptional characterisation of human lung cells identifies novel mesenchymal lineage markers.

Authors:  Soula Danopoulos; Soumyaroop Bhattacharya; Thomas J Mariani; Denise Al Alam
Journal:  Eur Respir J       Date:  2020-01-23       Impact factor: 16.671

3.  Pre- and postnatal exposure of mice to concentrated urban PM2.5 decreases the number of alveoli and leads to altered lung function at an early stage of life.

Authors:  Thais de Barros Mendes Lopes; Espen E Groth; Mariana Veras; Tatiane K Furuya; Natalia de Souza Xavier Costa; Gabriel Ribeiro Júnior; Fernanda Degobbi Lopes; Francine M de Almeida; Wellington V Cardoso; Paulo Hilario Nascimento Saldiva; Roger Chammas; Thais Mauad
Journal:  Environ Pollut       Date:  2018-06-05       Impact factor: 8.071

4.  Insulin-like Growth Factor 1 Supports a Pulmonary Niche that Promotes Type 3 Innate Lymphoid Cell Development in Newborn Lungs.

Authors:  Katherine Oherle; Elizabeth Acker; Madeline Bonfield; Timothy Wang; Jerilyn Gray; Ian Lang; James Bridges; Ian Lewkowich; Yan Xu; Shawn Ahlfeld; William Zacharias; Theresa Alenghat; Hitesh Deshmukh
Journal:  Immunity       Date:  2020-02-18       Impact factor: 31.745

5.  Integrating multiomics longitudinal data to reconstruct networks underlying lung development.

Authors:  Jun Ding; Farida Ahangari; Celia R Espinoza; Divya Chhabra; Teodora Nicola; Xiting Yan; Charitharth V Lal; James S Hagood; Naftali Kaminski; Ziv Bar-Joseph; Namasivayam Ambalavanan
Journal:  Am J Physiol Lung Cell Mol Physiol       Date:  2019-08-21       Impact factor: 5.464

6.  Temporal, spatial, and phenotypical changes of PDGFRα expressing fibroblasts during late lung development.

Authors:  Mehari Endale; Shawn Ahlfeld; Erik Bao; Xiaoting Chen; Jenna Green; Zach Bess; Matthew T Weirauch; Yan Xu; Anne Karina Perl
Journal:  Dev Biol       Date:  2017-04-11       Impact factor: 3.582

7.  Understanding Interstitial Lung Disease: It's in the Mucus.

Authors:  Burton F Dickey; Jeffrey A Whitsett
Journal:  Am J Respir Cell Mol Biol       Date:  2017-07       Impact factor: 6.914

Review 8.  Building and Regenerating the Lung Cell by Cell.

Authors:  Jeffrey A Whitsett; Tanya V Kalin; Yan Xu; Vladimir V Kalinichenko
Journal:  Physiol Rev       Date:  2019-01-01       Impact factor: 37.312

9.  Reply to D'Alessandro-Gabazza et al.: Risks of Treating Idiopathic Pulmonary Fibrosis with a TAM Receptor Kinase Inhibitor.

Authors:  Milena S Espindola; David M Habiel; Cory M Hogaboam
Journal:  Am J Respir Crit Care Med       Date:  2018-10-01       Impact factor: 21.405

10.  Role for Cela1 in Postnatal Lung Remodeling and Alpha-1 Antitrypsin-Deficient Emphysema.

Authors:  Rashika Joshi; Andrea Heinz; Qiang Fan; Shuling Guo; Brett Monia; Christian E H Schmelzer; Anthony S Weiss; Matthew Batie; Harikrishnan Parameshwaran; Brian M Varisco
Journal:  Am J Respir Cell Mol Biol       Date:  2018-08       Impact factor: 6.914

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.