| Literature DB >> 34718712 |
Jingjing Jin1, Peng Lu1, Yalong Xu1, Jiemeng Tao1, Zefeng Li1, Shuaibin Wang2, Shizhou Yu3, Chen Wang1, Xiaodong Xie1, Junping Gao2, Qiansi Chen1, Lin Wang1, Wenxuan Pu2, Peijian Cao1.
Abstract
The advent of single-cell sequencing opened a new era in transcriptomic and genomic research. To understand cell composition using single-cell studies, a variety of cell markers have been widely used to label individual cell types. However, the specific database of cell markers for use by the plant research community remains very limited. To overcome this problem, we developed the Plant Cell Marker DataBase (PCMDB, http://www.tobaccodb.org/pcmdb/), which is based on a uniform annotation pipeline. By manually curating over 130 000 research publications, we collected a total of 81 117 cell marker genes of 263 cell types in 22 tissues across six plant species. Tissue- and cell-specific expression patterns can be visualized using multiple tools: eFP Browser, Bar, and UMAP/TSNE graph. The PCMDB also supports several analysis tools, including SCSA and SingleR, which allows for user annotation of cell types. To provide information about plant species currently unsupported in PCMDB, potential marker genes for other plant species can be searched based on homology with the supported species. PCMDB is a user-friendly hierarchical platform that contains five built-in search engines. We believe PCMDB will constitute a useful resource for researchers working on cell type annotation and the prediction of the biological function of individual cells.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34718712 PMCID: PMC8728192 DOI: 10.1093/nar/gkab949
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.PCMDB data collection pipeline. Note: the table contains the number of marker genes for each source. The bar plot represents the number of experimental marker genes for the top 20 cell types identified in each plant species.
Figure 2.A schematic PCMDB workflow. (A) The browse page presents a hierarchical classification of cells and tissues. (B) A statistical graph of cell markers for Arabidopsis root cap using experimental supporting evidences. (C) Search page presenting different search engines. (D) Detail information page (basic information, supported evidences, eFP image) for the MRN1 (AT5G42600) gene of Arabidopsis. (E) Expression pattern by bulk RNA-seq for the MRN1 (AT5G42600) gene of Arabidopsis. (F) Cluster map from scRNA-seq data for the MRN1 (AT5G42600) gene of Arabidopsis.
Figure 3.PCMDB tools and number of potential marker genes for currently unsupported species. (A) The result of SCSA prediction using the default example data. Left top: The number of marker genes for each cluster of the input data. Right top: The number of clusters classified into different types (Good, Uncertain, and Unknown) by SCSA. Left bottom: Z-score of cluster label for clusters classified into the Good type. Right bottom: Z-score of top 2 cluster labels for clusters classified into the Uncertain type. (B) Heatmap based on the score of each cluster label output by SingleR using the default example data. (C) The number of potential marker genes for currently unsupported species uncovered using homology search by Arabidopsis. From inner to outer, phylogenetic tree of 67 species (different colors mean different clades based on NCBI Taxonomy Browser), the numbers of marker gene candidates identified by experimental markers, bulk RNA-seq markers, and scRNA-seq-related markers.