| Literature DB >> 32025315 |
Qiang Li1, Jingjing Qi1, Xiujuan Qin1, Wanfu Dou1, Tiangang Lei1, Anhua Hu1, Ruirui Jia2, Guojin Jiang1, Xiuping Zou1, Qin Long1, Lanzhen Xu1, Aihong Peng1, Lixiao Yao1, Shanchun Chen1, Yongrui He1.
Abstract
Citrus is one of the most important commercial fruit crops worldwide. With the vast genomic data currently available for citrus fruit, genetic relationships, and molecular markers can be assessed for the development of molecular breeding and genomic selection strategies. In this study, to permit the ease of access to these data, a web-based database, the citrus genomic variation database (CitGVD, http://citgvd.cric.cn/home) was developed as the first citrus-specific comprehensive database dedicated to genome-wide variations including single nucleotide polymorphisms (SNPs) and insertions/deletions (INDELs). The current version (V1.0.0) of CitGVD is an open-access resource centered on 1,493,258,964 high-quality genomic variations and 84 phenotypes of 346 organisms curated from in-house projects and public resources. CitGVD integrates closely related information on genomic variation annotations, related gene annotations, and details regarding the organisms, incorporating a variety of built-in tools for data accession and analysis. As an example, CitGWAS can be used for genome-wide association studies (GWASs) with SNPs and phenotypic data, while CitEVOL can be used for genetic structure analysis. These features make CitGVD a comprehensive web portal and bioinformatics platform for citrus-related studies. It also provides a model for analyzing genome-wide variations for a wide range of crop varieties.Entities:
Keywords: Genetic markers; Structural variation
Year: 2020 PMID: 32025315 PMCID: PMC6994598 DOI: 10.1038/s41438-019-0234-3
Source DB: PubMed Journal: Hortic Res ISSN: 2052-7276 Impact factor: 6.793
Fig. 1Workflow highlighting the development of CitGVD.
a Data sources and pipelines to construct CitGVD. b Data processing procedures. This process included NGS data quality control, SNP calling and filtering.
Fig. 2Screen dumps of the gene browse function of CitGVD.
a Three query strategies by ref. genomes, annotation sources, and gene IDs, respectively, can be used for data filtering. b Annotations from five sources including KEGG, GO, Nr, KOG, and TrEMBL can be retrieved in CitGVD. Gene information (c), gbrowse visualizations (d), genomic sequences, CDS, peptide sequences (e) can be accessed by the cross-link on the gene ID. The annotation details of GO (f), Nr (g), and KEGG (h) can be accessed by the cross-links on corresponding annotation IDs (b).
Fig. 3Screen dumps of the SNP/INDEL search functions of CitGVD.
a Three query strategies for SNP/INDEL searches. b A typical search result of the Multicriteria Search. c A typical search result of the Comparative Search. d Gbrowse visualization of a SNP. e Primer Design for a SNP. f Gene Search tool for a SNP/INDEL-related gene.
Fig. 4Built-in pipelines.
a Parameters of CitTRAIT. b Output of CitTRAIT. Min minimum value, Max maximum value, Mean mean value, SD standard deviation, Med median, CV coefficient of variation. c Parameters of CitEVOL. CitEVOL processes with built-in SNP data. In the calculation, the reference genome was first determined, and then the population structure, principal component analysis (PCA) results and phylogenetic trees can be analyzed. d Phylogenetic tree data were assessed with CitEVOL and visualized with MEGA V7.2. e Parameters of CitGWAS. Mmixed linear model (MLM) and general linear model (GLM) can be used for GWASs. f Manhattan plot output from CitGWAS. g QQ plot output from CitGWAS.