Literature DB >> 33985559

TGIF-DB: terse genomics interface for developing botany.

Daisuke Tsugama1, Tetsuo Takano2.   

Abstract

OBJECTIVES: Pearl millet (Pennisetum glaucum) is a staple cereal crop for semi-arid regions. Its whole genome sequence and deduced putative gene sequences are available. However, the functions of many pearl millet genes are unknown. Situations are similar for other crop species such as garden asparagus (Asparagus officinalis), chickpea (Cicer arietinum) and Tartary buckwheat (Fagopyrum tataricum). The objective of the data presented here was to improve functional annotations of genes of pearl millet, garden asparagus, chickpea and Tartary buckwheat with gene annotations of model plants, to systematically provide such annotations as well as their sequences on a website, and thereby to promote genomics for those crops. DATA DESCRIPTION: Sequences of genomes and transcripts of pearl millet, garden asparagus, chickpea and Tartary buckwheat were downloaded from a public database. These transcripts were associated with functional annotations of their Arabidopsis thaliana and rice (Oryza sativa) counterparts identified by BLASTX. Conserved domains in protein sequences of those species were identified by the HMMER scan with the Pfam database. The resulting data was deposited in the figshare repository and can be browsed on the Terse Genomics Interface for Developing Botany (TGIF-DB) website ( http://webpark2116.sakura.ne.jp/rlgpr/ ).

Entities:  

Keywords:  Arabidopsis; Chickpea; Garden asparagus; Genomics; Pearl millet; Plant; Rice; Tartary buckwheat

Mesh:

Substances:

Year:  2021        PMID: 33985559      PMCID: PMC8120730          DOI: 10.1186/s13104-021-05599-4

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Objective

Pearl millet (Pennisetum glaucum) is a staple cereal crop for semi-arid regions. Its whole genome was sequenced and putative gene sequences were deduced [1]. Functions of some of the pearl millet genes have also been either examined by experiments or predicted on the basis of their homologies to specific, targeted gene sets with known functions ([2, 3]; for example). However, functional annotations of the pearl millet genes are neither sufficient nor systematic. Situations are similar in many other plant species such as garden asparagus (Asparagus officinalis), chickpea (Cicer arietinum) and Tartary buckwheat (Fagopyrum tataricum) [4–6 respectively, for analyses of their genomes]. Arabidopsis thaliana and rice (Oryza sativa) are dicot and monocot model species, respectively, and have better functional annotations for each gene ([7, 8]; for example). The objective of the data presented here was to improve the functional annotations of genes of pearl millet by systematic homology searches with databases for Arabidopsis genes, rice genes and protein conserved domains, to develop a platform for browsing the resulting data, and thereby to promote pearl millet genomics.

Data description

The whole genome sequences and transcript (or protein coding) sequences that were deduced from the genome sequences of pearl millet, garden asparagus, chickpea and Tartary buckwheat as well as genome annotation files in the general feature format (GFF) were downloaded from the International Pearl Millet Genome Sequencing Consortium (IPMGSC) website [9], the Asparagus Genome Project website [10], the National Center for Biotechnology Information (NCBI) Chickpea Genome website (with Genome ID 2992) [11] and the MBKBASE Tartary Buckwheat Genome Project website [12], respectively. The sequences and functional annotations of Arabidopsis proteins (TAIR10 versions) were downloaded from The Arabidopsis Information Resource (TAIR) website [7], and those of rice (RGAP 7 versions) were downloaded from the Rice Genome Annotation Project (RGAP) website [8]. BLASTX on the BLAST + suite [13] was performed with the transcript sequences of those crop species as queries and with either the Arabidopsis protein sequences or rice protein sequences as the database. The threshold E-value was set as 1e − 20, which is more stringent than the default value (10.0), for this analysis. The transcripts (or genes) of the crop species were then associated with the functional annotations of the corresponding Arabidopsis and rice proteins identified by the BLASTX search. Protein sequences of pearl millet, garden asparagus, chickpea and Tartary buckwheat were deduced from their transcript sequences, and the Pfam database [14] was searched by the hmmscan program for HMMER (version 3.3) [15] to identify conserved domains in those proteins. The threshold E-value was set as 1e − 5, which is more stringent than the default value (10), for this analysis. A genomic locus sequence, which consists of exons and introns, and a promoter sequence, which is a 3-kb upstream sequence from the start codon, for each gene were extracted on the basis of the whole genome sequences and the GFF files. The resulting data for the gene sequences and their functional annotations for pearl millet, garden asparagus, chickpea and Tartary buckwheat were deposited in the figshare repository (Data sets 1–34 in Table 1) [16]. A website, Terse Genomics Interface for Developing Botany (TGIF-DB), was developed to browse these data [17] (see Data file 1 in Table 1 for a TGIF-DB interface). The programs in the BLAST+ suite [13] and the genome browser JBrowse [18] were included as a part of TGIF-DB.
Table 1

Overview of data files/data sets

LabelName of data file/data setFile types (file extension)Data repository and identifier (DOI or accession number)
Data set 1Ao_gene_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 2Ao_promoter_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 3Ao_protein_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 4Ao_proteins_At_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 5Ao_proteins_ELM.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 6Ao_proteins_Os_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 7Ao_proteins_Pfam.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 8Ao_transcript_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 9Ao_transcripts_hairpin_miRNA_blastn.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 10Ca_gene_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 11Ca_promoter_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 12Ca_protein_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 13Ca_proteins_At_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 14Ca_proteins_ELM.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 15Ca_proteins_Os_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 16Ca_proteins_Pfam.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 17Ca_transcript_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 18Ft_gene_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 19Ft_promoter_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 20Ft_protein_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 21Ft_proteins_At_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 22Ft_cds_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 23Ft_proteins_Os_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 24Ft_proteins_Pfam.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 25Ft_transcript_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 26Pg_gene_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 27Pg_promoter_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 28Pg_protein_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 29Pg_proteins_At_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 30Pg_proteins_ELM.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 31Pg_proteins_Os_proteins_blastp.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 32Pg_proteins_Pfam.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 33Pg_transcript_seqs.faFasta (.fa) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data set 34Pg_transcripts_hairpin_miRNA_blastn.txtText (.txt) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Data file 1Figure_TGIF-DB_snapshots.tifFigure (.tif) fileFigshare (https://doi.org/10.6084/m9.figshare.13565168.v2) [16]
Overview of data files/data sets

Limitations

Some proteins of the species used do not appear to have conserved domains and/or any close homolog in either Arabidopsis or rice. Some pearl millet genes have been characterized by targeted analyses ([2,3]; for example) but information about such analyses has not been included in the data set presented.
  12 in total

1.  Pearl millet stress-responsive NAC transcription factor PgNAC21 enhances salinity stress tolerance in Arabidopsis.

Authors:  Harshraj Shinde; Ambika Dudhate; Daisuke Tsugama; Shashi K Gupta; Shenkui Liu; Tetsuo Takano
Journal:  Plant Physiol Biochem       Date:  2018-11-10       Impact factor: 4.270

2.  The Arabidopsis information resource: Making and mining the "gold standard" annotated reference plant genome.

Authors:  Tanya Z Berardini; Leonore Reiser; Donghui Li; Yarik Mezheritsky; Robert Muller; Emily Strait; Eva Huala
Journal:  Genesis       Date:  2015-08-04       Impact factor: 2.487

3.  The Tartary Buckwheat Genome Provides Insights into Rutin Biosynthesis and Abiotic Stress Tolerance.

Authors:  Lijun Zhang; Xiuxiu Li; Bin Ma; Qiang Gao; Huilong Du; Yuanhuai Han; Yan Li; Yinghao Cao; Ming Qi; Yaxin Zhu; Hongwei Lu; Mingchuan Ma; Longlong Liu; Jianping Zhou; Chenghu Nan; Yongjun Qin; Jun Wang; Lin Cui; Huimin Liu; Chengzhi Liang; Zhijun Qiao
Journal:  Mol Plant       Date:  2017-09-01       Impact factor: 13.164

4.  Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement.

Authors:  Rajeev K Varshney; Chi Song; Rachit K Saxena; Sarwar Azam; Sheng Yu; Andrew G Sharpe; Steven Cannon; Jongmin Baek; Benjamin D Rosen; Bunyamin Tar'an; Teresa Millan; Xudong Zhang; Larissa D Ramsay; Aiko Iwata; Ying Wang; William Nelson; Andrew D Farmer; Pooran M Gaur; Carol Soderlund; R Varma Penmetsa; Chunyan Xu; Arvind K Bharti; Weiming He; Peter Winter; Shancen Zhao; James K Hane; Noelia Carrasquilla-Garcia; Janet A Condie; Hari D Upadhyaya; Ming-Cheng Luo; Mahendar Thudi; C L L Gowda; Narendra P Singh; Judith Lichtenzveig; Krishna K Gali; Josefa Rubio; N Nadarajan; Jaroslav Dolezel; Kailash C Bansal; Xun Xu; David Edwards; Gengyun Zhang; Guenter Kahl; Juan Gil; Karam B Singh; Swapan K Datta; Scott A Jackson; Jun Wang; Douglas R Cook
Journal:  Nat Biotechnol       Date:  2013-01-27       Impact factor: 54.908

5.  Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

Authors:  Rajeev K Varshney; Chengcheng Shi; Mahendar Thudi; Cedric Mariac; Jason Wallace; Peng Qi; He Zhang; Yusheng Zhao; Xiyin Wang; Abhishek Rathore; Rakesh K Srivastava; Annapurna Chitikineni; Guangyi Fan; Prasad Bajaj; Somashekhar Punnuri; S K Gupta; Hao Wang; Yong Jiang; Marie Couderc; Mohan A V S K Katta; Dev R Paudel; K D Mungra; Wenbin Chen; Karen R Harris-Shultz; Vanika Garg; Neetin Desai; Dadakhalandar Doddamani; Ndjido Ardo Kane; Joann A Conner; Arindam Ghatak; Palak Chaturvedi; Sabarinath Subramaniam; Om Parkash Yadav; Cécile Berthouly-Salazar; Falalou Hamidou; Jianping Wang; Xinming Liang; Jérémy Clotault; Hari D Upadhyaya; Philippe Cubry; Bénédicte Rhoné; Mame Codou Gueye; Ramanjulu Sunkar; Christian Dupuy; Francesca Sparvoli; Shifeng Cheng; R S Mahala; Bharat Singh; Rattan S Yadav; Eric Lyons; Swapan K Datta; C Tom Hash; Katrien M Devos; Edward Buckler; Jeffrey L Bennetzen; Andrew H Paterson; Peggy Ozias-Akins; Stefania Grando; Jun Wang; Trilochan Mohapatra; Wolfram Weckwerth; Jochen C Reif; Xin Liu; Yves Vigouroux; Xun Xu
Journal:  Nat Biotechnol       Date:  2017-09-18       Impact factor: 54.908

6.  Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.

Authors:  Jaina Mistry; Robert D Finn; Sean R Eddy; Alex Bateman; Marco Punta
Journal:  Nucleic Acids Res       Date:  2013-04-17       Impact factor: 16.971

7.  JBrowse: a dynamic web platform for genome visualization and analysis.

Authors:  Robert Buels; Eric Yao; Colin M Diesh; Richard D Hayes; Monica Munoz-Torres; Gregg Helt; David M Goodstein; Christine G Elsik; Suzanna E Lewis; Lincoln Stein; Ian H Holmes
Journal:  Genome Biol       Date:  2016-04-12       Impact factor: 13.583

8.  Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data.

Authors:  Yoshihiro Kawahara; Melissa de la Bastide; John P Hamilton; Hiroyuki Kanamori; W Richard McCombie; Shu Ouyang; David C Schwartz; Tsuyoshi Tanaka; Jianzhong Wu; Shiguo Zhou; Kevin L Childs; Rebecca M Davidson; Haining Lin; Lina Quesada-Ocampo; Brieanne Vaillancourt; Hiroaki Sakai; Sung Shin Lee; Jungsok Kim; Hisataka Numa; Takeshi Itoh; C Robin Buell; Takashi Matsumoto
Journal:  Rice (N Y)       Date:  2013-02-06       Impact factor: 4.783

9.  The Pfam protein families database in 2019.

Authors:  Sara El-Gebali; Jaina Mistry; Alex Bateman; Sean R Eddy; Aurélien Luciani; Simon C Potter; Matloob Qureshi; Lorna J Richardson; Gustavo A Salazar; Alfredo Smart; Erik L L Sonnhammer; Layla Hirsh; Lisanna Paladin; Damiano Piovesan; Silvio C E Tosatto; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  Genome-wide identification and expression analysis of WRKY transcription factors in pearl millet (Pennisetum glaucum) under dehydration and salinity stress.

Authors:  Jeky Chanwala; Suresh Satpati; Anshuman Dixit; Ajay Parida; Mrunmay Kumar Giri; Nrisingha Dey
Journal:  BMC Genomics       Date:  2020-03-14       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.