| Literature DB >> 29220466 |
Barbara Robbertse1, Pooja K Strope1, Priscila Chaverri2,3, Romina Gazis4, Stacy Ciufo1, Michael Domrachev1, Conrad L Schoch1.
Abstract
The ITS (nuclear ribosomal internal transcribed spacer) RefSeq database at the National Center for Biotechnology Information (NCBI) is dedicated to the clear association between name, specimen and sequence data. This database is focused on sequences obtained from type material stored in public collections. While the initial ITS sequence curation effort together with numerous fungal taxonomy experts attempted to cover as many orders as possible, we extended our latest focus to the family and genus ranks. We focused on Trichoderma for several reasons, mainly because the asexual and sexual synonyms were well documented, and a list of proposed names and type material were recently proposed and published. In this case study the recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database. A name status report is available here: https://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi. As a result, the ITS RefSeq Targeted Loci database at NCBI has been augmented with more sequences from type and verified material from Trichoderma species. Additionally, to aid in the cross referencing of data from single loci and genomes we have collected a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions. During the process of curation misidentified genomes were discovered, and sequence records from type material were found hidden under previous classifications. Source metadata curation, although more cumbersome, proved to be useful as confirmation of the type material designation. Database URL:http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353 © Crown copyright 2017.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29220466 PMCID: PMC5641268 DOI: 10.1093/database/bax072
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Morphology of specimens in the Trichoderma/Hypocrea clade: (A) asexual structures (conidiophore and conidia) of Trichoderma harzianum (FJ967806), (B) growth in culture of a specimen in the Trichoderma harzianum complex and (C) sexual reproduction structures of Hypocrea species.
Figure 2.Bar graph showing the number of formal Trichoderma names associated with different attributes in databases at NCBI.
Figure 3.Graphical display of ITS1 length compared with ITS2 length from Trichoderma ITS RefSeq records. Grey arrows indicate the minimum lengths of ITS1 and ITS2 observed using ITSx annotation.
Two or more Trichoderma RefSeq ITS records which had identical ITS1_5.8S_ITS2 sequences
| NR_138453( |
| NR_134394( |
| NR_137298( |
| NR_138434( |
| NR_137305( |
| NR_131317( |
| NR_144868( |
| NR_138439( |
| NR_134371( |
| NR_137308( |
| NR_138451( |
| NR_137302( |
| NR_131281( |
| NR_138444( |
Figure 4.A graphical summary of RefSeq ITS sequence BLASTn search results (% identity and alignment length) between (Figure 4A) and within clades (Figure 4B), where clades were defined by Jacklitsch and Voglmayr (23).
UNITE species hypothesis clusters with more than one RefSeq ITS record.
| UNITE_public_20_11_2016 | NCBI RefSeq accession count |
|---|---|
| SH181342.07FU | 39 |
| SH190868.07FU | 20 |
| SH207825.07FU | 10 |
| SH206530.07FU | 6 |
| SH177682.07FU | 6 |
| SH177683.07FU | 6 |
| SH177684.07FU | 6 |
| SH187755.07FU | 5 |
| SH177687.07FU | 4 |
| SH187759.07FU | 3 |
| SH177689.07FU | 3 |
| SH008832.07FU | 2 |
| SH177685.07FU | 2 |
| SH177686.07FU | 2 |
| SH002402.07FU | 2 |
| SH177694.07FU | 2 |
| SH217782.07FU | 2 |
| SH181351.07FU | 2 |
| SH187756.07FU | 2 |
| SH187757.07FU | 2 |
| SH177690.07FU | 2 |
https://unite.ut.ee/repository.php (Full ‘UNITE + INSD’ dataset, version no 7.1, release date 20 November 2016).
Percentage identity of the ITS region (bases 1–527) from NR_138441 (T. viride CBS 119325 from TYPE material) compared with other sequences from Trichoderma type material from the same UNITE cluster
| RefSeq records in cluster SH181342.07FU | NCBI RefSeq name | % identityin BLASTn search |
|---|---|---|
| NR_138441.1 | 100 | |
| NR_134363.1(a) | 99.617 | |
| NR_138440.1 | 99.616 | |
| NR_138452.1(b) | 99.432 | |
| NR_138456.1(b) | 99.432 | |
| NR_138451.1(b) | 99.432 | |
| NR_137308.1(a) | 99.431 | |
| NR_144870.1(b) | 99.419 | |
| NR_134340.1 | 99.242 | |
| NR_138439.1(c) | 99.241 | |
| NR_134367.1(c) | 99.237 | |
| NR_134342.1(d) | 99.225 | |
| NR_131281.1(d) | 99.225 | |
| NR_137303.1(d) | 99.225 | |
| NR_138442.1 | 99.053 | |
| NR_144874.1(e) | 99.031 | |
| NR_137302.1(e) | 99.031 | |
| NR_138444.1 | 99.006 | |
| NR_144876.1 | 98.868 | |
| NR_134362.1 | 98.851 | |
| NR_131317.1 | 98.846 | |
| NR_134343.1 | 98.837 | |
| NR_111837.1 | 98.491 | |
| NR_138443.1 | 98.45 | |
| NR_134437.1 | 97.938 | |
| NR_134361.1 | 97.854 | |
| NR_103571.1 | 97.736 | |
| NR_134392.1 | 97.323 | |
| NR_134419.1 | 97.164 | |
| NR_134360.1 | 96.992 | |
| NR_134359.1 | 96.798 | |
| NR_138438.1 | 96.792 | |
| NR_134446.1 | 96.786 | |
| NR_134447.1 | 96.786 | |
| NR_144875.1 | 96.737 | |
| NR_130668.1 | 96.712 | |
| NR_134371.1(f) | 96.591 | |
| NR_138449.1(f) | 96.591 | |
| NR_077179.1 | 96.408 |
Identical letters in brackets indicate ITS regions with 100% identity to each other.
https://unite.ut.ee/repository.php (Full ‘UNITE + INSD’ dataset, version no 7.1, release date 20 November 2016).
Accessing ITS RefSeq records at NCBI
| NCBI Resource | URL | Additional Entrez Query text |
|---|---|---|
| Nucleotide | ‘ITS region’ AND Trichoderma [organism] AND RefSeq [filter] | |
| BioProject | ||
| (Nucleotide via BioProject) | ( | AND Trichoderma [orgn] |
| Taxonomy Browser | ||
| (Nucleotide via Taxonomy browser) | ( | AND ‘ITS region’ AND RefSeq [filter] |
| NCBI BLAST | ‘ITS region’ AND Trichoderma [organism] AND RefSeq [filter] | |
| Targeted Loci Blast | ||
| NCBI FTP |
Figure 5.A phylogenetic tree generated by a FastTree analysis using a MAFFT alignment of RPB2 nucleotide sequences from type material (with asterisk) of Trichoderma and genomes labeled as Trichoderma.
Figure 6.Additions and updates to Trichoderma and Hypocrea binomial names in the NCBI Taxonomy database over the past 22 years.