Literature DB >> 29878054

MIDORI server: a webserver for taxonomic assignment of unknown metazoan mitochondrial-encoded sequences using a curated database.

Matthieu Leray1, Shian-Lei Ho2, I-Jeng Lin2, Ryuji J Machida2.   

Abstract

Summary: We present MIDORI server, a user-friendly web platform that uses a curated reference dataset, MIDORI, for high throughput taxonomic classification of unknown metazoan mitochondrial-encoded gene sequences. Currently three methods of taxonomic assignments: RDP Classifier, SPINGO and SINTAX, are implemented. Availability and implementation: The web server is freely available at {http://reference-midori.info/server.php}.

Entities:  

Mesh:

Year:  2018        PMID: 29878054      PMCID: PMC6198853          DOI: 10.1093/bioinformatics/bty454

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

The era of massive sequencing has transformed our ability to study Earth’s bio-diversity (Taberlet ). DNA extracted from various environments (i.e. soil, water, air, food products) can be analyzed and compared to public databases of annotated reference sequences to determine the presence of microbial, plant and animal taxa. The possible applications are extremely diverse. PCR-based (i.e. metagenetics or metabarcoding) and PCR-free (i.e. metatranscriptomics or metagenomics) DNA sequencing approaches can be used to study diversity patterns of overlooked microscopic taxa (Al-Rshaidat ), study the response of biological communities to environmental changes (Ji ), investigate illegal trade of endangered wildlife (Arulandhu ) and detect species mislabelling in food products (Raclariu ). The robustness of these technics, however, largely depends on our ability to rapidly and reliably assign taxonomy to sequences recovered from the environment. The realization by the scientific community that public repositories of genetic data (e.g. GenBank) contained a significant number of taxonomically mislabelled sequences has promoted the creation of curated databases with higher quality standards. For example, reference datasets were built for nuclear-encoded ribosomal RNA genes [e.g. PR2 (Guillou ), Silva (Quast )]. Recently, we assembled the first curated database of mitochondrial-encoded genes, MIDORI, for taxonomic assignments of metazoan sequences (Machida ). Mitochondrial genes provide higher taxonomic resolution for most metazoan groups than nuclear-encoded genes. As a result, they have been increasingly targeted in metagenetics and metagenomics studies (Leray and Knowlton, 2016). MIDORI was built by retrieving all nucleotide sequences from GenBank BLAST NT and, after quality filtration, includes metazoan mitochondrial sequences for 13 protein-coding (ATP synthase sub-unit 6 and 8; Cytochrome oxidase sub-unit I, II and III; Cytochrome b apoenzyme; NADH dehydrogenase sub-units 1–4, 4L, 5 and 6) and two ribosomal RNA genes (Large and Small ribosomal sub-unit RNA) with species-level level taxonomic information (see details in Machida ).

2 Server description

Here, we present MIDORI server, a user-friendly platform to facilitate taxonomic classification of mitochondrial-encoded gene sequences with MIDORI. The server currently performs taxonomic assignments with three algorithms that predict taxonomy using k-mer similarity: SPINGO (Allard ), RDP classifier (Wang ) and SINTAX (Edgar, 2016). A maximum of 10 000 sequences in a FASTA format can be uploaded at once, and all of them must be shorter than 4000 base pairs. Each algorithm can be run using two versions of each of the 15 mitochondrial-encoded gene reference datasets: MIDORI-Unique and MIDORI-Longest. MIDORI-Unique contains all haplotypes of every species while MIDORI-Longest contains a single haplotype per species, the longest one. For example, MIDORI-Longest for the COI gene contains the longest sequence for every species represented in the COI dataset. Using 1336 zooplankton sequences (Machida , 500 bp), we estimated the time required for assignments using the three algorithms with default settings (reference: COI-Longest). As a result, relatively longer calculation time was required for RDP classifier (630 s), compared to SPINGO (90 s) and SINTAX (100 s). Assigned phyla were compared between the results obtained from RDP classifier and SINTAX. The result indicated that about 10% of assignments were inconsistent between the results (most likely the groups with fewer reference sequences). Furthermore, we have also deposited the results of Leave One Out Test in MIDORI web site (http://www.reference-midori.info/download.php, Wang ). These results indicated that possibility of mis-assignment increases with the supporting bootstrap values decrease, demonstrating the importance of careful interpretation of results obtained for the analyses. The server is designed to give full flexibility to the user and functions with recent major browsers. A range of options is available for each algorithm such as assignment confidence cut-off (RDP), k-mer size (SPINGO) and bootstrap cut-off (SINTAX). A question mark button located next to each option provides hint details to the user when hovered by the cursor. The user can provide an e-mail address to receive the text-formatted result of the analysis. The server was extensively tested using mock sample and real environmental data. It is easy to use and does neither require any registration nor specific software to be installed locally. The RDP classifier is pre-trained with each of the reference datasets.

3 Conclusion

As bio-monitoring and bio-surveillance increasingly rely on mitochondrial-encoded sequence data, the ability to rapidly and reliably assign metazoan sequences to taxonomic groups has become indispensable. MIDORI server enables the classification of large number of unknown metazoan reads to taxa represented in the curated reference database. MIDORI will be regularly updated. We also intend to implement several additional taxonomic assignment algorithms on MIDORI server in the near future [e.g. SAP (Munch ), RTAX (Soergel ), METAXA2 (Bengtsson-Palme )].
  15 in total

1.  Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.

Authors:  Qiong Wang; George M Garrity; James M Tiedje; James R Cole
Journal:  Appl Environ Microbiol       Date:  2007-06-22       Impact factor: 4.792

2.  Reliable, verifiable and efficient monitoring of biodiversity via metabarcoding.

Authors:  Yinqiu Ji; Louise Ashton; Scott M Pedley; David P Edwards; Yong Tang; Akihiro Nakamura; Roger Kitching; Paul M Dolman; Paul Woodcock; Felicity A Edwards; Trond H Larsen; Wayne W Hsu; Suzan Benedick; Keith C Hamer; David S Wilcove; Catharine Bruce; Xiaoyang Wang; Taal Levi; Martin Lott; Brent C Emerson; Douglas W Yu
Journal:  Ecol Lett       Date:  2013-08-04       Impact factor: 9.492

3.  METAXA2: improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data.

Authors:  Johan Bengtsson-Palme; Martin Hartmann; Karl Martin Eriksson; Chandan Pal; Kaisa Thorell; Dan Göran Joakim Larsson; Rolf Henrik Nilsson
Journal:  Mol Ecol Resour       Date:  2015-03-23       Impact factor: 7.090

4.  Zooplankton diversity analysis through single-gene sequencing of a community sample.

Authors:  Ryuji J Machida; Yasuyuki Hashiguchi; Mutsumi Nishida; Shuhei Nishida
Journal:  BMC Genomics       Date:  2009-09-17       Impact factor: 3.969

5.  Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences.

Authors:  David A W Soergel; Neelendu Dey; Rob Knight; Steven E Brenner
Journal:  ISME J       Date:  2012-01-12       Impact factor: 10.302

6.  Comparative authentication of Hypericum perforatum herbal products using DNA metabarcoding, TLC and HPLC-MS.

Authors:  Ancuta Cristina Raclariu; Ramona Paltinean; Laurian Vlase; Aurélie Labarre; Vincent Manzanilla; Mihael Cristin Ichim; Gianina Crisan; Anne Krag Brysting; Hugo de Boer
Journal:  Sci Rep       Date:  2017-05-02       Impact factor: 4.379

7.  Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples.

Authors:  Ryuji J Machida; Matthieu Leray; Shian-Lei Ho; Nancy Knowlton
Journal:  Sci Data       Date:  2017-03-14       Impact factor: 6.444

8.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools.

Authors:  Christian Quast; Elmar Pruesse; Pelin Yilmaz; Jan Gerken; Timmy Schweer; Pablo Yarza; Jörg Peplies; Frank Oliver Glöckner
Journal:  Nucleic Acids Res       Date:  2012-11-28       Impact factor: 16.971

9.  The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy.

Authors:  Laure Guillou; Dipankar Bachar; Stéphane Audic; David Bass; Cédric Berney; Lucie Bittner; Christophe Boutte; Gaétan Burgaud; Colomban de Vargas; Johan Decelle; Javier Del Campo; John R Dolan; Micah Dunthorn; Bente Edvardsen; Maria Holzmann; Wiebe H C F Kooistra; Enrique Lara; Noan Le Bescot; Ramiro Logares; Frédéric Mahé; Ramon Massana; Marina Montresor; Raphael Morard; Fabrice Not; Jan Pawlowski; Ian Probert; Anne-Laure Sauvadet; Raffaele Siano; Thorsten Stoeck; Daniel Vaulot; Pascal Zimmermann; Richard Christen
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

10.  SPINGO: a rapid species-classifier for microbial amplicon sequences.

Authors:  Guy Allard; Feargal J Ryan; Ian B Jeffery; Marcus J Claesson
Journal:  BMC Bioinformatics       Date:  2015-10-08       Impact factor: 3.169

View more
  9 in total

1.  Over 2.5 million COI sequences in GenBank and growing.

Authors:  Teresita M Porter; Mehrdad Hajibabaei
Journal:  PLoS One       Date:  2018-09-07       Impact factor: 3.240

2.  A DNA barcode reference library of French Polynesian shore fishes.

Authors:  Erwan Delrieu-Trottin; Jeffrey T Williams; Diane Pitassy; Amy Driskell; Nicolas Hubert; Jérémie Viviani; Thomas H Cribb; Benoit Espiau; René Galzin; Michel Kulbicki; Thierry Lison de Loma; Christopher Meyer; Johann Mourier; Gérard Mou-Tham; Valeriano Parravicini; Patrick Plantard; Pierre Sasal; Gilles Siu; Nathalie Tolou; Michel Veuille; Lee Weigt; Serge Planes
Journal:  Sci Data       Date:  2019-07-04       Impact factor: 6.444

3.  Deep-sea predator niche segregation revealed by combined cetacean biologging and eDNA analysis of cephalopod prey.

Authors:  F Visser; V J Merten; T Bayer; M G Oudejans; D S W de Jonge; O Puebla; T B H Reusch; J Fuss; H J T Hoving
Journal:  Sci Adv       Date:  2021-03-31       Impact factor: 14.136

4.  RESCRIPt: Reproducible sequence taxonomy reference database management.

Authors:  Michael S Robeson; Devon R O'Rourke; Benjamin D Kaehler; Michal Ziemski; Matthew R Dillon; Jeffrey T Foster; Nicholas A Bokulich
Journal:  PLoS Comput Biol       Date:  2021-11-08       Impact factor: 4.475

5.  Profiling of RNA Viruses in Biting Midges (Ceratopogonidae) and Related Diptera from Kenya Using Metagenomics and Metabarcoding Analysis.

Authors:  Solomon K Langat; Fredrick Eyase; Wallace Bulimo; Joel Lutomiah; Samuel O Oyola; Mabel Imbuga; Rosemary Sang
Journal:  mSphere       Date:  2021-10-13       Impact factor: 4.389

6.  PREGO: A Literature and Data-Mining Resource to Associate Microorganisms, Biological Processes, and Environment Types.

Authors:  Haris Zafeiropoulos; Savvas Paragkamian; Stelios Ninidakis; Georgios A Pavlopoulos; Lars Juhl Jensen; Evangelos Pafilis
Journal:  Microorganisms       Date:  2022-01-26

7.  Using metatranscriptomics to estimate the diversity and composition of zooplankton communities.

Authors:  Mark Louie D Lopez; Ya-Ying Lin; Mitsuhide Sato; Chih-Hao Hsieh; Fuh-Kwo Shiah; Ryuji J Machida
Journal:  Mol Ecol Resour       Date:  2021-10-03       Impact factor: 8.678

8.  Development of transcriptomics-based growth rate indices in two model eukaryotes and relevance to metatranscriptomics datasets.

Authors:  Wye-Lup Kong; Ryuji J Machida
Journal:  Mol Ecol Resour       Date:  2022-06-14       Impact factor: 8.678

Review 9.  Some Examples of the Use of Molecular Markers for Needs of Basic Biology and Modern Society.

Authors:  Yuri Phedorovich Kartavtsev
Journal:  Animals (Basel)       Date:  2021-05-20       Impact factor: 2.752

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.