| Literature DB >> 28787424 |
Robert M Bowers1, Nikos C Kyrpides1, Ramunas Stepanauskas2, Miranda Harmon-Smith1, Devin Doud1, T B K Reddy1, Frederik Schulz1, Jessica Jarett1, Adam R Rivers1,3, Emiley A Eloe-Fadrosh1, Susannah G Tringe1,4, Natalia N Ivanova1, Alex Copeland1, Alicia Clum1, Eric D Becraft2, Rex R Malmstrom1, Bruce Birren5, Mircea Podar6, Peer Bork7, George M Weinstock8, George M Garrity9, Jeremy A Dodsworth10, Shibu Yooseph11, Granger Sutton12, Frank O Glöckner13, Jack A Gilbert14,15, William C Nelson16, Steven J Hallam17, Sean P Jungbluth1,18, Thijs J G Ettema19, Scott Tighe20, Konstantinos T Konstantinidis21, Wen-Tso Liu22, Brett J Baker23, Thomas Rattei24, Jonathan A Eisen25, Brian Hedlund26,27, Katherine D McMahon28,29, Noah Fierer30,31, Rob Knight32, Rob Finn33, Guy Cochrane33, Ilene Karsch-Mizrachi34, Gene W Tyson35, Christian Rinke35, Alla Lapidus36, Folker Meyer14, Pelin Yilmaz13, Donovan H Parks35, A M Eren37, Lynn Schriml38, Jillian F Banfield39, Philip Hugenholtz35, Tanja Woyke1,4.
Abstract
We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.Entities:
Mesh:
Year: 2017 PMID: 28787424 PMCID: PMC6436528 DOI: 10.1038/nbt.3893
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 54.908
Figure 1Sequencing of bacterial and archaeal genomes[3,11,13,37,85,86,87,88,89,90].
Increase in the number of SAGs and MAGs over time. Inset displays the number of isolate genomes over time for comparison. Data for figure were taken from IMG/GOLD[14] in January 2017.
Genome reporting standards for SAGs and MAGs
| Criterion | Description |
|---|---|
|
| |
| Assembly qualitya | Single contiguous sequence without gaps or ambiguities with a consensus error rate equivalent to Q50 or better |
|
| |
| Assembly qualitya | Multiple fragments where gaps span repetitive regions. Presence of the 23S, 16S, and 5S rRNA genes and at least 18 tRNAs. |
| Completionb | >90% |
| Contaminationc | <5% |
|
| |
| Assembly qualitya | Many fragments with little to no review of assembly other than reporting of standard assembly statistics. |
| Completionb | ≥50% |
| Contaminationc | <10% |
|
| |
| Assembly qualitya | Many fragments with little to no review of assembly other than reporting of standard assembly statistics. |
| Completionb | <50% |
| Contaminationc | <10% |
| This is a compressed set of genome reporting standards for SAGs and MAGs. For a complete list of mandatory and optional standards, see | |
aAssembly statistics include but are not limited to: N50, L50, largest contig, number of contigs, assembly size, percentage of reads that map back to the assembly, and number of predicted genes per genome.
bCompletion: ratio of observed single-copy marker genes to total single-copy marker genes in chosen marker gene set.
cContamination: ratio of observed single-copy marker genes in ≥2 copies to total single-copy marker genes in chosen marker gene set.
Figure 2Generation of SAGs and MAGs.
Flow diagram outlining the typical pipeline for the production of both SAGs and MAGs.