| Literature DB >> 26989152 |
Stacia R Engel1, Shuai Weng2, Gail Binkley2, Kelley Paskov2, Giltae Song2, J Michael Cherry2.
Abstract
In recent years, thousands of Saccharomyces cerevisiae genomes have been sequenced to varying degrees of completion. The Saccharomyces Genome Database (SGD) has long been the keeper of the original eukaryotic reference genome sequence, which was derived primarily from S. cerevisiae strain S288C. Because new technologies are pushing S. cerevisiae annotation past the limits of any system based exclusively on a single reference sequence, SGD is actively working to expand the original S. cerevisiae systematic reference sequence from a single genome to a multi-genome reference panel. We first commissioned the sequencing of additional genomes and their automated analysis using the AGAPE pipeline. Here we describe our curation strategy to produce manually reviewed high-quality genome annotations in order to elevate 11 of these additional genomes to Reference status. Database URL: http://www.yeastgenome.org/.Entities:
Mesh:
Year: 2016 PMID: 26989152 PMCID: PMC4795930 DOI: 10.1093/database/baw020
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Curation strategy currently in use at SGD to expand the original Saccharomyces cerevisiae systematic reference sequence from a single highly-curated genome to an expertly curated multi-genome reference panel.
Numbers of automated ORF annotations for 11 different Saccharomyces strains for which the predicted translation start and/or stop generated by the AGAPE sequence analysis pipeline (9) differed from the S288C reference
| Automated ORFs calls | ORF boundary differences relative to strain S288C | ||||||
|---|---|---|---|---|---|---|---|
| Strain | Provenance | Accession | Start | Stop | Both | Total | |
| CEN.PK | Lab strain | JRIV00000000 | 5379 | 35 | 19 | 39 | 93 |
| D273-10B | Lab strain | JRIY00000000 | 5383 | 37 | 18 | 40 | 95 |
| FL100 | Lab strain | JRIT00000000 | 5366 | 29 | 21 | 34 | 84 |
| JK9-3d | Lab strain | JRIZ00000000 | 5385 | 40 | 11 | 35 | 86 |
| RM11-1a | Vineyard | JRIP00000000 | 5323 | 36 | 17 | 30 | 83 |
| SEY6210 | Lab strain | JRIW00000000 | 5400 | 44 | 23 | 26 | 93 |
| Sigma1278b | Lab strain | JRIQ00000000 | 5358 | 31 | 20 | 28 | 79 |
| SK1 | Lab strain | JRIH00000000 | 5350 | 38 | 22 | 32 | 92 |
| W303 | Lab strain | JRIU00000000 | 5397 | 54 | 24 | 33 | 111 |
| X2180-1A | Lab strain | JRIX00000000 | 5387 | 37 | 24 | 35 | 96 |
| Y55 | Lab strain | JRIF00000000 | 5359 | 39 | 26 | 32 | 97 |
| Total | 420 | 225 | 364 | 1009 | |||
Numbers of ORFs in 11 different S. cerevisiae strains that the AGAPE sequence analysis pipeline (9) called on more than one contig
| ORFs called on multiple contigs | ||||
|---|---|---|---|---|
| Strain | Two contigs | Three contigs | Four contigs | Total |
| CEN.PK | 15 | 3 | 4 | 22 |
| D273-10B | 14 | 6 | 3 | 23 |
| FL100 | 12 | 4 | 3 | 19 |
| JK9-3d | 8 | 3 | 3 | 14 |
| RM11-1a | 12 | 3 | 3 | 18 |
| SEY6210 | 19 | 3 | 4 | 26 |
| Sigma1278b | 14 | 2 | 3 | 19 |
| SK1 | 15 | 2 | 5 | 22 |
| W303 | 36 | 1 | 2 | 39 |
| X2180-1A | 15 | 2 | 3 | 20 |
| Y55 | 25 | 2 | 5 | 32 |
| Total | 185 | 31 | 38 | 254 |
Numbers of contigs for 11 different S. cerevisiae strains in the original automated output from the AGAPE sequence analysis pipeline (9) and in the curated contig set after manual review
| Contig set | ||
|---|---|---|
| Strain | Original | Curated |
| CEN.PK | 389 | 189 |
| D273-10B | 403 | 203 |
| FL100 | 402 | 174 |
| JK9-3d | 431 | 197 |
| RM11-1a | 325 | 169 |
| SEY6210 | 366 | 183 |
| Sigma1278b | 451 | 206 |
| SK1 | 389 | 214 |
| W303 | 415 | 236 |
| X2180-1A | 409 | 212 |
| Y55 | 413 | 198 |
| Total | 4393 | 2181 |
Numbers of ORFs 11 different S. cerevisiae strains that were marked as ‘unidentifiable’ in the original automated output from the AGAPE sequence analysis pipeline (9). These ORFs are currently undergoing manual review
| Strain | Undefined ORFs |
|---|---|
| CEN.PK | 169 |
| D273-10B | 254 |
| FL100 | 128 |
| JK9-3d | 121 |
| RM11-1a | 344 |
| SEY6210 | 69 |
| Sigma1278b | 106 |
| SK1 | 124 |
| W303 | 158 |
| X2180-1A | 78 |
| Y55 | 148 |
| Total | 1699 |
Additional S. cerevisiae strain genome sequences are already available throughout SGD
| Location | URL |
|---|---|
| Alignment pages | |
| BLAST | |
| Downloads | |
| Sequence pages | |
| Pattern matching | |
| Variant viewer |