| Literature DB >> 17299583 |
Jonas Binladen1, M Thomas P Gilbert, Jonathan P Bollback, Frank Panitz, Christian Bendixen, Rasmus Nielsen, Eske Willerslev.
Abstract
BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17299583 PMCID: PMC1797623 DOI: 10.1371/journal.pone.0000197
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The application of 5′ primer tags to the GS20 sequencing-by-synthesis process.
5′ tagged PCR primers
| Forward primers | Reversed primers | ||
| Name | Sequence (5′–3′) | Name | Sequence (5′–3′) |
| 16Faa | aacggttggggtgacctcgga | 16Raa | aagctgttatccctagggtaact |
| 16Fac | accggttggggtgacctcgga | 16Rac | acgctgttatccctagggtaact |
| 16Fag | agcggttggggtgacctcgga | 16Rag | aggctgttatccctagggtaact |
| 16Fat | atcggttggggtgacctcgga | 16Rat | atgctgttatccctagggtaact |
| 16Fca | cacggttggggtgacctcgga | 16Rca | cagctgttatccctagggtaact |
| 16Fcc | cccggttggggtgacctcgga | 16Rcc | ccgctgttatccctagggtaact |
| 16Fcg | cgcggttggggtgacctcgga | 16Rcg | cggctgttatccctagggtaact |
| 16Fct | ctcggttggggtgacctcgga | 16Rct | ctgctgttatccctagggtaact |
| 16Fga | gacggttggggtgacctcgga | 16Rga | gagctgttatccctagggtaact |
| 16Fgc | gccggttggggtgacctcgga | 16Rgc | gcgctgttatccctagggtaact |
| 16Fgg | ggcggttggggtgacctcgga | 16Rgg | gggctgttatccctagggtaact |
| 16Fgt | gtcggttggggtgacctcgga | 16Rgt | gtgctgttatccctagggtaact |
| 16Fta | tacggttggggtgacctcgga | 16Rta | tagctgttatccctagggtaact |
| 16Ftc | tccggttggggtgacctcgga | 16Rtc | tcgctgttatccctagggtaact |
| 16Ftg | tgcggttggggtgacctcgga | 16Rtg | tggctgttatccctagggtaact |
| 16Ftt | ttcggttggggtgacctcgga | 16Rtt | ttgctgttatccctagggtaact |
| 16SF4a | gctacggttggggtgacctcgga | 16SR4a | gtacgctgttatccctagggtaact |
| 16SF4b | tcagcggttggggtgacctcgga | 16SR4b | tgacgctgttatccctagggtaact |
| 16SF4c | ctagcggttggggtgacctcgga | 16SR4c | tagcgctgttatccctagggtaact |
Assigned sequence distribution
| Wolf | Cheetah | Hippopotamus | Lion | Saiga | Gibbon | Narwhal | Domestic Mouse | Musk Ox | Human | Zebra | African Buffalo | Impala | |||||
| Primer |
|
|
|
|
|
|
|
|
|
|
|
|
| Total | Correctly assigned | Incorrectly assigned | Assignment Error |
| 16FAA | 49 | 69 | 23 |
| 142 | 141 | 1 | 0.0071 | |||||||||
| 16RAA | 41 | 58 | 16 | 115 | 115 | 0 | 0.0000 | ||||||||||
| 16FAC | 58 | 98 | 23 | 179 | 179 | 0 | 0.0000 | ||||||||||
| 16RAC | 21 | 72 | 20 |
| 114 | 113 | 1 | 0.0088 | |||||||||
| 16FAG | 15 | 17 | 36 | 68 | 68 | 0 | 0.0000 | ||||||||||
| 16RAG | 20 | 17 | 28 | 65 | 65 | 0 | 0.0000 | ||||||||||
| 16FAT | 28 | 44 | 23 |
| 96 | 95 | 1 | 0.0105 | |||||||||
| 16RAT | 18 | 56 | 19 | 93 | 93 | 0 | 0.0000 | ||||||||||
| 16FTA | 13 | 64 | 49 | 1 | 127 | 127 | 0 | 0.0000 | |||||||||
| 16RTA | 7 | 39 | 40 |
| 0 | 87 | 86 | 1 | 0.0116 | ||||||||
| 16FTC | 28 | 47 | 19 |
| 19 | 114 | 113 | 1 | 0.0088 | ||||||||
| 16RTC | 32 | 58 | 7 |
| 14 | 112 | 111 | 1 | 0.0090 | ||||||||
| 16FTG | 12 | 57 | 31 | 5 | 105 | 105 | 0 | 0.0000 | |||||||||
| 16RTG | 19 | 55 | 12 | 1 | 87 | 87 | 0 | 0.0000 | |||||||||
| 16FTT | 15 | 54 | 35 | 6 | 110 | 110 | 0 | 0.0000 | |||||||||
| 16RTT | 21 | 48 | 27 | 4 |
| 101 | 100 | 1 | 0.0100 | ||||||||
| 16FGA |
| 86 | 42 | 43 | 19 | 191 | 190 | 1 | 0.0053 | ||||||||
| 16RGA |
| 65 | 54 | 34 | 9 | 163 | 162 | 1 | 0.0062 | ||||||||
| 16FGC | 8 | 64 | 42 | 4 | 118 | 118 | 0 | 0.0000 | |||||||||
| 16RGC | 5 | 63 | 25 | 11 | 104 | 104 | 0 | 0.0000 | |||||||||
| 16FGG | 84 | 51 | 31 | 25 | 191 | 191 | 0 | 0.0000 | |||||||||
| 16RGG | 61 | 61 | 24 | 26 | 172 | 172 | 0 | 0.0000 | |||||||||
| 16FGT | 90 | 43 | 45 | 24 | 202 | 202 | 0 | 0.0000 | |||||||||
| 16RGT | 71 | 46 | 35 | 9 | 161 | 161 | 0 | 0.0000 | |||||||||
| 16FCA |
| 71 | 86 | 80 | 238 | 237 | 1 | 0.0042 | |||||||||
| 16RCA |
| 54 | 96 | 81 | 232 | 231 | 1 | 0.0043 | |||||||||
| 16FCC | 106 | 93 | 106 | 305 | 305 | 0 | 0.0000 | ||||||||||
| 16RCC |
| 117 | 108 | 101 | 328 | 326 | 2 | 0.0061 | |||||||||
| 16FCG |
| 80 | 99 | 112 | 292 | 291 | 1 | 0.0034 | |||||||||
| 16RCG | 96 | 82 | 108 | 286 | 286 | 0 | 0.0000 | ||||||||||
| 16FCT |
| 86 | 93 | 84 | 266 | 263 | 3 | 0.0114 | |||||||||
| 16RCT |
| 82 | 102 | 74 | 259 | 258 | 1 | 0.0039 | |||||||||
| 16SF4A | 43 | 2 | 45 | 45 | 0 | 0.0000 | |||||||||||
| 16SR4A | 55 | 5 | 60 | 60 | 0 | 0.0000 | |||||||||||
| 16SF4B | 29 |
| 15 | 4 | 50 | 48 | 2 | 0.0417 | |||||||||
| 16SR4B | 25 | 19 | 6 | 50 | 50 | 0 | 0.0000 | ||||||||||
| 16SF4C | 51 | 60 | 3 | 114 | 114 | 0 | 0.0000 | ||||||||||
| 16SR4C | 43 | 54 | 3 | 100 | 100 | 0 | 0.0000 | ||||||||||
| Total | 5642 | 5622 | 20 | 0.1525 | |||||||||||||
| Mean | 148.5 | 147.9474 | 0.5263 | 0.0040 | |||||||||||||
| Percent GS20 sequences | 83.4 | 83.1 | |||||||||||||||
| Overall miss-assignment rate | 0.003557453 | ||||||||||||||||
| Analysis by Column: | SUM | MEAN | |||||||||||||||
| Correctly Assigned | 398 | 431 | 188 | 147 | 422 | 220 | 470 | 424 | 279 | 127 | 988 | 782 | 746 | 5622 | 432.46 | ||
| Incorrect assigned | 0 | 1 | 0 | 0 | 2 | 1 | 2 | 1 | 0 | 11 | 0 | 2 | 0 | 20 | 1.5385 | ||
| Species assignment error | 0.0000 | 0.0023 | 0.0000 | 0.0000 | 0.0047 | 0.0045 | 0.0043 | 0.0024 | 0.0000 | 0.0866 | 0.0000 | 0.0026 | 0.0000 | 0.0083 |
Italic numbers indicate miss-assigned sequences.
Observed and Expected sequence distributions sorted by 5′ tag composition
| 5′Tag | Sequences from forward primer | Sequences from reverse primer | Total sequences | Expected sequence frequency | Deviation |
| AA | 141 | 115 | 256 | 274.75 | −18.75 |
| AC | 179 | 113 | 292 | 274.75 | 17.25 |
| AG | 68 | 65 | 133 | 274.75 | −141.75 |
| AT | 95 | 93 | 188 | 274.75 | −86.75 |
| CA | 237 | 231 | 468 | 274.75 | 193.25 |
| CC | 305 | 326 | 631 | 274.75 | 356.25 |
| CG | 291 | 286 | 577 | 274.75 | 302.25 |
| CT | 263 | 258 | 521 | 274.75 | 246.25 |
| GA | 171 | 153 | 324 | 274.75 | 49.25 |
| GC | 114 | 93 | 207 | 274.75 | −67.75 |
| GG | 166 | 146 | 312 | 274.75 | 37.25 |
| GT | 178 | 152 | 330 | 274.75 | 55.25 |
| TA | 127 | 86 | 213 | 366.33 | −153.33 |
| TC | 113 | 111 | 224 | 366.33 | −142.33 |
| TG | 105 | 87 | 192 | 366.33 | −174.33 |
| TT | 110 | 100 | 210 | 366.33 | −156.33 |
| 4A | 45 (gcta) | 60 (gtca) | 105 | 183,16 | −78,16 |
| 4B | 48 (tcag) | 50 (tgac) | 98 | 274,75 | −176,75 |
| 4C | 114 (ctag) | 100 (tagc) | 214 | 274,75 | −60,75 |
|
|
|
|
|
|
Sequence of the tetranucleotide tag in parentheses
Expected sequence frequencies are calculated to account for the number of initial PCRs commencing from each different 5′tag.