| Literature DB >> 22253803 |
Philippe Chouvarine1, Amanda M Cooksey, Fiona M McCarthy, David A Ray, Brian S Baldwin, Shane C Burgess, Daniel G Peterson.
Abstract
BACKGROUND: Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22253803 PMCID: PMC3254643 DOI: 10.1371/journal.pone.0029850
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Outline of procedure used to identify SNPs from miscanthus samples.
SNPs per aligned bp identified in comparative analysis of cDNA regions common to all samples.
| FF | FO | FN | I | C | F | MS | |
| FF | - | 0.000413390 | 0.000388363 | 0.000470852 | 0.000349889 | 0.000546697 | 0.000533935 |
| FO | 0.000314511 | - | 0.000348281 | 0.000434378 | 0.000309330 | 0.000486504 | 0.000502400 |
| FN | 0.000319526 | 0.000370514 | - | 0.000472350 | 0.000359891 | 0.000531350 | 0.000557107 |
| I | 0.000287344 | 0.000333024 | 0.000314453 | - | 0.000306604 | 0.000462724 | 0.000500130 |
| C | 0.000356861 | 0.000409450 | 0.000387226 | 0.000479916 | - | 0.000491909 | 0.000558566 |
| F | 0.000102675 | 0.000137919 | 0.000125332 | 0.000182317 | 0.000112822 | - | 0.000236819 |
| MS | 0.000187104 | 0.000244045 | 0.000230092 | 0.000334766 | 0.000212052 | 0.00060301 | - |
Distance matrix.
| FF | FO | FN | I | C | F | MMS | |
| FF | - | ||||||
| FO | 0.00036395 | - | |||||
| FN | 0.00035394 | 0.00035940 | - | ||||
| I | 0.00037910 | 0.00038370 | 0.00039340 | - | |||
| C | 0.00035337 | 0.00035939 | 0.00037356 | 0.00039326 | - | ||
| F | 0.00032469 | 0.00031221 | 0.00032834 | 0.00032252 | 0.00030237 | - | |
| MS | 0.00036052 | 0.00037322 | 0.00039360 | 0.00041745 | 0.00038531 | 0.00041991 | - |
Figure 2Phylogenetic tree inferred by SNP analysis in common regions of all seven samples.
Phylogeny is inferred using weighted SNPs/bp to prepare a distance matrix and generate the neighbor-joining tree for the miscanthus samples.
Figure 3Impact of k-mer size on characteristics of Miscanthus × giganteus exome assembly in Velvet.
Assisted assemblies were assisted with Sorghum bicolor transcript references. (A) N50 vs. k-mer size. (B) Longest contig length vs. k-mer size. (C) Sum of contig lengths, Mb vs. k-mer size. (D) Average length of the top 100 longest contigs vs. k-mer size.
Transcript assembly metrics evaluation using Arabidopsis thaliana assemblies.
| k | Average length of the top 100 longest contigs | Length of the longest contig | N50 | Number of megablast hits with 100% identify to the standard transcript sequences produced by the contig sequences | Number of bases in the regions where our transcript contig sequences aligned without overlapping each other to the standard transcript sequences with 100% identity |
| 15 | 1261 | 1957 | 8 | 661 | 8571 |
| 17 | 1482 | 2365 | 110 | 73600 | 1789362 |
| 19 |
| 4616 | 223 | 92409 |
|
| 21 | 1886 | 4182 | 165 | 73506 | 2124487 |
| 23 | 1732 | 5050 | 235 | 47372 | 2040209 |
| 25 | 1662 | 5048 | 300 | 31027 | 1821088 |
| 27 | 1590 | 5046 | 346 | 20384 | 1493454 |
| 29 | 1457 | 5044 | 379 | 13093 | 1102776 |
| 31 | 1382 | 5042 | 416 | 7656 | 750977 |
| 33 | 1253 | 4260 | 474 | 3679 | 427093 |
| 35 | 1005 | 4250 | 510 | 1362 | 120707 |
Figure 4Distribution of GO annotation for miscanthus sequences compared to Sorghum bicolor.
Sorghum GO annotation was downloaded from AgBase (October 2010) and the Plant GO Slim used to group and compare GO annotations from miscanthus and Sorghum bicolor, a closely related species. (A) Biological process GO terms. (B) Cellular component GO terms.