| Literature DB >> 21296765 |
Maud I Tenaillon1, Matthew B Hufford, Brandon S Gaut, Jeffrey Ross-Ibarra.
Abstract
The genome of maize (Zea mays ssp. mays) consists mostly of transposable elements (TEs) and varies in size among lines. This variation extends to other species in the genus Zea: although maize and Zea luxurians diverged only ∼140,000 years ago, their genomes differ in size by ∼50%. We used paired-end Illumina sequencing to evaluate the potential contribution of TEs to the genome size difference between these two species. We aligned the reads both to a filtered gene set and to an exemplar database of unique repeats representing 1,514 TE families; ∼85% of reads mapped against TE repeats in both species. The relative contribution of TE families to the B73 genome was highly correlated with previous estimates, suggesting that reliable estimates of TE content can be obtained from short high-throughput sequencing reads, even at low coverage. Because we used paired-end reads, we could assess whether a TE was near a gene by determining if one paired read mapped to a TE and the second read mapped to a gene. Using this method, Class 2 DNA elements were found significantly more often in genic regions than Class 1 RNA elements, but Class 1 elements were found more often near other TEs. Overall, we found that both Class 1 and 2 TE families account for ∼70% of the genome size difference between B73 and luxurians. Interestingly, the relative abundance of TE families was conserved between species (r = 0.97), suggesting genome-wide control of TE content rather than family-specific effects.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21296765 PMCID: PMC3068001 DOI: 10.1093/gbe/evr008
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Counts of the Mapping Results against the UTE and the FGS for Paired-End Illumina Reads for B73 and luxurians, 3 Data Sets of 36-bp Illumina Reads from Low-Copy-Enriched B73 Libraries, and the B73 In Silico 100-bp Reads
| B73–104 | B73–84 | LUX-104 | LUX-84 | MS-HpaII | MI-HpaII | BbvI | In Silico Data | |
| #Reads | 18,689,556 | 18,689,556 | 19,942,282 | 19,942,282 | 3,814,762 | 3,130,565 | 5,066,369 | 18,598,686 |
| #UTE hits | 12,503,392 | 11,664,486 | 13,281,920 | 12,399,664 | 254,471 | 1,029,319 | 639,215 | 12,642,456 |
| #FGS hits | 2,101,030 | 2,290,718 | 2,113,671 | 2,347,419 | 1,544,034 | 629,885 | 1,905,067 | 3,190,485 |
| #Unmapped | 4,085,134 | 4,734,352 | 4,546,691 | 5,195,199 | 2,016,257 | 1,471,361 | 2,522,087 | 2,765,745 |
| % Mapped | 78.1 | 74.7 | 77.2 | 73.9 | 47.1 | 53.0 | 50.2 | 85.1 |
| % UTE | 85.6 | 83.6 | 86.3 | 84.1 | 14.1 | 62.0 | 25.1 | 79.8 |
| % FGS | 14.4 | 16.4 | 13.7 | 15.9 | 85.9 | 38.0 | 74.9 | 20.2 |
FCorrelation of RPKM between (A) 1,509 TE families estimated from B73–104 and the in silico data with 5 outliers indicated in gray; (B) 1,514 TE families estimated from B73 and luxurians. Values are shown on a log scale, with a pseudocount of 1 added to families with 0 counts.
FLog scale coverage along the unique sequence length of four of the five outlier TE families. Shown is RPKM of B73 (black lines) and in silico data (gray lines). (A) RLX_osed_AC191084-2931, (B) RLX_sela_AC195130-4415, (C) RLX_teki_AC202867-7492, and (D) RLX_sari_AC184117-11.
FSorted log difference in number of TE hits normalized by coverage between (A) B73 and luxurians, where negative values indicate an excess of TE hits in luxurians; (B) the genic and nested TE regions in B73 (black dots) and luxurians (gray dots), where negative values indicate an excess of TE hits in TE-nested regions. Dashed lines indicate zero difference.