| Literature DB >> 22491057 |
Heidi J T Pagán1, Jiří Macas, Petr Novák, Eve S McCulloch, Richard D Stevens, David A Ray.
Abstract
The repetitive landscapes of mammalian genomes typically display high Class I (retrotransposon) transposable element (TE) content, which usually comprises around half of the genome. In contrast, the Class II (DNA transposon) contribution is typically small (<3% in model mammals). Most mammalian genomes exhibit a precipitous decline in Class II activity beginning roughly 40 Ma. The first signs of more recently active mammalian Class II TEs were obtained from the little brown bat, Myotis lucifugus, and are reflected by higher genome content (~5%). To aid in determining taxonomic limits and potential impacts of this elevated Class II activity, we performed 454 survey sequencing of a second Myotis species as well as four additional taxa within the family Vespertilionidae and an outgroup species from Phyllostomidae. Graph-based clustering methods were used to reconstruct the major repeat families present in each species and novel elements were identified in several taxa. Retrotransposons remained the dominant group with regard to overall genome mass. Elevated Class II TE composition (3-4%) was observed in all five vesper bats, while less than 0.5% of the phyllostomid reads were identified as Class II derived. Differences in satellite DNA and Class I TE content are also described among vespertilionid taxa. These analyses present the first cohesive description of TE evolution across closely related mammalian species, revealing genome-scale differences in TE content within a single family.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22491057 PMCID: PMC3342881 DOI: 10.1093/gbe/evs038
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FMost recent of several possible phylogenies for the surveyed taxa. Topology and vespertilionid divergence dates are taken from Lack and Van Den Bussche (2010). The date of the Artibeus lituratus/vespertilionid divergence is taken from Datzmann et al. (2010), and the M. lucifugus/M. austroriparius divergence is from Stadelmann et al. (2007).
454 Sequencing Summary
| Total Reads | Mean Read Length (bp) | Total Base Pairs | Estimated Genome Size | Percentage of Genome Coverage | After Sequencing Artifact Filter | |||
| Unique Reads | Percentage of Replicates | Percentage of Unique Genome Coverage | ||||||
| 295660 | 397 | 1.01 × 108 | 2.70 × 109 | 3.75 | 255065 | 13.73 | 3.23 | |
| 403317 | 285 | 1.15 × 108 | 2.42 × 109 | 4.75 | 317269 | 21.34 | 3.74 | |
| 233826 | 368 | 8.60 × 107 | 2.56 × 109 | 3.36 | 169361 | 27.57 | 2.43 | |
| 86583 | 285 | 2.47 × 107 | 3.26 × 109 | 0.76 | 67924 | 21.55 | 0.59 | |
| 135978 | 280 | 3.81 × 107 | 2.42 × 109 | 1.57 | 108535 | 20.18 | 1.26 | |
| 122395 | 265 | 3.24 × 107 | 2.26 × 109 | 1.44 | 99801 | 18.46 | 1.17 | |
NOTE.—Percentage of Genome Coverage was approximated using mean read length and estimated genome size. A sequencing artifact filter was applied to data (Percentage of Unique Genome Coverage) before graph-based repeat discovery and RepeatMasker analyses to determine genome representation.
Comparison of RepeatMasker Output from Myotis austroriparius 454 Data and the WGS for M. lucifugus
| Element Class/Family | ||||
| Percentage of RM Hits | Percentage of 454 Sequence Data | Percentage of RM Hits | Percentage of WGS | |
| DNA/hAT | 10.75 | 2.07 | 12.95 | 2.29 |
| DNA/Helitron | 15.13 | 2.78 | 16.23 | 2.57 |
| DNA/Mariner | 3.19 | 0.68 | 3.32 | 0.67 |
| DNA/piggyBac | 1.14 | 0.27 | 0.65 | 0.16 |
| DNA/TcMar-Tigger | 0.09 | 0.02 | 0.27 | 0.05 |
| ERV/LTR | 10.49 | 2.35 | 9.17 | 2.22 |
| Non-LTR/LINE | 29.10 | 9.21 | 17.49 | 6.02 |
| Non-LTR/SINE | 30.02 | 5.31 | 39.86 | 6.27 |
| Non-LTR/unknown | 0.10 | 0.03 | 0.04 | 0.02 |
NOTE.—Percentage of RM hits = proportion of total RepeatMasker hits to any given TE type. Percentage of 454 sequence data indicates proportion of bases masked from M. austroriparius survey sequence data. Percentage of WGS indicates proportion of bases masked in the M. lucifugus WGS.
Top Clusters for Each Taxon
| Cluster Number | Original Number of Reads | Number of Reads Used in Contigs | Number of Cluster-Based Contigs | Number of SeqMan Contigs | Number of RepeatMasker Reads | Element Name | Element Family | |
| CL1 | 9595 | 8625 | 283 | 7 | 24730 | L1MAB_ML | Non-LTR/LINE | |
| 2347 | ERV2X1A_I_ML | ERV/LTR | ||||||
| CL2 | 3820 | 3526 | 61 | 1 | 4842 | HAL1-1A_ML | Non-LTR/LINE | |
| CL3 | 2582 | 2538 | 3 | 1 | 2814 | mtDNA | ||
| CL4 | 2469 | 2249 | 37 | 1 | 3775 | HAL1-1A_ML | Non-LTR/LINE | |
| CL5 | 1755 | 1601 | 77 | 4 | 3343 | HAL1-1A_ML | Non-LTR/LINE | |
| CL1 | 3324 | 2919 | 102 | 1 | 12262 | L1MAB_ML | Non-LTR/LINE | |
| CL2 | 2174 | 1956 | 80 | 1 | 2973 | HAL1-1A_ML | Non-LTR/LINE | |
| CL3 | 1076 | 847 | 92 | 4 | 4182 | HAL1-1A_ML | Non-LTR/LINE | |
| CL4 | 625 | 531 | 44 | 3 | 1467 | L1MAB_ML | Non-LTR/LINE | |
| CL5 | 510 | 380 | 15 | 1 | 1663 | HAL1-1A_ML | Non-LTR/LINE | |
| CL1 | 748 | 677 | 20 | 1 | 2197 | L1MAB_ML | Non-LTR/LINE | |
| CL2 | 644 | 599 | 14 | 1 | 960 | HAL1-1B_ML | Non-LTR/LINE | |
| CL3 | 563 | 330 | 16 | 3 | 2882 | VES | Non-LTR/SINE | |
| CL4 | 303 | 226 | 6 | 1 | 423 | Tandem Repeat | Satellite | |
| CL5 | 262 | 248 | 4 | 1 | 510 | L1MAB2_ML | Non-LTR/LINE | |
| CL1 | 1818 | 1093 | 34 | 4 | 10436 | VES | Non-LTR/SINE | |
| CL2 | 614 | 521 | 28 | 2 | 1397 | HAL1-1A_ML | Non-LTR/LINE | |
| CL3 | 470 | 399 | 26 | 2 | 2101 | L1MAB_ML | Non-LTR/LINE | |
| 226 | ERV2X1A_I_ML | ERV/LTR | ||||||
| CL4 | 432 | 357 | 10 | 1 | 229 | L1MAB_ML | Non-LTR/LINE | |
| 512 | ERV2X1A_I_ML | ERV/LTR | ||||||
| CL5 | 345 | 260 | 37 | 2 | 3218 | nHelitron1_Nh | DNA/Helitron | |
| CL1 | 2092 | 1634 | 65 | 3 | 2934 | Tandem Repeat | Satellite | |
| CL2 | 1596 | 1430 | 54 | 1 | 5329 | L1MAB_ML | Non-LTR/LINE | |
| CL3 | 1408 | 1151 | 88 | 6 | 4994 | nHelitron1_Ps | DNA/Helitron | |
| CL4 | 1282 | 1157 | 37 | 1 | 2002 | HAL1-1A_ML | Non-LTR/LINE | |
| CL5 | 830 | 790 | 7 | 1 | 926 | Tandem Repeat | Satellite | |
| CL1 | 5933 | 5225 | 154 | 4 | 24398 | L1-4_PVa | Non-LTR/LINE | |
| CL2 | 5299 | 4563 | 169 | 5 | 11493 | HAL1-3_ML | Non-LTR/LINE | |
| CL3 | 2688 | 2498 | 20 | 1 | 3131 | Tandem Repeat | Satellite | |
| CL4 | 2454 | 2385 | 7 | 1 | 2609 | Tandem Repeat | Satellite | |
| CL5 | 1482 | 1269 | 41 | 3 | 2321 | Tandem Repeat | Satellite |
NOTE.—Information regarding the content of the graph-based clusters is provided, including the original number of contigs, which were submitted to SeqMan. The SeqMan contigs were then submitted to CENSOR for identification and used to RepeatMask the respective taxonomic 454 data set to determine genome representation.
Characteristics and Ages of Novel TEs
| Element | Length (bp) | TIR (bp) | ORF (aa) | N | Average K2P | Standard Error | Average Age (Myr) |
| Mariner2_Ml | 803 | 28 | 235 | 349 | 0.0188 | 0.0005 | 8.5 |
| 192 | 16 | 404 | 0.0194 | 0.0006 | 8.8 | ||
| 2294 | 25 | 347 | 23 | 0.0197 | 0.0024 | 9.0 | |
| 203 | 16 | 127 | 0.0223 | 0.0018 | 10.1 | ||
| 246 | 16 | 61 | 0.0228 | 0.0012 | 10.4 | ||
| 231 | 25 | 518 | 0.0268 | 0.0006 | 12.2 | ||
| nHeliBat1_Ps | 1207 | 33 | 0.0416 | 0.0041 | 18.9 | ||
| 213 | 16 | 47 | 0.0509 | 0.0066 | 23.2 | ||
| 184 | 29 | 54 | 0.0639 | 0.0032 | 29.1 | ||
| nHeliBat1_Lb | 993 | 209 | 0.0905 | 0.0019 | 41.1 | ||
| nHeliBat1_Nh | 1183 | 34 | 0.0916 | 0.0055 | 41.7 | ||
| nHeliBat2_Ps | 220 | 39 | 0.1119 | 0.0113 | 50.8 | ||
| nHeliBat1_Cr | 364 | 74 | 0.1208 | 0.0041 | 54.9 | ||
| Mariner1_Ps | 1293 | 32 | 345 | ||||
| nMariner1_Ps | 279 | 67 | |||||
| Mariner1_Ml | 1211 | 198 | 235 | ||||
| 337 | 16 |
Note.—Elements shown in bold are lineage-specific. Names preceded by an “n” are nonautonomous. Age estimations are only shown if >20 hits of appropriate length were obtained for analysis. Final two letters denote data set from which consensus was inferred (e.g., Lb–L. borealis).
Number of RepeatMasker hits, which are at least 90% of the query length; see Materials and Methods.
Average mammalian neutral mutation rate (2.2 × 10−9).
FGenome representation of the TE classes. The inclusion of outgroup Artibeus suggests elevated DNA transposon activity is limited to the vesper taxa, while other aspects of their repetitive landscapes differ within the family.
Genome Representation Determined Using RepeatMasker and a Custom Repeat Library Compiled for Each Taxon
| Non-LTR/LINE (%) | Non-LTR/SINE (%) | ERV/LTR (%) | Total Class I (%) | Total Class II (%) | |
| 14.83 | 2.90 | 0.93 | 18.66 | 0.38 | |
| 11.74 | 4.02 | 0.42 | 16.18 | 2.56 | |
| 11.93 | 3.91 | 0.97 | 16.81 | 3.12 | |
| 7.16 | 6.04 | 1.02 | 14.22 | 3.11 | |
| 8.46 | 4.48 | 0.53 | 13.48 | 3.52 | |
| 9.33 | 4.18 | 0.69 | 14.20 | 4.45 |
NOTE.—Primary Class I repeat types are shown, and final two columns depict Class I versus Class II content.
FCorrelation of Class I and Class II TE activity. Initial data suggest that TE activity may be inversely related between the two classes such that higher Class II genome representation is accompanied by a decrease in Class I content (r = −0.85, P < 0.05).