| Literature DB >> 29850787 |
Manee M Manee1,2,3, John Jackson1,4, Casey M Bergman1,5,6.
Abstract
Highly conserved noncoding elements (CNEs) constitute a significant proportion of the genomes of multicellular eukaryotes. The function of most CNEs remains elusive, but growing evidence indicates they are under some form of purifying selection. Noncoding regions in many species also harbor large numbers of transposable element (TE) insertions, which are typically lineage specific and depleted in exons because of their deleterious effects on gene function or expression. However, it is currently unknown whether the landscape of TE insertions in noncoding regions is random or influenced by purifying selection on CNEs. Here, we combine comparative and population genomic data in Drosophila melanogaster to show that the abundance of TE insertions in intronic and intergenic CNEs is reduced relative to random expectation, supporting the idea that selective constraints on CNEs eliminate a proportion of TE insertions in noncoding regions. However, we find no evidence for differences in the allele frequency spectra for polymorphic TE insertions in CNEs versus those in unconstrained spacer regions, suggesting that the distribution of fitness effects acting on observable TE insertions is similar across different functional compartments in noncoding DNA. Our results provide evidence that selective constraints on CNEs contribute to shaping the landscape of TE insertion in eukaryotic genomes, and provide further evidence that CNEs are indeed functionally constrained and not simply mutational cold spots.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29850787 PMCID: PMC6007792 DOI: 10.1093/gbe/evy104
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
TE Insertions in Normal Recombination Regions
| Region | Coverage (bp) | % Normal Rec. Genome | # ngs_te_mapper TE | % ngs_te_mapper TE | # TEMP TE | % TEMP TE |
|---|---|---|---|---|---|---|
| Exon | 27502613 | 26.4 | 399 | 6.6 | 278 | 6 |
| Intron | 38960671 | 37.4 | 2,743 | 45.3 | 2,153 | 46.3 |
| Intron/exon | n.a. | n.a. | 5 | 0.1 | 7 | 0.2 |
| Intergenic | 37804929 | 36.3 | 2,905 | 47.9 | 2,210 | 47.5 |
| Intergenic/exon | n.a. | n.a. | 9 | 0.1 | 4 | 0.1 |
| Total | 104268213 | 100 | 6,061 | 100 | 4,652 | 100 |
Note.—Columns contain the coverage (in bp) and percent of the normally recombining genome covered for exonic, intronic, and intergenic regions followed by the number and percent of TE insertions found fully in exonic, intronic, and intergenic regions or spanning intron/exon and intergenic/exon boundaries for both ngs_te_mapper and TEMP. Overlap categories have “n.a.” for coverage and percent of the normally recombining genome covered since boundaries between compartments do not occupy any space. Regions of the reference genome identified by RepeatMasker as TE were subtracted from all compartments and any nonreference TE in these regions were excluded from all analyses. Regions of normal recombination were defined by Cridland et al. (2013).
TE Insertions in Noncoding Regions with Normal Recombination
| Region | Coverage (bp) | % Normal Rec. Noncoding Genome | # ngs_te_mapper TE | % ngs_te_mapper TE | # TEMP TE | % TEMP TE |
|---|---|---|---|---|---|---|
| Intronic CNE | 14093340 | 18.4 | 747 | 13.2 | 500 | 11.5 |
| Intronic spacer | 24867331 | 32.4 | 1,842 | 32.6 | 1,458 | 33.4 |
| Intronic CNE/spacer | n.a. | n.a. | 154 | 2.7 | 195 | 4.5 |
| Intergenic CNE | 14749396 | 19.2 | 813 | 14.4 | 577 | 13.2 |
| Intergenic spacer | 23055533 | 30 | 1,928 | 34.1 | 1,447 | 33.2 |
| Intergenic CNE/spacer | n.a. | n.a. | 164 | 2.9 | 186 | 4.3 |
| Total | 76765600 | 100 | 5,648 | 100 | 4,363 | 100 |
Note.—Columns contain the coverage (in bp) and percent of the normally recombining noncoding genome covered by CNEs and spacers for introns and intergenic regions followed by the number and percent of TE insertions found fully in CNEs and spacers or spanning CNE/spacer boundaries for both ngs_te_mapper and TEMP. Overlap categories have “n.a.” for coverage and percent of the normally recombining noncoding genome covered since boundaries between compartments do not occupy any space. Regions of the reference genome identified by RepeatMasker as TE and any nonreference TE in these regions were excluded from all compartments. Regions of normal recombination were defined by Cridland et al. (2013).
. 1.—TEs in normally recombining regions of the Drosophila melanogaster genome are depleted in exonic and intronic regions. Observed numbers of TEs in different genomic compartments are shown as vertical lines for ngs_te_mapper (red) and TEMP (blue). Empirical null distributions of the numbers of TEs in different genomic compartments in 10,000 random permutations are shown as density plots for ngs_te_mapper (red) and TEMP (blue). All permutation analyses were restricted to normally recombining regions of the D. melanogaster genome as defined by Cridland et al. (2013). Permutation analyses were conducted across all compartments (A–E), or in noncoding regions only (F and G). Observed and simulated numbers of TEs were counted in exonic regions (A), intronic regions (B and F), intergenic regions (C and G), intronic/exonic boundaries (D), and intergenic/exonic boundaries (E). Observed TEs overlapping intron/exon boundaries or intergenic/exon boundaries were excluded from permutation analyses in noncoding regions only (F and G). Regions of the reference genome identified by RepeatMasker as TE sequence and any nonreference TE in these regions were also excluded from all permutation analyses.
. 2.—TEs in normally recombining regions of the Drosophila melanogaster genome are depleted in conserved noncoding elements. Observed numbers of TEs in different noncoding compartments are shown as vertical lines for ngs_te_mapper (red) and TEMP (blue). Empirical null distributions of the numbers of TEs in different noncoding compartments in 10,000 random permutations are shown as density plots for ngs_te_mapper (red) and TEMP (blue). All permutation analyses were restricted to normally recombining regions of the D. melanogaster genome as defined by Cridland et al. (2013). Permutation analyses were conducted across intronic regions only (A, C, and E) or intergenic regions only (B, D, and F). Observed and simulated numbers of TEs were counted in CNEs (A and B), CNE/spacer boundaries (C and D), or spacers (E and F). The TEMP data set has higher number of observed and expected CNE/spacer overlaps (C and D) despite having fewer TE insertions overall because of a larger average TSD length (7.71 bp) relative to ngs_te_mapper (4.73 bp). Observed TEs overlapping intron/exon boundaries or intergenic/exon boundaries were excluded from these analyses. Regions of the reference genome identified by RepeatMasker as TE sequence and any nonreference TE in these regions were also excluded from all permutation analyses.
. 3.—The derived allele frequency (DAF) spectrum for TE insertions is similar across different compartments of the Drosophila melanogaster genome. DAF spectra are shown for TE insertions predicted by ngs_te_mapper (A) or TEMP (B). Allele frequency classes are shown on the X axis, and the proportion of TE insertions observed in a particular compartment of the genome at that allele frequency is shown on the Y axis. Note that the Y axis is split to allow better visualization of the proportion of higher allele frequency classes.