| Literature DB >> 34498072 |
Robert Horvath1, Emily B Josephs2, Edouard Pesquet3, John R Stinchcombe4, Stephen I Wright4, Douglas Scofield5, Tanja Slotte1.
Abstract
Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.Entities:
Keywords: ATAC-sequencing; distribution of fitness effects; functional noncoding sequences; gene expression variation; natural selection; open chromatin region
Mesh:
Substances:
Year: 2021 PMID: 34498072 PMCID: PMC8662636 DOI: 10.1093/molbev/msab270
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.Genetic profile plots. (A) Proportion of ACRs in and around genes (TSS: transcription start site; TES: transcription end site). (B) Proportion of ACRs in and around CNSs. (C) Proportion of SNPs in and around intergenic ACRs. (D) Proportion of TE insertions in and around ACRs. All plots only include mappable regions of the genome (see Materials and Methods) and the inferred boundaries of the CNS and ACR are labeled upstream boundary (UB) and downstream boundary (DB).
Fig. 2.Estimated distribution of fitness effects of new mutations in intergenic ACRs, intergenic control regions, and 0-fold degenerate sites using a one-epoch demographic model. The estimated DFE was binned based on the mean selective effect (−N) into three bins: effectively neutral (0 < −N < 1), intermediate fitness effect (1 < −Ns ≤ 10), and strongly deleterious (−N > 10) mutations. (A) DFE of intergenic ACRs and 0-fold degenerate sites. (B) DFE of distal ACRs (dACRs), distal control regions (d-control), proximal ACRs (pACRs), and proximal control regions (P-control). (C) DFE of intergenic ACRs split up into unique (u), common (c), and high-frequency (h) ACRs. Error bars show the 95% CI of each estimate based on 100 bootstrap replicates. Significant differences between estimates are shown by asterisks (Kruskal–Wallis test with a Dunn post hoc test, P-value < 0.05).
Fig. 3.Comparison between the observed and expected number of cis- and trans-eQTL located in ACRs and their 500 bp surroundings in C. grandiflora. (A) The expected number of cis-eQTL found within and in the proximity of ACRs located in and around genes (5 kb up- and downstream). (B) The expected number of trans-eQTL found within and in the proximity of ACRs at least 5 kb away from genes. The expected number of cis- and trans-eQTL within and in the proximity of ACRs (gray) were based on 1,000 permutations. The observed number of cis- and trans-eQTL located within and in the proximity of ACRs is indicated by the red lines. The blue dashed lines delimit the 95% confidence interval of the permutation test.