Literature DB >> 21993624

A high-resolution map of human evolutionary constraint using 29 mammals.

Kerstin Lindblad-Toh¹, Manuel Garber, Or Zuk, Michael F Lin, Brian J Parker, Stefan Washietl, Pouya Kheradpour, Jason Ernst, Gregory Jordan, Evan Mauceli, Lucas D Ward, Craig B Lowe, Alisha K Holloway, Michele Clamp, Sante Gnerre, Jessica Alföldi, Kathryn Beal, Jean Chang, Hiram Clawson, James Cuff, Federica Di Palma, Stephen Fitzgerald, Paul Flicek, Mitchell Guttman, Melissa J Hubisz, David B Jaffe, Irwin Jungreis, W James Kent, Dennis Kostka, Marcia Lara, Andre L Martins, Tim Massingham, Ida Moltke, Brian J Raney, Matthew D Rasmussen, Jim Robinson, Alexander Stark, Albert J Vilella, Jiayu Wen, Xiaohui Xie, Michael C Zody, Jen Baldwin, Toby Bloom, Chee Whye Chin, Dave Heiman, Robert Nicol, Chad Nusbaum, Sarah Young, Jane Wilkinson, Kim C Worley, Christie L Kovar, Donna M Muzny, Richard A Gibbs, Andrew Cree, Huyen H Dihn, Gerald Fowler, Shalili Jhangiani, Vandita Joshi, Sandra Lee, Lora R Lewis, Lynne V Nazareth, Geoffrey Okwuonu, Jireh Santibanez, Wesley C Warren, Elaine R Mardis, George M Weinstock, Richard K Wilson, Kim Delehaunty, David Dooling, Catrina Fronik, Lucinda Fulton, Bob Fulton, Tina Graves, Patrick Minx, Erica Sodergren, Ewan Birney, Elliott H Margulies, Javier Herrero, Eric D Green, David Haussler, Adam Siepel, Nick Goldman, Katherine S Pollard, Jakob S Pedersen, Eric S Lander, Manolis Kellis.

Abstract

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
RNA

Year: 2011 PMID： 21993624 PMCID： PMC3207357 DOI： 10.1038/nature10530

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

Introduction

A key goal in understanding the human genome is to discover and interpret all functional elements encoded within its sequence. While only ~1.5% of the human genome encodes protein sequence[1], comparative analysis with the mouse[2], rat[3] and dog[4] genomes showed that at least 5% is under purifying selection and thus likely functional, of which ~3.5% consists of non-coding elements with likely regulatory roles. Detecting and interpreting these elements is particularly relevant to medicine, as loci identified in genome-wide association studies (GWAS) frequently lie in non-coding sequence[5]. Whereas initial comparative mammalian studies could estimate the overall proportion of the genome under evolutionary constraint, they had little power to detect most of the constrained elements – especially the smaller ones. Thus, they focused only on the top 5% of constrained sequence, corresponding to less than ~0.2% of the genome[4,6]. In 2005, we began an effort to generate sequence from a large collection of mammalian genomes with the specific goal of identifying and interpreting functional elements in the human genome based on their evolutionary signatures[7-8]. Here, we report our results to systematically characterize mammalian constraint, using 29 eutherian (placental) genomes. We identify 4.2% of the genome as constrained and ascribe potential function to ~60% of these bases using diverse lines of evidence for protein-coding, RNA, regulatory and chromatin roles, and we present evidence of exaptation and accelerated evolution. All datasets described here are publicly available in a comprehensive set at Broad Institute and UCSC (see below for links).

Sequencing, assembly and alignment

We generated genome sequence assemblies for 29 mammalian species selected to achieve maximum divergence across the four major mammalian clades (Figure 1a, Text S1 and Table S1). For nine species, we used genome assemblies based on ~7-fold coverage shotgun sequence, and for 20 species we generated ~2-fold coverage (2X), to maximize the number of species sequenced with available resources on capillary machines. Twenty genomes are first reported here, and nine were previously described (See supplement). 0

Figure 1

Phylogeny and constrained elements from the 29 eutherian mammalian genome sequences

a, A phylogenetic tree of all 29 mammals used in this analysis based on the substitution rates in the MultiZ alignments. Organisms with finished genome sequences are indicated in blue, high quality drafts in green and 2X assemblies in black. Substitutions per 100 bp are given for each branch, and branches with ≥ 10 substitutions are colored red, while blue indicates < 10 substitutions. b, At 10% FDR, 3.6 million constrained elements can be detected encompassing 4.2% of the genome, including a substantial fraction of newly detected bases (blue) compared to the union of the HMRD 50-bp + Siepel vertebrate elements[17] (see Figure S4b for comparison to HMRD elements only). The largest fraction of constraint can be seen in coding exons, introns and intergenic regions. For unique counts, the analysis was performed hierarchically: coding exons, 5′-UTRs, 3′-UTRs, promoters, pseudogenes, non-coding RNAs, introns, intergenic. The constrained bases are particularly enriched in coding transcripts and their promoters (Supp Fig S4c).

The power to detect constrained elements depends largely on the total branch length of the phylogenetic tree connecting the species[9]. The 29 mammals correspond to a total effective branch length of ~4.5 substitutions per site, compared to ~0.68 for human-mouse-rat-dog (HMRD), and thus should offer greater power to detect evolutionary constraint: the probability that a genomic sequence not under purifying selection will remain fixed across all 29 species is P1<0.02 for single bases and P12<10−25 for 12-mers, compared to P1~0.50 and P12~10−3 for HMRD. For 2x mammals, our assisted assembly approach[10] resulted in a typical contig size N50C of 2.8 kb and a typical scaffold size N50S of 51.8 kb (Text S2 and Table S1) and high sequence accuracy (96% of bases had a <1% error rate = Q20)[11]. Compared to high-quality sequence across the 30 Mb of the ENCODE pilot project[12], we estimated average error rates of 1-3 miscalled bases per kilobase[11], which is ~50-fold lower than the typical nucleotide sequence difference between the species, enabling high-confidence detection of evolutionary constraint (Text S3). We based our analysis on whole-genome alignments by MultiZ (Text S4). The average number of aligned species was 20.9 at protein-coding positions in the human genome and 23.9 at the top 5% HMRD-conserved non-coding positions, with an average branch length of 4.3 substitutions per base in these regions (Figure S1, S2). By contrast, whole-genome average alignment depth is only 17.1 species with 2.9 substitutions per site, likely due to large deletions in non-functional regions[4]. The depth at ancestral repeats is 11.4 (Figure S1a) consistent with repeats being largely non-functional[2,4].

Detection of constrained sequence

Our analysis did not substantially change the estimate of the proportion of genome under selection. By comparing genome-wide conservation to that of ancestral repeats, we estimated the overall fraction of the genome under evolutionary constraint to be 5.36% at 50-bp windows (5.44% at 12-bp windows), using the SiPhy-ω statistic[13], a measure of overall substitution rate (Figure S3), consistent with previous similar estimates[2,4,14]. However, alternative methods[15-16] and different ways of correcting for the varying alignment depths give higher estimates (see Text S5 for details). The additional species had a dramatic effect on our ability to identify the specific elements under constraint. With 29 mammals, we identify 3.6 million elements spanning 4.2% of the genome, at a finer resolution of 12 bp (Figure 1b, Text S6, Figure S4, Table S2, S3), compared to <0.1% of the genome for HMRD 12-bp elements and 2.0% for HMRD 50-bp elements[4]. Elements previously detected using five vertebrates[17] also detect a larger fraction of the genome (~4.1%), but only cover 45% of the mammalian elements detected here, suggesting a large fraction of our elements are mammalian-specific. The mean element size (36bp) is considerably shorter than both previously-detected HMRD elements (123bp) and five-vertebrate elements (104bp)[17]. For example, it is now possible to detect individual binding sites for the neuron-restrictive silencer factor (NRSF) in the promoter of the NPAS4 gene, which are beyond detection power in previous datasets (Figure 2, Figure S5). We found a similar regional distribution of 12-bp elements (including the 2.6 million newly-detected constrained elements) to previously-detected HMRD elements (r = 0.94, Figure S6),. Similar results were obtained with the PhastCons[17] statistic (see Text S6).

Figure 2

Identification of four NRSF-binding sites in NPAS4

a. The neurological gene NPAS4 has many constrained elements overlapping introns and the upstream intergenic region. The gray shaded box contained only one constrained element using HMRD, while analysis of 29 mammalian sequences reveals four smaller elements. b, These four constrained elements in the first intron correspond to binding sites for the NRSF transcription factor, known to regulate neuronal lineages.

Using a new method, SiPhy-π, sensitive not just to the substitution rate but also to biases in the substitution pattern (e.g. Figure S7), we detected an additional 1.3% of the human genome in constrained elements (see Table S2, S3). Most of the newly-detected constrained nucleotides extend elements found by rate-based methods, but 22% consist of new elements (average length 17 bp), and are enriched in noncoding regions.

Constraint within the human population

We observed that the evolutionary constraint acting on the 29 mammals is correlated with constraint within the human population, as assessed from human polymorphism data (Text S7) and consistent with previous studies[18]. Mammalian constrained elements show a depletion in single-nucleotide polymorphisms (SNPs)[19], and more constrained elements show even greater depletion. For example, in the top 1% most-strongly-conserved non-coding regions, SNPs occur at a 1.9-fold lower rate than the genome average, and the derived alleles have a lower frequency, consistent with purifying selection at many of these sites in the human genome. Moreover, at positions with biased substitution patterns across mammals, the observed human SNPs show a similar bias to the one observed across mammals (Figure S7). Thus, not only are constrained regions less likely to exhibit polymorphism in humans, but when such polymorphisms are observed, the derived alleles in humans tend to match the alleles present in non-human mammals, indicating a preference for the same alleles across both mammalian and human evolution.

Functional annotation of constraint

We first studied the overlap of the 3.6 million evolutionarily constrained elements (ω<0.8 with P<10−15) with known gene annotations (Figure 1b). Roughly 30% of constrained elements were associated with protein-coding transcripts: ~25.3% overlap mature mRNAs (including 19.6% in coding exons, 1.2% in 5′-UTRs, and 4.4% in 3′-UTRs), and an additional 4.4% reside within 2 kb of transcriptional start sites (1.2% of which is within 200 bases). The majority of constrained elements however reside in intronic and intergenic regions (29.7% and 38.6%, respectively). To study their biological roles and provide potential starting points to understand these large and mostly uncharted territories, we next studied their overlap with evolutionary signatures[7-8,20-21] characteristic of specific types of features and a growing collection of public large-scale experimental data.

Protein-coding genes and exons

Despite intense efforts to annotate protein-coding genes over the past decade[20,22-24], we detected 3,788 candidate new exons (a 2% increase) using evolutionary signatures characteristic of protein-coding exons[25]. Of these, 54% reside outside protein-coding genes, 19% within introns, and 13% in UTRs of known coding genes (Text S8, Table S4, S5). Our methods recovered 92% of known coding exons that were >10 codons and that fall in syntenic regions, the remainder showing non-consensus splice sites, unusual features, or poor conservation. The majority of new exon candidates (>58%) are supported by evidence of transcription measured in 16 human tissues[26] (Figure S8a) or similarity to known Pfam protein domains. 31% of intronic and 13% of intergenic predictions extend known transcripts, and 5% and 11% respectively reside in new transcript models. The newly detected exons are more tissue-specific than known exons (mean of 3 tissues, vs. 12) and are expressed at 5-fold lower levels. Directed experiments and manual curation will be required to complete the annotation of the few hundred protein-coding genes that likely remain unannotated[27]. We found apparent stop codon readthrough[28] of four genes based on continued protein-coding constraint after an initial conserved stop codon[29] and until a subsequent stop codon (Text S9, Figure S8b). Readthrough in SACM1L could be triggered by an 80-base conserved RNA stem loop predicted by RNAz[30], lying four bases downstream of the readthrough stop codon. We also detected coding regions with a very low synonymous substitution rate, indicating additional sequence constraints beyond the amino acid level (Text S9). We found >10,000 such synonymous constraint elements (SCEs) in more than one-quarter of all human genes[31]. Initial analysis suggests potential roles in splicing regulation (34% span an exon-exon junction), A-to-I editing, microRNA (miRNA) targeting, and developmental regulation. Hox genes contain several top candidates (Figure 3a), including two previously-validated developmental enhancers[32-33].

Figure 3

Examination of evolutionary signatures identifies synonymous constrained elements (SCEs) and evidence of positive selection

a, Two regions within the HOXA2 open reading frame are identified as Synonymous Constraint Elements (red), corresponding to overlapping functional elements within coding regions. Note that the synonymous rate reductions are not obvious from the base-wise conservation measure (in blue). Both elements have been characterized as enhancers driving Hoxa2 expression in distinct segments of the developing mouse hindbrain. The element in the first exon encodes Hox-Pbx binding sites and drives expression in rhombomere 4[33], while the element in the second exon contains Sox binding sites and drives expression in rhombomere 2[32]. Synonymous constraint elements are also found in most other Hox genes, and up to a quarter of all genes. b, While ~85% of genes show only negative (purifying) selection and 9 % of genes show uniform positive selection, the remaining 6% of genes, including ABI2, show only localized regions of positively-selected sites. Each vertical bar covers the estimated 95% confidence interval for dN/dS at that site (with values of 0 truncated to 0.01 to accommodate the log scaling), and bars are colored according to a signed version of the SLR statistic for non-neutral evolution: blue for sites under purifying selection, gray for neutral sites, and red for sites under positive selection.

RNA structures and families of structural elements

We next used evolutionary signatures characteristic of conserved RNA secondary structures[34] to reveal 37,381 candidate structural elements (Text S10, Figure S9a), covering ~1% of constrained regions. For example, the XIST lincRNA, known to bind chromatin and enable X-inactivation[35], contains a newly-predicted structure in its 5′ end (Figure S9bc), distinct from other known structures[36], that seems to be the source of chromatin-associated short RNAs[37]. Sequence- and structure-based clustering of predictions outside protein-coding exons revealed 1,192 novel families of structural RNAs (Text S10). We focused on a high-scoring subset consisting of 220 families with 725 instances, which also showed the highest thermodynamic stability[30] (Figure S9a, S10), DNase hypersensitivity, expression pattern correlation across tissues and intergenic expression enrichment (Figure S9a). We also expanded both known and novel families by including additional members detected by homology to existing members. Noteworthy examples include: a glycyl-tRNA family, including a new member in POP1, involved in tRNA maturation, and likely involved in feedback regulation of POP1; three intronic families of long hairpins in ion-channel genes known to undergo A-to-I RNA editing and possibly involved in regulation of the editing event; an additional member of a family of 5′UTR hairpins overlapping the start codon of collagen genes and potential new miRNA genes that extend existing families[37]. Two of the largest novel families consist of short AU-rich hairpins of 6-7 bp that share the same strong consensus motif in their stem. These occur in the 3′UTRs of genes in several inflammatory response pathways, whose post-transcriptional regulation often involves structural AU-rich elements (AREs). Indeed, two homologous hairpins in TNF and CSF3 correspond to known mRNA-destabilization elements, suggesting roles in mRNA stability for the two families[37]. Lastly, a family of six conserved hairpin structures (Figure S9d) was found in the 3′UTR of the MAT2A gene[37], which is involved in the synthesis of S-adenosyl-methionine (SAM), the primary methyl donor in human cells. All six hairpins consist of a 12-18 bp-stem and a 14-bp loop region with a deeply-conserved sequence motif (Figure S9), and may be involved in sensing SAM concentrations, which are known to affect MAT2A mRNA stability[38].

Conservation patterns in promoters

As different types of conservation in promoters may imply distinct biological functions[39], we classified the patterns of conservation within core promoters into three categories: those with uniformly ‘high’ constraint (7,635 genes, 13,996 transcripts), uniformly ‘low’ constraint (2,879 genes, 4,135 transcripts), and ‘intermittent’ constraint, consisting of alternating peaks and troughs of conservation (14,271 genes and 29,814 transcripts) (Figure S11a). ‘High’ and ‘intermittent’ constraint promoters are both associated with CpG islands (~66%), while ‘low’ constraint promoters have significantly lower overlap (~41%), and all three classes show similar overlap with functional TATA boxes (2-3%, see Text S11). These groups show distinct Gene Ontology enrichments (Figure S11b), with high-constraint promoters involved in development (Pbonf<10−30), intermittent-constraint in basic cellular functions (Pbonf.<5×10−4), and low-constraint promoters in immunity, reproduction and perception, functions expected to be under positive selection and lineage-specific adaptation[2]. High constraint may reflect cooperative binding of many densely-binding factors, as previously suggested for developmental genes[6]. Intermittent constraint promoters, whose peak-spacing distribution was suggestive of the periodicity of the DNA helix turns, may reflect loosely-interacting factors (Figure S11cd). Low constraint may reflect rapid motif turnover, under neutral drift or positive selection.

Identifying specific instances of regulatory motifs

Data from just four species (HMRD) was sufficient to create a catalog of known and novel motifs with many conserved instances across the genome[21]. The power to discover such motifs was high, because one can aggregate data across hundreds of motif instances. Not surprisingly, the additional genomes therefore had little effect on the ability to discover new motifs (known motifs showed 99% correlation in genome-wide motif conservation scores, Figure S12 and S13). In contrast, the 29 mammalian genomes dramatically improved our ability to detect individual motif instances, making it possible to predict specific target sites for 688 regulatory motifs corresponding to 345 transcription factors (Figure S14). We chose to identify motif instances at a false discovery rate (FDR) of 60%, representing a reasonable compromise between specificity and sensitivity given the available discovery power (Text S12), and matching the experimental specificity of Chromatin Immunoprecipitation (ChIP) experiments for identifying biologically-significant targets[40]. Higher levels of stringency could be obtained by sequencing additional species. We identified 2.7 million conserved instances (Table S6), enabling the construction of a regulatory network linking 375 motifs to predicted targets, with a median of 21 predicted regulators per target gene (25th percentile: 10; 75th: 39). The number of target sites (average: 4277; 25th percentile: 1407; 75th: 10,782) are comparable to those found in ChIP experiments, and have the advantage that they are detected at nucleotide resolution, enabling us to use them to interpret disease-associated variants for potential regulatory functions. However, some motifs never reached high confidence values, and others did so at very few instances. The motif-based targets show strong agreement with experimentally-defined binding sites from ChIP experiments (Table S7). For long and distinct motifs, such as CTCF and NRSF, the fraction of instances overlapping experimentally observed binding matches the fraction predicted by the confidence score (e.g. at 80% confidence 70% of NRSF motif instances overlapped bound sites, and at ~50% confidence 40% overlapped), despite potential confounding aspects such as condition-specific binding, overlapping motifs between factors, or non-specific binding. Moreover, increasing confidence levels showed increasing overlap with experimental binding (Figure S14-16). For example, YY1 enrichment for bound sites increased from 42-fold to 168-fold by focusing on conserved instances. Lastly, combining motif conservation and experimental binding led to increased enrichment for candidate tissue-specific enhancers, suggesting the two provide complementary information. Within bound regions, the evolutionary signal reveals specific motif instances with high precision (e.g., Figure 2, Figure 4, Figure S17).

Figure 4

Utilizing constraint to identify candidate mutations

Conservation can help us resolve amidst multiple SNPs the ones that disrupt conserved functional elements and are likely to have regulatory roles. In this example, a SNP (rs6504340) associated with tooth development is perfectly linked to a conserved intergenic SNP, rs8073963, 7.1kb away, which disrupts a deeply conserved Forkhead-family motif in a strong enhancer. While the SNPs shown here stem from GWAs or HAPMAP data, the same principle should be applicable also to associated variants detected by resequencing the region of interest.

Chromatin signatures

To suggest potential functions for the ~68% of ‘unexplained’ constrained elements outside coding regions, UTRs, or proximal promoters, we used chromatin state maps from CD4 T-cells[41] (Figure S18) and nine diverse cell types[42] (Text S13, Figure S19). In T-cells, constrained elements were most enriched for promoter-associated states (up to 5-fold), an insulator state and a specific repressed state (2.2-fold), and numerous enhancer states (1.5-2-fold), together covering 7.1% of the unexplained elements at 2.1-fold enrichment. In the nine cell types, enriched promoter, enhancer and insulator states, cover 36% of unexplained elements at ~1.75-fold enrichment, with locations active in multiple cell types showing even stronger enrichment (Figure S20). Overall, chromatin states suggest possible functions (at 1.74-fold enrichment) for 37.5% (N=987,985) of unexplained conserved elements (27% of all conserved elements), suggesting meaningful association for at least 16% of unexplained constrained bases. While current experiments only provide nucleosome-scale (~200-bp) resolution, we expect higher-resolution experimental assays that more precisely pinpoint regulatory regions to show further increases in enrichment. The increase observed with additional cell types suggests that new cell types will help elucidate additional elements. Of course, further experimental tests will be required to validate the predicted functional roles.

Accounting for constrained elements

Overall, ~30% of constrained elements overlap protein-coding genes, ~27% specific enriched chromatin states, ~1.5% novel RNA structures, and ~3% conserved regulatory motif instances (Text S14). Together, ~60% of constrained elements overlap one of these features, with enrichments ranging from 1.75-fold for chromatin states (compared to unannotated regions) up to 17-fold for protein-coding exons (compared to the whole genome).

Implications for interpreting disease-associated variants

In the non-protein-coding genome, SNPs associated with human diseases in genome-wide association studies are 1.37-fold enriched for constrained regions, relative to HapMap SNPs (Text S15, Table S8). This is striking, since only a small proportion of the associated SNPs are likely to be causative while the rest are merely in linkage disequilibrium (LD) with causative variants. Accordingly, constrained elements should be valuable in focusing the search for causative variants amongst multiple variants in LD. For example, in an intergenic region between HOXB1 and HOXB2 associated with tooth development phenotypes[43], the reported SNP (rs6504340) is not conserved, but a linked SNP (rs8073963) sits in a constrained element 7.1 kb away. Moreover, rs8073963 disrupts a deeply-conserved Foxo2 motif instance within a predicted enhancer (Figure 4), making it a candidate mutation for further follow-up. Similar examples of candidate causal variants are found for diverse phenotypes such as height or multiple sclerosis, and similar analyses could be applied to case-control resequencing data.

Evolution of constrained elements

We next sought to identify signatures of positive selection that may accompany functional adaptations of different species to diverse environments and new ecosystems. We used the ratio d of non-synonymous to synonymous codon substitutions as evidence of positive selection (>1) or negative selection (<1). While d is typically calculated for whole genes, the additional mammals sequenced enabled analysis at the codon level – simulations predicted a 250-fold gain in sensitivity compared to HMRD, identifying 53% of positive sites at 5% FDR (Text S16). Applying this test to 6.05 million codons in 12,871 gene trees, we found evidence of strong purifying selection (d<0.5) for 84.2% of codons and positive selection (d>1.5) for 2.4% of codons (with 94.1% of sites <1 and 5.9% >1; Table S9). At 5% FDR, we found 15,383 positively-selected sites in 4431 proteins. The genes fall into three classes based on the distribution of selective constraint: 84.8% of genes show uniformly high purifying selection, 8.9% show distributed positive selection across their length, and 6.3% show localized positive selection concentrated in small clusters (Figure 3b, Figure S21, Table S10-11). Genes with distributed positive selection were enriched in such functional categories as immune response (pBonf <10−16) and taste perception (pBonf <10−10), which are known to evolve rapidly, but also in some unexpected functions such as meiotic chromosome segregation (pBonf<10−23) and DNA-dependent regulation of transcription (pBonf<10−19, Table S12). Localized positive selection was enriched in core biochemical processes, including microtubule-based movement (pBonf<10−10), DNA topological change (pBonf<10−4) and telomere maintenance (pBonf<7×10−3), suggesting adaptation at important functional sites. Focusing on 451 unique Pfam protein-domain annotations, we found abundant purifying selection, with 225 domains showing purifying selection for >75% of their sites, and 447 domains showing negative selection for >50% of their sites (Table S13). Domains with substantial fractions of positively-selected sites include CRAL/TRIO involved in retinal binding (2.6%), proteinase-inhibitor-cystatin involved in bone remodeling (2.2%), and secretion-related Emp24/GOLD/p24 family (1.6%).

Exaptation of mobile elements

Mobile elements provide an elegant mechanism for distributing a common sequence across the genome, which can then be retained in locations where it confers advantageous regulatory functions to the host - a process termed exaptation. Our data revealed >280,000 mobile element exaptations common to mammalian genomes covering ~7Mb (Text S17), dramatically expanding from ~10,000 previously-recognized cases[44]. Of the ~1.1 million constrained elements that arose during the 90 million years between the divergence from marsupials and the eutherian radiation, we can trace >19% to mobile element exaptations. Often only a small fraction (median ~11%) of each mobile element is constrained, in some cases matching known regulatory motifs. Recent exaptations are generally found near ancestral regulatory elements, except in gene deserts which are abundant in ancestral elements but show few recent exaptations (p<10−300, Figure S22).

Accelerated evolution in the primate lineage

Lineage-specific rapid evolution in ancestrally-constrained elements previously revealed human positive selection associated with brain and limb development[45]. Applying this signature to the human and primate lineages, we identified 563 human-accelerated regions (HARs) and 577 primate-accelerated regions (PARs) at FDR<10% (Text S18, Table S14, S15), significantly expanding the 202 previously-known HARs[46]. Fifty-four HARs (9.4%) and 49 PARs (8.5%) overlap enhancer-associated chromatin marks and experimentally validated enhancers (Text S18). Substitution patterns in HARs suggest that GC-biased gene conversion (BGC) is not responsible for the accelerated evolution in the vast majority of these regions (~15% show evidence of BGC). Genes harboring or neighboring HARs and PARs are enriched for extra-cellular signaling, receptor activity, immunity, axon guidance, cartilage development, and embryonic pattern specification (Figure S23). For example, the FGF13 locus associated with an X-linked form of mental retardation contains four HARs near the 5′-ends of alternatively-spliced isoforms of FGF13 expressed in the nervous system, epithelial tissues and tumors, suggesting human-specific changes in isoform regulation (Figure S24).

Discussion

Comparative analysis of 29 mammalian genomes reveals a high-resolution map of >3.5 million constrained elements that encompass ~4% of the human genome and suggest potential functional classes for ~60% of the constrained bases; the remaining 40% show no overlap and remain uncharacterized. We report previously-undetected exons and overlapping functional elements within protein-coding sequence, new classes of RNA structures, promoter conservation profiles, and predicted targets of transcriptional regulators. We also provide evidence of evolutionary innovation, including codon-specific positive selection, mobile element exaptation and accelerated evolution in the primate and human lineages. By focusing our comparison on only eutherian mammals, we discover functional elements relevant to this clade, including recent eutherian innovations. This is especially important for discovering regulatory elements, which can be subject to rapid turnover[47]. Indeed, a previous comparison suggest that only 80% of 50-bp non-coding elements are shared with opossum, while the current 12-bp analysis shows ~64% of non-coding elements shared with opossum[48], and only 6% with stickleback fish. Many eutherian elements are thus likely missing from previous maps of vertebrate constraint[17]. Sequencing of additional species should enable discovery of lineage-specific elements within mammalian clades, and provide increased resolution for shared mammalian constraint. We estimate that 100-200 eutherian mammals (15-25 neutral substitutions per site) will enable single-nucleotide resolution. The majority of this branch length is present within the Laurasiatherian and Euarchontoglire branches, which also contain multiple model organisms. These are ideal next targets for sequencing as part of the Genome 10K effort[49], aiming to sequence 10,000 species from all walks of life. Within the primate clade, a branch length of ~1.5 could be achieved, enabling primate-specific selection studies albeit at lower resolution. Lastly, human-specific selection should be detectable by combining data across genomic regions and by comparing thousands of humans[50]. The constrained elements reported here can be used to prioritize disease-associated variants for subsequent study, providing a powerful lens for elucidating functional elements in the human genome complementary to ongoing large-scale experimental endeavors such as ENCODE and Roadmap Epigenomics. Experimental studies require prior knowledge of the biochemical activity sought and reveal regions active in specific cell types and conditions. Comparative approaches provide an unbiased catalog of shared functional regions independent of biochemical activity or condition, and thus can capture experimentally-intractable or rare activity patterns. With increasing branch length, they can provide information on ancestral and recent selective pressures across mammalian clades and within the human population. Ultimately, the combination of disease genetics, comparative and population genomics and biochemical studies have important implications for understanding human biology, health and disease.

50 in total

1. Initial sequencing and analysis of the human genome.

Authors: E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal: Nature Date: 2001-02-15 Impact factor: 49.962

2. Initial sequencing and comparative analysis of the mouse genome.

Authors: Robert H Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R Brent; Daniel G Brown; Stephen D Brown; Carol Bult; John Burton; Jonathan Butler; Robert D Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T Chinwalla; Deanna M Church; Michele Clamp; Christopher Clee; Francis S Collins; Lisa L Cook; Richard R Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D Delehaunty; Justin Deri; Emmanouil T Dermitzakis; Colin Dewey; Nicholas J Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M Dunn; Sean R Eddy; Laura Elnitski; Richard D Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A Fewell; Paul Flicek; Karen Foley; Wayne N Frankel; Lucinda A Fulton; Robert S Fulton; Terrence S Furey; Diane Gage; Richard A Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A Graves; Eric D Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B Jaffe; L Steven Johnson; Matthew Jones; Thomas A Jones; Ann Joy; Michael Kamal; Elinor K Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W James Kent; Andrew Kirby; Diana L Kolbe; Ian Korf; Raju S Kucherlapati; Edward J Kulbokas; David Kulp; Tom Landers; J P Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R Maglott; Elaine R Mardis; Lucy Matthews; Evan Mauceli; John H Mayer; Megan McCarthy; W Richard McCombie; Stuart McLaren; Kirsten McLay; John D McPherson; Jim Meldrim; Beverley Meredith; Jill P Mesirov; Webb Miller; Tracie L Miner; Emmanuel Mongin; Kate T Montgomery; Michael Morgan; Richard Mott; James C Mullikin; Donna M Muzny; William E Nash; Joanne O Nelson; Michael N Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S Pohl; Alex Poliakov; Tracy C Ponce; Chris P Ponting; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A Roe; Krishna M Roskin; Edward M Rubin; Alistair G Rust; Ralph Santos; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Matthias S Schwartz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B Singer; Guy Slater; Arian Smit; Douglas R Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P Vinson; Andrew C Von Niederhausern; Claire M Wade; Melanie Wall; Ryan J Weber; Robert B Weiss; Michael C Wendl; Anthony P West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K Wilson; Eitan Winter; Kim C Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M Zdobnov; Michael C Zody; Eric S Lander
Journal: Nature Date: 2002-12-05 Impact factor: 49.962

3. Comparative analyses of multi-species sequences from targeted genomic regions.

Authors: J W Thomas; J W Touchman; R W Blakesley; G G Bouffard; S M Beckstrom-Sternberg; E H Margulies; M Blanchette; A C Siepel; P J Thomas; J C McDowell; B Maskeri; N F Hansen; M S Schwartz; R J Weber; W J Kent; D Karolchik; T C Bruen; R Bevan; D J Cutler; S Schwartz; L Elnitski; J R Idol; A B Prasad; S-Q Lee-Lin; V V B Maduro; T J Summers; M E Portnoy; N L Dietrich; N Akhter; K Ayele; B Benjamin; K Cariaga; C P Brinkley; S Y Brooks; S Granite; X Guan; J Gupta; P Haghighi; S-L Ho; M C Huang; E Karlins; P L Laric; R Legaspi; M J Lim; Q L Maduro; C A Masiello; S D Mastrian; J C McCloskey; R Pearson; S Stantripop; E E Tiongson; J T Tran; C Tsurgeon; J L Vogt; M A Walker; K D Wetherby; L S Wiggins; A C Young; L-H Zhang; K Osoegawa; B Zhu; B Zhao; C L Shu; P J De Jong; C E Lawrence; A F Smit; A Chakravarti; D Haussler; P Green; W Miller; E D Green
Journal: Nature Date: 2003-08-14 Impact factor: 49.962

4. Ultraconserved elements in the human genome.

Authors: Gill Bejerano; Michael Pheasant; Igor Makunin; Stuart Stephen; W James Kent; John S Mattick; David Haussler
Journal: Science Date: 2004-05-06 Impact factor: 47.728

5. The share of human genomic DNA under selection estimated from human-mouse genomic alignments.

Authors: F Chiaromonte; R J Weber; K M Roskin; M Diekhans; W J Kent; D Haussler
Journal: Cold Spring Harb Symp Quant Biol Date: 2003

6. Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals.

Authors: Xiaohui Xie; Jun Lu; E J Kulbokas; Todd R Golub; Vamsi Mootha; Kerstin Lindblad-Toh; Eric S Lander; Manolis Kellis
Journal: Nature Date: 2005-02-27 Impact factor: 49.962

7. L-methionine availability regulates expression of the methionine adenosyltransferase 2A gene in human hepatocarcinoma cells: role of S-adenosylmethionine.

Authors: Maria L Martínez-Chantar; M Ujue Latasa; Marta Varela-Rey; Shelly C Lu; Elena R García-Trevijano; José M Mato; Matías A Avila
Journal: J Biol Chem Date: 2003-03-26 Impact factor: 5.157

8. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes.

Authors: Gregory M Cooper; Michael Brudno; Eric D Green; Serafim Batzoglou; Arend Sidow
Journal: Genome Res Date: 2003-05 Impact factor: 9.043

9. Genome sequence of the Brown Norway rat yields insights into mammalian evolution.

Authors: Richard A Gibbs; George M Weinstock; Michael L Metzker; Donna M Muzny; Erica J Sodergren; Steven Scherer; Graham Scott; David Steffen; Kim C Worley; Paula E Burch; Geoffrey Okwuonu; Sandra Hines; Lora Lewis; Christine DeRamo; Oliver Delgado; Shannon Dugan-Rocha; George Miner; Margaret Morgan; Alicia Hawes; Rachel Gill; Robert A Holt; Mark D Adams; Peter G Amanatides; Holly Baden-Tillson; Mary Barnstead; Soo Chin; Cheryl A Evans; Steve Ferriera; Carl Fosler; Anna Glodek; Zhiping Gu; Don Jennings; Cheryl L Kraft; Trixie Nguyen; Cynthia M Pfannkoch; Cynthia Sitter; Granger G Sutton; J Craig Venter; Trevor Woodage; Douglas Smith; Hong-Mei Lee; Erik Gustafson; Patrick Cahill; Arnold Kana; Lynn Doucette-Stamm; Keith Weinstock; Kim Fechtel; Robert B Weiss; Diane M Dunn; Eric D Green; Robert W Blakesley; Gerard G Bouffard; Pieter J De Jong; Kazutoyo Osoegawa; Baoli Zhu; Marco Marra; Jacqueline Schein; Ian Bosdet; Chris Fjell; Steven Jones; Martin Krzywinski; Carrie Mathewson; Asim Siddiqui; Natasja Wye; John McPherson; Shaying Zhao; Claire M Fraser; Jyoti Shetty; Sofiya Shatsman; Keita Geer; Yixin Chen; Sofyia Abramzon; William C Nierman; Paul H Havlak; Rui Chen; K James Durbin; Amy Egan; Yanru Ren; Xing-Zhi Song; Bingshan Li; Yue Liu; Xiang Qin; Simon Cawley; Kim C Worley; A J Cooney; Lisa M D'Souza; Kirt Martin; Jia Qian Wu; Manuel L Gonzalez-Garay; Andrew R Jackson; Kenneth J Kalafus; Michael P McLeod; Aleksandar Milosavljevic; Davinder Virk; Andrei Volkov; David A Wheeler; Zhengdong Zhang; Jeffrey A Bailey; Evan E Eichler; Eray Tuzun; Ewan Birney; Emmanuel Mongin; Abel Ureta-Vidal; Cara Woodwark; Evgeny Zdobnov; Peer Bork; Mikita Suyama; David Torrents; Marina Alexandersson; Barbara J Trask; Janet M Young; Hui Huang; Huajun Wang; Heming Xing; Sue Daniels; Darryl Gietzen; Jeanette Schmidt; Kristian Stevens; Ursula Vitt; Jim Wingrove; Francisco Camara; M Mar Albà; Josep F Abril; Roderic Guigo; Arian Smit; Inna Dubchak; Edward M Rubin; Olivier Couronne; Alexander Poliakov; Norbert Hübner; Detlev Ganten; Claudia Goesele; Oliver Hummel; Thomas Kreitler; Young-Ae Lee; Jan Monti; Herbert Schulz; Heike Zimdahl; Heinz Himmelbauer; Hans Lehrach; Howard J Jacob; Susan Bromberg; Jo Gullings-Handley; Michael I Jensen-Seaman; Anne E Kwitek; Jozef Lazar; Dean Pasko; Peter J Tonellato; Simon Twigger; Chris P Ponting; Jose M Duarte; Stephen Rice; Leo Goodstadt; Scott A Beatson; Richard D Emes; Eitan E Winter; Caleb Webber; Petra Brandt; Gerald Nyakatura; Margaret Adetobi; Francesca Chiaromonte; Laura Elnitski; Pallavi Eswara; Ross C Hardison; Minmei Hou; Diana Kolbe; Kateryna Makova; Webb Miller; Anton Nekrutenko; Cathy Riemer; Scott Schwartz; James Taylor; Shan Yang; Yi Zhang; Klaus Lindpaintner; T Dan Andrews; Mario Caccamo; Michele Clamp; Laura Clarke; Valerie Curwen; Richard Durbin; Eduardo Eyras; Stephen M Searle; Gregory M Cooper; Serafim Batzoglou; Michael Brudno; Arend Sidow; Eric A Stone; J Craig Venter; Bret A Payseur; Guillaume Bourque; Carlos López-Otín; Xose S Puente; Kushal Chakrabarti; Sourav Chatterji; Colin Dewey; Lior Pachter; Nicolas Bray; Von Bing Yap; Anat Caspi; Glenn Tesler; Pavel A Pevzner; David Haussler; Krishna M Roskin; Robert Baertsch; Hiram Clawson; Terrence S Furey; Angie S Hinrichs; Donna Karolchik; William J Kent; Kate R Rosenbloom; Heather Trumbower; Matt Weirauch; David N Cooper; Peter D Stenson; Bin Ma; Michael Brent; Manimozhiyan Arumugam; David Shteynberg; Richard R Copley; Martin S Taylor; Harold Riethman; Uma Mudunuri; Jane Peterson; Mark Guyer; Adam Felsenfeld; Susan Old; Stephen Mockrin; Francis Collins
Journal: Nature Date: 2004-04-01 Impact factor: 49.962

10. Sequencing and comparison of yeast species to identify genes and regulatory elements.

Authors: Manolis Kellis; Nick Patterson; Matthew Endrizzi; Bruce Birren; Eric S Lander
Journal: Nature Date: 2003-05-15 Impact factor: 49.962

534 in total

1. Evening expression of arabidopsis GIGANTEA is controlled by combinatorial interactions among evolutionarily conserved regulatory motifs.

Authors: Markus C Berns; Karl Nordström; Frédéric Cremer; Réka Tóth; Martin Hartke; Samson Simon; Jonas R Klasen; Ingmar Bürstel; George Coupland
Journal: Plant Cell Date: 2014-10-31 Impact factor: 11.277

2. Functional genomics: The changes that count.

Authors: Monya Baker
Journal: Nature Date: 2012-02-08 Impact factor: 49.962

Review 3. Molecular phylogenetics: principles and practice.

Authors: Ziheng Yang; Bruce Rannala
Journal: Nat Rev Genet Date: 2012-03-28 Impact factor: 53.242

4. Evidence of abundant stop codon readthrough in Drosophila and other metazoa.

Authors: Irwin Jungreis; Michael F Lin; Rebecca Spokony; Clara S Chan; Nicolas Negre; Alec Victorsen; Kevin P White; Manolis Kellis
Journal: Genome Res Date: 2011-10-12 Impact factor: 9.043

Review 5. Genetic architectures of psychiatric disorders: the emerging picture and its implications.

Authors: Patrick F Sullivan; Mark J Daly; Michael O'Donovan
Journal: Nat Rev Genet Date: 2012-07-10 Impact factor: 53.242

Review 6. Using chromatin marks to interpret and localize genetic associations to complex human traits and diseases.

Authors: Gosia Trynka; Soumya Raychaudhuri
Journal: Curr Opin Genet Dev Date: 2013-11-25 Impact factor: 5.578

Review 7. Enhancing our brains: Genomic mechanisms underlying cortical evolution.

Authors: Caitlyn Mitchell; Debra L Silver
Journal: Semin Cell Dev Biol Date: 2017-08-31 Impact factor: 7.727

8. Genome of the Chinese tree shrew.

Authors: Yu Fan; Zhi-Yong Huang; Chang-Chang Cao; Ce-Shi Chen; Yuan-Xin Chen; Ding-Ding Fan; Jing He; Hao-Long Hou; Li Hu; Xin-Tian Hu; Xuan-Ting Jiang; Ren Lai; Yong-Shan Lang; Bin Liang; Sheng-Guang Liao; Dan Mu; Yuan-Ye Ma; Yu-Yu Niu; Xiao-Qing Sun; Jin-Quan Xia; Jin Xiao; Zhi-Qiang Xiong; Lin Xu; Lan Yang; Yun Zhang; Wei Zhao; Xu-Dong Zhao; Yong-Tang Zheng; Ju-Min Zhou; Ya-Bing Zhu; Guo-Jie Zhang; Jun Wang; Yong-Gang Yao
Journal: Nat Commun Date: 2013 Impact factor: 14.919

9. A fast-evolving human NPAS3 enhancer gained reporter expression in the developing forebrain of transgenic mice.

Authors: Gretel B Kamm; Rodrigo López-Leal; Juan R Lorenzo; Lucía F Franchini
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2013-11-11 Impact factor: 6.237

10. Enhancer turnover and conserved regulatory function in vertebrate evolution.

Authors: Sabina Domené; Viviana F Bumaschny; Flávio S J de Souza; Lucía F Franchini; Sofía Nasif; Malcolm J Low; Marcelo Rubinstein
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2013-11-11 Impact factor: 6.237