Maria Warnefors1, Adam Eyre-Walker. 1. School of Life Sciences, University of Sussex, Brighton, United Kingdom. maria.warnefors@gmail.com
Abstract
Gene expression is governed by an intricate combination of transcription factors (TFs), microRNAs (miRNAs), splicing factors, and other regulators. Genes cannot support infinitely complex regulation due to sequence constraints and the increased likelihood of harmful errors. However, the upper limit of regulatory complexity in the genome is not known. Here, we provide evidence that human genes are currently not operating at their maximum capacity in terms of gene regulation. We analyze genes spanning the full spectrum of eukaryote evolution, from primate-specific genes to genes present in the eukaryote ancestor, and show that older genes tend to be bound by more TFs, have more conserved upstream sequences, generate more alternative isoforms, house more miRNA targets, and are more likely to be affected by nonsense-mediated decay and RNA editing. These results cannot be explained by overrepresentation of certain functional categories among younger or older genes. Furthermore, the increase in complexity is continuous over evolutionary time, without signs of saturation, leading to the conclusion that most genes, at least in the human genome, have the capacity to evolve even more complex gene regulation in the future.
Gene expression is governed by an intricate combination of transcription factors (TFs), microRNAs (miRNAs), splicing factors, and other regulators. Genes cannot support infinitely complex regulation due to sequence constraints and the increased likelihood of harmful errors. However, the upper limit of regulatory complexity in the genome is not known. Here, we provide evidence that human genes are currently not operating at their maximum capacity in terms of gene regulation. We analyze genes spanning the full spectrum of eukaryote evolution, from primate-specific genes to genes present in the eukaryote ancestor, and show that older genes tend to be bound by more TFs, have more conserved upstream sequences, generate more alternative isoforms, house more miRNA targets, and are more likely to be affected by nonsense-mediated decay and RNA editing. These results cannot be explained by overrepresentation of certain functional categories among younger or older genes. Furthermore, the increase in complexity is continuous over evolutionary time, without signs of saturation, leading to the conclusion that most genes, at least in the human genome, have the capacity to evolve even more complex gene regulation in the future.
The upper limit for regulatory complexity in the genome is not known, yet such a limit must exist. Taking alternative splicing as an example, although one might easily imagine a gene that produces 20 splicing isoforms, a gene with 200,000 isoforms appears highly unrealistic, due to the overwhelming amount of regulatory sequences that would be required to avoid aberrant splice variants, which may cause disease (Tazi et al. 2009), and the severe constraints that this would impose on the coding sequence (Parmley et al. 2007). It follows that genes have a maximum capacity for new isoforms and that once this maximum has been reached, the organizational difficulties of adding additional isoforms will completely outweigh the beneficial effects that these isoforms may provide.The same logic can be extended to the many other mechanisms that control gene expression, such that a single gene can only support a limited level of regulation by transcription factors (TFs), microRNAs (miRNAs), and other processes. Although these types of regulation rarely involve coding sequences, they will still be limited by a finite supply of sequences that can house regulatory elements, as well as interference between new and old elements. At saturation, new features can therefore only become fixed if they replace preexisting ones or following a gene duplication event.To what extent have human genes reached their maximum regulatory capacity? This question can be addressed by analyzing the level of regulation associated with genes that arose at different evolutionary times. Four potential scenarios are illustrated in figure 1. In the first (fig. 1), genes are continuously acquiring regulatory features and have not yet reached their maximum capacity. In the second scenario (fig. 1), older genes are saturated in terms of gene regulation and do not show a further increase in complexity. These two scenarios assume that gene regulatory features accumulate over time. It might, however, be that different forms of regulation dominate in genes of different age categories (fig. 1) or that regulation and age are uncorrelated factors (fig. 1). This last scenario does, however, appear unlikely as evolutionary age is known to correlate with aspects of gene architecture, including gene length and intron density (Wolf et al. 2009), as well as with gene expression, such that older genes tend to be expressed in more tissues (Milinkovitch et al. 2010) and at higher levels (Wolf et al. 2009) than younger genes.
F
Potential relationships between regulatory complexity and gene age. (A) Genes continuously increase their regulatory complexity throughout their lifetime. (B) Regulatory complexity increases over a time until the maximum capacity is reached. (C) Old and young genes tend to be regulated by different regulatory mechanisms. (D) Regulatory complexity is independent of gene age.
Potential relationships between regulatory complexity and gene age. (A) Genes continuously increase their regulatory complexity throughout their lifetime. (B) Regulatory complexity increases over a time until the maximum capacity is reached. (C) Old and young genes tend to be regulated by different regulatory mechanisms. (D) Regulatory complexity is independent of gene age.To distinguish between these scenarios, we have collected information on a variety of regulatory mechanisms operating in the human genome and related this to the evolutionary age of the affected genes. We found that older genes tend to be bound by more TFs, have more conserved upstream sequences, use more alternative transcription start sites (TSSs), produce more alternative splicing isoforms, and use more alternative polyadenylation sites. Furthermore, older genes are more likely to be affected by miRNAs, nonsense-mediated decay (NMD), and RNA editing. Based on this and the lack of apparent saturation, we draw the conclusion that the majority of human genes could support higher levels of regulation than what we currently observe.
Materials and Methods
To group human genes according to time of origin, we used the phylostratigraphic classifications established by Domazet-Lošo and Tautz (2010), with the additional requirement that the genes should be represented in release 59 of the Ensembl database (Flicek et al. 2010). We excluded human genes shared by archaea and bacteria from our analysis as many of the regulatory mechanisms that we consider are specific to eukaryotes. The number of genes for each of the 18 age categories is shown in table 1.
Table 1
Human Genes Classified According to Time of Origin
Category
Time of Origin (Ma)
Taxon
Number of Genes
1
77.5
Primates
163
2
91
Euarchontoglires
24
3
97.4
Boreoeutheria
84
4
104.7
Eutheria
294
5
176.1
Mammalia
213
6
324.5
Amniota
121
7
361.2
Tetrapoda
73
8
454.6
Euteleostomi
455
9
568.8*
Craniata
394
10
682.9*
Olfactores
33
11
797
Chordata
168
12
842
Deuterostomia
52
13
910
Bilateria
728
14
1036
Eumetazoa
1770
15
1237
Metazoa
341
16
1302.5*
Holozoa
281
17
1368
Opisthokonta
449
18
1628
Eukaryota
4906
Age classifications were taken from Domazet-Lošo and Tautz (2010) and time estimates from Hedges et al. (2006). In cases where the time estimates did not match the phylogeny (marked with an asterisk), the divergence time was interpolated from those of the surrounding taxa.
Human Genes Classified According to Time of OriginAge classifications were taken from Domazet-Lošo and Tautz (2010) and time estimates from Hedges et al. (2006). In cases where the time estimates did not match the phylogeny (marked with an asterisk), the divergence time was interpolated from those of the surrounding taxa.Next, we calculated eight measures of the regulatory complexity of human genes. First, we estimated the complexity of transcriptional regulation for each gene, by counting the number of TFs that bound within 10 kb upstream of the TSS in the human cell line GM12878. This data set came from ENCODE ChIP-seq experiments performed at the HudsonAlpha Institute (Birney et al. 2007) and was available through the HAIB TFBS track for the human genome (release hg18) in the UCSC Genome Browser (Kent et al. 2002). The following 20 TFs were analyzed: BATF, BCL3, BCL11, EBF, Egr-1, GABP, IRF4, NRSF, p300, PAX5c, PAX5n, Pbx3, POU2F, Sin3A, SP1, SRF, TAF1, TCF12, USF-1, and ZBT33. As a second measure of transcriptional regulation, we calculated the degree of conservation of sequences within 10 kb upstream of the TSS as the proportion of bases that were identified as conserved within primates by the phastCons program (Siepel et al. 2005). This information was taken from the Conservation track in the UCSC Genome Browser.Our next three complexity measures were based on the number of transcripts that are generated due to alternative use of TSSs, alternative splicing, and alternative polyadenylation, respectively. To distinguish between these mechanisms, we evaluated the exon coordinates, downloaded from Ensembl release 59 (Flicek et al. 2010), for all transcripts produced by genes for which we had age information. From the same database, we also downloaded a list of transcripts that were predicted to undergo NMD. Finally, we considered the degree of miRNA regulation based on the experimentally verified miRNA targets in TarBase v5.0.1 (Papadopoulos et al. 2009), as well as the number of sites that undergo RNA editing, taken from the DARNED database (Kiran and Baranov 2010).We investigated the relationship between gene age and regulatory complexity for each of our eight measures, by calculating the Pearson correlation. This analysis was based on the complexity values of each gene, not the averaged values, which are provided for overview in figure 2.
F
Evolution of regulatory complexity. (A) Average number of TFs binding within 10 kb upstream of genes. (B) Average number of conserved bases within 10 kb of the TSS. (C) Average number of TSSs per gene. (D) Average number of splicing isoforms per gene. (E) Average number of polyadenylation sites per gene. (F) Average number of verified miRNA targets per gene. (G) Proportion of genes that are targeted by NMD. (H) Proportion of genes that are RNA edited. The age of the gene categories in million years is on the x axis. Note that these are averages per age categories, whereas the statistical analysis described in the text was performed on raw data.
Evolution of regulatory complexity. (A) Average number of TFs binding within 10 kb upstream of genes. (B) Average number of conserved bases within 10 kb of the TSS. (C) Average number of TSSs per gene. (D) Average number of splicing isoforms per gene. (E) Average number of polyadenylation sites per gene. (F) Average number of verified miRNA targets per gene. (G) Proportion of genes that are targeted by NMD. (H) Proportion of genes that are RNA edited. The age of the gene categories in million years is on the x axis. Note that these are averages per age categories, whereas the statistical analysis described in the text was performed on raw data.To examine whether the observed correlations persisted even when we corrected for gene function, we first grouped genes into functional categories based on gene ontology terms (Ashburner et al. 2000). To this end, we downloaded GOslim terms for “molecular function” and “biological process” from Ensembl release 59 (Flicek et al. 2010). To make sure we had sufficient power to detect any correlations, we restricted our analysis to terms that matched at least 1,000 genes in our data set. We also excluded terms that were children to any of the other included terms, with exception for the term “binding,” which due to its generality was further divided into “protein binding” and “nucleic acid binding.” We then repeated the analysis described above for each functional category.
Results and Discussion
We have examined the accumulation of regulatory complexity in human genes, by analyzing several aspects of gene expression in genes of different evolutionary ages. To group genes according to time of origin, we used the classifications given by Domazet-Lošo and Tautz (2010). These age estimates rely on ortholog identification by BLAST (Altschul et al. 1997), which could mean that some faster-evolving genes escape detection. However, simulations indicate that overall this strategy is reliable (Albà and Castresana 2007). In total, human genes were divided into 18 age categories, with the oldest category including human genes that were present in the eukaryote ancestor and the youngest category consisting of primate-specific genes (table 1). Divergence times for the different categories were taken from the TimeTree database (Hedges et al. 2006), except in cases of contradictory estimates, where instead we interpolated the divergence time from the surrounding categories by taking the average time (table 1). Qualitatively similar results were obtained when we excluded these categories, as well as when we performed the analysis using the category numbers rather than the time estimates.We calculated eight measures of regulatory complexity, based on publicly available data (see Materials and Methods). To estimate the level of transcriptional regulation, we analyzed sequences within 10 kb upstream of the TSS. First, we counted the number of TFs that bind to this region in the human lymphoblastoid cell line GM12878. To exclude nonexpressed genes, only genes that were bound by at least one TF were included in the analysis. Figure 2 shows the average number of TFs that bind to genes of different ages, with a clear increase in TF binding for old relative to young genes. As the data are rather noisy and some of the age categories contain relatively few genes (table 1), differences between individual age categories should be interpreted with caution in this and the following graphs. A list of means and standard errors for all investigated regulatory mechanisms is provided as Supplementary Material online. Analysis confirmed that evolutionary age is significantly correlated with TF-binding diversity, such that older genes are typically associated with more types of TFs (P = 2 ×10−16, Pearson correlation, note all correlations are performed on the raw data, not the means shown in the figures). To estimate the magnitude of the increase in diversity, we fitted a linear model to the data, which showed that genes in the youngest category are typically bound by 4.1 TFs, whereas the oldest genes are bound by 5.4 TFs (table 2).
Table 2
Differences in Complexity between the Youngest and Oldest Age Categories
Category
Youngest Genes (Primates)
Oldest Genes (Eukaryotes)
Ratio
TF-binding sites
4.12
5.38
1.31
Conserved bases upstream
396
547
1.38
TSSs
2.35
4.92
2.09
Splicing isoforms
2.76
5.72
2.07
Polyadenylation sites
2.26
4.80
2.12
miRNA sites
0.0017
0.0573
33.7
NMD proportion
0.058
0.168
2.90
RNA editing proportion
0.052
0.161
3.10
The estimates were obtained by fitting a linear model to the data.
Differences in Complexity between the Youngest and Oldest Age CategoriesThe estimates were obtained by fitting a linear model to the data.Second, we assessed the level of conservation of upstream sequences, by counting the number of bases within 10 kb of the TSS that were identified as conserved among primates by the phastCons program (Siepel et al. 2005). Again, we found a significant correlation with age, where older genes tend to have more conserved upstream sequences than younger genes (P = 1 × 10−10), such that the upstream regions of the oldest genes contain almost 40% more conserved bases compared with younger genes (table 2). Thus, both TF binding and upstream conservation show a highly significant correlation with evolutionary age.We then considered complexity in terms of alternative isoforms generated by differential use of TSSs (fig. 2), splice sites (fig. 2), and polyadenylation sites (fig. 2). For each of these mechanisms, we found significant positive correlations with gene age (alternative TSSs: P < 2×10−16; alternative splicing: P < 2×10−16; alternative polyadenylation: P < 2×10−16). Compared with the youngest genes in our data set, the oldest genes have gained 2.57 alternative start sites, 2.96 alternative splicing isoforms, and 2.54 alternative polyadenylation sites (table 2). This is consistent with the recent results of Roux and Robinson-Rechavi (2011), who also showed an accumulation in alternative splicing isoforms over time.Notably, the patterns for these last three mechanisms are highly similar. This is to be expected since they are frequently coupled (e.g., a gene with two potential last exons will need to accommodate at least two polyadenylation sites and produce at least two alternative splicing isoforms). However, the similarity could also be a sign of ascertainment bias: if some genes have been more intensely studied, we might expect more alternative isoforms, of all three types, to have been identified in these genes. To exclude biased identification as an explanation, we analyzed cases where one of the three mechanisms acts independently of the others. Thus, we identified alternative TSSs and polyadenylation sites that occur within a single exon and therefore cannot be directly associated with an increase in splicing. We also counted the number of alternative coding sequences generated from each gene as this is not coupled directly to changes in UTR structure. As seen in figure 3, the three resulting distributions of alternative events are distinct from each other as we would expect for unbiased data. Remarkably, the correlations remained positive and significant (alternative TSSs: P = 1×10−5; alternative splicing: P < 2×10−16; alternative polyadenylation: P = 3×10−5), even though this analysis was performed on very limited data sets.
F
Alternative isoforms arising from independent mechanisms. The average number of isoforms that are due to TSSs within a single exon (A), splicing of coding sequences (B), and polyadenylation sites (C) within a single exon for genes of different ages. The x axis shows gene age in million years.
Alternative isoforms arising from independent mechanisms. The average number of isoforms that are due to TSSs within a single exon (A), splicing of coding sequences (B), and polyadenylation sites (C) within a single exon for genes of different ages. The x axis shows gene age in million years.Next, we investigated the distribution of verified miRNA-binding sites across the 18 categories (fig. 2) and found that older genes are enriched in this type of regulation (P < 5×10−11), with the number of miRNA targets per gene increasing more than 30-fold from 0.0017 to 0.0573. We also found significant positive correlations between gene age and the likelihood for genes to be targeted by the less common regulatory mechanisms NMD (P < 2×10−16) and RNA editing (P < 2×10−16). For both of these mechanisms, around 5% of the youngest genes are affected, whereas the proportion among the oldest genes is three times larger.In theory, the results described above could be influenced by an uneven distribution of gene functions among the age categories. If “early” genes predominantly are of a functional type that requires a certain level or mode of regulation, whereas “late” genes have other functions and therefore different regulatory needs, then we might see a superficial correlation between age and regulatory complexity. To test this possibility, we further divided our data set according to gene ontology terms (Ashburner et al. 2000) and repeated the analysis for a number of functional categories (see Materials and Methods). In the vast majority of cases, the correlations between complexity and gene age remained positive even for functional subsets of genes (fig. 4), showing that the positive correlations that we obtained for the full data set are not due to functional bias.
F
Correlation between complexity and age for functional subsets. Significantly positive correlations are indicated as black boxes, positive but not significant as gray boxes, and negative but not significant as white boxes. No significantly negative correlations were found.
Correlation between complexity and age for functional subsets. Significantly positive correlations are indicated as black boxes, positive but not significant as gray boxes, and negative but not significant as white boxes. No significantly negative correlations were found.Based on these results, we can exclude the last two possibilities shown in figure 1, (no increase in complexity with time and certain types of complexity being associated with particular time periods) as all forms of regulatory complexity investigated here show a significant increase over time. We are therefore left to determine whether the oldest human genes have reached regulatory saturation, that is, whether the pace at which genes accumulate new features has slowed down for older genes. To do this, we performed a regression analysis involving a quadratic term. However, in all eight cases, this term was either not significant or it indicated that the pace is higher for older genes. Thus, we have not found any evidence to suggest that human genes have reached saturation or that the rate with which they increase in regulatory complexity slows down over time. This partially contradicts the results of Roux and Robinson-Rechavi (2011), who showed that for nonduplicated genes, the rate of splicing isoform acquisition decreases as genes grow older. For duplicated genes, they found a linear relationship, consistent with our results, but argued that the linearity may be due to biased duplication.Wolf et al. (2009) recently showed that the ratio of the rate of nonsynonymous to synonymous substitution (dN/dS), decreases with gene age, indicating that older genes are under stronger constraint. Rather than being the cause of the observed correlations, the decrease in dN/dS might be a consequence of the increase in the complexity of gene regulation as regulatory elements within protein-coding sequences would be expected to constrain both nonsynonymous and synonymous sites, but might affect nonsynonymous sites more, as they also need to encode the protein sequence. However, even if the increase in constraint with evolutionary age was the cause of the increase in complexity, this does not alter the fact that regulatory complexity accumulates through time.To summarize, we have demonstrated that older genes tend to be bound by more TFs, have more conserved upstream sequences, use more alternative TSSs, produce more alternative splicing isoforms, use more alternative polyadenylation sites, and contain more miRNA-binding sites and that they are also more likely targets of NMD and RNA editing. The differences between young and old genes are of such a magnitude that they could have a substantial impact on gene function. Furthermore, we have shown that the accumulation of new regulatory features has been an ongoing process over the past 1.5 billion years of eukaryote evolution. Therefore, although human gene regulation is a highly elaborate process, it has not reached its peak and human genes would thus be able to become even more complex in the future.
Supplementary Material
Supplementary material is available at Genome Biology and Evolution online (http://gbe.oxfordjournals.org/).
Authors: W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler Journal: Genome Res Date: 2002-06 Impact factor: 9.043
Authors: Adam Siepel; Gill Bejerano; Jakob S Pedersen; Angie S Hinrichs; Minmei Hou; Kate Rosenbloom; Hiram Clawson; John Spieth; Ladeana W Hillier; Stephen Richards; George M Weinstock; Richard K Wilson; Richard A Gibbs; W James Kent; Webb Miller; David Haussler Journal: Genome Res Date: 2005-07-15 Impact factor: 9.043
Authors: Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong Journal: Nature Date: 2007-06-14 Impact factor: 49.962
Authors: Yuri I Wolf; Pavel S Novichkov; Georgy P Karev; Eugene V Koonin; David J Lipman Journal: Proc Natl Acad Sci U S A Date: 2009-04-07 Impact factor: 11.205
Authors: Giorgos L Papadopoulos; Martin Reczko; Victor A Simossis; Praveen Sethupathy; Artemis G Hatzigeorgiou Journal: Nucleic Acids Res Date: 2008-10-27 Impact factor: 16.971
Authors: Paul Flicek; Bronwen L Aken; Benoit Ballester; Kathryn Beal; Eugene Bragin; Simon Brent; Yuan Chen; Peter Clapham; Guy Coates; Susan Fairley; Stephen Fitzgerald; Julio Fernandez-Banet; Leo Gordon; Stefan Gräf; Syed Haider; Martin Hammond; Kerstin Howe; Andrew Jenkinson; Nathan Johnson; Andreas Kähäri; Damian Keefe; Stephen Keenan; Rhoda Kinsella; Felix Kokocinski; Gautier Koscielny; Eugene Kulesha; Daniel Lawson; Ian Longden; Tim Massingham; William McLaren; Karine Megy; Bert Overduin; Bethan Pritchard; Daniel Rios; Magali Ruffier; Michael Schuster; Guy Slater; Damian Smedley; Giulietta Spudich; Y Amy Tang; Stephen Trevanion; Albert Vilella; Jan Vogel; Simon White; Steven P Wilder; Amonida Zadissa; Ewan Birney; Fiona Cunningham; Ian Dunham; Richard Durbin; Xosé M Fernández-Suarez; Javier Herrero; Tim J P Hubbard; Anne Parker; Glenn Proctor; James Smith; Stephen M J Searle Journal: Nucleic Acids Res Date: 2009-11-11 Impact factor: 16.971
Authors: Stephen J Bush; Lu Chen; Jaime M Tovar-Corona; Araxi O Urrutia Journal: Philos Trans R Soc Lond B Biol Sci Date: 2017-02-05 Impact factor: 6.237
Authors: Dan Huang; Xiansong Wang; Ziheng Huang; Yingzhi Liu; Xiaodong Liu; Tony Gin; Sunny Hei Wong; Jun Yu; Lin Zhang; Matthew Tak Vai Chan; Huarong Chen; William Ka Kei Wu Journal: Oncogene Date: 2022-05-06 Impact factor: 9.867
Authors: Konstantin Y Popadin; Maria Gutierrez-Arcelus; Tuuli Lappalainen; Alfonso Buil; Julia Steinberg; Sergey I Nikolaev; Samuel W Lukowski; Georgii A Bazykin; Vladimir B Seplyarskiy; Panagiotis Ioannidis; Evgeny M Zdobnov; Emmanouil T Dermitzakis; Stylianos E Antonarakis Journal: Am J Hum Genet Date: 2014-12-04 Impact factor: 11.025