Literature DB >> 24037378

Transcriptome and genome sequencing uncovers functional variation in humans.

Tuuli Lappalainen¹, Michael Sammeth, Marc R Friedländer, Peter A C 't Hoen, Jean Monlong, Manuel A Rivas, Mar Gonzàlez-Porta, Natalja Kurbatova, Thasso Griebel, Pedro G Ferreira, Matthias Barann, Thomas Wieland, Liliana Greger, Maarten van Iterson, Jonas Almlöf, Paolo Ribeca, Irina Pulyakhina, Daniela Esser, Thomas Giger, Andrew Tikhonov, Marc Sultan, Gabrielle Bertier, Daniel G MacArthur, Monkol Lek, Esther Lizano, Henk P J Buermans, Ismael Padioleau, Thomas Schwarzmayr, Olof Karlberg, Halit Ongen, Helena Kilpinen, Sergi Beltran, Marta Gut, Katja Kahlem, Vyacheslav Amstislavskiy, Oliver Stegle, Matti Pirinen, Stephen B Montgomery, Peter Donnelly, Mark I McCarthy, Paul Flicek, Tim M Strom, Hans Lehrach, Stefan Schreiber, Ralf Sudbrak, Angel Carracedo, Stylianos E Antonarakis, Robert Häsler, Ann-Christine Syvänen, Gert-Jan van Ommen, Alvis Brazma, Thomas Meitinger, Philip Rosenstiel, Roderic Guigó, Ivo G Gut, Xavier Estivill, Emmanouil T Dermitzakis.

Abstract

Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：
RNA, Messenger

Year: 2013 PMID： 24037378 PMCID： PMC3918453 DOI： 10.1038/nature12531

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

Introduction and data set

Interpreting functional consequences of millions of discovered genetic variants is one of the biggest challenges in human genomics[1]. While genome-wide association studies have linked genetic loci to various human phenotypes and the functional annotation of the genome is improving,[2,3], we still have limited understanding of the underlying causal variants and biological mechanisms. One approach to address this challenge has been to analyze variants affecting cellular phenotypes, such as gene expression,[4-8] known to affect many human diseases and traits.[9,10] In this study, we characterize functional variation in human genomes by RNA-sequencing hundreds of samples from the 1000 Genomes project[1], the most important reference data set of human genetic variation, thus creating the biggest RNA sequencing data set of multiple human populations to date. We not only catalogue novel loci with regulatory variation but also, for the first time, discover and characterize molecular properties of causal functional variants. We performed mRNA and small RNA sequencing on lymphoblastoid cell line (LCL) samples from 5 populations: the CEPH (CEU), Finns (FIN), British (GBR), Toscani (TSI) and Yoruba (YRI). After quality control, we had 462 and 452 individuals (89–95 per population) with mRNA and miRNA data, respectively (Fig. S1–11, Table S1). Of these, 421 are in the 1000 Genomes Phase 1 dataset[1], and the remaining were imputed from SNP array data (Fig. S3, Table S2). RNA-seq was performed in seven laboratories, and the smaller amount of variation between laboratories than individuals demonstrated that RNA sequencing is a mature technology ready for distributed data production (MW p < 2.2 × 10−16 for mRNA, p = 1.34 × 10−10 for miRNA; Fig. 1a, S11;[11]). To discover genetic regulatory variants, we mapped cis-QTLs to transcriptome traits of protein-coding and miRNA genes separately in the European (EUR) and Yoruba (YRI) populations (Fig. S12, Table S3, Table 1). The RNA-seq read, quantification, genotype and QTL data are available open-access (see Data Access section).

Figure 1

Transcriptome variation

a) Spearman rank correlation of replicate samples, based on mRNA exon and miRNA quantifications of 5 individuals sequenced 8 and 7 times for mRNA and miRNA, respectively, and separated by the individual or the sequencing lab being the same or different. The quantifications have been normalized only for the total number of mapped reads (see Fig. S11 for correlations after normalization). b) The proportion of expression level variation (as opposed to splicing) of the total transcription variation between individuals in each population, measured per gene. c) Proportion of genes with differential expression levels and/or transcript usage between population pairs, out of the total listed on the right-hand side. d) Network of significant miRNA families (P<0.001; yellow) and their significantly associated mRNA targets (P<0.05; purple). The edges display negative (green) and positive (red) associations.

Table 1

Numbers of transcriptome features with a QTL (FDR 5%)

	Total	EUR (n=373)	YRI (n=89)	Union
exon eQTL	12981 genes	7390	2369	7825
gene eQTL	13703 genes	3259	501	3773
transcript ratio QTL	7855 genes	620	83	639
mirQTL	644 miRNAs	57	15	60
Transcribed repeat eQTL	43875 repeats	5763	1055	6069

Transcriptome variation in populations

This first uniformly processed RNA-seq data set from multiple human populations allowed high-resolution analysis of transcriptome variation. Individual and population differences in transcription can manifest in (1) overall expression levels, and (2) relative abundance of transcripts from the same gene (transcript ratios). Deconvolution of the relative contribution of these[12] indicates that this ratio is characteristic for each gene with transcript ratio being on average more dominant (Fig. 1b, Fig.S13, S14). Population differences explain a small but significant proportion of 3% of total variation (MW p < 2.2 × 10−16). In addition to this genome-wide perspective to population variation, we identified 263–4379 genes with differential expression and/or transcript ratios between population pairs (PGF, JM, MGP, MB, TL, TW, MRF, A Guin, MAR, TGC, PR, ETD, RG, MS, submitted). Interestingly, continental differences between YRI-EUR population pairs have much higher contribution of genes with different transcript usage than European population pairs (75–85% versus 6–40%; Fig. 1c, Fig. S14). This has not been observed before in humans, but it is consistent with splicing patterns capturing phylogenetic differences between species better than expression levels[13,14]. We quantify a total of 644 autosomal miRNAs in >50% individuals of which 60 have significant cis-mirQTLs for miRNA expression (Fig. S15, Table 1), showing that genetic effects on miRNA expression are much more widespread than the previously identified loci[15]. To complement previous studies of miRNA function in cell perturbation experiments, we analyzed miRNA-mRNA interaction in our steady-state population sample. Of 100 miRNA families, 32 correlated with the expression of predicted target exons in a highly connected network (P<0.001, Fig. 1d, Table S4), including miRNA families with important immunological or lymphocyte functions, such as miR-150, miR-155, miR-181, and miR-146[16]. Interestingly, 45% of the associations were positive – consistent with previous results[15] – even though based on knockout experiments miRNAs mostly downregulate genes. Analyzing the direction of causality, cis-mirQTLs had small trans-eQTL effects to predicted targets only when effects were negative (pi1 = 0.11 versus pi1 = 0, Fig.S16), suggesting that miRNAs indeed downregulate their targets. Positive correlations may be driven by other effects, which is supported by overrepresentation of transcription factors in the network (29%, Fisher p= 2.1 × 10−7 for negative targets and 26% p=4.0 × 10−4 for positive targets). This suggests feedback loops of both mRNA and miRNA genes affecting the expression of each other, and supports the idea that under steady-state conditions miRNAs confer robustness to expression programs[17]. Altogether, these results highlight the added insight into the role of miRNAs in regulatory networks from analysis of population variation.

Genetic effects on the transcriptome

Expression QTL (eQTL) analysis of protein-coding and lincRNA genes uncovered extremely widespread regulatory variation, with 3,773 genes having a classical eQTL for gene expression levels (Table 1). While the potential of RNA-seq to discover other transcriptome traits such as splicing variation is widely known[7,8,18-20], a comprehensive analysis has been lacking. To this end, we first mapped eQTLs for exon quantifications that can capture both gene expression and splicing variation, discovering as many as 7,825 genes with an eQTL, referred to as eQTLs in this paper unless otherwise specified. Regressing out the most significantly associated variant from the EUR eQTL analysis showed that as many as 34% of the genes have a second, independent eQTL for any of their exons (of which 7% for the exon of the first association). Thus, there is substantial allelic heterogeneity for regulatory effects on a single gene and independence of exons of the same gene (Fig.S17), To investigate genetic effects specifically on splicing, we discovered 639 genes with transcript ratio QTLs (trQTLs) affecting the ratio of each transcript to the gene total – the largest number of genetic effects on transcript structure identified to date. The lower number relative to gene eQTLs is likely caused by higher noise in model-based transcript quantifications than in gene counts. To characterize the relationship of genetic variants affecting expression versus splicing, we regressed out the best trQTL variant from the gene eQTL analysis in 279 genes with both types of QTLs. The results showed that the causal variants are independent in ≥57% of these genes (Fig. S18), suggesting that transcriptional activity and transcript usage are usually controlled by different regulatory elements of the genome. The transcript differences driven by trQTLs involve exon skipping only in 15% of genes, with as much as 48% and 43% varying in 5’ and 3’ ends, respectively (in EUR; categories not mutually exclusive; Fig. 2b). To further analyze transcript modifications through unannotated transcript elements, we mapped cis-eQTL for expressed retrotransposon-derived elements (repeat elements) outside genes, known to be an important source for evolution of new transcripts.[21] We detected widespread sharing between the 5,763 cis-eQTLs discovered for repeat elements (Fig. S19, Table 1) and nearby exon eQTLs: of the best repeat eQTLs variants in EUR, 49% were significant and 6% the top eQTLs variants for exons of a nearby gene (3.8× and 26× enrichment; Fisher p<2.2 × 10−16). This suggests that retrotransposon-derived elements can share regulatory elements with nearby genes. These results provide the first genome-wide characterization of genetic effects on transcript structure through annotated and unannotated 3’ and 5’ changes, which may predominate exon skipping that previous studies have focused on[19]. This opens new perspectives for understanding their cellular and high-level effects, as end modifications will rarely change protein structure but may affect post-transcriptional regulation.

Figure 2

Transcriptome QTLs

a) Enrichment of EUR exon eQTLs in functional annotations for the 1st, 2nd, 5th and 10th best associating eQTL variant per gene, relative to a matched null set of variants denoted by the horizontal line. The numbers are −log10 p-values of a Fisher test between the best eQTL and the null. b) Classification of changes caused by transcript ratio QTLs. c) The rank of the best Omni2.5M SNP among the significant EUR eQTL variants per gene. d) DGKD gene locus where an intronic SNP rs838705 is associated to calcium levels (red), and the top eQTL variant 21 kb downstream (blue) is a very likely causal variant, close the TSS of two transcripts in the MEF2A,C binding region.

Altogether, we present the largest and the most diverse catalog of cis-regulatory variants discovered in a single tissue to date. The majority of the analyzed genes – 8,329 out of 13,970 – have one or several QTLs for different transcript traits, a resolution enabled by in-depth analysis of high-quality transcriptome and genome sequencing data. These results highlight both allelic heterogeneity of regulatory variants and phenotypic heterogeneity of diverse transcriptome traits of individual genes.

Properties of regulatory variants

To understand how eQTLs affect gene expression, we compared the properties of the top (most significant) eQTL variant per gene to a null of non-eQTL variants (matched for distance from TSS and minor allele frequency). The best eQTL variant may not always be the causal variant due to noise in genotype and phenotype data, and to estimate our ability to pinpoint causal variants, we contrasted the properties of the 1st eQTL to the 2nd, 5th and 10th best eQTL variants (Fig. 2a). First, comparing the eQTL with the best p-value to the matched null showed an enrichment of indels among top eQTLs (13% = 1.22× enrichment; Fisher p = 1.9 × 10−3 in EUR; Fig.S20), suggesting that indels are more likely to have functional effects than SNPs. eQTLs are highly enriched in several noncoding elements from the Ensembl Regulatory Build, such as many transcription factor peaks (median enrichment 3.3×, median p = 0.009 in EUR; Fig. 2a, S21), DNase1 hypersensitive sites (3.4×, p = 1.00 × 10−20), as well as in chromatin states of active promoters (3.5×, p = 1.08 × 10−36) and strong enhancers (median 2.4×, median p = 1.14 × 10−5). Within genes, splice-site (3.8×, p = 1.65 × 10−5) and nonsynonymous (2.3×, p = 4.84 × 10−6) enrichments point to putative regulatory functions of coding variants. Transcript ratio QTLs are overrepresented in splice sites (6.8×, p = 2.44 × 10−7), as expected, but also for example in 3’UTRs (2.5×, p = 1.83 × 10−6; Fig. S22) and promoters (2.4×, p = 5.79 × 10−6). Altogether, the higher resolution of annotations and eQTLs relative to previous studies[22,23] provides important insight into the role of individual transcription factors and other regulatory elements mediating genetic regulatory effects. Functional enrichment typically decreases rapidly from the best eQTL variant towards lower ranks. To estimate how often the first variant is likely to be the causal regulatory variant, we calculated the annotation enrichment of the best eQTL variants relative to the null for (1) all eQTL loci, and (2) loci where the best eQTL variant is very likely causal due to having a log10 p-value >1.5 higher than the second variant (Fig.S23). The ratio of the enrichments (1) and (2) yields an approximation of the best variant being causal in 55% of EUR and 74% of YRI eQTLs, with more conservative estimates being 34% and 41%, respectively (Fig.S23). Thus, we have reasonable power to pinpoint causal regulatory variants from unbiased p-value distributions alone without annotation priors[23]. This is enabled by not relying on SNP array data[22]: in 81% of the cases the best variant is not on the Omni 2.5M chip (Fig.2c, Fig.S25). Validating the putative causal effects, we observed that the best eQTL variants in CTCF peaks showed more allele-specific binding compared to matched null variants (p = 2.0 × 10−3, Fig.S24) in CTCF ChIP-seq data from 6 individuals[24], and the best eQTLs were enriched in DNase1 hypersensitivity QTLs[25] (3.3×, p = 2.51 × 10−6 in EUR, 7.9×, p < 2.2 × 10−16 in YRI). In conclusion, we not only identify broad eQTL loci but also substantially increase our confidence to pinpoint individual causal variants and their functional mechanisms. Of the 6,473 variants in the GWAS catalog[26], 16% are eQTLs and 1.8% are trQTLs in EUR or YRI, but a high overlap is observed also by chance for a frequency-matched GWAS null (11% and 0.84%, respectively). The modest (albeit significant: eQTL chi2 p < 2.2 × 10−16; trQTL p = 7.2 × 10−9) enrichment[9,10] is due to eQTLs being very ubiquitous, and consequently, a GWAS variant being an eQTL does not mean that the regulatory change is necessarily driving the disease association. Our data offers a unique opportunity to address the key question of whether the causal eQTL variant is also causal for the disease. The enrichment of GWAS SNPs in the top eQTL ranks (p=1.18 × 10−7; Fig. S26) is a genome-wide signal of shared causality. To further characterize individual loci, we selected 78 eQTL regions that are likely causal signals for 91 GWAS SNPs (estimated by the RTC method),[6,9], and in these loci our best eQTL variant is the putative disease-causing variant (Fig.S27, Table S5). Figure 2d shows an example of the DGKD gene where an intronic SNP rs838705 is associated to calcium levels[27], and 21 kb downstream the top eQTL – a 2bp insertion – is the likely causal variant affecting calcium levels. Thus, the integration of genome sequencing and cellular phenotype data helps not only to understand causal genes and biological processes but also to pinpoint putative causal genetic variants underlying GWAS associations.

Allelic and oss-of-function effects

Transcript differences between the two haplotypes of an individual allow quantification of regulatory variation even when eQTLs cannot be detected e.g. due to low allele frequency. We analyzed both allele-specific expression (ASE) and allele-specific transcript structure (ASTS), a novel approach based on exonic distribution of reads (Fig.S2, S28–33). This first genome-wide quantification of allelic effects on transcript structure shows that it is almost equally common as ASE, with significant (p < 0.005) ASE and ASTS in a median of 6.5% and 5.6% sites (out of 8,420 and 2,135) per individual, respectively. Furthermore, the substantial overlap of ASE and ASTS signals (Fig.3a) suggests that ASE is actually often driven by transcript structure variation. The low population frequency of the vast majority of ASE (Fig.3b) and ASTS (Fig.S30) events points to widespread rare regulatory variation that is undetectable in eQTL analysis.

Figure 3

Allele-specific effects on expression and transcript structure

a) Sharing of allele-specific expression (ASE) and transcript structure (ASTS) signals: the distribution of ASTS p-value of the sites with significant (p<0.005) ASE in the same individual, and vice versa. The ASE p-values are calculated from sites sampled to exactly 30 reads. The numbers denote the pi1 statistic measuring the enrichment of low p-values. b) Frequency of significant ASE event in the population (x-axis) and their effect size (|0.5 – REF/TOTAL|), calculated per ASE SNP. Only ASE SNPs with >=20 heterozygote individuals with >=30 reads were included, and the estimates were corrected for coverage bias and false positives by sampling and permutations. c) Enrichment of variants in regulatory annotations relative to a matched null distribution for the most significant eQTL variants, and for the subset of these that are also rSNPs. Categories with highest amount of data are shown (see Fig. S36 for all categories, see also Fig. 2a).

An important caveat in ASE analysis has been the possibility that it can be driven by purely epigenetic effects rather than cis-regulatory genetic variants. We investigated this by a novel approach to quantify concordance between ASE and putative regulatory variants (prSNPs), where heterozygotes but not homozygotes for a true rSNP should have differential expression of the two haplotypes, i.e. allelic imbalance in an aseSNP (Fig. S2, S34). We calculated concordance of allelic ratios of 5,479 aseSNPs and genotypes of all SNPs +/− 100kb from TSS, with an empirical p-value from 100–1000 permutations. Assigning the prSNPs with empirical p-value <0.01 to p<0.001 as likely rSNPs yielded a total of 224,640 rSNPs (7.4% of tested, Table S6) that clustered close to TSS as expected for regulatory variants[5] and replicate the majority of eQTL signals (Fig. S35). Nearly all aseSNPs (95%) had more observed rSNPs than expected; thus ASE appears to be nearly always genetic rather than driven by genotype-independent allelic epigenetic effects. rSNP signals are widespread and robust also outside eQTL genes (Table S6, Fig.S35), indicating potential to capture novel effects. Variants that are both eQTLs and rSNPs show higher enrichment in functional annotations (Fig. S3c, S36), suggesting that integrated analysis may improve resolution to find causal regulatory variants. Altogether, we show evidence that ASE effects are mostly rare and nearly always genetic, and ASE-based analyses may complement eQTL analysis in identification of especially low-frequency regulatory variants in future studies. While QTL and prSNP analyses aim at identifying previously unknown regulatory variants, we can also quantify functional effects of predicted loss-of-function variants.[28] Our RNA-seq data captures 839 premature stop codon and 849 splice-site variants, with the much higher number than in previous studies enabling proper quantification of their transcriptome effects. As expected, premature stop variants often show loss of the variant allele (Fig, S37) indicating nonsense-mediated decay[29] as in previous studies[28,30]. Variants close to the end of the transcript appear to escape NMD as predicted[29]. However, of the variants predicted to trigger NMD, in 68% (54% of rare variants MAF<1%) the ASE results do not support this (Fig. 4), suggesting currently unknown mechanisms of NMD escape.

Figure 4

Transcriptome effects of loss-of-function variants

A) Nonsense-mediated decay due to premature stop codon variants was measured using allele-specific expression. The distribution of non-reference allele ratios (on the y-axis) for premature stop variants sorted on the x-axis according to derived allele frequency, split to sites predicted to trigger and escape NMD. The dots denote the median across individuals, and the vertical lines show the range of ratios for variants carried by several individuals. The grey vertical lines denote derived allele frequencies of 0, 0.001 and 0.01. B) Exon inclusion scores for variable exons for individuals that carry 0, 1 or 2 copies of variants that destroy a splice motif, with p-value from Mann-Whitney test.

Finally, we modeled how genetic variants affect splicing affinity in the entire splicing motif rather than only the canonical splice site, which is the first comprehensive set of such predictions genome-wide (PGF et al., submitted). Nonreference alleles have a lower splicing affinity on average (p<2.2 × 10−16, Fig. S38). For the 10% of these variants predicted to destroy the motif, individuals carrying two motif-destroying alleles have 29% lower median inclusion rates of the affected exon (p<2.2 × 10−16, Fig.4c), indicating that our RNA-seq data is consistent with predictions of splicing effects.

Conclusions

By integrated analysis of RNA and DNA sequencing data we were able to obtain a unique view to variation of the transcriptome and its genetic causes, moving beyond eQTL catalogs to a high-resolution view of genetic regulatory variants. We deconvoluted the effect of gene expression and transcript structure in population differences of the transcriptome, in QTLs, and in allele-specific effects, and show that these two dimensions of transcript variation appear equally common but largely independent. Genetic regulatory variation is the rule rather than the exception in the genome with widespread allelic heterogeneity, and is the major determinant of allelic expression. For the first time, we were able to predict large numbers of causal regulatory variants, and thus provide a detailed view into cellular mechanisms of regulatory and loss-of-function variation, which is essential for future functional prediction of variants discovered in personal genomes. A subset of this functional variation at the cellular level will also have effects on higher-level traits. We demonstrate how eQTL data can be used to pinpoint putative causal GWAS variants of individual loci, which is important as a new paradigm of how integration of cellular phenotypes and genome sequencing data can uncover causal variants and biological mechanisms underlying diseases. The landscape of regulatory variation in this study adds a functional dimension to the 1000 Genomes data, which is used in effectively all disease studies, and together they form an important joint reference data set of variation and function of the human genome. Ultimately, this study illustrates the power of combining genome sequence analysis with a high-depth functional readout such as the transcriptome.

Methods

Total RNA was extracted from EBV transformed lymphoblastoid cell line pellets by the TRIzol reagent (Ambion), and mRNA and small RNA sequencing of 465 unique individuals was performed on the Illumina HiSeq2000 platform, with paired-end 75bp mRNA-seq and single-end 36bp small RNA-seq. Five samples were sequenced in replicate in each of the seven sequencing laboratories. The mRNA and small RNA reads were mapped with GEM[31] and miraligner[32], respectively, with an average of 48.9M mRNA-seq reads and 1.2M miRNA reads per sample after QC. Numerous transcript features were quantified using Gencode v12[33] and miRBase v18[34] annotations: protein-coding and lincRNA genes (16,084 detected in >50% of samples), transcripts (67,603; with FluxCapacitor[7]), exons (146,498), annotated splice junctions (129,805; analyzed in detail in Ferreira et al. submitted), transcribed repetitive elements (47,409), and mature miRNAs (715). Data quality was assessed by sample correlations and read and gene count distributions, and technical variation was removed by PEER normalization[35] for the QTL and miRNA-mRNA correlation analyses[11]. The samples clustered uniformly both before and after normalization. The genotype data was obtained from 1000 Genomes Phase 1 data set for 421 samples (80× average exome and 5× whole genome read depth), and the remaining 41 samples were imputed from Omni 2.5M SNP array data. Furthermore, we did functional reannotation for all the 1000 Genomes variants using Gencode v12. QTL mapping was done with linear regression, using genetic variants with >5% frequency in 1MB window and normalized quantifications transformed to standard normal. Permutations were used to adjust FDR to 5%. Full details are provided in Supplementary Methods.

34 in total

1. Estimation of alternative splicing variability in human populations.

Authors: Mar Gonzàlez-Porta; Miquel Calvo; Michael Sammeth; Roderic Guigó
Journal: Genome Res Date: 2011-11-23 Impact factor: 9.043

2. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories.

Authors: Peter A C 't Hoen; Marc R Friedländer; Jonas Almlöf; Michael Sammeth; Irina Pulyakhina; Seyed Yahya Anvar; Jeroen F J Laros; Henk P J Buermans; Olof Karlberg; Mathias Brännvall; Johan T den Dunnen; Gert-Jan B van Ommen; Ivo G Gut; Roderic Guigó; Xavier Estivill; Ann-Christine Syvänen; Emmanouil T Dermitzakis; Tuuli Lappalainen
Journal: Nat Biotechnol Date: 2013-09-15 Impact factor: 54.908

Review 3. A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance.

Authors: E Nagy; L E Maquat
Journal: Trends Biochem Sci Date: 1998-06 Impact factor: 13.807

4. GENCODE: the reference human genome annotation for The ENCODE Project.

Authors: Jennifer Harrow; Adam Frankish; Jose M Gonzalez; Electra Tapanari; Mark Diekhans; Felix Kokocinski; Bronwen L Aken; Daniel Barrell; Amonida Zadissa; Stephen Searle; If Barnes; Alexandra Bignell; Veronika Boychenko; Toby Hunt; Mike Kay; Gaurab Mukherjee; Jeena Rajan; Gloria Despacio-Reyes; Gary Saunders; Charles Steward; Rachel Harte; Michael Lin; Cédric Howald; Andrea Tanzer; Thomas Derrien; Jacqueline Chrast; Nathalie Walters; Suganthi Balasubramanian; Baikang Pei; Michael Tress; Jose Manuel Rodriguez; Iakes Ezkurdia; Jeltje van Baren; Michael Brent; David Haussler; Manolis Kellis; Alfonso Valencia; Alexandre Reymond; Mark Gerstein; Roderic Guigó; Tim J Hubbard
Journal: Genome Res Date: 2012-09 Impact factor: 9.043

5. Dissecting the regulatory architecture of gene expression QTLs.

Authors: Daniel J Gaffney; Jean-Baptiste Veyrieras; Jacob F Degner; Roger Pique-Regi; Athma A Pai; Gregory E Crawford; Matthew Stephens; Yoav Gilad; Jonathan K Pritchard
Journal: Genome Biol Date: 2012-01-31 Impact factor: 13.583

6. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Authors:
Journal: Nature Date: 2007-06-07 Impact factor: 49.962

7. An integrated encyclopedia of DNA elements in the human genome.

Authors:
Journal: Nature Date: 2012-09-06 Impact factor: 49.962

8. Extent, causes, and consequences of small RNA expression variation in human adipose tissue.

Authors: Leopold Parts; Åsa K Hedman; Sarah Keildson; Andrew J Knights; Cei Abreu-Goodger; Martijn van de Bunt; José Afonso Guerra-Assunção; Nenad Bartonicek; Stijn van Dongen; Reedik Mägi; James Nisbet; Amy Barrett; Mattias Rantalainen; Alexandra C Nica; Michael A Quail; Kerrin S Small; Daniel Glass; Anton J Enright; John Winn; Panos Deloukas; Emmanouil T Dermitzakis; Mark I McCarthy; Timothy D Spector; Richard Durbin; Cecilia M Lindgren
Journal: PLoS Genet Date: 2012-05-10 Impact factor: 5.917

9. Mapping cis- and trans-regulatory effects across multiple tissues in twins.

Authors: Elin Grundberg; Kerrin S Small; Åsa K Hedman; Alexandra C Nica; Alfonso Buil; Sarah Keildson; Jordana T Bell; Tsun-Po Yang; Eshwar Meduri; Amy Barrett; James Nisbett; Magdalena Sekowska; Alicja Wilk; So-Youn Shin; Daniel Glass; Mary Travers; Josine L Min; Sue Ring; Karen Ho; Gudmar Thorleifsson; Augustine Kong; Unnur Thorsteindottir; Chrysanthi Ainali; Antigone S Dimas; Neelam Hassanali; Catherine Ingle; David Knowles; Maria Krestyaninova; Christopher E Lowe; Paola Di Meglio; Stephen B Montgomery; Leopold Parts; Simon Potter; Gabriela Surdulescu; Loukia Tsaprouni; Sophia Tsoka; Veronique Bataille; Richard Durbin; Frank O Nestle; Stephen O'Rahilly; Nicole Soranzo; Cecilia M Lindgren; Krina T Zondervan; Kourosh R Ahmadi; Eric E Schadt; Kari Stefansson; George Davey Smith; Mark I McCarthy; Panos Deloukas; Emmanouil T Dermitzakis; Tim D Spector
Journal: Nat Genet Date: 2012-09-02 Impact factor: 38.330

10. Landscape of transcription in human cells.

Authors: Sarah Djebali; Carrie A Davis; Angelika Merkel; Alex Dobin; Timo Lassmann; Ali Mortazavi; Andrea Tanzer; Julien Lagarde; Wei Lin; Felix Schlesinger; Chenghai Xue; Georgi K Marinov; Jainab Khatun; Brian A Williams; Chris Zaleski; Joel Rozowsky; Maik Röder; Felix Kokocinski; Rehab F Abdelhamid; Tyler Alioto; Igor Antoshechkin; Michael T Baer; Nadav S Bar; Philippe Batut; Kimberly Bell; Ian Bell; Sudipto Chakrabortty; Xian Chen; Jacqueline Chrast; Joao Curado; Thomas Derrien; Jorg Drenkow; Erica Dumais; Jacqueline Dumais; Radha Duttagupta; Emilie Falconnet; Meagan Fastuca; Kata Fejes-Toth; Pedro Ferreira; Sylvain Foissac; Melissa J Fullwood; Hui Gao; David Gonzalez; Assaf Gordon; Harsha Gunawardena; Cedric Howald; Sonali Jha; Rory Johnson; Philipp Kapranov; Brandon King; Colin Kingswood; Oscar J Luo; Eddie Park; Kimberly Persaud; Jonathan B Preall; Paolo Ribeca; Brian Risk; Daniel Robyr; Michael Sammeth; Lorian Schaffer; Lei-Hoon See; Atif Shahab; Jorgen Skancke; Ana Maria Suzuki; Hazuki Takahashi; Hagen Tilgner; Diane Trout; Nathalie Walters; Huaien Wang; John Wrobel; Yanbao Yu; Xiaoan Ruan; Yoshihide Hayashizaki; Jennifer Harrow; Mark Gerstein; Tim Hubbard; Alexandre Reymond; Stylianos E Antonarakis; Gregory Hannon; Morgan C Giddings; Yijun Ruan; Barbara Wold; Piero Carninci; Roderic Guigó; Thomas R Gingeras
Journal: Nature Date: 2012-09-06 Impact factor: 49.962

914 in total

1. A likelihood-based approach to transcriptome association analysis.

Authors: Jing Qian; Evan Ray; Regina L Brecha; Muredach P Reilly; Andrea S Foulkes
Journal: Stat Med Date: 2018-12-04 Impact factor: 2.373

2. Genetic variants in the genes encoding rho GTPases and related regulators predict cutaneous melanoma-specific survival.

Authors: Shun Liu; Yanru Wang; William Xue; Hongliang Liu; Yinghui Xu; Qiong Shi; Wenting Wu; Dakai Zhu; Christopher I Amos; Shenying Fang; Jeffrey E Lee; Terry Hyslop; Yi Li; Jiali Han; Qingyi Wei
Journal: Int J Cancer Date: 2017-06-01 Impact factor: 7.396

3. Associations between RNA splicing regulatory variants of stemness-related genes and racial disparities in susceptibility to prostate cancer.

Authors: Yanru Wang; Jennifer A Freedman; Hongliang Liu; Patricia G Moorman; Terry Hyslop; Daniel J George; Norman H Lee; Steven R Patierno; Qingyi Wei
Journal: Int J Cancer Date: 2017-06-01 Impact factor: 7.396

4. Complexity and diversity of F8 genetic variations in the 1000 genomes.

Authors: J N Li; I G Carrero; J F Dong; F L Yu
Journal: J Thromb Haemost Date: 2015-10-20 Impact factor: 5.824

5. Tools and best practices for data processing in allelic expression analysis.

Authors: Stephane E Castel; Ami Levy-Moonshine; Pejman Mohammadi; Eric Banks; Tuuli Lappalainen
Journal: Genome Biol Date: 2015-09-17 Impact factor: 13.583

6. Downregulation of the acetyl-CoA metabolic network in adipose tissue of obese diabetic individuals and recovery after weight loss.

Authors: Harish Dharuri; Peter A C 't Hoen; Jan B van Klinken; Peter Henneman; Jeroen F J Laros; Mirjam A Lips; Fatiha El Bouazzaoui; Gert-Jan B van Ommen; Ignace Janssen; Bert van Ramshorst; Bert A van Wagensveld; Hanno Pijl; Ko Willems van Dijk; Vanessa van Harmelen
Journal: Diabetologia Date: 2014-08-07 Impact factor: 10.122