Literature DB >> 27098042

g:Profiler-a web server for functional interpretation of gene lists (2016 update).

Jüri Reimand¹, Tambet Arak², Priit Adler², Liis Kolberg², Sulev Reisberg², Hedi Peterson², Jaak Vilo³.

Abstract

Functional enrichment analysis is a key step in interpreting gene lists discovered in diverse high-throughput experiments. g:Profiler studies flat and ranked gene lists and finds statistically significant Gene Ontology terms, pathways and other gene function related terms. Translation of hundreds of gene identifiers is another core feature of g:Profiler. Since its first publication in 2007, our web server has become a popular tool of choice among basic and translational researchers. Timeliness is a major advantage of g:Profiler as genome and pathway information is synchronized with the Ensembl database in quarterly updates. g:Profiler supports 213 species including mammals and other vertebrates, plants, insects and fungi. The 2016 update of g:Profiler introduces several novel features. We have added further functional datasets to interpret gene lists, including transcription factor binding site predictions, Mendelian disease annotations, information about protein expression and complexes and gene mappings of human genetic polymorphisms. Besides the interactive web interface, g:Profiler can be accessed in computational pipelines using our R package, Python interface and BioJS component. g:Profiler is freely available at http://biit.cs.ut.ee/gprofiler/.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
Transcription Factors

Year: 2016 PMID： 27098042 PMCID： PMC4987867 DOI： 10.1093/nar/gkw199

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Next-generation sequencing and other high-throughput technologies have revolutionized the characterization of life at molecular resolution. While collection of omics data has become dramatically cheaper and more accessible over the past decades, its interpretation remains a significant challenge. Functional enrichment analysis is a common technique to interpret gene lists. It takes advantage of previous knowledge of gene function and uses a battery of statistical techniques to determine biological processes and pathways characteristic of the genes of interest. Information about biological processes, molecular functions, and cell components and phenotypes is organized into structured vocabularies such as Gene Ontology (GO) (1) and Human Phenotype Ontology (HPO) (2). Databases such as Reactome (3) and KEGG (4) maintain well-curated collections of known molecular pathways. Other functional annotations including protein complexes (5), transcription factor (TF) binding sites (6), microRNA target sites (7) and disease associations (8) can be also used to interpret gene lists. We call all these potential annotations commonly as features or terms that help to interpret the shared properties of the genes in the input lists. Functional enrichment analysis is a common component of every omics analysis and such resources are in demand in the research community. Many tools of variable qualities are available. While data are frequently updated in some tools such as g:Profiler (9), GOstats (10) and Babelomics (11), many popular tools like DAVID (12) and Bingo (13) have not been updated in years. Tools such as Panther (14), FuncAssociate (15) and GOrilla (16) aim to support the analysis of many species, while others such as WebGestalt (17) focus on the convenient mapping of diverse gene identifiers. Babelomics provides functional enrichment analysis as part of a larger platform (11). While the majority of available tools are web services, functional enrichment analysis can be performed using Java applications (18), R packages (10) and Cytoscape plugins (13). Thus users have many alternatives to interpret their gene lists with functional information. With the ten-year continuous development of g:Profiler, we aim to address the needs of diverse research communities. Our web server provides access to a toolbox of statistical techniques, intuitive interactive analyses, numerous species and a multitude of options.

g:PROFILER WEB SERVER

The g:Profiler web server (http://biit.cs.ut.ee/gprofiler/) comprises several tools to perform functional enrichment analysis and mine additional information. These tools analyse flat or ranked gene lists for enriched features (g:GOSt; g:Cocoa), convert gene identifiers of different classes (g:Convert), map genes to orthologous genes in related species (g:Orth) and find similarly expressed genes from public microarray datasets (g:Sorter). An additional tool g:SNPense maps human single nucleotide polymorphisms (SNP) to gene names, chromosomal locations and variant consequence terms from Sequence Ontology (19,20). g:Profiler regularly synchronises with the Ensembl database for gene annotations and identifiers. It supports all species whose genomes are available in Ensembl and Ensembl Genomes (19,21) except for bacterial, archaeal and protist genomes. GO ontologies and some gene annotations are downloaded from the GO website (1). Other functional resources are updated regularly from corresponding databases. The latest versions and dates of each update are documented on our main page.

g:GOSt—functional enrichment analysis

g:GOSt performs pathway enrichment analysis and is the central tool of the g:Profiler web server. It maps a user-provided gene list to various sources of functional information and determines significantly enriched pathways, processes and other annotations. The GO (1,22) is richest of supported ontologies and is available for many species. We also use molecular pathways from the KEGG (4) and Reactome (3) databases, target sites of miRNAs from the miRBase (7) database, and predicted target sites of TFs using the TRANSFAC resource (6). Information about protein complexes and protein–protein interaction networks from the CORUM database (5) and BioGRID (23) is also used to interpret gene lists. In this update we have included protein expression data from the Human Protein Atlas (HPA) (24). Gene annotations of physiological and disease phenotypes from the HPO (2) and the Online Mendelian Inheritance in Man (OMIM) resource (8) allow users to interpret their gene lists in the context of human health. g:GOSt supports the majority of gene identifiers used by the basic and biomedical research community. This includes all identifiers that have been linked to genes in the Ensembl database (19), including genes, proteins, transcripts, accession numbers in genome databases, probesets of experimental platforms, etc. For example, g:GOSt recognises 116 types of identifiers of human genes that can be presented as input as a mixed list of genes. This flexible feature allows the user to easily navigate the jungle of numerous omics platforms and gene databases. The gene query can be also presented as a list of chromosomal coordinates. For each chromosomal region we extract all genes that are at least partially located in the given region. Analysis of genes in chromosomal regions is a useful feature for analysing GWAS and epigenomics data. g:GOSt allows researchers to analyse flat and ranked gene lists. Ranked list analysis is more powerful and is recommended in the majority of cases. In the case of ranked gene lists, first genes in the input list are more important than the following genes (e.g. have a stronger signal in the underlying experiment). g:GOSt then computes a minimum hypergeometric statistic for every term. This technique starts from the top-ranked genes in the list and determines the subset where the enrichment is the strongest. This method provides more resolution to pathway enrichment analysis as it detects both small and highly significant pathways among top-ranked genes as well as broader terms representative of the entire gene list. We enable and encourage users to provide a custom background for their query when necessary. This statistical technique is essential when the number of genes studied in the specific case is a considerably small subset of all known genes in the genome of the studied gene. For example, certain experimental platforms such as ProtoArrays only cover 1/3 of all human genes and thus the remaining genes are not part of analysis by design. Providing this fraction of genes of as statistical background provides a more accurate estimate of functional enrichment and reduces the bias towards over-interpretation. g:GOSt applies the widely applied hypergeometric distribution to estimate the significance of enriched pathways and processes in gene lists. Each default analysis of human gene lists considers more than 30 000 gene sets corresponding to a large variety of features. Thus a multiple testing correction is required to reduce false positive findings. With the first release of g:Profiler in 2007 we introduced a ontology-focused multiple testing correction method g:SCS (9). We showed that the most common multiple testing correction methods incorrectly estimate the expected number of false positive results in the enrichment analysis: the Benjamini–Hochberg False Discovery Rate tends to find more false positives while the Bonferroni correction is overly conservative. The g:SCS method is used by default in g:Profiler and users can choose to use the other two correction methods. g:GOSt allows users to filter resources used to interpret gene lists. For example, one may choose to only use biological processes of GO and Reactome pathways and filter out other databases and ontologies. Similarly, one may focus on relatively small biological processes (more than five genes and less than five hundred) and discard other gene sets prior to analysis. Such filtering speeds up calculations, improves statistical power and reduces the effect of multiple testing, as well as provides easier interpretation. We recommend users to consider data resources beforehand and select the most interesting ones for their particular analysis. The main output of g:GOSt comprises a visual matrix of functional annotations of genes. Each gene in the input list is highlighted with a coloured square if it belongs to the respective enriched term. Colours represent different evidence codes for GO as well as gene annotations to other functional resources. Several metrics are also reported for each of the enriched results, including the size of the gene set in question, overlap with the input gene list and the statistical significance (P-value). Individual results are grouped by their hierarchy relative to other results, or alternatively ordered by P-value. Additional details about the query are given below the visualization, along with statistical background sizes, the input gene list and involved protein interaction networks. The results are either provided in graphical format (PNG, PDF), text file or Excel spreadsheet. We also look for enriched modules in BioGRID protein–protein interaction (PPI) network (23). Input genes that have at least one common interaction partner in the PPI network are visualized with all their partners. We use the Cytoscape.js JavaScript library (25) to visualize these networks and provide all interaction data as text. g:GOSt results can be easily integrated with the Enrichment Map (26) method that provides network visualization of functional enrichment analysis. Enrichment Map is a useful method for simplifying complex results with many redundant processes and gene functions. g:GOSt provides a special output format (generic enrichment map) that can be directly uploaded into Cytoscape for visual network analysis.

g:Cocoa—simultaneous enrichment analysis of multiple gene lists

g:Cocoa provides means to analyse several gene lists at the same time and compare their characteristic enriched terms. This is useful in scenarios where an experimental design involves many comparisons of samples or individuals, or when one wants to compare directly different clusters of genes arising from the analysis. Each gene list is analysed for functional enrichments similarly to g:GOSt and resulting terms are aligned vertically into a matrix highlighting strongest findings for every gene list.

g:Convert—automatic conversion of gene identifiers

g:Convert provides a convenient service to translate identifiers (IDs) of genes, proteins, microarray probesets and many other types of namespaces. The seamless translation process works on a mixed set of diverse identifiers and maps these through Ensembl gene identifiers (ENSG) as reference. In cases of multiple identifiers, all relevant combinations are highlighted. At least 13 types of IDs are supported for all of the 213 species available in g:Profiler, and at least 40 types of IDs for more than 50 species.

g:Orth—mapping related genes across species

g:Orth allows the user to map a list of genes of interest to homologous genes in another related organism. Many experiments are conducted in model organisms and knowledge from such experiments is transferred to other organisms to compare or complement previous findings. g:Orth uses g:Convert to map gene IDs to Ensembl ENSG identifiers. Further mapping to orthologous genes in other organisms is also based on Ensembl data (19,21). We provide cross-references between all organisms in g:Profiler. Queries are limited to tiers according to classes of species (animals, plants, fungi).

g:Sorter—finding similar genes in transcriptomics data

g:Sorter is a tool for finding lists of co-expressed genes from public transcriptomics datasets. Thousands of microarray and RNA-seq experiments have been conducted in the past decades. The majority of published studies have been accumulated in databases like ArrayExpress (27) and Gene Expression Omnibus (28). We have downloaded 7878 datasets for 18 species from ArrayExpress and provide gene co-expression similarity searches using six most common statistical methods. The datasets can be searched rapidly with keywords. The input of g:Sorter is a single gene and a dataset of interest and the result is a sorted list of genes that are similarly expressed with the gene of interest. These lists can be integrated into functional enrichment analysis. For comprehensive global gene expression similarity queries, as well as support for more species and platforms we suggest to use Multi Experiment Matrix (MEM) tool (29).

gProfileR package in R for automated analyses

The g:Profiler web server can be accessed in GNU R using the dedicated R package gProfileR available in CRAN. The R is a core asset of the bioinformatics community with hundreds of resources and analysis packages available. We provide the R package to enable integration of our tools to diverse automated pipelines. The package accesses our web server via the internet and covers the functionality of g:GOSt, g:Cocoa, g:Convert and g:Orth.

NEW DEVELOPMENTS IN G:PROFILER IN 2016

Since our previous publication in 2011 (30), we have added several new resources for interpreting gene lists and implemented new technologies. With our data update and archiving policy, we aim to maximize reproducibility and timeliness of research.

Mapping of ambiguous gene identifiers

Gene identifier mapping is a complex problem as the community continuously replaces earlier identifiers by newer ones and multiple aliases are the rule rather than an exception. This creates ambiguities in gene list interpretation and may cause genes to be excluded. To remedy this situation, we now provide semi-manual mapping of gene identifiers in addition to our automated annotation pipeline. We determine input genes that cannot be mapped to single ENSG identifiers and provide these to the user as an optional form where correct identifiers can be selected manually or excluded from the analysis. This approach guarantees that important genes are always included in the enrichment analysis.

Transcription factor binding site predictions with TRANSFAC

TF binding in regulatory DNA determines regulation of gene expression. Thus information about TF binding sites (TFBS) can be used to interpret gene lists and enrichment of TFBS in gene promoters may indicate common regulation and biological function. We have updated our binding site predictions in gene promoters by systematically mapping TF binding motifs to regulatory DNA in multiple species including human, mouse, chicken, fly and yeast. The promoter sizes depend on the species and are depicted in Figure 1.

Figure 1.

We predicted regulatory motifs from the TRANSFAC database for nine species shown on the x-axis. We used 6 kb promoter regions (±3 kb) upstream the transcription start sites for vertebrates, 2 kb promoters for fly and worm and 1000 bp promoters for yeast. TFBS matches per promoter are given with boxplots on the y-axis where the mean number of sites per promoter per TF is depicted with a black diamond. We have updated TFBS data in g:Profiler and changed our definitions of potential regulation events. We used regulatory motifs in the TRANSFAC database version 2015.3 to make computational predictions of binding sites in gene promoters. We use the TRANSFAC internal threshold for limiting false positive matches (minFP) of TFBS. On average, each promoter has between 1.4 and 2.5 TFBS on average for included species. Thus we introduce a two-step hierarchy of terms where the upper more lenient category covers all the genes that have at least one match of the given TFBS in their promoter, while the second more stringent category covers promoters where the motif needs to be present at least twice. The second category with stronger binding sites suggests a stronger regulatory relation.

Enriched protein expression patterns from HPA

The HPA is a compendium of protein expression in 44 normal human tissues measured by immunohistochemistry (24). Protein expression levels are categorized into four groups (not detected, low, medium and high expression) with two evidence terms of presence (uncertain, supportive). To allow interpretation of gene lists using this information, we have mapped gene sets corresponding to protein expression signatures into a hierarchy of terms that reflects their tissue-specific level of expression. The most stringent terms include only highly expressed proteins per tissue, while lenient terms include highly as well as lowly expressed proteins. The HPA resource provides 713 tissue-specific groups of genes corresponding to 15 000 genes.

Enrichment of Mendelian disorders

OMIM is a collection of human genes and their relationships with Mendelian disorders and other genetic phenotypes (8). Although the majority of OMIM descriptions are included already into g:Profiler via the HPO, we have also directly added more than 4500 OMIM annotations to 3500 genes and provide methods to search for over-represented disorders. Disorders have been organized hierarchically to parental terms using information on genetic heterogeneity in OMIM data records.

Genomic and functional data for 213 species

The 2016 version of g:Profiler supports the analysis of data from 213 different organisms from Ensembl (19) and Ensembl Genomes (21). g:Profiler covers 67 vertebrate, 38 plant and 52 fungi species among others, nearly doubling from 126 in our previous update in 2011 (30) (Figure 2). This makes g:Profiler the most species-rich functional enrichment analysis tool serving different research communities in life sciences.

Figure 2.

Total number of annotations terms per eight most common organisms in g:GOSt. Each colour represents different functional resource and the largest one (GO) is depicted with orange. Supplementary Figure S1 shows a similar overview of all 213 organisms.

g:SNPense—SNP identifier mapping

With the rapid growth of whole genome sequencing technology, researchers are uncovering extensive genetic variation and large collections of known SNP are available for human and other species. In order to easily map SNP identifiers (e.g. rs4244285) to gene names, chromosomal coordinates and retrieve their functional consequences we now provide a new service called g:SNPense. Information about genome variants is retrieved from dbSNP (31) and mapped to NCBI Genes (32). Potential functional consequences of the variants are retrieved from Ensembl Variation data (19) and grouped into 35 Sequence Ontology terms of various severity (20). g:SNPense is a potential entry point to g:GOSt and its functional annotation pipeline and enrichment analysis.

Programmable access to g:Profiler

The research community increasingly requires automatic and programmable access to web tools as basic and biomedical science is becoming increasingly data intensive. In addition to the CRAN-supported R package gProfileR that we have been providing for years, we are now expanding the programmable access capabilities to new technologies. We provide an application program interface (API) with Python that can be included into user-friendly bioinformatics analysis software such as Chipster (33) and Galaxy (34), allowing users to set up their own custom analytical pipelines for large-scale analysis. We already provide the g:Profiler utility as part of the Galaxy ToolShed (35).

BioJS component for visualizing g:Profiler results as word clouds

BioJS is an open source bioinformatics project comprising a library of JavaScript components for visualising biological data in web applications (36). We have developed a BioJS component for g:Profiler (biojs-vis-gprofiler) that performs g:GOSt analysis and represents the most significant keywords as word clouds. Clicking on keywords reveals associated biological processes and enrichment statistics. This simplified solution can be used in web applications and pipelines such as our MEM tool (29) where a comprehensive visual representation is not required. Tag clouds provide an easily interpretable visualization of most common terms highlighted with different colours and font sizes (Figure 3).

Figure 3.

BioJS output from the MEM tool depicting processes and functions related to embryonic stem cells.

Data maintenance policy for reproducibility

Since the publication in 2011 (30) all 13 previous releases of g:Profiler have been saved on our web server and are accessible on the web site through the dedicated link to Archive. This allows users to continue or verify their analysis on the same set of data that was available at the time of the original analysis. With this policy, we aim to increase transparency and integrity of bioinformatics data analysis. Since December 2014, g:Profiler has been updated on a quarterly basis following each Ensembl release (Figure 4). This assures that central gene identifier indexes and GO annotations are never older than 6 months. This renders g:Profiler one of the most up to date functional enrichment tool available today. While resources such as Reactome, KEGG, HPO, OMIM and others follow their own release schedules and add new genes to their databases, we check for updates in these resources when we conduct the main cycles with Ensembl. Thus some more static resources are updated less frequently (e.g. regulatory motifs of microRNAs and TFs). In addition to the stable g:Profiler version, we also provide public access to a development server called g:Profiler Beta for power users who benefit from the latest developments and newest data sources.

Figure 4.

Since 2011 the underlying human datasets in g:Profiler have been steadily growing. The majority of the functional resources have grown as depicted over the 13 latest g:Profiler versions. The number of mapped human genes has also grown by 20% from 26 000 to 31 000. As an exception, the Reactome dataset has decreased as we filtered out gene sets corresponding to reactions starting from g:Profiler version R1353.

DISCUSSION

With the continuous development of g:Profiler, combining further resources and releasing programmable access points (web, R, Python API, BioJS), we aim to provide a state-of-the-art functional profiling and identifier mapping service. Our tool combines up-to-date functional and genomic data with sophisticated algorithms and serves the community through an intuitive and freely accessible website with efficient visualization techniques. With this update we introduce more datasets for functional interpretation of gene lists. These include physiological and disease-related gene sets from OMIM, tissue-specific protein resources from the HPA and a new release of gene regulatory predictions using the TRANSFAC resource. We also have a new SNP identifier translation service g:SNPense. Since the last publication in 2011 we have more than doubled the number of supported species to 213 species. This is the largest number of species supported by any of the publicly available functional enrichment tools. In this update we focus on programmable ways to access our service. We have developed the Python API to g:Profiler that provides means to include our analysis to bioinformatics pipelines using solutions like Galaxy or Chipster (33,34). We have increased the number of output formats on our web portal for easier downstream analysis. We have also developed the g:Profiler BioJS JavaScript component that can be incorporated into independent websites. We consider data timeliness our highest priority. Novel annotation terms and findings about gene functions appear daily and therefore it is important to promptly consider this information for functional annotation. Annotating gene lists with data from five years ago, like when using DAVID (12), provides different conclusions than current data would. This is especially important in fields of research where technological improvements have only recently allowed high-throughput analysis (e.g. single cell analysis, embryonic stem cells, precision medicine). The g:Profiler service has recently proven to be useful to a broad user community studying anything from insects to wolves and plants (37–39). The most frequent use cases of g:Profiler probably relate to cancer genomics (40–44), stem cell research (45,46) and ageing (47,48). g:Profiler is a recommended tool for interpreting cancer genomes with pathway information (49). Several bioinformatics tools have incorporated the functionality of g:Profiler through dedicated APIs. For example, the global gene expression similarity analysis tool MEM (29) uses g:Convert for identifier mapping and the BioJS library for summarising enrichment results. Our online multivariate data clustering and visualization tool ClustVis (50) uses data and name mapping services from g:Profiler. A similar approach to GO-based word clouds is also used in the R package GOsummaries (51) that combines enrichment analysis of gene lists from g:GOSt with principal component analysis of gene expression data. Future developments of g:Profiler will focus on research of precision medicine and also on supporting as many species as possible. Advances in whole genome sequencing technology create requirements for novel tools that analyse genome variation for functional enrichments, their relation to drugs and diseases, protein domains or phosphorylation sites (52,53). As the enrichment analysis and identifier mapping services are highly needed for many species, our goal is to support the research of common and uncommon model organisms.

52 in total

1. Ontologizer 2.0--a multifunctional tool for GO term enrichment analysis and data exploration.

Authors: Sebastian Bauer; Steffen Grossmann; Martin Vingron; Peter N Robinson
Journal: Bioinformatics Date: 2008-05-29 Impact factor: 6.937

2. Proteomics. Tissue-based map of the human proteome.

Authors: Mathias Uhlén; Linn Fagerberg; Björn M Hallström; Cecilia Lindskog; Per Oksvold; Adil Mardinoglu; Åsa Sivertsson; Caroline Kampf; Evelina Sjöstedt; Anna Asplund; IngMarie Olsson; Karolina Edlund; Emma Lundberg; Sanjay Navani; Cristina Al-Khalili Szigyarto; Jacob Odeberg; Dijana Djureinovic; Jenny Ottosson Takanen; Sophia Hober; Tove Alm; Per-Henrik Edqvist; Holger Berling; Hanna Tegel; Jan Mulder; Johan Rockberg; Peter Nilsson; Jochen M Schwenk; Marica Hamsten; Kalle von Feilitzen; Mattias Forsberg; Lukas Persson; Fredric Johansson; Martin Zwahlen; Gunnar von Heijne; Jens Nielsen; Fredrik Pontén
Journal: Science Date: 2015-01-23 Impact factor: 47.728

3. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation.

Authors: Daniele Merico; Ruth Isserlin; Oliver Stueker; Andrew Emili; Gary D Bader
Journal: PLoS One Date: 2010-11-15 Impact factor: 3.240

4. Human embryonic stem cell-derived test systems for developmental neurotoxicity: a transcriptomics approach.

Authors: Anne K Krug; Raivo Kolde; John A Gaspar; Eugen Rempel; Nina V Balmer; Kesavan Meganathan; Kinga Vojnits; Mathurin Baquié; Tanja Waldmann; Roberto Ensenat-Waser; Smita Jagtap; Richard M Evans; Stephanie Julien; Hedi Peterson; Dimitra Zagoura; Suzanne Kadereit; Daniel Gerhard; Isaia Sotiriadou; Michael Heke; Karthick Natarajan; Margit Henry; Johannes Winkler; Rosemarie Marchan; Luc Stoppini; Sieto Bosgra; Joost Westerhout; Miriam Verwei; Jaak Vilo; Andreas Kortenkamp; Jürgen Hescheler; Ludwig Hothorn; Susanne Bremer; Christoph van Thriel; Karl-Heinz Krause; Jan G Hengstler; Jörg Rahnenführer; Marcel Leist; Agapios Sachinidis
Journal: Arch Toxicol Date: 2012-11-21 Impact factor: 5.153

5. ArrayExpress update--simplifying data submissions.

Authors: Nikolay Kolesnikov; Emma Hastings; Maria Keays; Olga Melnichuk; Y Amy Tang; Eleanor Williams; Miroslaw Dylag; Natalja Kurbatova; Marco Brandizi; Tony Burdett; Karyn Megy; Ekaterina Pilicheva; Gabriella Rustici; Andrew Tikhonov; Helen Parkinson; Robert Petryszak; Ugis Sarkans; Alvis Brazma
Journal: Nucleic Acids Res Date: 2014-10-31 Impact factor: 16.971

6. Babelomics 5.0: functional interpretation for new generations of genomic data.

Authors: Roberto Alonso; Francisco Salavert; Francisco Garcia-Garcia; Jose Carbonell-Caballero; Marta Bleda; Luz Garcia-Alonso; Alba Sanchis-Juan; Daniel Perez-Gil; Pablo Marin-Garcia; Ruben Sanchez; Cankut Cubuk; Marta R Hidalgo; Alicia Amadoz; Rosa D Hernansaiz-Ballesteros; Alejandro Alemán; Joaquin Tarraga; David Montaner; Ignacio Medina; Joaquin Dopazo
Journal: Nucleic Acids Res Date: 2015-04-20 Impact factor: 16.971

7. Ensembl 2016.

Authors: Andrew Yates; Wasiu Akanni; M Ridwan Amode; Daniel Barrell; Konstantinos Billis; Denise Carvalho-Silva; Carla Cummins; Peter Clapham; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah E Hunt; Sophie H Janacek; Nathan Johnson; Thomas Juettemann; Stephen Keenan; Ilias Lavidas; Fergal J Martin; Thomas Maurel; William McLaren; Daniel N Murphy; Rishi Nag; Michael Nuhn; Anne Parker; Mateus Patricio; Miguel Pignatelli; Matthew Rahtz; Harpreet Singh Riat; Daniel Sheppard; Kieron Taylor; Anja Thormann; Alessandro Vullo; Steven P Wilder; Amonida Zadissa; Ewan Birney; Jennifer Harrow; Matthieu Muffato; Emily Perry; Magali Ruffier; Giulietta Spudich; Stephen J Trevanion; Fiona Cunningham; Bronwen L Aken; Daniel R Zerbino; Paul Flicek
Journal: Nucleic Acids Res Date: 2015-12-19 Impact factor: 16.971

8. Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers.

Authors: Jüri Reimand; Gary D Bader
Journal: Mol Syst Biol Date: 2013 Impact factor: 11.429

9. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013.

Authors: Jing Wang; Dexter Duncan; Zhiao Shi; Bing Zhang
Journal: Nucleic Acids Res Date: 2013-05-23 Impact factor: 16.971

10. Ensembl Genomes 2016: more genomes, more complexity.

Authors: Paul Julian Kersey; James E Allen; Irina Armean; Sanjay Boddu; Bruce J Bolt; Denise Carvalho-Silva; Mikkel Christensen; Paul Davis; Lee J Falin; Christoph Grabmueller; Jay Humphrey; Arnaud Kerhornou; Julia Khobova; Naveen K Aranganathan; Nicholas Langridge; Ernesto Lowy; Mark D McDowall; Uma Maheswari; Michael Nuhn; Chuang Kee Ong; Bert Overduin; Michael Paulini; Helder Pedro; Emily Perry; Giulietta Spudich; Electra Tapanari; Brandon Walts; Gareth Williams; Marcela Tello-Ruiz; Joshua Stein; Sharon Wei; Doreen Ware; Daniel M Bolser; Kevin L Howe; Eugene Kulesha; Daniel Lawson; Gareth Maslen; Daniel M Staines
Journal: Nucleic Acids Res Date: 2015-11-17 Impact factor: 16.971

511 in total

1. LUZP1, a novel regulator of primary cilia and the actin cytoskeleton, is a contributing factor in Townes-Brocks Syndrome.

Authors: Laura Bozal-Basterra; María Gonzalez-Santamarta; Veronica Muratore; Aitor Bermejo-Arteagabeitia; Carolina Da Fonseca; Orhi Barroso-Gomila; Mikel Azkargorta; Ibon Iloro; Olatz Pampliega; Ricardo Andrade; Natalia Martín-Martín; Tess C Branon; Alice Y Ting; Jose A Rodríguez; Arkaitz Carracedo; Felix Elortza; James D Sutherland; Rosa Barrio
Journal: Elife Date: 2020-06-18 Impact factor: 8.140

2. An integrative systems-based analysis of substance use: eQTL-informed gene-based tests, gene networks, and biological mechanisms.

Authors: Zachary F Gerring; Angela Mina Vargas; Eric R Gamazon; Eske M Derks
Journal: Am J Med Genet B Neuropsychiatr Genet Date: 2020-12-23 Impact factor: 3.568

3. Prediction of condition-specific regulatory genes using machine learning.

Authors: Qi Song; Jiyoung Lee; Shamima Akter; Matthew Rogers; Ruth Grene; Song Li
Journal: Nucleic Acids Res Date: 2020-06-19 Impact factor: 16.971

4. The Legionella Effector Kinase LegK7 Hijacks the Host Hippo Pathway to Promote Infection.

Authors: Pei-Chung Lee; Matthias P Machner
Journal: Cell Host Microbe Date: 2018-09-12 Impact factor: 21.023

5. Transcriptomic studies provide insights into the tumor suppressive role of miR-146a-5p in non-small cell lung cancer (NSCLC) cells.

Authors: Joseph R Iacona; Nicholas J Monteleone; Alexander D Lemenze; Ashley L Cornett; Carol S Lutz
Journal: RNA Biol Date: 2019-08-23 Impact factor: 4.652

6. Engineering Controlled Peritumoral Inflammation to Constrain Brain Tumor Growth.

Authors: Tarun Saxena; Johnathan G Lyon; S Balakrishna Pai; Daniel Pare; Jessica Amero; Lohitash Karumbaiah; Sheridan L Carroll; Eric Gaupp; Ravi V Bellamkonda
Journal: Adv Healthc Mater Date: 2018-12-11 Impact factor: 9.933

7. Piwi/PRG-1 Argonaute and TGF-β Mediate Transgenerational Learned Pathogenic Avoidance.

Authors: Rebecca S Moore; Rachel Kaletsky; Coleen T Murphy
Journal: Cell Date: 2019-06-06 Impact factor: 41.582

8. Sex differences in gene expression in response to ischemia in the human left ventricular myocardium.

Authors: Gregory Stone; Ashley Choi; Oliva Meritxell; Joshua Gorham; Mahyar Heydarpour; Christine E Seidman; Jon G Seidman; Sary F Aranki; Simon C Body; Vincent J Carey; Benjamin A Raby; Barbara E Stranger; Jochen D Muehlschlegel
Journal: Hum Mol Genet Date: 2019-05-15 Impact factor: 6.150

9. Auxin Contributes to the Intraorgan Regulation of Gene Expression in Response to Shade.

Authors: Sujung Kim; Nobuyoshi Mochizuki; Ayumi Deguchi; Atsushi J Nagano; Tomomi Suzuki; Akira Nagatani
Journal: Plant Physiol Date: 2018-05-04 Impact factor: 8.340

10. Proteogenomic Landscape of Breast Cancer Tumorigenesis and Targeted Therapy.

Authors: Karsten Krug; Eric J Jaehnig; Shankha Satpathy; Lili Blumenberg; Alla Karpova; Meenakshi Anurag; George Miles; Philipp Mertins; Yifat Geffen; Lauren C Tang; David I Heiman; Song Cao; Yosef E Maruvka; Jonathan T Lei; Chen Huang; Ramani B Kothadia; Antonio Colaprico; Chet Birger; Jarey Wang; Yongchao Dou; Bo Wen; Zhiao Shi; Yuxing Liao; Maciej Wiznerowicz; Matthew A Wyczalkowski; Xi Steven Chen; Jacob J Kennedy; Amanda G Paulovich; Mathangi Thiagarajan; Christopher R Kinsinger; Tara Hiltke; Emily S Boja; Mehdi Mesri; Ana I Robles; Henry Rodriguez; Thomas F Westbrook; Li Ding; Gad Getz; Karl R Clauser; David Fenyö; Kelly V Ruggles; Bing Zhang; D R Mani; Steven A Carr; Matthew J Ellis; Michael A Gillette
Journal: Cell Date: 2020-11-18 Impact factor: 41.582