Abhijit Chakraborty1, Ferhat Ay1,2, Ramana V Davuluri3. 1. La Jolla Institute for Immunology, La Jolla, CA, 92037, USA. 2. Department of Pediatrics, UC San Diego - School of Medicine, La Jolla, 92093, CA, USA. 3. Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, 11794, USA.
Abstract
MOTIVATION: Access to large-scale genomics and transcriptomics data from various tissues and cell lines allowed the discovery of wide-spread alternative splicing events and alternative promoter usage in mammalians. Between human and mouse, gene-level orthology is currently present for nearly 16k protein-coding genes spanning a diverse repertoire of over 200k total transcript isoforms. RESULTS: Here, we describe a novel method, ExTraMapper, which leverages sequence conservation between exons of a pair of organisms and identifies a fine-scale orthology mapping at the exon and then transcript level. ExTraMapper identifies more than 350k exon mappings, as well as 30k transcript mappings between human and mouse using only sequence and gene annotation information. We demonstrate that ExTraMapper identifies a larger number of exon and transcript mappings compared to previous methods. Further, it identifies exon fusions, splits, and losses due to splice site mutations, and finds mappings between microexons that are previously missed. By reanalysis of RNA-seq data from 13 matched human and mouse tissues, we show that ExTraMapper improves the correlation of transcript-specific expression levels suggesting a more accurate mapping of human and mouse transcripts. We also applied the method to detect conserved exon and transcript pairs between human and rhesus macaque genomes to highlight the point that ExTraMapper is applicable to any pair of organisms that have orthologous gene pairs. AVAILABILITY: The source code and the results are available at https://github.com/ay-lab/ExTraMapper and http://ay-lab-tools.lji.org/extramapper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Access to large-scale genomics and transcriptomics data from various tissues and cell lines allowed the discovery of wide-spread alternative splicing events and alternative promoter usage in mammalians. Between human and mouse, gene-level orthology is currently present for nearly 16k protein-coding genes spanning a diverse repertoire of over 200k total transcript isoforms. RESULTS: Here, we describe a novel method, ExTraMapper, which leverages sequence conservation between exons of a pair of organisms and identifies a fine-scale orthology mapping at the exon and then transcript level. ExTraMapper identifies more than 350k exon mappings, as well as 30k transcript mappings between human and mouse using only sequence and gene annotation information. We demonstrate that ExTraMapper identifies a larger number of exon and transcript mappings compared to previous methods. Further, it identifies exon fusions, splits, and losses due to splice site mutations, and finds mappings between microexons that are previously missed. By reanalysis of RNA-seq data from 13 matched human and mouse tissues, we show that ExTraMapper improves the correlation of transcript-specific expression levels suggesting a more accurate mapping of human and mouse transcripts. We also applied the method to detect conserved exon and transcript pairs between human and rhesus macaque genomes to highlight the point that ExTraMapper is applicable to any pair of organisms that have orthologous gene pairs. AVAILABILITY: The source code and the results are available at https://github.com/ay-lab/ExTraMapper and http://ay-lab-tools.lji.org/extramapper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Manuel Irimia; Robert J Weatheritt; Jonathan D Ellis; Neelroop N Parikshak; Thomas Gonatopoulos-Pournatzis; Mariana Babor; Mathieu Quesnel-Vallières; Javier Tapial; Bushra Raj; Dave O'Hanlon; Miriam Barrios-Rodiles; Michael J E Sternberg; Sabine P Cordes; Frederick P Roth; Jeffrey L Wrana; Daniel H Geschwind; Benjamin J Blencowe Journal: Cell Date: 2014-12-18 Impact factor: 41.582
Authors: Shin Lin; Yiing Lin; Joseph R Nery; Mark A Urich; Alessandra Breschi; Carrie A Davis; Alexander Dobin; Christopher Zaleski; Michael A Beer; William C Chapman; Thomas R Gingeras; Joseph R Ecker; Michael P Snyder Journal: Proc Natl Acad Sci U S A Date: 2014-11-20 Impact factor: 11.205
Authors: Paul Flicek; M Ridwan Amode; Daniel Barrell; Kathryn Beal; Konstantinos Billis; Simon Brent; Denise Carvalho-Silva; Peter Clapham; Guy Coates; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah Hunt; Nathan Johnson; Thomas Juettemann; Andreas K Kähäri; Stephen Keenan; Eugene Kulesha; Fergal J Martin; Thomas Maurel; William M McLaren; Daniel N Murphy; Rishi Nag; Bert Overduin; Miguel Pignatelli; Bethan Pritchard; Emily Pritchard; Harpreet S Riat; Magali Ruffier; Daniel Sheppard; Kieron Taylor; Anja Thormann; Stephen J Trevanion; Alessandro Vullo; Steven P Wilder; Mark Wilson; Amonida Zadissa; Bronwen L Aken; Ewan Birney; Fiona Cunningham; Jennifer Harrow; Javier Herrero; Tim J P Hubbard; Rhoda Kinsella; Matthieu Muffato; Anne Parker; Giulietta Spudich; Andy Yates; Daniel R Zerbino; Stephen M J Searle Journal: Nucleic Acids Res Date: 2013-12-06 Impact factor: 16.971