Literature DB >> 17485470

ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets.

Sylvain Foissac1, Michael Sammeth.   

Abstract

In the process of establishing more and more complete annotations of eukaryotic genomes, a constantly growing number of alternative splicing (AS) events has been reported over the last decade. Consequently, the increasing transcript coverage also revealed the real complexity of some variations in the exon-intron structure between transcript variants and the need for computational tools to address 'complex' AS events. ASTALAVISTA (alternative splicing transcriptional landscape visualization tool) employs an intuitive and complete notation system to univocally identify such events. The method extracts AS events dynamically from custom gene annotations, classifies them into groups of common types and visualizes a comprehensive picture of the resulting AS landscape. Thus, ASTALAVISTA can characterize AS for whole transcriptome data from reference annotations (GENCODE, REFSEQ, ENSEMBL) as well as for genes selected by the user according to common functional/structural attributes of interest: http://genome.imim.es/astalavista.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17485470      PMCID: PMC1933205          DOI: 10.1093/nar/gkm311

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Alternative splicing (AS) is a fundamental cellular process involved in eukaryotic gene expression (1–3). To decipher the molecular mechanisms responsible for AS, several computational studies have been presented over the last years, producing a considerable quantity of dedicated analyses and databases representing the transcript diversity resulting from AS events (4,5). In contrast to the global transcript diversity, it is also possible to identify local variations observed in the exon–intron structures amongst the transcripts. For the investigation of the molecular mechanisms giving rise to AS, the separation of events in different classes has shown promise. Historically, four main AS events have been reported in literature: exon skipping, intron retention, alternative donor and acceptor splice sites. Obviously, the frequent observation of these ‘simple’ events is correlated with the fact that they involve the minimum number of variable splice sites (i.e. two). However, current annotation datasets show a plethora of more complex variations that can be seen as variable combinations of simple events overlapping each other, indicating connections in regulation and function. Only recently efforts have been undertaken in order to properly identify and describe such ‘complex’ AS events. In previous work, ‘bit matrices’ identifying exonic and intronic segments of transcripts have been used to generally describe AS events (6). Alternatively, we have developed an intuitive notation system based on the relative position of alternatively used splice sites in order to univocally identify any possible AS event (Sammeth et al., submitted). In the process of investigating the phenomenon of AS, many web resources have already been made available (7–12). In general, these tools can be considered as AS-dedicated gene or genome browsers, suitable to access much information about each gene of interest but not convenient for comprehensive analyses of AS across genes. Moreover, none of them propose an exhaustive identification of AS events from custom input data. Herein, we describe the ASTALAVISTA web server (alternative splicing transcriptional landscape visualization tool) that allows to dynamically identify, extract and display complex AS events from annotated genes. ASTALAVISTA gives the opportunity to investigate and compare types and distributions of the different AS events found in the input—whole genome annotations as well as user provided gene sets. To our knowledge, this is the first time a tool for the exhaustive extraction of AS events from custom datasets is made publicly available.

METHODS

ASTALAVISTA adopts a generic definition of AS events and a flexible notation system assigning a code based on the relative position of alternatively used splice sites. In brief, given a set of annotated transcripts, the method consists in first considering all pairwise comparisons between overlapping transcripts. A variation of the splicing structure is detected if some splice sites are not used in both transcripts. Then, according to the genomic coordinates, the relative order of the splice sites that are included in such variations is used to build a code describing the corresponding AS event. This approach overcomes limitations of methods focusing exclusively on simple events and circumvents the problem of chosing a reference transcript, defining a ‘main’ splice form to be compared with. The intrinsic transcript clustering prevents the method from being dependent on the assignment of transcripts to a certain gene name or locus. Furthermore, a redundancy filtering is applied in order to identify the list of unique AS events, regardless of how many transcript comparisons exhibit the same splicing variation. The genericity of the notation based on relative splice site positions enables to compare AS events across different genes, chromosomes or genomes. By this, events describing equal variations in the exon–intron structures are pooled in a common group. Finally, the distribution of AS events across these groups is used to depict the AS landscape in a dataset. More details about the method and the resulting ‘AS code’ are provided on the web site.

WEB SITE DESCRIPTION

Input: annotation datasets

ASTALAVISTA requires a set of transcripts with known exon–intron structure (e.g. from mRNAs or ESTs). As primary data source for custom transcript sets, the genomic positions of the exon boundaries for each transcript are provided using the gene transfer format (GTF). Each GTF line has nine required fields, of which the feature (e.g. ‘exon’, ‘CDS’, etc.), start and end coordinates on a chromosome (or a contig), the strand and an identifier for the corresponding transcript are used. Note that no gene identifiers are necessary due to the intrinsic clustering of transcripts into loci. To check the GTF format requirements, an example input is available on the web server. Optionally, if any protein coding region information is provided within the input (feature ‘CDS’), transcripts and/or extracted AS events may be filtered according to the annotated CDSs. This straightforwardly allows to compare the AS landscapes of coding versus non-coding transcripts or of events localized in CDSs versus UTRs. Alternatively, the user may also analyze the anatomy of AS as characterized by popular genomic human annotations (GENCODE, REFSEQ, ENSEMBL), or just provide any set of human genes. In the latter case, the list of genes can be specified by identifiers from various nomenclature systems, like REFSEQ mRNA IDs, SWISSPROT IDs, HUGO gene symbols, ENSEMBL transcript/gene/protein identifiers, etc. Therefore, the AS topology can be differentially assessed for custom datasets containing the respective genes of interest.

Output: landscape of AS events

From the provided annotation with respect to the specified options, the ASTALAVISTA protocol dynamically extracts AS events. As a summary, the main result page shows a list where each event type is depicted and its unique code in the relative-position notation is given. The list is ranked according to the occurrence (number or proportion) of the events. A graphical overview is provided in form of a pie diagram that displays the distribution of events across the groups, considering differentially each type of simple event and pooling the others in one group (Figure 1, left).
Figure 1.

Analysis of the AS landscape in a sample dataset. The AS landscape is described by a list of AS events grouped according to equal variations in the exon–intron structure between transcripts (left). A schematic picture illustrates every type of event, specified by the respective code in the relative splice site position notation. The list is ranked according to the observed frequency of events, and as an overview, a pie diagram shows the resulting distribution. For each type of AS event, the enumeration of all genes/transcripts involved is provided, including the corresponding identifiers and genomic coordinates (top-right). The genomic positions are dynamically linked to the UCSC genome browser for further analysis (bottom-right).

Analysis of the AS landscape in a sample dataset. The AS landscape is described by a list of AS events grouped according to equal variations in the exon–intron structure between transcripts (left). A schematic picture illustrates every type of event, specified by the respective code in the relative splice site position notation. The list is ranked according to the observed frequency of events, and as an overview, a pie diagram shows the resulting distribution. For each type of AS event, the enumeration of all genes/transcripts involved is provided, including the corresponding identifiers and genomic coordinates (top-right). The genomic positions are dynamically linked to the UCSC genome browser for further analysis (bottom-right). From the result summary page, the genomic coordinates of all AS events counted in a group are accessible by clicking on the corresponding list entry (Figure 1, top-right). For each AS event, the transcript identifiers and the variable splice sites giving rise to it are specified. Finally, each event is linked to the UCSC Genome Browser for further comparative analyses (Figure 1, bottom-right).

CONCLUSION AND PERSPECTIVE

ASTALAVISTA is an explorative tool to exhaustively extract AS events reflected by a certain input dataset, to compare and to group them according to equal exon–intron structure variations. As a key feature, arbitrarily complex combinations of hitherto described AS events can be distinguished, either visually or by representation in a univocal notation system. The event-based model of AS permits to easily identify genes that involve the same type of event, e.g. alternative donors or double exon skipping. On the other hand, the comprehensive analysis of observed AS events provides a powerful tool for investigating correlations between differences in AS patterns and functional/structural features of genes, gene sets or complete genomes. In this concern, ASTALAVISTA can handle custom inputs according to any discriminatory criteria, e.g. common evolutionary conservation, pattern or intensity of expression, function/cellular localization of the gene product, etc. Although the reference datasets currently provided on the server are dedicated to reference organisms, the generic ASTALAVISTA protocol is applicable to any genome, even if the sequencing/annotation process has not been completed. In the future, the web resource will be completed by reference annotations for more species.
  12 in total

Review 1.  Alternative pre-mRNA splicing: the logic of combinatorial control.

Authors:  C W Smith; J Valcárcel
Journal:  Trends Biochem Sci       Date:  2000-08       Impact factor: 13.807

Review 2.  Mechanisms of alternative pre-messenger RNA splicing.

Authors:  Douglas L Black
Journal:  Annu Rev Biochem       Date:  2003-02-27       Impact factor: 23.643

3.  The Alternative Splicing Gallery (ASG): bridging the gap between genome and transcriptome.

Authors:  Jeremy Leipzig; Pavel Pevzner; Steffen Heber
Journal:  Nucleic Acids Res       Date:  2004-08-03       Impact factor: 16.971

4.  Automated classification of alternative splicing and transcriptional initiation and construction of visual database of classified patterns.

Authors:  Hideki Nagasaki; Masanori Arita; Tatsuya Nishizawa; Makiko Suwa; Osamu Gotoh
Journal:  Bioinformatics       Date:  2006-02-24       Impact factor: 6.937

Review 5.  Bioinformatics of alternative splicing and its regulation.

Authors:  Liliana Florea
Journal:  Brief Bioinform       Date:  2006-03       Impact factor: 11.622

Review 6.  Alternative splicing and RNA selection pressure--evolutionary consequences for eukaryotic genomes.

Authors:  Yi Xing; Christopher Lee
Journal:  Nat Rev Genet       Date:  2006-06-13       Impact factor: 53.242

Review 7.  Alternative splicing of pre-mRNA: developmental consequences and mechanisms of regulation.

Authors:  A J Lopez
Journal:  Annu Rev Genet       Date:  1998       Impact factor: 16.830

8.  ASD: a bioinformatics resource on alternative splicing.

Authors:  Stefan Stamm; Jean-Jack Riethoven; Vincent Le Texier; Chellappa Gopalakrishnan; Vasudev Kumanduri; Yesheng Tang; Nuno L Barbosa-Morais; Thangavel Alphonse Thanaraj
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  ASGS: an alternative splicing graph web service.

Authors:  Durgaprasad Bollina; Bernett T K Lee; Tin Wee Tan; Shoba Ranganathan
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

10.  ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization.

Authors:  Tiziana Castrignanò; Raffaella Rizzi; Ivano Giuseppe Talamo; Paolo D'Onorio De Meo; Anna Anselmo; Paola Bonizzoni; Graziano Pesole
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

View more
  145 in total

1.  Global dissection of alternative splicing in paleopolyploid soybean.

Authors:  Yanting Shen; Zhengkui Zhou; Zheng Wang; Weiyu Li; Chao Fang; Mian Wu; Yanming Ma; Tengfei Liu; Ling-An Kong; De-Liang Peng; Zhixi Tian
Journal:  Plant Cell       Date:  2014-03-28       Impact factor: 11.277

2.  RNA-seq analysis of bovine intramuscular, subcutaneous and perirenal adipose tissues.

Authors:  Xihui Sheng; Hemin Ni; Yunhai Liu; Junya Li; Lupei Zhang; Yong Guo
Journal:  Mol Biol Rep       Date:  2014-01-08       Impact factor: 2.316

3.  High-throughput quantification of splicing isoforms.

Authors:  Jean-Philippe Brosseau; Jean-François Lucier; Elvy Lapointe; Mathieu Durand; Daniel Gendron; Julien Gervais-Bird; Karine Tremblay; Jean-Pierre Perreault; Sherif Abou Elela
Journal:  RNA       Date:  2009-12-28       Impact factor: 4.942

4.  Genome-wide analysis of alternative splicing landscapes modulated during plant-virus interactions in Brachypodium distachyon.

Authors:  Kranthi K Mandadi; Karen-Beth G Scholthof
Journal:  Plant Cell       Date:  2015-01-29       Impact factor: 11.277

Review 5.  Function of alternative splicing.

Authors:  Olga Kelemen; Paolo Convertini; Zhaiyi Zhang; Yuan Wen; Manli Shen; Marina Falaleeva; Stefan Stamm
Journal:  Gene       Date:  2012-08-15       Impact factor: 3.688

6.  Differential Alternative Splicing Genes and Isoform Regulation Networks of Rapeseed (Brassica napus L.) Infected with Sclerotinia sclerotiorum.

Authors:  Jin-Qi Ma; Wen Xu; Fei Xu; Ai Lin; Wei Sun; Huan-Huan Jiang; Kun Lu; Jia-Na Li; Li-Juan Wei
Journal:  Genes (Basel)       Date:  2020-07-13       Impact factor: 4.096

7.  Genome-wide analysis of shoot growth-associated alternative splicing in moso bamboo.

Authors:  Long Li; Tao Hu; Xueping Li; Shaohua Mu; Zhanchao Cheng; Wei Ge; Jian Gao
Journal:  Mol Genet Genomics       Date:  2016-05-11       Impact factor: 3.291

8.  Comprehensive Transcriptome Analyses Reveal that Potato Spindle Tuber Viroid Triggers Genome-Wide Changes in Alternative Splicing, Inducible trans-Acting Activity of Phased Secondary Small Interfering RNAs, and Immune Responses.

Authors:  Yi Zheng; Ying Wang; Biao Ding; Zhangjun Fei
Journal:  J Virol       Date:  2017-05-12       Impact factor: 5.103

9.  Comprehensive splicing graph analysis of alternative splicing patterns in chicken, compared to human and mouse.

Authors:  Elsa Chacko; Shoba Ranganathan
Journal:  BMC Genomics       Date:  2009-07-07       Impact factor: 3.969

10.  EasyCluster: a fast and efficient gene-oriented clustering tool for large-scale transcriptome data.

Authors:  Ernesto Picardi; Flavio Mignone; Graziano Pesole
Journal:  BMC Bioinformatics       Date:  2009-06-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.