| Literature DB >> 23161672 |
Jose Manuel Rodriguez1, Paolo Maietta, Iakes Ezkurdia, Alessandro Pietrelli, Jan-Jaap Wesselink, Gonzalo Lopez, Alfonso Valencia, Michael L Tress.
Abstract
Here, we present APPRIS (http://appris.bioinfo.cnio.es), a database that houses annotations of human splice isoforms. APPRIS has been designed to provide value to manual annotations of the human genome by adding reliable protein structural and functional data and information from cross-species conservation. The visual representation of the annotations provided by APPRIS for each gene allows annotators and researchers alike to easily identify functional changes brought about by splicing events. In addition to collecting, integrating and analyzing reliable predictions of the effect of splicing events, APPRIS also selects a single reference sequence for each gene, here termed the principal isoform, based on the annotations of structure, function and conservation for each transcript. APPRIS identifies a principal isoform for 85% of the protein-coding genes in the GENCODE 7 release for ENSEMBL. Analysis of the APPRIS data shows that at least 70% of the alternative (non-principal) variants would lose important functional or structural information relative to the principal isoform.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23161672 PMCID: PMC3531113 DOI: 10.1093/nar/gks1058
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The APPRIS system. The inputs to the web services are the peptide sequences of the isoforms, the sequences of the transcripts and cross-species alignments of either transcripts or peptides. The output of all eight individual web services is used by APPRIS to annotate alternative variants with structural, functional and conservation information and to select a principal isoform for each gene.
Figure 2.APPRIS annotations for gene DNAJC5G. (A) Snapshot of the APPRIS page for gene DNAJC5G, showing the five protein-coding transcripts annotated by Ensembl and the selection of the principal isoform by APPRIS (shown by the green tick). (B) A selection of the annotation results from the individual modules in APPRIS (Matador3D maps 3D structure to the isoforms, SPADE maps Pfam functional domains and INERTIA detects unusually evolving exons). The principal isoform is highlighted. APPRIS chooses isoform DNAJC5G-004 as the main variant based on the output of SPADE and Matador3D and designates the two ‘KNOWN’ isoforms (which are also CCDS variants) as alternative variants. (C) The 3D structure of mouse DNAJ subfamily C2 member 5 (PDB: 2CTW), to which DNAJC5G-004 has 56% identity with no gaps. The coloring on the 3D structure comes from the predominant coloring in the Pfam multiple alignment in (D). The large red arrow shows that the 16 extra residues present in the larger isoforms would have to be inserted into an important helix. (D) The multiple alignment for a section of the Pfam DnaJ family of sequences. The red arrow shows that the 16 extra residues in the CCDS variants would need to be inserted into a critical region of the functional domain of DNAJC5G.
Figure 3.APPRIS result for gene TP63. (A) The four variants of gene TP63 that score highest in APPRIS, highlighting the 5′-splice junction differences between TA and P63delta, deltaN and TP63-013, and the GYNGYN splice event that differentiates TA and deltaN from P63delta and TP63-013. (B) Annotation results from some of the individual modules in APPRIS (Matador3D maps 3D structure to the isoforms, SPADE maps Pfam functional domains and CORSAIR detects orthologous isoforms in related species). The isoforms are color-coded as in (A) and we have added two other well-known TP63 variants to the table. APPRIS chooses isoform TP63-013 as the main variant based on the output of the three modules.