Literature DB >> 22493526

IntDb: A comprehensive database for classified introns of saccharomyces & human.

Subhalaxmi Mohanty1, Amouda Nizam, Monalisha Biswal.   

Abstract

UNLABELLED: Introns (intra-genic) are non-coding regions of several eukaryotic genes. However, their role in regulation of transcription, embryonic development, stimulate gene (HEG) is apparent in recent years. Thus current research focuses on mutation in introns and their influence in causing various diseases. Though many available intron databases like YIDB, IDB, ExInt, GISSD, FUGOID, etc. discusses on various aspects of introns but none of them have classified the introns where identification of start intron is found to be important which mainly regulates the various activities of protein at gene level. This lead to an idea for development of "Intdb"; a database meant for classifying introns as start, middle and stop on the basis of position of specific consensus site. Information provided in IntDb is useful for gene prediction, determination of splicing sites and identification of diseases. In addition, the main focus is on violation of consensus rule and frequency of other deviations observed in classified introns. Further, GC content, length variations according to the biased residues and occurrence of consensus pattern to discover potential role of introns is also emphasized in IntDb. AVAILABILITY: The database is available for free at http://introndb.bicpu.edu.in/

Entities:  

Keywords:  consensus pattern; consensus rule violation; length variation; splicing sites

Year:  2012        PMID: 22493526      PMCID: PMC3314878          DOI: 10.6026/97320630008233

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Introns (intra-genic, non-coding DNA) split a eukaryotic gene into coding exons and their role largely remains unknown. Variation of intron length with respect to intron positions and its dynamic study is focused by current researchers [1]. A correlation between intron positions and protein structure is the main theme of research after the discovery of intron in eukaryotes in 1977 [2]. It is found that there was a strong positive correlation between intron length and divergence, also a strong negative correlation between intron length and GC content (first introns are more GC rich, longer, more divergent) [3]. A major type of alternative splicing mechanism in human transcriptome is intron retention. In 21,106 human genes intron retention is located within the untranslated regions (UTRs) of human transcript and 22% of retained introns in human are also found in mouse transcriptome [4]. In upstream and downstream introns, flanking intronic sequences are 88% and 80% respectively which is higher in comparison to promoter region conservation level as 77% [5]. According to the mechanism of excision introns are classified as GroupI, GroupII, GroupIII, t-RNA, spliceosomal, chloroplast, mitochondrial, premRNA and HAC1 introns [6]. We have developed IntDb, a comprehensive database which contains information regarding accession ID, definition, location, length, nucleotide sequences of classified introns (like start, middle, stop) of Saccharomyces and human. IntDb constitutes percentage of GC content, GTAG, AT-AC consensus, etc. along with other deviations and information related to diseases like application of homing endonucleases in human. IntDb has better constitution than other intron databases like YIDB, FUGOID, ExInt, GISSD, IDB in exploring classified introns and their analysis Table 1 (see supplementary material).

Methodology

IntDb is developed by using HTML (Figure 1), Java script as frontend, MySQL as backend and CGI-PERL for connectivity to retrieve data. Apache server which has most intuitive configuration is used as the web-server. The complete genome of human & Saccharomycetaceae family is screened in NCBI database to find accession IDs (Genbank sequences) those posses introns and after screening 18 species of Saccharomycetaceae family, only 3 species like Saccharomyces cerevisiae (350 IDs), S.pastorias (7 IDs), S.bayanus (8 IDs) were found to be containing introns. Along with this 500 Homo sapiens accession IDs those posses' introns are collected. Classified introns are collected by using IMAGA tool which classifies the intron as start, middle and stop by taking introns positions from Genbank file. The database constitution is presented in the schema (Figure 2). Patterns of introns were found by using IMAGA tool (http://imaga.bicpu.edu.in:8080/Isaga/) in which 4 sets of introns taken in one slots with variation of GA arameters (population size, maximum number of GA runs, crossover type).
Figure 1

Flow chart of IntDb Development

Figure 2

Schema of IntDb shows 4 tables

Utility

IntDb constitutes classified introns of Saccharomyces and human along with accession IDs those posses’ introns with relevant information regarding introns like definition, no. of introns, location, length, nucleotide sequence, GC percentage and consensus pattern of classified introns (Figure 3a, 3b, 3c). We have analyzed several statistical and functional aspects of introns. Statistical analysis is done by considering 850 accession IDs of Sachharomyces (350) and human (500). Statistical analysis is done to identify consensus pattern, GC content and length variation in classified introns (Figure 3d, 3e). In these introns containing accession IDs first, last introns are said as start, stop intron and rest all in between are middle introns. The collected accession IDs always posses at least one start intron, hence start introns are more. This analysis is done by taking 850 start, 38 middle and 143 stop introns. Apart from this we have found some patterns of start introns by using IMAGA tool. All the previous research works show that first introns (start intron) are longer but our analysis reflects middle introns are longer in average though their occurence is less in gene. Functional analysis of introns covers the detailed prior studies on introns, their role in homing endonuclease and transcription. Here homing endonuclease application is projected in diseases like Xeroderma pigmentosa and cancer. Transcription regulatory role is found in Murine IL4, Hsp90 beta gene, Beta 1 tubulin gene in which intron enhances expression of gene by Intron Mediated enhancement (IME). IntDb can be accessed for detection of mutation position, identification of splicing sites which are helpful in gene prediction programs, development of tools & databases.
Figure 3

A snapshot of IntDb and its important options. (a) Table contains accessionid of Saccharomyces cerevisiae possess introns, (b) Search for accession ID AJ011856 in IntDb homepage, (c) Result for query AJ011856, (d) Statistical analysis of start, middle, stop introns for its GC content and length, and (e) Statistical analysis for consensus pattern of start, middle, stop introns.

Future Developments

IndDb is a generalized name and intron data of other model organisms will be included in future. We also want to predict and store typical patterns which are conserved during evolution along with their functional or malfunctional role in gene.
  6 in total

1.  Intronic sequences flanking alternatively spliced exons are conserved between human and mouse.

Authors:  Rotem Sorek; Gil Ast
Journal:  Genome Res       Date:  2003-07       Impact factor: 9.043

2.  Detection and evaluation of intron retention events in the human transcriptome.

Authors:  Pedro Alexandre Favoretto Galante; Noboru Jo Sakabe; Natanja Kirschbaum-Slager; Sandro José de Souza
Journal:  RNA       Date:  2004-05       Impact factor: 4.942

3.  YIDB: the Yeast Intron DataBase.

Authors:  P J Lopez; B Séraphin
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

4.  An analysis of intron positions in relation to nucleotides, amino acids, and protein secondary structure.

Authors:  Gordon S Whamond; Janet M Thornton
Journal:  J Mol Biol       Date:  2006-03-29       Impact factor: 5.469

5.  Patterns and rates of intron divergence between humans and chimpanzees.

Authors:  Elodie Gazave; Tomàs Marqués-Bonet; Olga Fernando; Brian Charlesworth; Arcadi Navarro
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

6.  Longer first introns are a general property of eukaryotic gene structure.

Authors:  Keith R Bradnam; Ian Korf
Journal:  PLoS One       Date:  2008-08-29       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.