Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Statistical method for predicting protein coding regions in nucleic acid sequences.

Literature DB >> 3134115

Statistical method for predicting protein coding regions in nucleic acid sequences.

Abstract

Protein coding regions of a genome fragment can be mathematically predicted by studying variations in the statistical properties or by searching the signals characteristic of the junctions between the coding and non-coding regions. We propose here a new statistical method using correspondence analysis. This method does not use any reference codon set but takes into account the codon usage homogeneity along the studied genome fragment. Comparison with previously published methods especially the 'codon usage method' of Staden has been made, and two examples are presented here. Applications to analysis of prokaryotic operon and eukaryotic split genes are also discussed. Use of the method has also shown two structures not previously described: i) in the human prt gene, a strong triplet structure exists in a non-coding region; ii) in the human tp-a codon usage is not uniform between the different exons.

Entities: Species

Mesh：

Substances：

Year: 1987 PMID： 3134115 DOI： 10.1093/bioinformatics/3.4.287

Source DB: PubMed Journal: Comput Appl Biosci ISSN： 0266-7061

Keyword Cloud
Cited

11 in total

9. Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

Authors: Qi Dai; Lihua Li; Xiaoqing Liu; Yuhua Yao; Fukun Zhao; Michael Zhang
Journal: PLoS One Date: 2011-11-10 Impact factor: 3.240

10. Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'.

Authors: Qi Dai; Tianming Wang
Journal: BMC Bioinformatics Date: 2008-09-23 Impact factor: 3.169

Statistical method for predicting protein coding regions in nucleic acid sequences.

1. Use and misuse of correspondence analysis in codon usage studies.

Review 2. Assessment of protein coding measures.

3. Metagenomic Classification Using an Abstraction Augmented Markov Model.

4. Chaos game representation of gene structure.

5. A frameshift error detection algorithm for DNA sequencing projects.

6. NRSub: a non-redundant database for Bacillus subtilis.

7. Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

8. A novel hierarchical clustering algorithm for gene sequences.

9. Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

10. Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'.