Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Enhanced protein domain discovery by using language modeling techniques from speech recognition.

Literature DB >> 12668763

Enhanced protein domain discovery by using language modeling techniques from speech recognition.

Lachlan Coin¹, Alex Bateman, Richard Durbin.

Abstract

Most modern speech recognition uses probabilistic models to interpret a sequence of sounds. Hidden Markov models, in particular, are used to recognize words. The same techniques have been adapted to find domains in protein sequences of amino acids. To increase word accuracy in speech recognition, language models are used to capture the information that certain word combinations are more likely than others, thus improving detection based on context. However, to date, these context techniques have not been applied to protein domain discovery. Here we show that the application of statistical language modeling methods can significantly enhance domain recognition in protein sequences. As an example, we discover an unannotated Tf_Otx Pfam domain on the cone rod homeobox protein, which suggests a possible mechanism for how the V242M mutation on this protein causes cone-rod dystrophy.

Entities: Disease Gene Mutation

Mesh：

Substances：

Year: 2003 PMID： 12668763 PMCID： PMC404693 DOI： 10.1073/pnas.0737502100

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

19 in total

1. Experimental data of a single promoter can be used for in silico detection of genes with related regulation in the absence of sequence similarity.

Authors: V Gailus-Durner; M Scherf; T Werner
Journal: Mamm Genome Date: 2001-01 Impact factor: 2.957

2. An insight into domain combinations.

Authors: G Apic; J Gough; S A Teichmann
Journal: Bioinformatics Date: 2001 Impact factor: 6.937

Review 3. Profile hidden Markov models.

Authors: S R Eddy
Journal: Bioinformatics Date: 1998 Impact factor: 6.937

4. Interpolated Markov models for eukaryotic gene finding.

Authors: S L Salzberg; M Pertea; A L Delcher; M J Gardner; H Tettelin
Journal: Genomics Date: 1999-07-01 Impact factor: 5.736

5. Comparative genomics of the eukaryotes.

Authors: G M Rubin; M D Yandell; J R Wortman; G L Gabor Miklos; C R Nelson; I K Hariharan; M E Fortini; P W Li; R Apweiler; W Fleischmann; J M Cherry; S Henikoff; M P Skupski; S Misra; M Ashburner; E Birney; M S Boguski; T Brody; P Brokstein; S E Celniker; S A Chervitz; D Coates; A Cravchik; A Gabrielian; R F Galle; W M Gelbart; R A George; L S Goldstein; F Gong; P Guan; N L Harris; B A Hay; R A Hoskins; J Li; Z Li; R O Hynes; S J Jones; P M Kuehl; B Lemaitre; J T Littleton; D K Morrison; C Mungall; P H O'Farrell; O K Pickeral; C Shue; L B Vosshall; J Zhang; Q Zhao; X H Zheng; S Lewis
Journal: Science Date: 2000-03-24 Impact factor: 47.728

Review 6. Dominant Leber congenital amaurosis, cone-rod degeneration, and retinitis pigmentosa caused by mutant versions of the transcription factor CRX.

Authors: C Rivolta; E L Berson; T P Dryja
Journal: Hum Mutat Date: 2001-12 Impact factor: 4.878

7. OTX2 homeodomain protein binds a DNA element necessary for interphotoreceptor retinoid binding protein gene expression.

Authors: N Bobola; P Briata; C Ilengo; N Rosatto; C Craft; G Corte; R Ravazzolo
Journal: Mech Dev Date: 1999-04 Impact factor: 1.882

8. Promoter region-based classification of genes.

Authors: P Pavlidis; T S Furey; M Liberto; D Haussler; W N Grundy
Journal: Pac Symp Biocomput Date: 2001

9. Crx, a novel otx-like homeobox gene, shows photoreceptor-specific expression and regulates photoreceptor differentiation.

Authors: T Furukawa; E M Morrow; C L Cepko
Journal: Cell Date: 1997-11-14 Impact factor: 41.582

10. Crx, a novel Otx-like paired-homeodomain protein, binds to and transactivates photoreceptor cell-specific genes.

Authors: S Chen; Q L Wang; Z Nie; H Sun; G Lennon; N G Copeland; D J Gilbert; N A Jenkins; D J Zack
Journal: Neuron Date: 1997-11 Impact factor: 17.173

19 in total

1. Lineage-specific expansion of DNA-binding transcription factor families.

Authors: Varodom Charoensawan; Derek Wilson; Sarah A Teichmann
Journal: Trends Genet Date: 2010-07-31 Impact factor: 11.639

Review 2. Genomic repertoires of DNA-binding transcription factors across the tree of life.

Authors: Varodom Charoensawan; Derek Wilson; Sarah A Teichmann
Journal: Nucleic Acids Res Date: 2010-07-30 Impact factor: 16.971

3. Using context to improve protein domain identification.

Authors: Alejandro Ochoa; Manuel Llinás; Mona Singh
Journal: BMC Bioinformatics Date: 2011-03-31 Impact factor: 3.169

4. The 20 years of PROSITE.

Authors: Nicolas Hulo; Amos Bairoch; Virginie Bulliard; Lorenzo Cerutti; Béatrice A Cuche; Edouard de Castro; Corinne Lachaize; Petra S Langendijk-Genevaux; Christian J A Sigrist
Journal: Nucleic Acids Res Date: 2007-11-14 Impact factor: 16.971

5. The Pfam protein families database.

Authors: Alex Bateman; Lachlan Coin; Richard Durbin; Robert D Finn; Volker Hollich; Sam Griffiths-Jones; Ajay Khanna; Mhairi Marshall; Simon Moxon; Erik L L Sonnhammer; David J Studholme; Corin Yeats; Sean R Eddy
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971