Literature DB >> 26564973

DNA pattern recognition using canonical correlation algorithm.

B K Sarkar1, Chiranjib Chakraborty.   

Abstract

We performed canonical correlation analysis as an unsupervised statistical tool to describe related views of the same semantic object for identifying patterns. A pattern recognition technique based on canonical correlation analysis (CCA) was proposed for finding required genetic code in the DNA sequence. Two related but different objects were considered: one was a particular pattern, and other was test DNA sequence. CCA found correlations between two observations of the same semantic pattern and test sequence. It is concluded that the relationship possesses maximum value in the position where the pattern exists. As a case study, the potential of CCA was demonstrated on the sequence found from HIV-1 preferred integration sites. The subsequences on the left and right flanking from the integration site were considered as the two views, and statistically significant relationships were established between these two views to elucidate the viral preference as an important factor for the correlation.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26564973     DOI: 10.1007/s12038-015-9555-z

Source DB:  PubMed          Journal:  J Biosci        ISSN: 0250-5991            Impact factor:   1.826


  16 in total

1.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.

Authors:  G Z Hertz; G D Stormo
Journal:  Bioinformatics       Date:  1999 Jul-Aug       Impact factor: 6.937

2.  Identifying target sites for cooperatively binding factors.

Authors:  D GuhaThakurta; G D Stormo
Journal:  Bioinformatics       Date:  2001-07       Impact factor: 6.937

3.  HIV-1 integration in the human genome favors active genes and local hotspots.

Authors:  Astrid R W Schröder; Paul Shinn; Huaming Chen; Charles Berry; Joseph R Ecker; Frederic Bushman
Journal:  Cell       Date:  2002-08-23       Impact factor: 41.582

4.  Canonical correlation analysis: an overview with application to learning methods.

Authors:  David R Hardoon; Sandor Szedmak; John Shawe-Taylor
Journal:  Neural Comput       Date:  2004-12       Impact factor: 2.026

5.  Multivariate association and dimension reduction: a generalization of canonical correlation analysis.

Authors:  Ross Iaci; T N Sriram; Xiangrong Yin
Journal:  Biometrics       Date:  2010-12       Impact factor: 2.571

6.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.

Authors:  C E Lawrence; S F Altschul; M S Boguski; J S Liu; A F Neuwald; J C Wootton
Journal:  Science       Date:  1993-10-08       Impact factor: 47.728

7.  Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

Authors:  A F Neuwald; J S Liu; C E Lawrence
Journal:  Protein Sci       Date:  1995-08       Impact factor: 6.725

8.  Host site selection for concerted integration by human immunodeficiency virus type-1 virions in vitro.

Authors:  G Goodarzi; R Chiu; K Brackmann; K Kohn; Y Pommier; D P Grandgenett
Journal:  Virology       Date:  1997-05-12       Impact factor: 3.616

9.  Concerted integration of retrovirus-like DNA by human immunodeficiency virus type 1 integrase.

Authors:  G Goodarzi; G J Im; K Brackmann; D Grandgenett
Journal:  J Virol       Date:  1995-10       Impact factor: 5.103

10.  Transcription start regions in the human genome are favored targets for MLV integration.

Authors:  Xiaolin Wu; Yuan Li; Bruce Crise; Shawn M Burgess
Journal:  Science       Date:  2003-06-13       Impact factor: 47.728

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.