| Literature DB >> 20090173 |
Kun Zhang1, Wei Fan, Prescott Deininger, Andrea Edwards, Zujia Xu, Dongxiao Zhu.
Abstract
Insertion site characterisation of Alu elements is an important problem in primate-specific bioinformatics research. Key characteristics of this challenging problem include: data are not in the pre-defined feature vectors for predictive model construction; without any prior knowledge, can we discover the general patterns that could exist and also make biological insights?; how to obtain the compact yet discriminative patterns given a search space of 4(200)? This paper provides an integrated algorithmic framework for fulfilling the above mining tasks. Compared to the benchmark biological study, our results provide a further refined analysis of the patterns involved in Alu insertion. In particular, we acquire a 200nt predictive profile around the primary insertion site which not only contains the widely accepted consensus, but also suggests a longer pattern (T(7)AA[G'A]AATAA. This pattern provides more insight into the favourable sequence variations allowed for preferred binding and cleavage by the L1 ORF2 endonuclease. The proposed method is general enough that can be also applied to other sequence detection problems, such as microRNA target prediction.Entities:
Mesh:
Year: 2009 PMID: 20090173 PMCID: PMC2922064 DOI: 10.1504/IJCBDD.2009.030763
Source DB: PubMed Journal: Int J Comput Biol Drug Des ISSN: 1756-0756