| Literature DB >> 2706403 |
Abstract
The composition of naturally occurring DNA sequences is often strikingly heterogeneous. In this paper, the DNA sequence is viewed as a stochastic process with local compositional properties determined by the states of a hidden Markov chain. The model used is a discrete-state, discrete-outcome version of a general model for non-stationary time series proposed by Kitagawa (1987). A smoothing algorithm is described which can be used to reconstruct the hidden process and produce graphic displays of the compositional structure of a sequence. The problem of parameter estimation is approached using likelihood methods and an EM algorithm for approximating the maximum likelihood estimate is derived. The methods are applied to sequences from yeast mitochondrial DNA, human and mouse mitochondrial DNAs, a human X chromosomal fragment and the complete genome of bacteriophage lambda.Entities:
Mesh:
Substances:
Year: 1989 PMID: 2706403 DOI: 10.1007/BF02458837
Source DB: PubMed Journal: Bull Math Biol ISSN: 0092-8240 Impact factor: 1.758