| Literature DB >> 18822112 |
Rubén Armañanzas1, Iñaki Inza, Roberto Santana, Yvan Saeys, Jose Luis Flores, Jose Antonio Lozano, Yves Van de Peer, Rosa Blanco, Víctor Robles, Concha Bielza, Pedro Larrañaga.
Abstract
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain.Entities:
Year: 2008 PMID: 18822112 PMCID: PMC2576251 DOI: 10.1186/1756-0381-1-6
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Figure 1EDA algorithm flow chart (Figure 1-EDAChart.eps). Diagram of how an estimation of distribution algorithm works. This overview of the algorithm is further specified by the pseudocode shown in Table 1.
EDA pseudocode
| Set |
| Evaluate the points using the fitness function |
| Select a set |
| Estimate a probabilistic model for |
| Generate |
| |
Estimation of distribution algorithms: evolutionary computation based on learning and simulation of probabilistic graphical models.
Figure 2EBNA and BOA paradigms (Figure 2-EBNA-BOA.eps). Diagram of probability models for the proposed EDAs in combinatorial optimization with multiple dependencies (FDA, EBNA, BOA, and EcGA).
EDAs taxonomy
| Statistical order | Advantages | Disadvantages | Examples |
| Simplest and fastest | Ignore feature dependencies | PBIL (Baluja, 1994) | |
| Suited for high cardinality problems | Bad performance for deceptive problems | UMDA (Mühlenbein and Paaß, 1996) | |
| Scalable | cGA (Harik | ||
| Able to represent low order dependencies | Possibly ignore some feature dependencies | MIMIC (De Bonet | |
| Suited for many problems | Slower than univariate EDAs | Dependency trees EDA (Baluja and Davies, 1997) BMDA (Pelikan and Mühlenbein, 1999) | |
| Graphically inquire the induced models | Tree-EDA/Mixture of distributions EDA (Santana | ||
| Parameter learning ( | |||
| Suited for problems with known underlying model | Possibly ignore complex feature dependencies | FDA (Mühlenbein | |
| Higher memory requirements than bivariate | Markov network-based EDA (Shakya and McCall, 2007) | ||
| Structure+parameter learning ( | |||
| Maximum power of generalization | Highest computation time | EcGA (Harik | |
| Flexibility to introduce user dependencies | Highest memory requirements | EBNA (Etxeberria and Larrañaga, 1999) | |
| Online study of the induced dependencies | BOA/hBOA (Pelikan | ||
| Dependency networks EDA (Gámez | |||
A taxonomy of some representative EDAs. We highlight a set of characteristics that can guide the choice of a particular EDA suited to the goals and properties of a given problem.
Figure 3Optimal protein structure (Figure 3-ProteinStructure.eps). Optimal solution of an HP model found by an EDA that uses a Markovian model.