| Literature DB >> 19900303 |
Paola Bertolazzi1, Giovanni Felici, Emanuel Weitschek.
Abstract
BACKGROUND: According to many field experts, specimens classification based on morphological keys needs to be supported with automated techniques based on the analysis of DNA fragments. The most successful results in this area are those obtained from a particular fragment of mitochondrial DNA, the gene cytochrome c oxidase I (COI) (the "barcode"). Since 2004 the Consortium for the Barcode of Life (CBOL) promotes the collection of barcode specimens and the development of methods to analyze the barcode for several tasks, among which the identification of rules to correctly classify an individual into its species by reading its barcode.Entities:
Mesh:
Year: 2009 PMID: 19900303 PMCID: PMC2775153 DOI: 10.1186/1471-2105-10-S14-S7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Logic Models Extracted from Data
| Lonchofilla thomasi | v83 = t | 19 |
| Molossus molossus | v344 = t | 11 |
| Rhinofilla pumilio | v64 = t | 20 |
| Sturnira tildae | v616 = g | 8 |
Figure 1BLOG flow chart. The flow chart of the BLOG software system.
Optimal values and Error Rates (first data set)
| 10 | 4 | 10 | 8.02% | 17.00% |
| 10 | 4 | 10 | 10.14% | 20.00% |
| 10 | 4 | 20 | 11.90% | 21.52% |
| 10 | 4 | 20 | 13.50% | 21.19% |
| average | 10.89% | 19.93% | ||
| 15 | 6 | 10 | 0.87% | 10.00% |
| 15 | 6 | 10 | 1.93% | 12.50% |
| 15 | 6 | 20 | 1.50% | 10.93% |
| 15 | 6 | 20 | 2.04% | 12.25% |
| average | 1.58% | 11.42% | ||
| 20 | 8 | 15 | 0.20% | 8.94% |
| 20 | 8 | 15 | 0.61% | 7.72% |
| average | 0.40% | 8.33% | ||
Logic Formulas for Species 1 to 5 (first data set)
| A1 | 1.00 | 0.00 | (v100 = c) and (v346 = a) and (v499 = t) and (v502 = a) |
| A2 | 0.77 | 0.00 | (v82 = t) and (v238 = t) and (v502 = c) |
| A3 | 1.00 | 0.00 | (v58 = a) and not(v100 = c) and not(v106 = a) |
| A4 | 1.00 | 0.00 | (v106 = t) and (v139 = g) |
| A5 | 1.00 | 0.00 | not(v106 = g) and not(v295 = a) and not(v295 = g) |
Optimal values and Error Rates (second data set)
| 10 | 8 | 10 | 15.47% | 15.06% |
| 10 | 8 | 10 | 16.48% | 15.90% |
| 10 | 7 | 20 | 19.97% | 21.03% |
| 10 | 7 | 20 | 21.39% | 22.56% |
| average | 18.32% | 18.64% | ||
| 15 | 11 | 10 | 5.52% | 6.67% |
| 15 | 11 | 10 | 6.37% | 8.33% |
| 15 | 11 | 20 | 7.40% | 10.42% |
| 15 | 11 | 20 | 10.88% | 10.42% |
| average | 7.54% | 8.96% | ||
| 20 | 14 | 10 | 0% | 1.38% |
| 20 | 14 | 10 | 2.22% | 3.08% |
| 20 | 15 | 20 | 1.97% | 5.50% |
| 20 | 15 | 20 | 1.58% | 5.13% |
| average | 1.44% | 3.77% | ||
Logic Formulas for Species 1 to 5 (second data set)
| Ametrida centurio | 1.00 | 0.00 | (v182 = g) and (v290 = g) |
| Anoura caudifer | 1.00 | 0.00 | (v83 = c) and (v416 = a) and (v470 = c) |
| Anoura geoffroyi | 1.00 | 0.00 | (v290 = g) and (v416 = a) and (v470 = t) |
| Anoura latidens | 1.00 | 0.00 | (v266 = t) and (v377 = c) and (v416 = a) |
| Artibeus amplus | 1.00 | 0.00 | (v140 = t) and (v473 = t) and (v512 = c) and (v602 = c) |
Optimal values and Error Rates (third data set)
| 15 | 5 | 10 | 4.78% | 7.06% |
| 15 | 6 | 10 | 4.78% | 8.74% |
| 15 | 6 | 20 | 3.30% | 10.64% |
| 15 | 5 | 20 | 10.52% | 17.73% |
| average | 5.84% | 11.04% | ||
| 20 | 7 | 10 | 4.02% | 4.75% |
| 20 | 8 | 10 | 1.15% | 9.71% |
| 20 | 8 | 20 | 8.25% | 14.18% |
| 20 | 7 | 20 | 10.93% | 16.31% |
| average | 6.08% | 11.23% | ||
| 25 | 10 | 10 | 0% | 3.78% |
| 25 | 10 | 10 | 1.53% | 8.74% |
| 25 | 10 | 20 | 1.24% | 7.80% |
| 25 | 10 | 20 | 1.65% | 5.67% |
| average | 1.10% | 6.49% | ||
Logic Formulas for Species 1 to 5 (third data set)
| Ompok bimaculatus | 1.00 | 0.00 | (v400 = t) and (v556 = t) and (v607 = c) |
| Ompok pabo | 1.00 | 0.00 | (v287 = a) and (v329 = a) |
| Glyptothorax ventrolineatus | 1.00 | 0.00 | (v36 = g) and (v267 = a) and (v308 = g) and (v589 = c) |
| Glyptothorax brevipinnis | 1.00 | 0.00 | ((v219 = a) and (v408 = t) |
| Parambassis ranga | 1.00 | 0.00 | (v545 = g) and (v556 = a) and (v607 = a) |