| Literature DB >> 19641601 |
Markus Göker1, Gema García-Blázquez, Hermann Voglmayr, M Teresa Tellería, María P Martín.
Abstract
BACKGROUND: Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19641601 PMCID: PMC2712678 DOI: 10.1371/journal.pone.0006319
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
GenBank Accession Numbers and Voucher Information for Query Sequences.
| Host | Geographical origin, source or herbarium number |
| GenBank accession no. | Closest neighbours in clustered dataset | Distance to closest neighbours |
|
| Spain, Asturias, Carcabada, MA-Fungi 27736 |
| FM863725 | DQ643842 (TU 5) | 0.000000 |
|
| Spain, Burgos, Cornudilla, MA-Fungi 27855 |
| FM863718 | EU113303, EU113304, EU113305, EU113306, EU113307, EU113308, EU113310 (TU 49) | 0.000000 |
|
| Spain, Gerona, Bolvir, MA-Fungi 27858 |
| FM863720 | EU113303, EU113304, EU113310 (TU 49) | 0.000000 |
|
| Spain, Gerona, Campdevànol, MA-Fungi 27857 |
| FM863723 | EU113309 (TU 49) | 0.000000 |
|
| Spain, Gerona, Isóvol, MA-Fungi 27859 |
| FM863716 | EU113303, EU113304, EU113309, EU113310 (TU 49) | 0.000000 |
|
| Spain, Gerona, Puigcerdà, MA-Fungi 27854 |
| FM863719 | EU113309 (TU 49) | 0.000000 |
|
| Spain, Huesca, Canfranc, MA-Fungi 27861 |
| FM863717 | EU113303, EU113304, EU113309, EU113310 (TU 49) | 0.000000 |
|
| Spain, La Rioja, Ansejo, MA-Fungi 27862 |
| FM863721 | EU113303, EU113304, EU113305, EU113306, EU113307, EU113308, EU113310 (TU 49) | 0.000000 |
|
| Spain, Lerida, Esterri d'Àneu, MA-Fungi 27856 |
| FM863724 | AF528556, AF528557, AY211017, EF614959, EF614961, EF614962, EF614963, EF614965, EF614966, EF614967, EF614968 (TU 49) | 0.000000 |
|
| Spain, Lérida, Las Bordas, MA-Fungi 27864 |
| FM863722 | EU113310, EU113304, EU113303 (TU 49) | 0.000000 |
|
| Spain, Asturias, Leitariegos, MA-Fungi 27850 |
| FM863712 | AY198286, EF614952, EF614953 (TU 73) | 0.001294 |
|
| Spain, Huesca, Baños de Benasque, MA-Fungi 27849 |
| FM863713 | AY198286, EF614952, EF614953, EF614954 (TU 73) | 0.000000 |
|
| Spain, Huesca, Baños de Panticosa, MA-Fungi 27847 |
| FM863715 | AY198286, EF614952, EF614953, EF614954 (TU 73) | 0.000000 |
|
| Spain, Huesca, Baños de Panticosa, MA-Fungi 27848 |
| FM863714 | AY198286, EF614952, EF614953, EF614954 (TU 73) | 0.000000 |
Collection data (columns 1–3), GenBank accession numbers and molecular taxonomic results (columns 5–6) for the sequences newly obtained in the course of this study. The material is preserved in public collections: MA-Fungi, Real Jardín Botánico de Madrid, Spain and LISE, Estaçao Agronomica Nacional, Portugal.
Figure 1Optimization plots.
Modified Rand Index (MRI) plot based on the poa alignment, uncorrected distances, the globally optimal F value (1.0) and two suboptimal F values (0.0 and 0.5). Axes: x-axis, T values examined (values larger than 0.25 gave the same result because all sequences were assigned to a single cluster); y-axis, resulting MRI values for taxonomy-based optimization (thick lines) and host-based optimization (thin lines). Colours: black, F = 1.0; dark grey, F = 0.5; light grey, F = 0.0.
Figure 2Dependency of the number of molecular taxonomic units (TU) on T and F.
The subset of the data with correctly formatted taxon names was analysed. Axes: x-axis, T values examined (values larger than 0.25 gave the same result because all sequences were assigned to a single cluster); y-axis, natural logarithm of the resulting number of clusters (TU) for three selected values of F. Colours: black, F = 1.0; dark grey, F = 0.5; light grey, F = 0.0.
Figure 3Maximum-likelihood tree, bottom part.
Phylogram as inferred with RAxML and rooted with the Pseudoperonospora sequences present in the dataset. Branches are scaled in terms of the number of substitutions per site. Numbers above/below the branches are maximum likelihood and maximum parsimony bootstrap support values from 100 replicates. The sequence labels contain the “organism” entry and the accession number from the GenBank files; for the validity of these entries, the corrected “organism” names and the revised taxonomy, see supporting file S2. Taxonomic unit (TU) numbers from optimal clustering settings are provided in rectangular brackets. These numbers are only used to circumscribe the TU; they do not indicate relationships between the TU (e.g. TU 16 is not closer to TU 15 than to TU 91). Red labels denote accessions affected by type I conflicts, blue labels by type II conflicts, mauve labels by both type I and II conflicts and green labels by database errors due to incorrect data submission. The red (type I) or blue (type II) lines connect the accessions affected by the respective conflict, with the conflict subtype given to the right. Type I concern the presence of the same taxon in different clusters (TU), type II the presence of several taxa within the same cluster (TU). Subtypes: Ia, different TU correspond to different hosts; Ib-Ic, different TU correspond to the same host; Ib, different TU are effected by sequencing/alignment artefacts; Ic different TU are effected by high genetic variability; IIa different taxa within a TU occur on the same host species/genus; (IIa) different taxa within a TU occur on different host genera within the same family; IIb different taxa within a TU occur on different host families. The tree is continued in Fig. 4.
Figure 4Maximum-likelihood tree, central part.
Phylogram as inferred with RAxML; continuation of Fig. 3 (connections indicated by arrowheads). For a description of the sequence labels and the colouring, see legend to Fig. 3. The tree is continued in Fig. 5.
Figure 5Maximum-likelihood tree, top part.
Phylogram as inferred with RAxML; continuation of Fig. 4 (connections indicated by arrowheads). For a description of the sequence labels and the colouring, see legend to Fig. 3.