| Literature DB >> 17688684 |
Eiko E Kuramae1, Vincent Robert, Carlos Echavarri-Erasun, Teun Boekhout.
Abstract
BACKGROUND: The construction of robust and well resolved phylogenetic trees is important for our understanding of many, if not all biological processes, including speciation and origin of higher taxa, genome evolution, metabolic diversification, multicellularity, origin of life styles, pathogenicity and so on. Many older phylogenies were not well supported due to insufficient phylogenetic signal present in the single or few genes used in phylogenetic reconstructions. Importantly, single gene phylogenies were not always found to be congruent. The phylogenetic signal may, therefore, be increased by enlarging the number of genes included in phylogenetic studies. Unfortunately, concatenation of many genes does not take into consideration the evolutionary history of each individual gene. Here, we describe an approach to select informative phylogenetic proteins to be used in the Tree of Life (TOL) and barcoding projects by comparing the cophenetic correlation coefficients (CCC) among individual protein distance matrices of proteins, using the fungi as an example. The method demonstrated that the quality and number of concatenated proteins is important for a reliable estimation of TOL. Approximately 40-45 concatenated proteins seem needed to resolve fungal TOL.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17688684 PMCID: PMC2045111 DOI: 10.1186/1471-2148-7-134
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Correlation values of KOG distance matrices compared to that of KOG2671, KOG functional category, the corresponding single protein KOGs to the systematic name, systematic deletion and chromosome number of ORFs of Saccharomyce cerevisae (Sce) [19].
| 1.00 | KOG2671 | YOL124c | viable | XV |
| 0.93 | KOG0340 | YHR169w | inviable | VIII |
| 0.91 | KOG4089 | YDR405w | viable | IV |
| 0.91 | KOG0173 | YOR157C | inviable | XV |
| 0.91 | KOG2728 | YIL083c | inviable | IX |
| 0.90 | KOG3111 | YJL121c | viable | X |
| 0.89 | KOG3800 | YDR460w | inviable | IV |
| 0.89 | KOG3024 | YOR164c | viable | XV |
| 0.89 | KOG0816 | YKL009w | viable | XI |
| 0.89 | KOG2905 | YGR005c | inviable | VII |
| 0.89 | KOG3013 | YHR069c | inviable | VIII |
| 0.89 | KOG1416 | YNL062c | inviable | XIV |
| 0.88 | KOG2299 | YNL072w | viable | XIV |
| 0.88 | KOG3045 | YDR083w | viable | IV |
| 0.88 | KOG3003 | YOR232w | inviable | XV |
| 0.87 | KOG4018 | YDR152w | viable | IV |
| 0.87 | KOG3786 | YLR418c | viable | XII |
| 0.87 | KOG3789 | YEL062w | viable | V |
| 0.86 | KOG0809 | YOL018c | viable | XV |
| 0.86 | KOG4093 | YPL225w | viable | XVI |
| 0.86 | KOG3015 | YJL180c | viable | X |
| 0.86 | KOG2487 | YPR056w | inviable | XVI |
| 0.85 | KOG0438 | YEL050c | viable | V |
| 0.85 | KOG0645 | YDR267c | inviable | IV |
| 0.85 | KOG2851 | YIR008c | inviable | IX |
| 0.85 | KOG2267 | YKL045w | inviable | XI |
| 0.84 | KOG2732 | YJR006w | inviable | X |
| 0.84 | KOG2021 | YKL205w | viable | XI |
| 0.83 | KOG0991 | YOL094c | inviable | XV |
| 0.83 | KOG3224 | YPR040w | viable | XVI |
| 0.83 | KOG2994 | YML021c | viable | XIII |
| 0.82 | KOG3103 | YGR172c | inviable | VII |
| 0.82 | KOG1598 | YGR246c | inviable | VII |
| 0.82 | KOG0436 | YGR171c | viable | VII |
| 0.81 | KOG2326 | YMR106C | viable | XIII |
| 0.81 | KOG1355 | YNL220w | viable | XIV |
| 0.81 | KOG1741 | YPR166c | viable | XVI |
| 0.80 | KOG3381 | YHR122w | inviable | VIII |
| 0.79 | KOG3244 | YDR204w | viable | IV |
| 0.79 | KOG1534 | YLR243w | inviable | XII |
| 0.78 | KOG3229 | YKL041w | viable | XI |
| 0.77 | KOG3438 | YNL113w | inviable | XIV |
| 0.77 | KOG1069 | YGR095c | inviable | VII |
| 0.76 | KOG3364 | YIL065c | viable | IX |
| 0.76 | KOG0989 | YJR068w | inviable | X |
| 0.75 | KOG3911 | YDR087c | inviable | IV |
| 0.73 | KOG3104 | YDR005c | viable | IV |
| 0.73 | KOG0304 | YNR052c | viable | XIV |
| 0.73 | KOG3341 | YPL002c | viable | XVI |
| 0.72 | KOG3059 | YPL076w | inviable | XVI |
| 0.71 | KOG3259 | YJR017c | inviable | X |
| 0.71 | KOG3313 | YGR078c | viable | VII |
| 0.70 | KOG1750 | YNR036c | viable | XIV |
| 0.70 | KOG0396 | YIL097w | viable | IX |
| 0.70 | KOG3240 | YPR113w | inviable | XVI |
| 0.69 | KOG1173 | YKL022c | inviable | XI |
| 0.68 | KOG2626 | YLR015w | viable | XII |
| 0.66 | KOG1299 | YGL095c | viable | VII |
| 0.65 | KOG3327 | YJR057w | inviable | X |
| 0.62 | KOG1746 | YOR103c | inviable | XV |
| 0.61 | KOG3159 | YJL046w | viable | X |
| 0.56 | KOG0325 | YLR239c | viable | XII |
| 0.50 | KOG3063 | YJL053w | viable | X |
| 0.50 | KOG0282 | YDR364c | viable | IV |
| 0.48 | KOG2874 | YCL059c | inviable | III |
| 0.44 | KOG4017 | YMR201c | viable | XIII |
| 0.36 | KOG3228 | YDR163w | viable | IV |
| 0.35 | KOG0551 | YBR155w | inviable | II |
| 0.24 | KOG0285 | YPL151c | inviable | XVI |
| 0.08 | KOG2441 | YAL032c | inviable | I |
Figure 1Phylogenetic relationship of 33 complete fungal genomes. The same tree topology is given by concatenation of 30, 40, 50, 60 and 64 KOG proteins with correlation values >0.50 when compared to reference KOG2671 distance matrix. Asp. = Aspergillus, Can. = Candida, Cry. = Cryptococcus, Sac. = Saccharomyces, Ash. = Ashbya. Phyla: I = Ascomycota, II = Basidiomycota, III = Rhyzomycota. Subphyla: IA = Saccharomycotina, IB = Pezizomycotina, IC = Taphrinamycotina, IIA = Agaricomycotina, IIB = Ustilaginomycotina, IIIA = Mucormycotina. IB1 = Sardariomycetes, IB2 = Letiomycetes, IB3 = Eurotiomycetes, IB4 = Dothideomycetes. Support values indicated on the branches were obtained by bootstrap analysis using 100 replicates. * indicates support values of 98–100%.
Figure 2Graph representing the number of concatenated KOGs (x-axis) per functional KOG category (information storage and processing; cellular processes and signaling; metabolism; poorly characterized), and the correlation values between KOG2671 distance matrix and each distance matrix of the 70 KOGs (right y-axis). The left y-axis illustrates the cumulative values of each KOG functional category when they are concatenated. The corresponding KOG protein number in x-axis is listed in the Table 1 and the corresponding functional category is in Supplemental Table 1.
Genome sources, genome size (Mb), number of KOGs assigned to each genome used in the study
| ATCC10895 | 7 | 2,592 | Zoologisches Institut der Univ. Basel, Switzerland | |
| Af293 | 30 | 3,182 | TIGR | |
| FGSC A4 | 31 | 2,982 | Broad Institute | |
| B05.10 | 38 | 3,191 | Broad Institute | |
| 100 | 4,235 | Welcome Trust Sanger Institute | ||
| SC5314 | 16 | 2,636 | Stanford University | |
| CBS138 | 13 | 2,505 | Genolevures | |
| ATCC6260 | 12 | 2,750 | Broad Institute | |
| ATCC42720 | 16 | 2,742 | Broad Institute | |
| CBS148.51 | 36 | 3,144 | Broad Institute | |
| RS | 28.78 | 3,137 | Broad Institute | |
| Okayama 7 (#130). | 37.5 | 3,210 | Broad Institute | |
| JEC21 | 24 | 2,876 | TIGR | |
| H99 | 20 | 3,074 | Broad Institute | |
| CBS767 | 12.22 | 2,760 | Genolevures | |
| PH-1 (NRRL 31084) | 36 | 3,063 | Broad Institute | |
| CLIB210 | 10.69 | 2,596 | Genolevures | |
| 70-15 | 40 | 2,917 | Broad Institute | |
| N-150 | 40 | 2,962 | Broad Institute | |
| RP78 | 30 | 2,945 | DOE Joint Genome Institute | |
| RA99–880 | 40 | 3,310 | Broad Institute | |
| MCYC623 | 12 | 2,560 | Stanford University | |
| NRRL Y-12630 | 10.2 | 2,390 | Stanford University | |
| RM11-1a | 12 | 2,665 | Broad Institute | |
| S288c | 12.07 | 2,668 | Welcome Trust Sanger Institute | |
| NRRL Y-12651 | 10.2 | 1,747 | Stanford University | |
| IFO1802 | 10.6 | 1,855 | Stanford University | |
| IFO1815 | 12 | 2,557 | Stanford University | |
| NRRLY-17217 | 12 | 2,592 | Stanford University | |
| S288C | 13 | 2,668 | Stanford University | |
| Urs Leupold 972 h- | 14 | 2,762 | Welcome Trust Sanger Institute | |
| 1980 | 38 | 3,219 | Broad Institute | |
| SN15 | 37.1 | 3,324 | Broad Institute | |
| 521 | 20 | 2,850 | Broad Institute | |
| CLIB99 | 20–21 | 2,699 | Genolevures |