| Literature DB >> 31398949 |
Abstract
Cytochrome P450 monooxygenases (CYPs) are ubiquitous throughout the tree of life and play diverse roles in metabolism including the synthesis of secondary metabolites as well as the degradation of recalcitrant organic substrates. The genomes of budding yeasts (phylum Ascomycota, sub-phylum Saccharomycotina) typically contain fewer families of CYPs than filamentous fungi. There are currently five CYP families among budding yeasts with known function while at least another six CYP families with unknown function ("orphan CYPs") have been described. The current study surveyed the genomes of 372 species of budding yeasts for CYP-encoding genes in order to determine the taxonomic distribution of individual CYP families across the sub-phylum as well as to identify novel CYP families. Families CYP51 and CYP61 (represented by the ergosterol biosynthetic genes ERG11 and ERG5, respectively) were essentially ubiquitous among the budding yeasts while families CYP52 (alkane/fatty acid hydroxylases), CYP56 (N-formyl-l-tyrosine oxidase) displayed several instances of gene loss at the genus or family level. Phylogenetic analysis suggested that the three orphan families CYP5217, CYP5223 and CYP5252 diverged from a common ancestor gene following the origin of the budding yeast sub-phylum. The genomic survey also identified eight CYP families that had not previously been reported in budding yeasts.Entities:
Keywords: CYPome; enzyme; metabolism; orphan gene; yeast
Year: 2019 PMID: 31398949 PMCID: PMC6723986 DOI: 10.3390/microorganisms7080247
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
Figure 1Currently known cytochrome P450 monooxygenase (CYP) enzymatic activities among budding yeasts: (a) Δ22-sterol desaturase; (b) 14α-lanosterol demethylase; (c) N-formyl-l-tyrosine oxidase; (d) n-alkane hydroxylase; (e) fatty acid ω-hydroxylase; (f) fatty acid (ω-1)-hydroxylase; (g) pulcherrimic acid synthase.
CYP family reference sequences.
| CYP Family | Species | Accession Number |
|---|---|---|
| 51 ( |
| NP_011871 |
| 52 ( |
| XP_002546278 |
| 56 ( |
| NP_010690 |
| 61 ( |
| NP_013728 |
| 501 |
| XP_001485863 |
| 504 |
| XP_001483214 |
| 548 |
| XP_501196 |
| 5217 |
| XP_715414 |
| 5223 |
| XP_503945 |
| 5251 ( |
| XP_453057 |
| 5252 |
| CDR39009 |
Figure 2Taxonomic distribution of CYP families among the budding yeasts. Genera are organized with respect to clade [24]. The number of species surveyed within each genus (n) is indicated. The color of circles represent the occurrence of each CYP category (either individual CYP families or larger clusters of related CYP families) within each individual genus. Suspected pseudogenes were scored as “absent”. The CYP category “other” denotes CYP families detected exclusively in a small number of basal budding yeast genera. These CYP families are described in greater detail in Table 2.
Figure 3Average gene copy number of CYP52 family genes per haploid genome among budding yeasts. Genus names are color-coded with respect to clade [24]. The number of species surveyed within each genus (n) is indicated. Error bars indicate one standard deviation.
CYP genes in basal budding yeast taxa.
| Species | Protein Accession | Top Assigned |
|---|---|---|
|
| ||
|
| ODQ70065 | XP_659488 (31%/50%) |
|
| ODQ69649 | XP_659488 (32%/50%) |
|
| ODQ71312 | XP_659488 (32%/51%) |
|
| ODQ70063 | XP_659488 (34%/55%) |
|
| ODQ72725 | XP_659488 (34%/51%) |
|
| ODV92859 | XP_659488 (27%/47%) |
|
| ||
|
| ODQ75312 | XP_682188 (46%/65%) 1 |
|
| ||
|
| ODQ74007 2 | XP_681077 (38%/55%) 3 |
|
| ||
|
| ODQ75272 | XP_681884 (44%/60%) |
|
| ||
|
| CDO53625 | XP_661721 (22%/41%) |
|
| ODQ69471 | XP_661721 (23%/40%) |
|
| ODQ74469 | XP_661721 (23%/36%) |
|
| ODQ69396 | XP_660953 (27%/45%) |
|
| ODQ72272 | XP_661721 (22%/40%) |
|
| ODQ72285 | XP_682522 (31%/48%) |
|
| ||
|
| ODQ75335 | XP_664439 (46%/58%) |
1 The top Asp. nidulans hit overall (43%/59%) is a hypothetical CYP (XP_663370), which has not been assigned a CYP family [23] and the corresponding gene (AN3861) is not currently listed in the Aspergillus genome database (http://www.aspergillusgenome.org/). 2 Potentially truncated N-terminus due to suspected gene prediction error. 3 The top Asp. nidulans hit overall (56%/73%) is a hypothetical CYP (XP_661465), which has not been assigned a CYP family [23] and the corresponding gene (AN5766) is not currently listed in the Aspergillus genome database.
Figure 4Unrooted phylogram of CYP56 gene products in representative species of sub-phylum Saccharomycotina as well as selected non-Saccharomycotina species. Individual species are color-coded with respect to clade [24]. The analysis was based on 327 aligned amino acid positions of the corresponding proteins. Colored circles represent branch support with the color indicating the proportion of retained nodes among 1000 bootstrap replicates. Branches lacking circles indicate branch support less than 500. Accession numbers for individual protein sequences are displayed when available. Protein sequences lacking accession numbers were derived through conceptual translation of genomic sequences. (Genomic coordinates for protein sequences derived through conceptual translation of genomic sequences are listed in Table A1 of Appendix A).
Figure 5Unrooted phylogram of CYP52 gene products in representative species of sub-phylum Saccharomycotina as well as selected non-Saccharomycotina species. Individual species are color-coded with respect to clade [24]. The analysis was based on 309 aligned amino acid positions of the corresponding proteins. Colored circles represent branch support with the color indicating the proportion of retained nodes among 1000 bootstrap replicates. Branches lacking circles indicate branch support less than 500. Accession numbers for individual protein sequences are displayed when available. Protein sequences lacking accession numbers were derived through conceptual translation of genomic sequences. Named protein sequences are indicated in bold font. (Genomic coordinates for protein sequences derived through conceptual translation of genomic sequences are listed in Table A1 of Appendix A).
Figure 6CYP-containing sophorolipid biosynthetic gene clusters in Starmerella and Wickerhamiella species. Genomic accession numbers and corresponding genomic sequence coordinates are indicated. (a) conserved sophorolipid biosynthetic gene clusters within the genus Starmerella; (b) putative sophorolipid biosynthetic gene cluster in the species W. versatilis.
Figure 7Unrooted phylogram of CYP501 and CYP504 gene products in representative species of sub-phylum Saccharomycotina as well as selected non-Saccharomycotina species. Individual species are color-coded with respect to clade [24]. The analysis was based on 252 aligned amino acid positions of the corresponding proteins. Colored circles represent branch support with the color indicating the proportion of retained nodes among 1000 bootstrap replicates. Branches lacking circles indicate branch support less than 500. Accession numbers for individual protein sequences are displayed when available. Protein sequences lacking accession numbers were derived through conceptual translation of genomic sequences. Named protein sequences are indicated in bold font. (Genomic coordinates for protein sequences derived through conceptual translation of genomic sequences are listed in Table A1 of Appendix A).
Figure 8Unrooted phylogram of CYP548, CYP630, CYP5217, CYP5223, CYP5252 and associated gene products in representative species of sub-phylum Saccharomycotina as well as selected non-Saccharomycotina species. Individual species are color-coded with respect to clade [24]. The analysis was based on 217 aligned amino acid positions of the corresponding proteins. Colored circles represent branch support with the color indicating the proportion of retained nodes among 1000 bootstrap replicates. Branches lacking circles indicate branch support less than 500. Accession numbers for individual protein sequences are displayed when available. Protein sequences lacking accession numbers were derived through conceptual translation of genomic sequences. (Genomic coordinates for protein sequences derived through conceptual translation of genomic sequences are listed in Table A1 of Appendix A).
Figure 9Unrooted tree of CYP5078 gene products in representative species of sub-phylum Saccharomycotina as well as selected non-Saccharomycotina species. Individual species are color-coded with respect to clade [24]. The analysis was based on 403 aligned amino acid positions of the corresponding proteins. Colored circles represent branch support with the color indicating the proportion of retained nodes among 1000 bootstrap replicates. Branches lacking circles indicate branch support less than 500. Accession numbers for individual protein sequences are displayed when available. Protein sequences lacking accession numbers were derived through conceptual translation of genomic sequences. (Genomic coordinates for protein sequences derived through conceptual translation of genomic sequences are listed in Table A1 of Appendix A).
Genomic sequences used for conceptual translation into protein sequences for phylogenetic analysis. Numbers in parentheses refers to numbered sequences in the corresponding figures.
| Species | Genomic Accession 1 | Strand | Coordinates |
|---|---|---|---|
| CYP56 | |||
|
| PPJB01000050 2 | + | 74966–76441 |
|
| BCKZ01000006 | − | 381233–382128, 382188–382782 |
|
| PPLZ02000015 | + | 151344–152804 |
|
| PPJE01000011 | + | 4500–5072, 5133–5489, 5549–6016, 6083–6148 |
|
| PJEZ01000006 2 | + | 67720–69054 |
|
| PPNN01000011 2 | − | 275828–277339 |
|
| UIDE01000004 | − | 435148–435696, 435779–436135, 436192–436770 |
|
| BCIF01000002 2 | + | 1110008–1111489 |
|
| PPNH01000004 | − | 4783–6303 |
|
| PPNC01000057 | + | 57884–59377 |
|
| PPLR01000028 2 | + | 76692–78167 |
|
| BCGE01000005 2 | − | 792413–793876 |
|
| PPMT01000005 | − | 193708–195165 |
|
| PPHZ01000005 | + | 324089–325603 |
| CYP52 | |||
|
| BCKZ01000001 | − | 888959–890518 |
| PPJD01000017 | + | 195224–196783 | |
| PPJD01000017 | + | 197357–198934 | |
| QBLK01000108 | − | 23137–24687 | |
| QBLK01000108 | − | 21081–22637 | |
| JNFV01000009 2 | − | 683357–684952 | |
| JNFV01000003 2 | + | 6643–8226 | |
| JNFV01000009 2 | + | 7380–8951 | |
| JNFV01000001 2 | + | 2891–4453 | |
| NRDR01000004 | − | 591579–593135 | |
| NRDR01000004 | − | 318040–319596 | |
| NRDR01000025 | + | 62522–64117 | |
| NRDR01000003 | − | 308477–310045 | |
| NRDR01000005 | − | 9970–11538 | |
| NRDR01000005 | − | 6608–8176 | |
|
| PPXM02000002 | − | 1137995–1139512 |
| NRED01000001 | − | 1258441–1259970 | |
| NRED01000011 | − | 25516–27039 | |
| NRED01000004 | + | 480766–482298 | |
|
| PPMC02000012 | − | 276041–277603 |
| CYP501, CYP504 | |||
|
| BCKZ01000004 | − | 123913–125610 |
|
| NW_017962913 2 | + | 1016180–1017844 |
|
| PPJM02000008 | − | 17335–18903 |
|
| LCTY01000003 | − | 955844–957598 |
|
| PPHV01003366 | + | 46403–47908 |
| PPLG02000002 | + | 204898–206562 | |
| PPLG02000014 | − | 73651–75282 | |
|
| NC_027864 | − | 1260033–1261667 |
|
| NW_017566986 | − | 510782–512500 |
|
| PPIG01000060 2 | + | 55413–56858 |
|
| NC_009047 2 | − | 132488–134149 |
| CYP548, CYP630, CYP5217, CYP5223, CYP5252 | |||
|
| BCKZ01000023 | − | 227089–228648 |
|
| BCIP01000010 | − | 36354–38219 |
|
| NW_017962915 2 | − | 986327–988084 |
|
| PPJN02000012 | + | 34235–35698 |
|
| PPJJ02000009 | − | 104592–106094 |
|
| PPJG01000004 | − | 306605–308191 |
|
| PPJC01001041 | − | 443–1888 |
|
| LCTY01000007 | + | 286500–288059 |
|
| PPHV01003001 | − | 5839–7305 |
|
| NW_003101576 2 | − | 792130–793887 |
| PPJD01000014 | + | 6451–7920 | |
| PPJD01000011 | − | 56654–58084 | |
|
| NW_017963729 2 | + | 1954183–1955925 |
|
| PPJW01000001 | + | 112103–112111, 112147–112381, 112443–113550, 113613–113838 |
|
| PPKU01000003 3 | + | 174994–176508 |
|
| NW_017566985 | + | 474706–476289 |
|
| NC_009047 2 | + | 1074603–1076360 |
| BCGN01000006 | − | 1022779–1024251 | |
| BCGN01000011 | + | 318748–320439 | |
|
| PPXM02000001 | − | 657214–658683 |
| CYP5078 | |||
|
| BCGA01000001 | + | 1475803–1477413 |
|
| PPLV01000007 | + | 15809–17437 |
|
| PPKY02000013 | + | 3245–4852 |
|
| PPKZ02000016 | − | 79288–80895 |
|
| PPLW02000009 | − | 377701–379308 |
|
| PPHW01000114 | − | 32141–33748 |
| PPJQ02000001 | + | 628395–629990 | |
| PPJQ02000011 | − | 223225–224796 | |
| PPJQ02000011 | + | 248970–249275, 249341–249604, 249662–250591 | |
| PPJQ02000063 | + | 4147–4461, 4542–4781, 4846–5757 | |
|
| PPKO01000004 | + | 345713–347290 |
|
| PPHT01000003 | + | 42457–44139 |
|
| PPKQ01000003 | + | 112748–114325 |
|
| PPIX02000004 | + | 115779–117362 |
|
| PPKJ01000015 | + | 25059–26642 |
|
| PPKH02000011 | − | 45513–47093 |
|
| PPKG01000007 3 | − | 78966–80543 |
|
| PPKF01000009 3 | − | 77489–79066 |
| PPID01000125 | − | 23315–24868 | |
| PPID01000036 | + | 26664–28286 | |
|
| BCGN01000002 | + | 250317–251900 |
1 Translation table 1 was used unless otherwise indicated. 2 Translation table 12. 3 Translation table 27.