| Literature DB >> 19654869 |
Thomas Rolland1, Cécile Neuvéglise, Christine Sacerdot, Bernard Dujon.
Abstract
Horizontal gene transfer has been occasionally mentioned in eukaryotic genomes, but such events appear much less numerous than in prokaryotes, where they play important functional and evolutionary roles. In yeasts, few independent cases have been described, some of which corresponding to major metabolic functions, but no systematic screening of horizontally transferred genes has been attempted so far. Taking advantage of the synteny conservation among five newly sequenced and annotated genomes of Saccharomycetaceae, we carried out a systematic search for HGT candidates amidst genes present in only one species within conserved synteny blocks. Out of 255 species-specific genes, we discovered 11 candidates for HGT, based on their similarity with bacterial proteins and on reconstructed phylogenies. This corresponds to a minimum of six transfer events because some horizontally acquired genes appear to rapidly duplicate in yeast genomes (e.g. YwqG genes in Kluyveromyces thermotolerans and serine recombinase genes of the IS607 family in Saccharomyces kluyveri). We show that the resulting copies are submitted to a strong functional selective pressure. The mechanisms of DNA transfer and integration are discussed, in relation with the generally small size of HGT candidates. Our results on a limited set of species expand by 50% the number of previously published HGT cases in hemiascomycetous yeasts, suggesting that this type of event is more frequent than usually thought. Our restrictive method does not exclude the possibility that additional HGT events exist. Actually, ancestral events common to several yeast species must have been overlooked, and the absence of homologs in present databases leaves open the question of the origin of the 244 remaining species-specific genes inserted within conserved synteny blocks.Entities:
Mesh:
Year: 2009 PMID: 19654869 PMCID: PMC2715888 DOI: 10.1371/journal.pone.0006515
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Phylogenetic tree of the Saccharomycetaceae, with published HGT cases.
The tree was constructed by maximum likelihood using PHYML, from alignments of conserved protein families with only one member per species and corresponding to true orthologs as defined by SONS [13]. Alignments were performed using the MAFFT algorithm and further cleaned with Gblocks before concatenation (53 families, 19144 residues). Bootstrap values are indicated next to the nodes. Triangles represent cases of HGT (black are published cases and red are cases discussed in this paper). References: (1) Dujon et al. 2004; (2) Gojkovic et al. 2004; (3) Hall et al. 2005; (4) Hall and Dietrich, 2007; (5) Wei et al. 2007; (6) Fitzpatrick et al. 2008; (7) This work. Len.: length in amino-acids (for multigene families, the average is shown). Dup.: Number of copies of transferred gene. FUN: unknown function. ¥ In cases of duplication and/or ancestral HGT, the Génolevures families replace individual gene names. * Highly similar orthologs of S. cerevisiae genes YOL164W and YJL217W have been identified in K. thermotolerans, suggesting more ancestral HGT events. ** The gene is present in four copies in K. thermotolerans, but only once in K. waltii.
Figure 2Size distribution of conserved synteny blocks among the five protoploid Saccharomycetaceae, and intervening gene numbers.
Synteny block common to all 5 protoploid Saccharomycetaceae species (inset) were classified by their number of anchoring points (abscissa, Materials and Methods). The number of blocks in each size category is referred to the scale on the left axis. The mean number of intervening genes per species and per block category is represented by dotted lines (scale on the right axis), and total figure for each species are in inset. All: total number of intervening genes. S-S: total number of species-specific intervening genes. HGT: total number of horizontally transferred genes.
Figure 3Single HGT candidates in K. lactis, K. thermotolerans and S. kluyveri.
(A) Part of the conserved synteny block surrounding the KLLA0B05269g gene (previously reported in [20]). Only protein coding and tRNA genes are shown. Orthologous genes are colored, intervening genes are white, tRNA genes are indicated by short hatched arrows, pseudogenes are identified by dotted lines. Arrows represent gene orientation. The segmental duplication between chromosomes B and D of K. lactis is shown at the bottom of the figure. KLLA0D10109g is similar to S. cerevisiae YOR120W GCY1, KLLA0D10131g similar to S. cerevisiae YKL221W MCH2, and KLLA0D10153g and KLLA0B05379g are slightly similar to S. cerevisiae YKL222C protein of unknown function. (B) Part of the conserved synteny block surrounding the KLLA0C09218g gene. Same legend as (A). (C) Part of the conserved synteny block surrounding KLTH0E10032g gene. Same legend as (A). Triangles represent single gene species-specific deletions. (D) Part of the conserved synteny block surrounding SAKL0E22528g. Same legend as (A). ZYRO0G10604s is the centromere of chromosome G in Z. rouxii (dotted oval). Note the local inversion of two genes in Z. rouxii. (E) Phylogenetic tree reconstructed from sequence alignment of 81 sites of the KLLA0B05269g product, bacterial proteins (green) and products of S. cerevisiae SMI1 gene and its K. lactis homolog (YGR229C/SACE0G11066p and KLLA0E15775p, in black). Bootstrap values are indicated next to the nodes and branch length scale is shown at bottom left. (F) Phylogenetic tree reconstructed from sequence alignment of 65 sites of the KLLA0C09218g product, bacterial proteins (green) and the S. cerevisiae glyoxalase I gene (YML004C/SACE0M03014p) and its K. lactis homolog (KLLA0F06226p). Same legend as (B). (G) Phylogenetic tree reconstructed from sequence alignment of 85 sites of the KLTH0E10032g product and bacterial proteins (dark green, cyanobacteria in light green). Same legend as (B). (H) Phylogenetic tree reconstructed from sequence alignment of 72 sites of the SAKL0H22528g product, bacterial and moss proteins (green) and Pezizomycotina proteins (light red). Same legend as (B).
Pairwise sequence identity between single HGT candidate products, best bacterial hits and yeast/fungal similarly annotated proteins.
| Query gene | Génolevures/Uniprot annotation | Species | Reference | Length (aa) | Identity (%) |
|
| hypothetical protein ASA_0453 |
| A4SIA4 | 143 | 53 |
| hypothetical protein BT9727_2262 |
| Q6HIN8 | 142 | 49 | |
| hypothetical protein BALH_2200 |
| A0RE58 | 142 | 48 | |
| hypothetical protein BcerKBAB4_2135 |
| A9VU41 | 143 | 44 | |
| SMI1/KNR4 family protein BCAH1134_3336 |
| B5ULS5 | 134 | 42 | |
| SMI1 homolog |
| KLTH0C01760p | 516 | 28 | |
| SMI1 homolog |
| SAKL0G15466p | 515 | 26 | |
| ADL139W – SMI1 protein |
| ERGO0D06072p | 657 | 26 | |
| SMI1 homolog |
| ZYRO0B15180p | 619 | 24 | |
| YGR229C – SMI1 protein |
| SACE0G11066p | 505 | 24 | |
|
|
| KLLA0E15775p | 535 | 23 | |
|
| hypothetical protein XAC0586 |
| Q8PPU9 | 164 | 44 |
| putative glyoxalase/bleomycin resistance protein/dioxygenase XCV0638 |
| Q3BXZ4 | 131 | 41 | |
| hypothetical protein OA2633_07919 |
| A3UH53 | 131 | 39 | |
| glyoxalase/bleomycin resistance protein/dioxygenase Swit_2866 |
| A5VAA2 | 127 | 39 | |
| Glyoxalase I homolog |
| SAKL0A02244p | 341 | 25 | |
| KLLA0F06226p |
| KLLA0F06226p | 338 | 24 | |
| YML004C - Glyoxalase I |
| SACE0M03014p | 326 | 25 | |
| Glyoxalase I homolog |
| ZYRO0D09064p | 347 | 28 | |
| Glyoxalase I homolog |
| KLTH0A05896p | 346 | 26 | |
|
| Hypothetical protein Sce1221 |
| A9F1M3 | 292 | 37 |
| Putative uncharacterized protein M23134_00952 |
| A1ZZM9 | 277 | 27 | |
| Putative NADH-ubiquinone oxido-reductase MED297_07641 |
| A4BKJ1 | 284 | 29 | |
| NAD-dependent epimerase/dehydratase family protein CYA_1049 |
| Q2JVJ5 | 318 | 27 | |
| Putative uncharacterized protein CY0110_10507 |
| A3IH58 | 325 | 27 | |
| Putative chaperon-like protein for quinone binding in Photosystem II ycf39 |
| B1X3H0 | 320 | 24 | |
| Putative chaperon-like protein for quinone binding in Photosystem II A9601_13281 |
| A2BS51 | 320 | 26 | |
| Ycf39 protein |
| P74029 | 219 | 22 | |
|
| Predicted protein PHYPADRAFT_163804 |
| A9SAE7 | 248 | 38 |
| MoeA domain protein Bcenmc03_4042 |
| B1K6R0 | 674 | 38 | |
| MoeA domain protein Bcen2424_3480 |
| A0AXU3 | 693 | 38 | |
| Putative uncharacterized protein M23134_06735 |
| A1ZXR5 | 194 | 32 | |
| Protein involved in beta-1-3-glucan synthesis Cyan7425DRAFT_0799 |
| B4C8K2 | 180 | 31 | |
| Putative uncharacterized protein AM1_4879 |
| B0C3L6 | 165 | 31 | |
| 1,3-beta-glucan biosynthesis protein, putative ACLA_009040 |
| A1C9R6 | 517 | 34 | |
| Catalytic activity: UDP-glucose+((1)) An17g02120 |
| A2R9P2 | 525 | 33 | |
| Putative uncharacterized protein ATEG_09455 |
| Q0CA29 | 531 | 32 | |
| 1,3-beta-glucan biosynthesis protein, putative AFUB_053320 |
| B0Y387 | 515 | 32 | |
| Glucan synthesis regulatory protein gs-1 |
| P38678 | 532 | 30 |
References are taken from Uniprot database (http://www.uniprot.org/) for bacterial proteins and from Génolevures (http://www.genolevures.org/) for yeast proteins
Sequence identity is measured on total query sequence length.
KLLA0E15862g corresponds to the new Génolevures annotation KLLA0E15775g.
Figure 4Duplicated HGT candidates in K. thermotolerans.
(A) Part of the conserved synteny block surrounding KLTH0C07700g and KLTH0C07722g tandem genes. Same legend as Figure 3A. (B) Part of the conserved synteny block surrounding KLTH0F12276g gene. Same legend as Figure 3A. (C) Sequence alignment of proteins from K. thermotolerans and K. waltii. (D) Phylogenetic tree reconstructed from sequence alignment of 100 sites of K. thermotolerans protein family members, K. waltii gene, bacterial (green) and amoebal (blue) proteins. Bootstrap values are indicated next to the nodes and branch length scale is shown at bottom left.
Pairwise sequence identity between K. thermotolerans protein family, K. waltii protein and best bacterial and amoebal hits.
| Identity (%) | ||||||||
| Génolevures/Uniprot annotation | Species | Reference | Length (aa) | KLTH0C07700g | KLTH0C07722g | KLTH0F12276g | KLTH0H12914g | KLWA_20732 |
| KLTH0C07700p |
| - | 259 | - | - | - | - | - |
| KLTH0C07722p |
| - | 267 | 64 | - | - | - | - |
| KLTH0F12276p |
| - | 259 | 52 | 53 | - | - | - |
| KLTH0H12914p |
| - | 264 | 48 | 50 | 44 | - | - |
| KLWA_20732 |
| - | 259 | 52 | 51 | 43 | 48 | - |
| YwqG |
| P96719 | 261 | 49 | 48 | 41 | 47 | 52 |
| YwqG |
| A7Z9J7 | 262 | 44 | 46 | 43 | 47 | 48 |
| YwqG |
| B4AJW0 | 271 | 46 | 46 | 41 | 44 | 45 |
| Putative uncharacterized protein BT9727_2468 |
| Q6HI32 | 271 | 47 | 46 | 37 | 41 | 45 |
| Putative uncharacterized protein BAS2507 |
| Q81PV2 | 271 | 48 | 45 | 37 | 41 | 44 |
| Putative uncharacterized protein BcerKBAB4_2451 |
| A9VGU8 | 271 | 44 | 45 | 36 | 40 | 42 |
| YwqG |
| B5UWG9 | 271 | 44 | 42 | 35 | 38 | 43 |
| Putative uncharacterized protein EDI_035210 |
| B0EN38 | 284 | 35 | 35 | 36 | 34 | 32 |
| Putative uncharacterized protein EHI_001040 |
| B1N3C1 | 267 | 30 | 31 | 30 | 31 | 29 |
References are taken from Uniprot database (URL: http://www.uniprot.org/) for bacterial and Entamoeba proteins and from Génolevures (URL: http://www.genolevures.org/) for yeast proteins.
Sequence identity is measured on total query sequence length.
K. waltii gene KLWA_20732 sequence was taken from annotation published in Kellis et al. (2004).
Figure 5Duplicated HGT candidates in S. kluyveri.
(A) Chromosomal region surrounding SAKL0H06314g and SAKL0H06600g and conserved syntenic regions of the other protoploid genomes including K. waltii. Same legend as Figure 3A. (B) Sequence alignment of the six putative horizontally transferred serine recombinases of S. kluyveri. Red rectangles indicate the conserved motifs (A, B and C) within the catalytic domain according to Grindley et al. [33]. (C) Phylogenetic tree built from protein alignment of the six S. kluyveri proteins and of the closest serine recombinase from archaea (Q9UXS2, Q97YI6 and Q5JF72) and bacteria especially Firmicutes (Q3CGJ5, Q8RBS0, A4XJ35 and Q331V0) and proteobacteria (Q9RMU7 and Q6A4H9). After horizontal gene transfer, the first putative serine recombinase in S. kluyveri has been successively duplicated in the genome. The presence of identical genes (SAKL0H06314g and SAKL0H06600g) suggests a recent duplication event or the result of a conversion event. Bootstrap values are indicated next to the nodes.
Pairwise sequence identity between K. kluyveri protein family and best bacterial and archaeal hits.
| Identity (%) | |||||||||
| Génolevures/Uniprot annotation | Species | Reference | Length (aa) | SAKL0B01782p | SAKL0B05940p | SAKL0H03674p | SAKL0H06314p | SAKL0H06600p | SAKL0G04686p |
| SAKL0B01782p |
| - | 237 | - | - | - | - | - | - |
| SAKL0B05940p |
| - | 253 | 59 | - | - | - | - | - |
| SAKL0H03674p |
| - | 249 | 57 | 56 | - | - | - | - |
| SAKL0H06314p |
| - | 248 | 56 | 57 | 61 | - | - | - |
| SAKL0H06600p |
| - | 248 | 56 | 57 | 61 | 100 | - | - |
| SAKL0G04686p |
| - | 254 | 56 | 62 | 57 | 62 | 62 | - |
| Regulatory protein, MerR:Resolvase |
| Q3CGJ5 | 197 | 29 | 28 | 27 | 27 | 27 | 29 |
| Predicted site-specific integrase-resolvase |
| Q8RBS0 | 196 | 29 | 28 | 27 | 27 | 27 | 28 |
| Resolvase, N-terminal domain |
| A4XJ35 | 201 | 29 | 28 | 28 | 28 | 28 | 29 |
| Putative resolvase |
| Q9UXS2 | 194 | 25 | 24 | 24 | 24 | 24 | 24 |
| First ORF in transposon ISC1904 |
| Q97YI6 | 192 | 25 | 23 | 23 | 24 | 24 | 24 |
| Putative transposase OrfA |
| Q9RMU7 | 217 | 26 | 24 | 24 | 25 | 25 | 24 |
| Putative uncharacterized protein OrfA |
| Q6A4H9 | 212 | 26 | 23 | 22 | 23 | 23 | 23 |
| Predicted site-specific integrase/resolvase |
| Q5JF72 | 209 | 27 | 23 | 24 | 22 | 22 | 25 |
| Putative IS transposase (OrfA) |
| Q331V0 | 219 | 24 | 25 | 25 | 23 | 23 | 26 |
References are taken from Uniprot database (http://www.uniprot.org/) for bacteria and archae and from Génolevures (http://www.genolevures.org/) for yeast proteins.
Sequence identity is measured on total query sequence length.