| Literature DB >> 22384408 |
Véronique Leh Louis, Laurence Despons, Anne Friedrich, Tiphaine Martin, Pascal Durrens, Serge Casarégola, Cécile Neuvéglise, Cécile Fairhead, Christian Marck, José A Cruz, Marie-Laure Straub, Valérie Kugler, Christine Sacerdot, Zlatyo Uzunov, Agnes Thierry, Stéphanie Weiss, Claudine Bleykasten, Jacky De Montigny, Noemie Jacques, Paul Jung, Marc Lemaire, Sandrine Mallet, Guillaume Morel, Guy-Franck Richard, Anasua Sarkar, Guilhem Savel, Joseph Schacherer, Marie-Line Seret, Emmanuel Talla, Gaelle Samson, Claire Jubin, Julie Poulain, Benoît Vacherie, Valérie Barbe, Eric Pelletier, David J Sherman, Eric Westhof, Jean Weissenbach, Philippe V Baret, Patrick Wincker, Claude Gaillardin, Bernard Dujon, Jean-Luc Souciet.
Abstract
Polyploidization is an important process in the evolution of eukaryotic genomes, but ensuing molecular mechanisms remain to be clarified. Autopolyploidization or whole-genome duplication events frequently are resolved in resulting lineages by the loss of single genes from most duplicated pairs, causing transient gene dosage imbalance and accelerating speciation through meiotic infertility. Allopolyploidization or formation of interspecies hybrids raises the problem of genetic incompatibility (Bateson-Dobzhansky-Muller effect) and may be resolved by the accumulation of mutational changes in resulting lineages. In this article, we show that an osmotolerant yeast species, Pichia sorbitophila, recently isolated in a concentrated sorbitol solution in industry, illustrates this last situation. Its genome is a mosaic of homologous and homeologous chromosomes, or parts thereof, that corresponds to a recently formed hybrid in the process of evolution. The respective parental contributions to this genome were characterized using existing variations in GC content. The genomic changes that occurred during the short period since hybrid formation were identified (e.g., loss of heterozygosity, unilateral loss of rDNA, reciprocal exchange) and distinguished from those undergone by the two parental genomes after separation from their common ancestor (i.e., NUMT (NUclear sequences of MiTochondrial origin) insertions, gene acquisitions, gene location movements, reciprocal translocation). We found that the physiological characteristics of this new yeast species are determined by specific but unequal contributions of its two parents, one of which could be identified as very closely related to an extant Pichia farinosa strain.Entities:
Keywords: allopolyploidy; genome evolution; hybridization; loss of heterozygosity; osmotolerant yeast P. sorbitophila
Year: 2012 PMID: 22384408 PMCID: PMC3284337 DOI: 10.1534/g3.111.000745
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1 The hybrid nuclear genome of P. sorbitophila. The 14 chromosomes are represented by pairs according to their synteny. Chromosomes A/B and C/D are partly heterozygous, partly homozygous. Chromosomes E/F/I/J and M/N are heterozygous, and G/H and K/L are homozygous. The translocation breakpoint observed between chromosomes E, F, I, and J is represented by crossed chromosomes. Red and blue colors correspond to the proposed Pγ and Pε parental subgenomes, respectively, and hatched boxes to synteny losses between the homeologous chromosomes. Percentages of nucleotide identity between heterozygous regions are also indicated. Position of predicted centromeres (Figure S4) and of the 15 clusters of NUMT loci comprising a total of 24 NUMTs (Table S3) are shown by white dots and pins, respectively. Triangles indicate the positions of selected sequences used in P. farinosa CBS2001 strain (Table S4). The nucleotide identity level observed between P. sorbitophila and P. farinosa for the selected sequences is represented as followed: black triangles for 100% identity, gray for 99% and white for less than 96%. Below each chromosome pair (or at right for the crossed chromosomes) is represented the GC content variation (as shown in Materials and Methods and Figure S5) calculated for genes present at two allelic copies in chromosome pairs. The colors for curves correspond to subgenomes as for chromosomes.
Figure 3 Phylogenetic positions of the two subgenome sequences identified in P. sorbitophila hybrid. Yarrowia lipolytica is used as outgroup to root the CTG tree. The tree was built from the alignment of 233 protein families (87,181 amino acids per species) having a single member in each analyzed species. Amino acid sequences for each family were aligned with MAFFT (Katoh ) and cleaned with Gblocks (Talavera and Castresana 2007). The tree was built from the resulting alignment with the maximum likelihood method using PHYML with a JTT substitution model corrected for heterogeneity among sites by a Γ-law distribution using four different categories of evolution rates (Guindon and Gascuel 2003). The proportion of invariable sites and the α-parameter of the Γ-law distribution were optimized according to the data. Bootstraps were calculated from 100 replicates. They are indicated before each node and the scale for branch length at the bottom of the figure.
Bias in codon usage between Pγ and Pε subgenomes
| Amino Acid | Codon | No. Codons | Usage, % | Pγ-Pε | ||
|---|---|---|---|---|---|---|
| Pγ | Pε | Pγ | Pε | |||
| Phe | TTT | 35,134 | 37,411 | 50.56 | 53.55 | 2.99 |
| TTC | 34,354 | 32,448 | 49.44 | 46.45 | ||
| Val | GTT | 29,941 | 30,325 | 59.98 | 61.56 | 1.58 |
| GTC | 19,977 | 18,937 | 40.02 | 38.44 | ||
| Ser | TCT | 34,498 | 36,153 | 64.45 | 66.26 | 1.80 |
| TCC | 19,030 | 18,412 | 35.55 | 33.74 | ||
| Pro | CCT | 24,501 | 25,360 | 66.37 | 67.88 | 1.51 |
| CCC | 12,414 | 11,999 | 33.63 | 32.12 | ||
| Thr | ACT | 25,044 | 26,245 | 58.24 | 60.78 | 2.55 |
| ACC | 17,961 | 16,932 | 41.76 | 39.22 | ||
| Ala | GCT | 30,768 | 31,479 | 61.14 | 63.54 | 2.4 |
| GCC | 19,557 | 18,060 | 38.86 | 36.46 | ||
| His | CAT | 20,343 | 20,787 | 58.51 | 60.07 | 1.56 |
| CAC | 14,426 | 13,818 | 41.49 | 39.93 | ||
| Asn | AAT | 48,995 | 51,063 | 53.57 | 55.28 | 1.71 |
| AAC | 42,467 | 41,316 | 46.43 | 44.72 | ||
| Asp | GAT | 55,556 | 56,476 | 57.36 | 58.56 | 1.21 |
| GAC | 41,307 | 39,959 | 42.64 | 41.44 | ||
| Cys | TGT | 10,196 | 10,161 | 55.71 | 55.19 | 0.52 |
| TGC | 8105 | 8250 | 44.29 | 44.81 | ||
| Ser | AGT | 21,219 | 21,365 | 50.48 | 50.81 | 0.33 |
| AGC | 20,817 | 20,682 | 49.52 | 49.19 | ||
| Gly | GGT | 30,801 | 29,869 | 64.52 | 63.82 | 0.70 |
| GGC | 16,937 | 16,931 | 35.48 | 36.18 | ||
| Avg. | 1.57 | |||||
| Mean | 1.56 | |||||
| Q1-Q3 | [1.08-1.95] | |||||
For tRNA species that pair with two codons, the usage % of each codon was determined as follows: (number of one codon/number of both codon ×100). The values were calculated for all chromosomal regions defined as belonging to Pγ or Pε.
Figure 2 Analysis of the GC trend curve exchange between chr. C and D. (A) Position of the GC exchange determined by the global GC content analysis along chromosomes using a sliding window of 10 kb and a step of 1 kb. (B) Distribution of the codon usage percentages calculated from tRNA species that pair with two codons and showing more than 1.5 variation between both codons (extracted from Table 1, lane Pγ-Pε). Values for each tRNA were calculated for the left and the right regions of the C/D chromosomal exchange, respectively. (C) Multiple alignments of P. sorbitophila and P. farinosa CBS 2001 sequences around the GC trend curve exchange. This exchange area is characterized by a 100% identical region (1934-nt long) between chr. C and D. Positions of SNPs between P. sorbitophila chr. C, chr. D and P. farinosa CBS 2001 are indicated in red. As shown, P. farinosa chromosomal sequence is first identical to P. sorbitophila chr. D sequence and then to chr. C (after the 1934-nt long sequence), confirming that the GC exchange ensues from a reciprocal translocation event.
P. sorbitophila genomic features in Pγ and Pε subgenomes
| Parental Contribution | Chromosomal Region | Total No. ProteinCoding Genes | Total No. Noncoding RNA | Total No. Other Elements NUMTs Loci | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CDS Without Intron | CDS With Introns | Pseudo-Gene | Total Gene | tRNA | snoRNA | snRNA | Pol III ncRNA | Ribosomal DNA | |||
| Pγ | Solo | 95 | 2 | 7 | 104 | 0 | 0 | 0 | 0 | 73 | 1 |
| Heterozygous | 2973 | 219 | 13 | 3205 | 88 | 21 | 3 | 5 | 0 | 0 | |
| Homozygous | 3834 | 246 | 10 | 4090 | 92 | 26 | 2 | 0 | 0 | 2 | |
| Total | 6902 | 467 | 30 | 7399 | 180 | 47 | 5 | 5 | 73 | 3 | |
| Pε | Solo | 106 | 4 | 6 | 116 | 0 | 0 | 0 | 0 | 0 | 8 |
| Heterozygous | 2950 | 218 | 37 | 3205 | 88 | 21 | 3 | 5 | 0 | 0 | |
| Homozygous | 482 | 46 | 4 | 532 | 20 | 6 | 0 | 0 | 0 | 4 | |
| Total | 3538 | 268 | 47 | 3853 | 108 | 27 | 3 | 5 | 0 | 12 | |
| Total genome | 10,440 | 735 | 77 | 11,252 | 288 | 74 | 8 | 10 | 73 | 15 | |
| Haplotype equivalent | 5,220 | 367.5 | 38.5 | 5626 | 144 | 37 | 4 | 5 | 36.5 | 7.5 | |
CDS, CoDing Sequence; Pol III, polymerase III; NUMTs, NUclear sequences of MiTochondrial origin.
Tandemly repeated units.
Loci (3205 pairs) containing both parental genes (a total of 6410 genes). They correspond to CDS/CDS pairs for 3161 pairs, to CDS/pseudogene pairs for 38 pairs, and to pseudogene/pseudogene pairs for 6 pairs (supporting information).
Figure 4 Location and organization of the ribosomal DNA clusters. rDNA sequences were identified by comparison to other yeast genomes (Table S5). Three loci were identified, one at the left border of chr. E (“a” in A and B), a second at the right border of chr. E (“b” in A and B) and a third at the left border of chr. F (“c” in A and B), the latest one containing approximately 73 tandem repeats. (A) Indicates the position and coordinates of each locus on chromosomes. (B) Describes the loci organizations, incomplete elements are indicated by stars. These three organizations were also checked by polymerase chain reaction amplification, end sequencing and PFGE hybridization (Figure S2).
Distribution of single-allele genes between Pγ and Pε for sugar degradation and other transports
| Putative Function | Gene Name | Locus in Pγ | Locus in Pε |
|---|---|---|---|
| Sugar metabolism | |||
| Maltose permease | MALX1 | PISO0M16930g | |
| PISO0J21547g | |||
| PISO0M00166g | |||
| Maltase | MALX2 | PISO0M00188g | |
| PISO0M16886g | |||
| PISO0J03551g | |||
| PISO0J03441g | |||
| MAL activator | MALX3 | PISO0E02028g | |
| PISO0E02050g | |||
| PISO0M16974g | |||
| PISO0J35007g | |||
| Invertase | SUC2 | PISO0J03639g | |
| PISO0J03573g | |||
| Sorbitol dehydrogenase | SOR1 | PISO0N22123g | PISO0M21880g |
| PISO0E00180g | |||
| PISO0K00604g/ PISO0L00605 | |||
| Gluthatione metabolism | |||
| 5-oxoprolinase | OXP1 | PISO0I02648g | PISO0J04431g |
| PISO0C10078g/PISO0D10145g | |||
| PISO0E04690g | |||
| PISO0J03595g | |||
| Allantoate transport | |||
| Allantoate permease | DAL5 | PISO0A12958g/ PISO0B13025g | |
| PISO0M24916g | |||
| PISO0M17480g | |||
| PISO0K23022g/ PISO0L23023g | |||
| PISO0J21503g | |||
| PISO0I08192g | PISO0J10019g | ||
| PISO0E04646g | |||
| Nicotinic acid transport | NTA1 | PISO0I02626g | PISO0J04409g |
| PISO0J03617g | |||
| PISO0J03485g | |||
| PISO0K00318g/ PISO0L00319g | |||
| PISO0K00406g/ PISO0L11407g | |||
| PISO0N15105g | PISO0M14708g | ||
Pseudogene.
Identical alleles located in homozygous regions.
Figure 5 Genomic reshapings identified since the separation of Pγ and Pε parents from their common ancestor to the current P. sorbitophila genome. Position of rearrangements in each part of the flowchart (before or after hybridization) is not relevant of their chronology. (*) The 11,252 genes in P. sorbitophila are in fact 3205 gene pairs coming from both parents, 2045 pairs coming from only Pγ parent, 266 pairs from Pε parent, and 116 and 104 are single-copy genes derived from either Pγ or Pε, respectively.