| Literature DB >> 26615217 |
Wen Hu1, Fang Suo2, Li-Lin Du3.
Abstract
Although the fission yeast Schizosaccharomyces pombe is a well-established model organism, studies of natural trait variations in this species remain limited. To assess the feasibility of segregant-pool-based mapping of phenotype-causing genes in natural strains of fission yeast, we investigated the cause of a maltose utilization defect (Mal(-)) of the S. pombe strain CBS5557 (originally known as Schizosaccharomyces malidevorans). Analyzing the genome sequence of CBS5557 revealed 955 nonconservative missense substitutions, and 61 potential loss-of-function variants including 47 frameshift indels, 13 early stop codons, and 1 splice site mutation. As a side benefit, our analysis confirmed 146 sequence errors in the reference genome and improved annotations of 27 genes. We applied bulk segregant analysis to map the causal locus of the Mal(-) phenotype. Through sequencing the segregant pools derived from a cross between CBS5557 and the laboratory strain, we located the locus to within a 2.23-Mb chromosome I inversion found in most S. pombe isolates including CBS5557. To map genes within the inversion region that occupies 18% of the genome, we created a laboratory strain containing the same inversion. Analyzing segregants from a cross between CBS5557 and the inversion-containing laboratory strain narrowed down the locus to a 200-kb interval and led us to identify agl1, which suffers a 5-bp deletion in CBS5557, as the causal gene. Interestingly, loss of agl1 through a 34-kb deletion underlies the Mal(-) phenotype of another S. pombe strain CGMCC2.1628. This work adapts and validates the bulk segregant analysis method for uncovering trait-gene relationship in natural fission yeast strains.Entities:
Keywords: Schizosaccharomyces pombe; maltose utilization; natural variation
Mesh:
Substances:
Year: 2015 PMID: 26615217 PMCID: PMC4700965 DOI: 10.1093/gbe/evv238
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FSequence variants in the CBS5557 genome. (A) The two data analysis pipelines used for identifying the sequence variants. (B) Distribution of SNPs and indels in the CBS5557 genome. The numbers of SNPs (blue, left y axis) and indels (red, right y axis) in 5-kb windows are depicted as vertical bars. (C) The numbers of CBS5557 SNPs in 5-kb windows show a bimodal distribution, with about half of the windows containing more than ten SNPs and the other half containing much fewer SNPs. (D) The functional impact of CBS5557 sequence variants assessed using CooVar. (E) The LoF variants are enriched in the last 5% of the CDS.
FConstructing an artificial inversion on chromosome I of the laboratory strain. A schematic depicting the procedure of strain construction. See the Materials and Methods section for details. kanMX* denotes the promoter-less kanMX ORF that only became expressed after being placed immediately downstream of the Padh1 promoter by the inversion.
The 27 Genes Whose Gene Structure Annotations Are Revised as a Result of Reference Sequence Indel Error Correction (see supplementary file S1, Supplementary Material online, for Further Details)
| Systematic ID | Gene Name | Chr. | Indel Position | Ref. Seq. | True Seq. | Annotation Change |
|---|---|---|---|---|---|---|
| SPAC1F8.07c | I | 101871 | A | AG | A 2-bp intron becomes part of an exon | |
| SPAC22F3.11c | snu23 | I | 682993 | TC | T | An intron becomes part of an exon |
| SPAC3A12.04c | rpp1 | I | 1424708 | CA | C | CDS is extended at 3’-end |
| SPAP27G11.10c | nup184 | I | 1625092 | T | TC | CDS is extended at 3’-end |
| SPAC17G8.01c | trl1 | I | 2343703 | G | GA | An intron becomes part of an exon |
| SPAC823.04 | rrp36 | I | 2588021 | C | CA | A 2-bp intron becomes part of an exon |
| 2588066 | C | CA | An intron becomes part of an exon | |||
| SPAC688.08 | srb8 | I | 3125118 | A | AT | An intron becomes part of an exon |
| SPAC1486.05 | nup189 | I | 3197528 | A | AG | An intron becomes part of an exon |
| SPAC3A11.09 | sod22 | I | 3450130 | GT | G | CDS is extended at 3’-end |
| SPAC3A11.06 | mvp1 | I | 3460318 | T | TC | One boundary of an intron is moved |
| SPAC1071.01c | pta1 | I | 3855790 | GT | G | CDS is extended at 3’-end |
| SPAC29E6.03c | uso1 | I | 4407494 | T | TG | An intron becomes part of an exon |
| SPAC29E6.04 | nnf1 | I | 4410191 | CG | C | A 1-bp intron no longer exists |
| SPAC29A4.03c | I | 5142627 | A | AG | A 2-bp intron becomes part of an exon | |
| SPAC4D7.09 | tif223 | I | 5368262 | TC | T | Three amino acids are altered |
| 5368273 | G | GT | ||||
| SPBC16E9.16c | lsd90 | II | 1948953 | GA | G | A 1-bp intron no longer exists |
| 1950050 | A | AG | A 2-bp intron becomes part of an exon | |||
| SPBC1E8.03c | II | 1960392 | A | AG | CDS is extended at 3’-end | |
| SPBC1A4.06c | tam41 | II | 1987101 | CG | C | A 2-bp intron becomes part of an exon |
| 1987117 | TG | T | ||||
| SPBC29A3.06 | utp18 | II | 2049891 | AT | A | A 1-bp intron no longer exists |
| SPBC29A3.08 | pof4 | II | 2053516 | G | GC | A 1-bp intron becomes part of an exon |
| SPBC23G7.06c | II | 2108180 | T | TA | A 2-bp intron becomes part of an exon | |
| SPBC14C8.09c | dbl3 | II | 2219928 | A | AT | A 2-bp intron becomes part of an exon |
| SPBC4F6.10 | vps901 | II | 2709414 | G | GC | A 2-bp intron becomes part of an exon |
| SPBC32F12.08c | duo1 | II | 2798040 | CT | C | CDS is extended at 3’-end |
| SPBC13E7.01 | cwf22 | II | 3040332 | C | CG | A 2-bp intron becomes part of an exon |
| SPBC16D10.10 | tad2 | II | 3619003 | A | AG | Both boundaries of an intron are moved |
| SPCC1442.04c | III | 1774235 | T | TGATC | A 2-bp intron becomes part of an exon |
aFrameshifts have been reported by Matsuyama et al. (2006).
bAmino acid sequence of the gene product is unchanged by the proposed gene structure revision.
cFrameshift has been reported by Hayashi et al. (2006).
dFrameshifts have been reported by Yokoyama et al. (2008).
FBSA on the cross between DY5945 and LD775. Scatter plots depict the differences of reference allele frequencies between the Mal+ pool and the Mal- pool at SNP positions. The allele frequency differences are expected to be around 0 in most regions of the genome and reach 1 at the Mal- trait locus. Local regression lines are displayed to better visualize the trend. Two dashed vertical lines mark the boundaries of the 2.23-Mb inversion. Black triangles mark the positions of centromeres.
FBSA on the cross between DY5945 and DY8531. (A) Scatter plots depicting the allele frequency differences are drawn as in figure 2. (B) Scatter plot of chromosome I is redrawn, so that the genomic coordinates of data points within the inversion region are adjusted to match the inversion. (C) Close-up views of the linkage peak summit region (coordinates 2684001–3164000 of chromosome I). The scatter plot was drawn as in (A). The green line marks the position of the agl1 gene. The histogram depicts average allele frequency differences in 40-kb bins. The five bins with average allele frequency differences >0.8 are highlighted as light blue bars.
Fagl1 is the causal gene of the Mal- phenotype of CBS5557 and CGMCC2.1628. (A) Illumina sequencing reveals a 5-bp deletion in the CDS of agl1 in the CBS5557 genome. (B) PCR and Sanger sequencing confirm the 5-bp deletion in the CDS of agl1 in the CBS5557 genome. The sequences are in a reverse complement orientation relative to that of the reference genome. (C) The Mal- phenotype of CBS5557 can be rescued by a plasmid expressing the laboratory strain version of agl1. The parental strain and transformant were streaked on agar plates containing glucose or maltose as the carbon source. To avoid the rescue of Mal- phenotype by the diffusion of extracellular glucose generated by nearby Mal+ colonies, gaps were created between sectors streaked with different strains by cutting out agar slices. (D) Two strains deposited in the culture collection under the name Schizosaccharomyces malidevorans, CGMCC2.1621 and CGMCC2.1628, are Mal- Schizosaccharomyces pombe strains, and their Mal- phenotype can be rescued by a plasmid expressing the laboratory strain version of agl1. The phenotype analysis was performed as in (C). (E) A 34-kb chromosome I region containing agl1 is deleted in CGMCC2.1628. The deletion breakpoints are denoted by vertical dashed lines. (F) A schematic depicting a possible scenario of how the 34-kb deletion in CGMCC2.1628 may have formed.