| Literature DB >> 30304922 |
Minho Lee1, Shin-Jung Choi2, Sangjo Han3, Miyoung Nam4, Dongsup Kim5, Dong-Uk Kim2, Kwang-Lae Hoe4.
Abstract
Incorporation of unique barcodes into fission yeast gene deletion collections has enabled the identification of gene functions by growth fitness analysis. For fine tuning, it is important to examine barcode sequences, because mutations arise during strain construction. Out of 8,708 barcodes (4,354 strains) covering 88.5% of all 4,919 open reading frames, 7,734 barcodes (88.8%) were validated as high-fidelity to be inserted at the correct positions by Sanger sequencing. Sequence examination of the 7,734 high-fidelity barcodes revealed that 1,039 barcodes (13.4%) deviated from the original design. In total, 1,284 mutations (mutation rate of 16.6%) exist within the 1,039 mutated barcodes, which is comparable to budding yeast (18%). When the type of mutation was considered, substitutions accounted for 845 mutations (10.9%), deletions accounted for 319 mutations (4.1%), and insertions accounted for 121 mutations (1.6%). Peculiarly, the frequency of substitutions (67.6%) was unexpectedly higher than in budding yeast (∼28%) and well above the predicted error of Sanger sequencing (∼2%), which might have arisen during the solid-phase oligonucleotide synthesis and PCR amplification of the barcodes during strain construction. When the mutation rate was analyzed by position within 20-mer barcodes using the 1,284 mutations from the 7,734 sequenced barcodes, there was no significant difference between up-tags and down-tags at a given position. The mutation frequency at a given position was similar at most positions, ranging from 0.4% (32/7,734) to 1.1% (82/7,734), except at position 1, which was highest (3.1%), as in budding yeast. Together, well-defined barcode sequences, combined with the next-generation sequencing platform, promise to make the fission yeast gene deletion library a powerful tool for understanding gene function.Entities:
Keywords: barcode; fission yeast; gene deletion; growth fitness; mutation
Year: 2018 PMID: 30304922 PMCID: PMC6187811 DOI: 10.5808/GI.2018.16.2.22
Source DB: PubMed Journal: Genomics Inform ISSN: 1598-866X
Fig. 1.Construction overview of fission yeast gene deletion strains. (A) Structure of the deletion cassette consisting of the KanMX module and its flanking RHG (Region of Homology to the Gene of interest). The KanMX4 module is flanked by unique barcodes and RHG on both sides. The deletion cassette replaces the open reading frame of interest by homologous recombination. The length of the RHG is from 80 bp to 250‒450 bp, depending on construction method. (B) Serial PCR method. For serial PCR method, 4 rounds of PCR were performed overall. The purpose of the first PCR is for preparation of the KanMX module. Through another 3 rounds of PCR, the RHG of the 80-mer was finally prepared for homologous recombination. (C) Block PCR method. For block PCR method, 4 rounds of PCR were performed overall, as with the serial PCR method. The first PCR is same as the serial PCR method for the preparation of the KanMX module. Two separate PCRs using 51-mer and 25-mer primers and chromosomal DNA as templates were then performed for the preparation of a pair of RHG blocks flanking the KanMX module. Finally, three separated blocks were combined by block PCR. (D) Strategy of colony PCR. For Sanger sequencing, barcode regions were PCR-amplified as indicated. Note that the cp5 and cp3 primers are gene-specific primers. They are around 500 bp apart from the homologous recombination site of each gene deletion cassette. For the sequences of the primers used, refer to “Methods.”
Validation of sequenced barcodes
| No. of strains/barcodes (%) | ||||
|---|---|---|---|---|
| Block-PCR | Serial-PCR | Gene-synthesis | Total | |
| Entire ORF in fission yeast | 4,919/9,838 (100.0) | |||
| Under strain construction | - | - | 565/1,300 (11.5) | 565/1,300 (11.5) |
| Used for barcode sequencing | 2,886/5,772 (58.7) | 1,468/2,936 (29.8) | - | 4,354/8,708 (88.5) |
| Used for barcode sequencing | 2,886/5,772 (100.0) | 1,468/2,936 (100.0) | - | 4,354/8,708 (100.0) |
| Confirmed on both sides | 2,531/5,062 (87.7) | 1,336/2,672 (91.0) | - | 3,867/7,734 (88.8) |
| Not confirmed on both sides | 355/710 (12.3) | 132/264 (9.0) | - | 487/974 (11.2) |
ORF, open reading frame.
Overview of defects in barcodes and mutations
| No. of defects (%) | |||
|---|---|---|---|
| Block-PCR | Serial-PCR | Total | |
| Barcodes confirmed by sequencing | 5,062 (100.0) | 2,672 (100.0) | 7,734 (100.0) |
| Barcodes having mutation(s) | 756 (14.9) | 283 (10.6) | 1,039 (13.4) |
| Barcodes without mutation(s) | 4,306 (85.1) | 2,389 (89.4) | 6,695 (86.6) |
| No mutation on both barcodes | 3,696 (73.0) | 2,144 (80.2) | 5,840 (75.5) |
| No mutation on either barcode | 610 (12.1) | 245 (9.2) | 855 (11.1) |
| Barcodes having mutation(s) | 756 (14.9) | 283 (10.6) | 1,039 (13.4) |
| Mutation(s) on either barcode | 610 (12.1) | 245 (9.2) | 855 (11.0) |
| Mutation(s) on both barcodes | 146 (2.8) | 38 (1.4) | 184 (2.4) |
| Mutations within barcode tags[ | 931 (18.4) | 353 (13.2) | 1,284 (16.6) |
| Mutation(s) in UT | 475 (9.4) | 199 (7.4) | 674 (8.7) |
| Mutation(s) in DT | 456 (9.0) | 154 (5.8) | 610 (7.9) |
| Type of mutations within barcode tags | |||
| Substitution | 654 (12.9) | 191 (7.1) | 845 (10.9) |
| Deletion | 187 (3.7) | 132 (4.9) | 319 (4.1) |
| Insertion | 90 (1.8) | 31 (1.2) | 121 (1.6) |
UT, up-tag; DT, down-tag.
The number of mutations within the barcodes (1,284) is more than the number of defective barcodes (1,039) because of multiple mutations in a single barcode.
Types of defects in 1,039 barcodes having mutation(s)
| No. of barcodes (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Block-PCR (n = 756, 72.8%) | Serial-PCR (n = 283, 27.2%) | Total (n = 1,039) | ||||||||||
| Either tag (n = 610, 58.7%) | Both tags (n = 146, 14.1%) | Either tag (n = 245, 23.6%) | Both tags (n = 38, 3.7%) | Either tag (n = 855, 82.3%) | Both tags (n = 184, 17.7%) | |||||||
| UT (n = 316) | DT (n = 294) | UT (n = 73) | DT (n = 73) | UT (n = 140) | DT (n = 105) | UT (n = 19) | DT (n = 19) | UT (n = 456) | DT (n = 399) | UT (n = 92) | DT (n = 92) | |
| Substitution (n = 702, 67.6%) | 233 | 198 | 55 | 55 | 81 | 59 | 10 | 11 | 314 | 257 | 65 | 66 |
| 1 bp | 215 | 176 | 50 | 51 | 75 | 55 | 9 | 9 | 290 | 231 | 59 | 60 |
| 2 bp (in a row) | 3 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 4 | 2 | 0 | 1 |
| 2 bp (1 + 1) | 11 | 18 | 3 | 3 | 4 | 4 | 1 | 2 | 15 | 22 | 4 | 5 |
| 3 bp (in a row) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 bp (1 + 1 + 1) | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| 3 bp (1 + 2) | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 4 bp (in a row) | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| 4 bp (1 + 1 + 1 + 1) | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 4 bp (1 + 2 + 1) | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 |
| 4 bp (2 + 2) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| Deletion (n = 177, 17.0%) | 36 | 48 | 5 | 11 | 35 | 30 | 7 | 5 | 71 | 78 | 12 | 16 |
| 1 bp | 31 | 41 | 5 | 9 | 29 | 22 | 5 | 4 | 60 | 63 | 10 | 13 |
| 2 bp (in a row) | 3 | 4 | 0 | 1 | 0 | 3 | 1 | 0 | 3 | 7 | 1 | 1 |
| 2 bp (1 + 1) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 3 bp (in a row) | 1 | 1 | 0 | 0 | 5 | 3 | 1 | 1 | 6 | 4 | 1 | 1 |
| 3 bp (1 + 1 + 1) | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 4 bp (in a row) | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 3 | 0 | 0 |
| 5 bp (in a row) | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 6 bp (in a row) | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Insertion (n = 79, 7.6%) | 19 | 28 | 6 | 4 | 9 | 10 | 0 | 3 | 28 | 38 | 6 | 7 |
| 1 bp | 18 | 27 | 6 | 4 | 9 | 10 | 0 | 3 | 27 | 37 | 6 | 7 |
| 2 bp (in a row) | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| Complex (n = 55, 5.3%) | 17 | 14 | 6 | 2 | 11 | 3 | 2 | 0 | 28 | 17 | 8 | 2 |
| S1 + I1 | 6 | 4 | 2 | 1 | 2 | 2 | 0 | 0 | 8 | 6 | 2 | 1 |
| S1 + D1 | 6 | 7 | 0 | 0 | 4 | 1 | 0 | 0 | 10 | 8 | 0 | 0 |
| I1 + D1 | 5 | 3 | 4 | 1 | 5 | 0 | 2 | 0 | 10 | 3 | 6 | 1 |
| Unclassified (n = 26, 2.5%) | 11 | 6 | 1 | 1 | 4 | 3 | 0 | 0 | 15 | 9 | 1 | 1 |
UT, up-tag; DT, down-tag; S1, 1-bp substitution; I1, 1-bp deletion; D1, 1-bp deletion.
Statistics of 1,284 mutations by position within 20-mer barcodes
| PCR type | Mutation | Tag | Base position | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | Sum | |||
| Block | Substitution (n = 654, 50.9%) | UT | 112 | 11 | 15 | 11 | 22 | 12 | 12 | 25 | 14 | 13 | 10 | 9 | 5 | 10 | 17 | 13 | 10 | 5 | 8 | 17 | 351 |
| DT | 82 | 14 | 17 | 16 | 9 | 10 | 11 | 14 | 13 | 8 | 13 | 11 | 10 | 13 | 12 | 11 | 7 | 6 | 9 | 17 | 303 | ||
| Deletion (n = 187, 14.6%) | UT | 6 | 8 | 11 | 2 | 2 | 4 | 5 | 3 | 4 | 5 | 3 | 3 | 2 | 3 | 3 | 1 | 3 | 0 | 2 | 8 | 78 | |
| DT | 12 | 16 | 12 | 4 | 4 | 5 | 6 | 9 | 5 | 4 | 7 | 3 | 6 | 3 | 5 | 3 | 1 | 0 | 1 | 3 | 109 | ||
| Insertion (n = 90, 7.0%) | UT | 5 | 2 | 5 | 2 | 3 | 1 | 2 | 0 | 3 | 1 | 4 | 2 | 1 | 3 | 4 | 1 | 5 | 0 | 1 | 1 | 46 | |
| DT | 0 | 2 | 2 | 2 | 4 | 4 | 2 | 1 | 4 | 1 | 2 | 3 | 4 | 1 | 3 | 2 | 2 | 3 | 2 | 0 | 44 | ||
| Sum | 217 | 53 | 62 | 37 | 44 | 36 | 38 | 52 | 43 | 32 | 39 | 31 | 28 | 33 | 44 | 31 | 28 | 14 | 23 | 46 | 931 | ||
| Serial | Substitution (n = 191, 14.9%) | UT | 7 | 6 | 8 | 8 | 4 | 7 | 4 | 5 | 5 | 9 | 7 | 5 | 8 | 4 | 4 | 1 | 4 | 6 | 3 | 6 | 111 |
| DT | 4 | 2 | 4 | 5 | 6 | 5 | 5 | 4 | 1 | 0 | 3 | 1 | 4 | 6 | 3 | 4 | 5 | 9 | 3 | 6 | 80 | ||
| Deletion (n = 132, 10.3%) | UT | 7 | 3 | 5 | 8 | 3 | 7 | 3 | 5 | 5 | 3 | 2 | 8 | 0 | 2 | 1 | 2 | 1 | 1 | 2 | 4 | 72 | |
| DT | 4 | 1 | 2 | 3 | 6 | 4 | 3 | 4 | 7 | 3 | 5 | 5 | 1 | 1 | 3 | 2 | 2 | 2 | 1 | 0 | 59 | ||
| Insertion (n = 31, 2.4%) | UT | 1 | 5 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 2 | 1 | 2 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 16 | |
| DT | 1 | 1 | 1 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | 2 | 3 | 0 | 1 | 3 | 0 | 0 | 0 | 0 | 0 | 15 | ||
| Sum | 24 | 18 | 20 | 24 | 19 | 25 | 16 | 19 | 19 | 16 | 21 | 23 | 15 | 14 | 15 | 9 | 12 | 18 | 10 | 16 | 353 | ||
| Total[ | 241 | 71 | 82 | 61 | 63 | 61 | 54 | 71 | 62 | 48 | 60 | 54 | 43 | 47 | 59 | 40 | 40 | 32 | 33 | 62 | 1,284 | ||
UT, up-tag; DT, down-tag.
The number of mutations within the barcodes (1,284) is more than the number of defective barcodes (1,039) because of multiple mutations in a single barcode.
Fig. 2.Occurrence of sequence variation by position within tags. The results show a summary of the 1,284 nucleotide sequence mutations in 7,734 Sanger-sequenced up-tags (dark grey) and down-tags (light grey) at a given position. The mutation frequency at a given position in the 20-mer barcode is similar at most positions, ranging from 0.4% (32/7,734) to 1.1%, except at position 1, which is the highest (3.1%). It shows a tendency to increase from position 20 to position 1. For the statistics in detail, refer to Table 4.