| Literature DB >> 29581444 |
Yul-Kyun Ahn1, Abinaya Manivannan2, Sandeep Karna2, Tae-Hwan Jun3, Eun-Young Yang2, Sena Choi2, Jin-Hee Kim2, Do-Sun Kim2, Eun-Su Lee2.
Abstract
The present study deals with genome wide identification of single-nucleotide polymorphism (SNP) markers related to powdery mildew (PM) resistance in two pepper varieties. Capsicum baccatum (PRH1- a PM resistant line) and Capsicum annuum (Saengryeg- a PM susceptible line), were resequenced to develop SNP markers. A total of 6,213,009 and 6,840,889 SNPs for PRH1 and Saengryeg respectively have been discovered. Among the SNPs, majority were classified as homozygous type SNPs, particularly in the resistant line. Moreover, the SNPs were differentially distributed among the chromosomes in both the resistant and susceptible lines. In total, 4,887,031 polymorphic SNP loci were identified between the two lines and 306,871 high-resolution melting (HRM) marker primer sets were designed. In order to understand the SNPs associated with the vital genes involved in diseases resistance and stress associated processes, chromosome-wise gene ontology analysis was performed. The results revealed the occurrence that SNPs related to diseases resistance genes were predominantly distributed in chromosome 4. In addition, 6281 SNPs associated with 46 resistance genes were identified. Among the lines, PRH1 consisted of maximum number of polymorphic SNPs related to NBS-LRR genes. The SNP markers were validated using HRM assay in 45 F4 populations and correlated with the phenotypic disease index.Entities:
Mesh:
Year: 2018 PMID: 29581444 PMCID: PMC5980001 DOI: 10.1038/s41598-018-23279-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of sequencing, sequence pre-processing and alignment of reads to the reference genome.
| Sample | Read parameters | PRH1 | Saengryeg |
|---|---|---|---|
| Raw read data | No. of reads | 130,370,103 | 118,588,231 |
| 130,370,103 | 118,588,231 | ||
| Avg. length (bp) | 151 | 151 | |
| 151 | 151 | ||
| Total length (Gb) | 19.69 | 17.91 | |
| 19.69 | 17.91 | ||
| Genome coverage# | ≒11.31X | ≒10.29X | |
| Cleaned data | No. of reads | 97,261,537 | 88,964,871 |
| 97,261,537 | 88,964,871 | ||
| Avg. length (bp) | 120 | 121 | |
| 81 | 81 | ||
| Total length (Gb) | 11.69 | 10.79 | |
| 7.83 | 7.21 | ||
| Trimmed/raw* | 59.41 | 60.24 | |
| 39.78 | 40.25 | ||
| Genome coverage# | ≒5.61× | ≒5.71X | |
| Read mapping | No. of total reads | 194,523,074 | 177,929,742 |
| No. of mapped reads (%) | 88,448,386 (45.47) | 1,080,500,795 (39.24) | |
| Mapped region** (%) | 1,080,500,765 (39.24) | 2,514,912,154 (91.34) |
*Trimmed/raw: total length of trimmed read / total length of raw read.
#Genome coverage: Total length of all reads divided by reference genome size (3.48 Gb).
**Mapped region: Coverage of read mapping relative to the reference genome.
Figure 1SNP distribution observed per 1 Mb chromosome. Thedistribution of SNPs detected with resequencing of pepper varieties along 12 chromosomes. The horizontalx-axis denotes the length (Mb) of chromosome and y-axis represents number of SNPs254x190 mm.
Distribution of SNPs in the chromosomes of PRH1 and Saengryeg.
| Chromosome No. | PRH1 | Saengryeg | ||
|---|---|---|---|---|
| Homozygous | Heterozygous | Homozygous | Heterozygous | |
| 1 | 601,032 | 23,932 | 692,326 | 9,977 |
| 2 | 400,513 | 15,849 | 388,770 | 8,428 |
| 3 | 557,185 | 18,211 | 466,439 | 11,441 |
| 4 | 420,713 | 20,654 | 296,183 | 9,345 |
| 5 | 405,744 | 21,116 | 577,304 | 12,761 |
| 6 | 448,886 | 20,823 | 516,710 | 10,563 |
| 7 | 552,120 | 17,025 | 620,306 | 14,005 |
| 8 | 304,395 | 11,915 | 153,518 | 7,229 |
| 9 | 384,009 | 16,739 | 890,135 | 9,787 |
| 10 | 460,912 | 23,517 | 1,096,754 | 10,236 |
| 11 | 499,176 | 14,391 | 467,193 | 10,278 |
| 12 | 469,523 | 22,456 | 301,556 | 15,942 |
| Total | 5,504,208 | 226,628 | 6,467,194 | 129,992 |
Summary of SNP classification by genome structure.
| Sample | Total no. of SNP | Region | Total | Homozygous | Heterozygous | Other |
|---|---|---|---|---|---|---|
| PRH1 | 6,213,009 | Introns | 280,076 | 258,904 | 7,824 | 13,348 |
| CDS | 150,932 | 133,436 | 7,025 | 10,471 | ||
| Genic region | 431,058 | 392,383 | 14,851 | 23,824 | ||
| Intergenic region | 5,781,951 | 5,111,825 | 211,777 | 458,349 | ||
| Saengryeg | 6,804,889 | Introns | 69,542 | 64,994 | 1,938 | 2,610 |
| CDS | 39,955 | 34,212 | 2,449 | 3,294 | ||
| Genic region | 109,504 | 99,210 | 4,388 | 5,906 | ||
| Intergenic region | 6,695,385 | 6,367,984 | 125,604 | 201,797 |
Figure 2Genomic distribution of polymorphic SNP markers (PRH1 Vs Saengryeg) 254 × 190mm.
Figure 3Chromosome wise annotation of polymorphic genic SNPs associated with important functionsinPRH1 and Saengryeg.
Figure 4Chromosome wise occurrence of SNPs associated with NBS-LRR genesinPRH1 and Saengryegincomparison with reference genome.
List of HRM primers designed for genotyping polymorphic genic SNPs from each chromosome.
|
|
|
|
|
| |
|---|---|---|---|---|---|
| 1 | CA01g00370 | Serine/threonine protein kinase%2C putative | CGGCCAATGTATCAAGACTCG | AACGAATTCAACAACCGCGT | Positive |
| 2 | CA01g02310 | Xpa-binding protein%2C putative | TCCCTTCTGCGGTTTTCCTC | TGTTGCAAACTTCTCCTTGTAGG | Positive |
| 3 | CA01g04020 | Kinesin heavy chain%2C putative | CCCACTGGTGAAAGCAGTGT | TGGAGAGAAGGCCTCAATGG | Positive |
| 4 | CA02g00020 | DNA-repair protein UVH3%2C putative | TGGTCAGGTAATGGTGGTTCT | CTCTCCCTCATCTGGCAAACA | Negative |
| 5 | CA02g00720 | Pentatricopeptide repeat-containing protein%2C putative | AGAGCACTAACCTCTTTAGCA | GACTGCAAAGACCCCACAGA | Positive |
| 6 | CA02g02750 | MYBR domain class transcription factor | ACAGTCATACTAGATGAAGGCGG | TGATGCAATGTGGTCAGATGA | Negative |
| 7 | CA03g00110 | Beta-galactosidase | AGTAACTGATGGAATTTCGGAA | TGGATGCGTTTTAGCCTGACT | Positive |
| 8 | CA03g00740 | Small subunit processome component-like protein | TCCCAGCATACTCGTCCAAC | CCTCAACCTAGGCATGCCAA | Negative |
| 9 | CA03g15330 | PREDICTED: Golgi to ER traffic protein 4 homolog | TGGTTAGTCTTTCCTAATCCGGT | CTATTTCTTTTTCCATTCCATTGC | Positive |
| 10 | CA04g00830 | Phosphatidylinositol 4-kinase%2C putative | GGGGGCTAGTCTTCTCTTCT | GGCAACAAGGTGGAAAGACG | Negative |
| 11 | CA04g00250 | PREDICTED: transmembrane emp24 domain-containing protein p24beta3-like | CGGATCATCCCGGCATTGAT | TCACCTCCGATTCACAACTCA | Negative |
| 12 | CA04g00360 | Protein transport protein sec. 23%2C putative | GCACGCCCATACCTTGTCAA | ATCAATGCCAAGCCCATCCA | Positive |
| 13 | CA05g00010 | RNA polymerase II transcription mediators isoform 1 | CAACGAGGCTGACCGAAAGA | CTCCACTCGCCCATCTTCTC | Positive |
| 14 | CA05g00320 | Folylpolyglutamate synthase | GGTGGGGGCTTTTGTCTTCT | ACTACATCTTCTGAGGTAACACC | Negative |
| 15 | CA05g15050 | PREDICTED: mediator of RNA polymerase II transcription subunit 33A | CCACCGTTTCAATCCCTTGC | ACGTGTCAGGATTCATAAGCT | Positive |
| 16 | CA06g00010 | Kinesin heavy chain%2C putative | TGAAGCCGCCTCGAATTTCT | AATGAGACTTCGAGGGGCAC | Negative |
| 17 | CA06g01280 | Myosin XI%2C putative | ATAGACCCCGGCTCAGGAAT | GCAAAGGTAGCTCCACCACT | Positive |
| 18 | CA06g01570 | PREDICTED: TBC1 domain family member | GGCAGGAAGATACAATAAATGTAC | AGCAGTATCGTGATTTCATTTGGT | Negative |
| 19 | CA07g03700 | PREDICTED: synaptotagmin-5-like | AGTAAGGTCAAATGTGGAGCCA | AGAACGTTAATACTGGCCATCG | Negative |
| 20 | CA07g04200 | Transducin family protein | TGCGAACTTAAGGAAAAAGAAGCA | GTAATGCTTGTCGGGAGCCT | Positive |
| 21 | CA07g12460 | Formin | GGGATAACGCTCTTCCATATGGA | CATGTCTGACAGAGGGTGCA | Negative |
| 22 | CA08g00950 | Transcription cofactor%2C putative | ACACTGAGATGCATGCACCA | TACCTGGTTTTGGCTGTGTT | Positive |
| 23 | CA08g08740 | DNA-directed RNA polymerase | ACAACAGGGACATGATTTCATCA | ACACTAAACCCTTCTGTGCACA | Positive |
| 24 | CA08g09730 | PREDICTED: protein ZINC INDUCED FACILITATOR-LIKE 1-like isoform X3 | TGTGTGTCGAAGCAATTGAT | CTGTTGGAAGATTTGTCAATATCA | Positive |
| 25 | CA09g00140 | O-linked n-acetylglucosamine transferase | CTGCACATAGAATTCTTGCCCA | TGGGATTGTTTCGTGCTTTT | Negative |
| 26 | CA09g01180 | Vacuolar protein sorting-associated protein | TTGTCCTCCTCCTCAGATGA | ACCACCAGCAAGAACGTCAA | Negative |
| 27 | CA09g14940 | Beta-amyrin synthase | TGGCACCATTTTTAAACAACA | ACAGTCAGAAGCACACTGTGA | Positive |
| 28 | CA10g01250 | PREDICTED: heterogeneous nuclear ribonucleoprotein | TGATGAGCTCGGAGGAGTCA | AAGTGGCTGGGATTCAAGGG | Negative |
| 29 | CA10g01280 | Protein binding protein%2C putative | GGGTGAGTTTCCTAAGAGGTCC | CAAATCACATGGCCAAACGC | Positive |
| 30 | CA10g07870 | Amidase%2C putative | GCTGCAGCAATGTAATTGGA | CCTCTGACCATCATCGCTGA | Negative |
| 31 | CA11g11870 | Xanthine dehydrogenase | ACCTTGACTGGTACACTTTTTCA | AGTGATGACGGACAATTGTGT | Positive |
| 32 | CA11g15420 | Tubulin family protein | GGCCTCATAACACCGTGGAA | TTACCAGCAGCATTGATCGA | Negative |
| 33 | CA11g15430 | ABA aldehyde oxidase | TTAATGGAGGCTTCAGAGAGA | GCTTGGGACTCTTGAAAGAAGC | Positive |
| 34 | CA12g01070 | IsoleucyltRNA synthetase%2C putative | ACAACACCCATCGACTTCCC | TGCAGAGCCAGATTTCAGGT | Positive |
| 35 | CA12g02370 | N-like protein | TGGTGTTTTTCCATTTGCCT | TCTCTAGAACGTAAGGGTATTCA | Negative |
| 36 | CA12g22510 | PREDICTED: pleiotropic drug resistance protein | ACCGAGTCGAAAGAGGAAGC | AAGGGCAGAGTCGAGCTTTC | Negative |
Figure 5HRM melt curve and temperature peaks obtained from candidate SNPs between C. baccatum (AR1) and C. annum (TF68) illustrating the G/A and C/A SNP variation.