| Literature DB >> 27434138 |
Morad M Mokhtar1, Sami S Adawy1, Salah El-Din S El-Assal2, Ebtissam H A Hussein1,2.
Abstract
The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27434138 PMCID: PMC4951042 DOI: 10.1371/journal.pone.0159268
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of the identified SSR motifs on the date palm PDK30 genome sequence.
| Searching item | Results |
|---|---|
| 57277 | |
| 381563256 | |
| 172075 | |
| 28394 | |
| 17237 | |
| 22432 |
Comparison between non-triplet repeat types according to the number of repeats in both genic and intergenic regions.
| Non-triplet | ||||||||
|---|---|---|---|---|---|---|---|---|
| Mono- | Di- | Tetra- | Penta- | |||||
| No. of Repeat | Genic | Intergenic | Genic | Intergenic | Genic | Intergenic | Genic | Intergenic |
| - | - | 8563 | 22859 | 726 | 2234 | 75 | 315 | |
| 22944 | 68365 | 2408 | 5941 | 2 | 15 | - | 1 | |
| 1810 | 6247 | 647 | 1584 | 1 | 3 | - | - | |
| 433 | 1387 | 186 | 460 | 2 | 5 | - | 1 | |
| 106 | 433 | 37 | 97 | - | 3 | - | - | |
| 43 | 152 | 10 | 31 | - | 2 | - | - | |
| 48 | 155 | 5 | 30 | 1 | - | 1 | - | |
| 16 | 41 | 9 | 11 | - | 4 | - | - | |
| 6 | 24 | 5 | 18 | - | 3 | - | 1 | |
| - | 9 | 12 | 33 | 2 | 3 | - | - | |
| 1 | 2 | 21 | 44 | 1 | 3 | - | - | |
| - | 1 | 10 | 26 | - | - | - | - | |
| 1 | 1 | 16 | 13 | - | - | - | - | |
| 25408 | 76817 | 11929 | 31147 | 735 | 2275 | 76 | 318 | |
Comparison between triplet repeat types according to the number of repeat in both genic and intergenic regions.
| Triplet | ||||
|---|---|---|---|---|
| Tri- | Hexa- | |||
| No. of Repeat | Genic | Intergenic | Genic | Intergenic |
| 3745 | 7259 | 49 | 115 | |
| 75 | 152 | - | 1 | |
| 12 | 21 | - | - | |
| 4 | 10 | - | 1 | |
| - | 4 | - | - | |
| - | 2 | - | - | |
| 1 | - | - | - | |
| 2 | 3 | - | - | |
| 3 | 5 | 1 | - | |
| 4 | 8 | 1 | 1 | |
| 16 | 6 | - | - | |
| 3862 | 7470 | 51 | 118 | |
Fig 2The comparison between mono, di and tri- repeats in both genic and intergenic regions.
The numbers in boxes (1, 2, and 3) and (4, 5 and 6) represent the repeat types in the genic and intergenic regions, respectively. Similar repeats in the genic and intergenic regions are labeled with the same color. Numbers with pink color represent the number of repeats in the genic regions, while those with blue color refer to the intergenic regions. The latter data are illustrated with the same colors as histograms.
Different base substitution probabilities that may convert tri repeats in date palm exons to stop codons.
| Amino acid | Repeat Sequence | Stop Codon | Amino acid | Repeat Sequence | Stop Codon |
|---|---|---|---|---|---|
| CGA | TGA | TGT | TGA | ||
| AGA | TGA | TGC | TGA | ||
| GGA | TGA | TGG | TGA | ||
| GAG | TAG | CAG | TAG | ||
| TTA | TGA | CAA | TAA | ||
| TTG | TAG | AAA | TAA | ||
| TCA | TGA | AAG | TAG | ||
| TCG | TAG | TAC | TAA | ||
| GAA | TAA | TAT | TAA |
SSR primers designed within disease resistance related genes.
| Gene product | contig ID | Gene Start | Gene End | Primer name |
|---|---|---|---|---|
| PDK_30s1053751 | 12041 | 18193 | Pd_GSSR1733 | |
| PDK_30s1160101 | 6656 | 7974 | Pd_GSSR4967 | |
| PDK_30s779571 | 1328 | 9575 | Pd_GSSR14611 | |
| PDK_30s1078571 | 2782 | 7044 | Pd_GSSR24273 | |
| PDK_30s790641 | 783 | 4086 | Pd_GSSR28574 | |
| PDK_30s1108901 | 7116 | 10724 | Pd_GSSR3477 | |
| PDK_30s1130951 | 4900 | 9698 | Pd_GSSR4096 | |
| PDK_30s1159381 | 18412 | 23365 | Pd_GSSR4958, Pd_GSSR4959 | |
| PDK_30s676021 | 11551 | 15125 | Pd_GSSR8887 | |
| PDK_30s1065091 | 52197 | 62496 | Pd_GSSR2000, Pd_GSSR2002 | |
| PDK_30s998451 | 515 | 10639 | Pd_GSSR31369 | |
| PDK_30s665521 | 33248 | 35690 | Pd_GSSR26227, Pd_GSSR26229 | |
| PDK_30s853031 | 786 | 7042 | Pd_GSSR17867, Pd_GSSR17868 | |
| PDK_30s742701 | 1713 | 6858 | Pd_GSSR12867 | |
| PDK_30s665281 | 87243 | 102914 | Pd_GSSR8157, Pd_GSSR8161, Pd_GSSR26210,Pd_GSSR26212, Pd_GSSR26213 | |
| PDK_30s909521 | 479 | 7523 | Pd_GSSR20333, Pd_GSSR20334 | |
| PDK_30s866171 | 2129 | 11697 | Pd_GSSR18479, Pd_GSSR18481, Pd_GSSR29704 | |
| PDK_30s981471 | 9730 | 21118 | Pd_GSSR22786,Pd_GSSR22787, Pd_GSSR22790, Pd_GSSR22791 | |
| PDK_30s1036661 | 1243 | 5906 | Pd_GSSR1055, Pd_GSSR1058, Pd_GSSR1061 | |
| PDK_30s866791 | 3182 | 7027 | Pd_GSSR29709 | |
| PDK_30s767921 | 186922 | 194251 | Pd_GSSR14032, Pd_GSSR14033, Pd_GSSR14034 | |
| PDK_30s827291 | 2600 | 6751 | Pd_GSSR16802 | |
| PDK_30s1076091 | 57384 | 60783 | Pd_GSSR2418 | |
| PDK_30s716241 | 11276 | 14302 | Pd_GSSR27256 |
disease resistance genes product,
contig ID in date palm genome,
gene start and end in date palm genome,
primer name in our data base.
Summary statistics of the in silico validated detected SSR in date palm genome sequences.
| Items | PDK30 sequence | ATBV01 sequence |
|---|---|---|
| 1031 | 1031 | |
| 1031 | 903 | |
| 2450 | 2774 | |
| 613364 | 696699 |
1PDK30 sequence (Al-Dous et al [12])
2ATBV01 sequence (Al-Mssallem et al [13])
The comparison between the summarized results of the in silico and in vitro PCR.
| Zaghloul | Hayany | Samany | Barhee | Sewi | Bartamoda | Khalas (PDK30) | Khalas (ATPV01) | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Primer name | Forward & Reverse primer sequence | No. of alleles | Allele length (bp) | Allele length (bp) | Allele length (bp) | Allele length (bp) | Allele length (bp) | Allele length (bp) | No. of allele | Allele length (bp) | Allele length (bp) |
| CGCACTCTGGAATCAACTCACATCTGCCGACAGAGTCAAA | 1 | 143 | 143 | 143 | 143 | 143 | 143 | 2 | 143 | 142 | |
| AGGCAGAGAGATGTCCGTGTAAAAGTCTCCATGTCCGTGC | 1 | 301 | 301 | 301 | 301 | 301 | 301 | 1 | 301 | 301 | |
| CATGCCTGCTATGATCGGTACAGGAAGATCCCAAATCGAA | 2 | 200 | 200/222 | 200/222 | 200 | 200/222 | 200/222 | 2 | 191 | 244 | |
| GCCTGATGATAGTCTCGGCTGCGAACCATACGAAAGCAGT | 2 | 173 | 150 | 173 | ND | 173 | 150 | 2 | 160/173 | 160/173 | |
| CCATTTCGATGTCACCTCCTGCGAACCATACGAAAGCAGT | 1 | 229 | 229 | 229 | 229 | 229 | 229 | 1 | 229 | 229 | |
| AGTCGTGGAGAGATTGCGTTGCCCATCTTGTAGGGAGACA | 3 | 328/354 | 328 | 328 | 328/354 | 315 | 328 | 2 | 333 | 274 | |
| CGGAAGATGCTGAAGTCTCCGCCATAGGAGTTTCAGTCGG | 2 | 220 | 220 | 220 | 220 | 220 | 220 | 1 | 220 | 220 | |
| TTTGGGATGTTGAATGGGTTTCTTTGGTGGAAAAAGCCAG | 1 | 340 | 340 | 340 | 340 | 340 | 340 | 2 | 340 | 340 | |
| AAGTTCATGGGATGGTCGAGGGCTTCAACAATATGCGACA | 8 | 231/562/674 | 239/674 | 247/562/674 | 239/562 | 247/562/674 | 243/674 | 2 | 225/227 | 227 | |
| AGGTGGAGGCCTTCATAGGTTGAATTTGTGCTAGCGATGC | 1 | 257 | 257 | 257 | 257 | 257 | 257 | 1 | 257 | 257 | |
| TACGTGGTCTTGCACGGTAATTAAGCTCGCACTCCTCGAT | 2 | 213 | 213 | 213 | 213 | 213 | 213 | 1 | 213 | 213 | |
| ACTCCCATGTAAACCTCCCCATGTGGGTTGGGTTTGTTGT | 1 | 237 | 237 | 237 | 237 | 237 | 237 | 3 | 237 | 236/613 | |
| TTCCAATGAAAGCCTTTTGGACCCGGAACAGGTTACTGAG | 4 | 165 | 165 | 171 | 171 | 178 | 178 | 2 | 155 | 152 | |
| TTCCTCCTGTTTTTCCCCTTCTCACCGGCTCTACCAGAAG | 1 | 278 | 278 | 278 | 278 | 278 | 278 | 2 | 278 | 277 | |
| CGCAATCATTAAGCTCAGTCAGTTGGGAATGGGTAAGGTCAA | 1 | 315 | 315 | 315 | 315 | 315 | 315 | 2 | 329 | 328 | |
| TTGATGAGCCTCCTCTTTGGGATGGTGAGAGTTGGGGAGA | 3 | 333 | 333 | 333 | 333 | 333 | 333 | 1 | 333 | 333 | |
| TGGCCATCGAGTGCTACATAAGGCTTCGTTCCTCCAACTT | 2 | 200/586 | 200/586 | 200/586 | 200/586 | 200/586 | 200/586 | 1 | 189 | 189 | |
| CCCCAGAAAATGCCTTAACAAAGAGCGTTGACTGCTACCAA | 1 | 125 | 125 | 125 | 125 | 125 | 125 | 1 | 125 | 125 | |
| GATGCCAAGCACTGTGATGTTATCCTGCATGCACCAATGT | 1 | 348 | 348 | 348 | 348 | 348 | 348 | 3 | 229/348 | 229/348 | |
| CAGCTCTCGGGAAATCTTTGTGCCACTGTTTTTGGATCAG | 2 | 165 | 165 | 165 | 165 | 165 | 165 | 2 | 165 | 165 | |
| TTGCTAGAACCCTAACCCCCCCCAACCCGTTTAAGGAAAT | 7 | 497/539 | 488/533 | 505 | 484 | 484 | 474 | 5 | 284/388/420/714/769 | ND | |
| GAAACGGGCCCCTAGAATTATCACTGTCTCCACCACCATC | 14 | 198/282/329 | 198/282 | 188/352 | 179/264/307 | 171/220/252/331 | 163/248 | 3 | 287 | 214/291 | |
| TGGATGTTTCTGGTTACTGTTGTTAGGAACCCCCTTATCCCA | 4 | 256 | 278 | 256 | 278/253 | 256 | 267 | 2 | 356 | 253 | |
| GGCACTCCATGACCTTTTGTAAACAAGCCGAAACCAACAG | 2 | 225 | 225 | 240 | 225 | 225 | 229/240 | 3 | 152/240 | 152/225 | |
| TGCAACAAAGAGATCTGCCAAGACAAAGGCTTCCCCAAAT | 6 | 304 | 282/304 | 304 | 274/316 | 295 | 282/312 | 3 | 195/231/304 | 195/231 | |
| CGCGGTCACTGAAGTCAATAGGAAACCCATGGGAACATAA | 6 | 224/295 | 224/300 | 290 | 250/309 | 290 | 309 | 1 | 326 | ND | |
| GCGGCATCCTCTTGAACTTATTTCCAATCCAACCTAGCAGTT | 7 | 266 | 266/319 | 261 | 254/280/314 | 254 | 254/319/343 | 2 | 271/333 | 271 | |
| TGGTTCAGGAGAAGCATGTGGAAGAAATTGGGAGAATTAGGG | 2 | 278/343 | 278/343 | 278/343 | 278/343 | 278/343 | 278/343 | 2 | 258/357 | 258 | |
| CTGCATGACTTGGCACCTTAAAGGCCTTAGCCCAAAGAAG | 6 | 210/240 | 210/265 | 210/236/260 | 210/240 | 210/240 | 210/265 | 1 | 240 | ND | |
| CCTCTTATCCTTCTCTTCGGGAACTTTCTTCTGCATTGCCA | 2 | 354/428 | 354/428 | 354/428 | 354/428 | 354/428 | 354/428 | 1 | 488 | ND | |
| TAGATCCTCCCCTTTACCCGGTATACACACACACGCACGC | 8 | 150 | 150 | 157/180 | 159 | 168/198 | 168/190 | 2 | 150 | 150 | |
a The bands consistent with Khalas (PDK30 sequence) [12]
b The bands consistent with Khalas (ATPV01 sequence) [13], ND non detected.