| Literature DB >> 32558907 |
Jingmiao Li1, Siqiao Li1,2, Lijuan Kong1, Lihua Wang3, Anzhi Wei1, Yulin Liu1.
Abstract
Zanthoxylum bungeanum, a spice and medicinal plant, is cultivated in many parts of China and some countries in Southeast Asia; however, data on its genome are lacking. In the present study, we performed a whole-genome survey and developed novel genomic-SSR markers of Z. bungeanum. Clean data (∼197.16 Gb) were obtained and assembled into 11185221 scaffolds with an N50 of 183 bp. K-mer analysis revealed that Z. bungeanum has an estimated genome size of 3971.92 Mb, and the GC content, heterozygous rate, and repeat sequence rate are 37.21%, 1.73%, and 86.04%, respectively. These results indicate that the genome of Z. bungeanum is complex. Furthermore, 27153 simple sequence repeat (SSR) loci were identified from 57288 scaffolds with a minimum length > 1 kb. Mononucleotide repeats (19706) were the most abundant type, followed by dinucleotide repeats (5154). The most common motifs were A/T, followed by AT/AT; these SSRs accounted for 71.42% and 11.84% of all repeats, respectively. A total of 21243 non-repeating primer pairs were designed, and 100 were randomly selected and validated by PCR analysis using DNA from 10 Z. bungeanum individuals and 5 Zanthoxylum armatum individuals. Finally, 36 polymorphic SSR markers were developed with polymorphism information content (PIC) values ranging from 0.16 to 0.75. Cluster analysis revealed that Z. bungeanum and Z. armatum could be divided into two major clusters, suggesting that these newly developed SSR markers are useful for genetic diversity and germplasm resource identification in Z. bungeanum and Z. armatum.Entities:
Keywords: Genetic diversity; Genome size; Genome survey; SSR markers; Zanthoxylum bungeanum
Mesh:
Substances:
Year: 2020 PMID: 32558907 PMCID: PMC7322109 DOI: 10.1042/BSR20201101
Source DB: PubMed Journal: Biosci Rep ISSN: 0144-8463 Impact factor: 3.840
Figure 1The adult tree, fruits and dried pericarps of ZB
(A) An adult tree covered with ripe fruits. (B) The appearance of the fruits. (C) Characteristic of dried pericarps.
Origin regions of 15 Zanthoxylum individuals
| Species | ID | Individual name | Origin region |
|---|---|---|---|
| ZA01 | Chongqingjiuyeqing | Jiangjin, Chongqing, China | |
| ZA02 | Pengxiqinghuajiao | Suining, Sichuan, China | |
| ZA03 | Rongchangwuci | Rongchang, Chongqing, China | |
| ZA04 | Hanyuanputaoqingjiao | Ya’an, Sichuan, China | |
| ZA05 | Goujiao | Ya’an, Sichuan, China | |
| ZB01 | Youhuajiao | Liupanshu, Guizhou, China | |
| ZB02 | Suhuajiao | Liupanshu, Guizhou, China | |
| ZB03 | Hanyuandahongpao | Ya’an, Sichuan, China | |
| ZB04 | Shandongdahongpao | Linyi, Shandong, China | |
| ZB05 | Shanxidahongpao | Yongji, Shanxi, China | |
| ZB06 | Fengxiandahongpao | Baoji, Shaanxi, China | |
| ZB07 | Fuguhuajiao | Yulin, Shaanxi, China | |
| ZB08 | Shizitou | Hancheng, Shaanxi, China | |
| ZB09 | Hanchengdahongpao | Hancheng, Shaanxi, China | |
| ZB10 | Dangcunwuci | Hancheng, Shaanxi, China |
Basic statistics for the genome survey sequencing data of ZB
| Library | DES00802 | DES00803 | DES00804 | DES00805 |
|---|---|---|---|---|
| Insert size(bp) | 250 | 250 | 350 | 350 |
| Raw reads | 200690359 | 210996601 | 192712146 | 190585631 |
| Raw base(bp) | 60207107700 | 63298980300 | 57813643800 | 57175689300 |
| Effective rate (%) | 76.70 | 76.24 | 89.43 | 89.43 |
| Clean base(bp) | 46124187900 | 48205998000 | 51133569600 | 51701329800 |
| Error rate(%) | 0.02 | 0.02 | 0.02 | 0.02 |
| Q20(%) | 98.10 | 98.07 | 97.28 | 97.40 |
| Q30(%) | 95.66 | 95.42 | 93.88 | 94.11 |
| GC content(%) | 38.66 | 38.68 | 38.38 | 38.39 |
Estimation statistics and analysis based on K-mer of ZB
| K-mer depth | Genome size (Mb) | Revised genome size (Mb) | Heterozygous ratio (%) | Repeat (%) | ||
|---|---|---|---|---|---|---|
| 17 | 176134142868 | 44 | 4003.05 | 3971.92 | 1.73 | 86.04 |
Figure 2Distribution of K-mer = 17 depth and GC content and depth correlation analysis
(A) In the figure, the estimated genome size of ZB was judged by the following formula: genome size = K-mer number/K-mer depth. The x-axis is depth; the y-axis represents the frequency at a particular depth divided by the total frequency of all depths. (B) In the figure, the x-axis represents the GC content, the y-axis represents the sequencing depth. The right side is the sequencing depth distribution and the top side is the GC content distribution. The red part represents the dense part of the points in the scatter plot.
Statistics of the assembled genome sequences in ZB
| Total length (bp) | Total number | Total number(>2 kb) | Max length (bp) | N50 length (bp) | N90 length (bp) | |
|---|---|---|---|---|---|---|
| Contig | 2069338941 | 11185221 | 7712 | 13151 | 183 | 110 |
| Scaffold | 2072641802 | 11086925 | 7921 | 13628 | 186 | 111 |
Distribution pattern of G-SSR motifs in 17593 scaffolds in ZB
| Repeat motif | Number of repeats | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 5 | 6 | 7 | 8 | 9 | 10 | 10-20 | >20 | ||
| Mono-nucleotide (19706) | |||||||||
| A/T | 8006 | 10600 | 787 | 19393 | |||||
| C/G | 32 | 204 | 77 | 313 | |||||
| Di-nucleotide (5154) | |||||||||
| AT/AT | 897 | 540 | 444 | 326 | 285 | 720 | 2 | 3214 | |
| AG/CT | 498 | 220 | 136 | 72 | 37 | 64 | 1 | 1028 | |
| AC/GT | 416 | 186 | 107 | 75 | 37 | 81 | 3 | 905 | |
| CG/CG | 4 | 3 | 7 | ||||||
| Tri-nucleotide (1806) | |||||||||
| AAT/ATT | 451 | 206 | 124 | 68 | 49 | 43 | 79 | 1 | 1021 |
| AAG/CTT | 201 | 82 | 36 | 13 | 19 | 7 | 17 | 1 | 376 |
| ATC/GAT | 81 | 32 | 8 | 8 | 2 | 4 | 8 | 143 | |
| AAC/GTT | 42 | 23 | 9 | 5 | 1 | 3 | 83 | ||
| ACC/GGT | 52 | 13 | 7 | 3 | 1 | 76 | |||
| AGG/CCT | 33 | 18 | 4 | 4 | 3 | 1 | 63 | ||
| AGC/GCT | 36 | 5 | 3 | 2 | 46 | ||||
| CCG/CGG | 9 | 5 | 2 | 1 | 1 | 18 | |||
| ACT/AGT | 11 | 2 | 2 | 3 | 18 | ||||
| ACG/CGT | 9 | 2 | 3 | 2 | 16 | ||||
| Tetra-nucleotide (309) | |||||||||
| AAAT/ATTT | 130 | 21 | 6 | 2 | 159 | ||||
| ACAT/ATGT | 22 | 7 | 3 | 5 | 6 | 43 | |||
| AAAG/CTTT | 26 | 3 | 2 | 1 | 32 | ||||
| AATT/AATT | 28 | 2 | 2 | 32 | |||||
| AAAC/GTTT | 20 | 2 | 1 | 23 | |||||
| Others | 13 | 4 | 2 | 1 | 20 | ||||
| Penta-nucleotide (60) | |||||||||
| AAAAT/ATTTT | 21 | 6 | 27 | ||||||
| AAAAC/GTTTT | 8 | 8 | |||||||
| AAATT/AATTT | 3 | 1 | 4 | ||||||
| AATCG/ATTCG | 4 | 4 | |||||||
| Others | 14 | 2 | 1 | 17 | |||||
| Hexa-nucleotide (64) | |||||||||
| AAAAAT/ATTTTT | 9 | 1 | 10 | ||||||
| AAAAAC/GTTTTT | 3 | 3 | 6 | ||||||
| AAATAT/ATATTT | 2 | 2 | 4 | ||||||
| AAAAAG/CTTTTT | 2 | 1 | 3 | ||||||
| Others | 32 | 8 | 1 | 41 | |||||
| Total | 1262 | 2266 | 1164 | 801 | 556 | 8453 | 11779 | 872 | 27153 |
Characterizations of 36 polymorphic G-SSR primers pairs
| Primer name | Primer seqence (5′-3′) | Repeat motif | Expected size (bp) | All individuals | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Na | PIC | Ho | He | Na | PIC | Ho | He | Na | PIC | Ho | He | ||||
| ZBg01 | F: TGTCTTCGCCTTCCATTCTC | (AG)10 | 272 | 3 | 0.48 | 0.13 | 0.57 | 2 | 0.16 | 0.20 | 0.19 | 2 | 0.27 | 0.00 | 0.36 |
| R: CGAGCACCAACCCTAACAAT | |||||||||||||||
| ZBg02 | F: GGGTGAGACTGGCGTTATGT | (TA)8 | 268 | 3 | 0.58 | 0.40 | 0.68 | 2 | 0.36 | 0.60 | 0.51 | 1 | 0.00 | 0.00 | 0.00 |
| R: CGAACCAAGTCATTGAGGCT | |||||||||||||||
| ZBg03 | F: GCTTCGTCAGGCAGAAACTC | (AG)8 | 238 | 4 | 0.59 | 0.64 | 0.68 | 3 | 0.51 | 0.70 | 0.62 | 1 | 0.00 | 0.00 | 0.00 |
| R: CAAAATCGGTCTTCGCTTTC | |||||||||||||||
| ZBg04 | F: GATGCGACCCTCACTCTAGC | (AGA)9 | 180 | 5 | 0.69 | 0.36 | 0.77 | 4 | 0.65 | 0.17 | 0.77 | 5 | 0.72 | 0.60 | 0.84 |
| R: GTCGTCCGAATTGGAGGTAA | |||||||||||||||
| ZBg07 | F: TCACTCCTATGCCTCCTTGG | (TA)10 | 219 | 5 | 0.58 | 0.40 | 0.64 | 3 | 0.30 | 0.10 | 0.35 | 5 | 0.70 | 1.00 | 0.82 |
| R: TGATCTTGGTGCCACAGGTA | |||||||||||||||
| ZBg11 | F: CATTGGCACACCAAGTGTTT | (TA)8 | 248 | 3 | 0.47 | 0.11 | 0.57 | 3 | 0.50 | 0.13 | 0.61 | 1 | 0.00 | 0.00 | 0.00 |
| R: TCGAATTTTAGCACTGCTCG | |||||||||||||||
| ZBg17 | F: GGCAATCTTCTCACCATTCC | (AG)13 | 275 | 2 | 0.27 | 0.40 | 0.34 | 2 | 0.27 | 0.40 | 0.34 | – | – | – | – |
| R: TGAGGTGGATGCACATAAGG | |||||||||||||||
| ZBg18 | F: GCCCAGTTGTCAGTTTTGGT | (GT)10 | 246 | 5 | 0.66 | 0.00 | 0.73 | 3 | 0.47 | 0.00 | 0.57 | 3 | 0.55 | 0.00 | 0.71 |
| R: ATGGGCATGAGATGGCTTAG | |||||||||||||||
| ZBg19 | F: TGGATTCACTCATTCACATGC | (TAT)8 | 206 | 2 | 0.33 | 0.47 | 0.43 | 2 | 0.22 | 0.30 | 0.27 | 2 | 0.36 | 0.80 | 0.53 |
| R: TTTGGAGTTCAAACCTCGCT | |||||||||||||||
| ZBg21 | F: GGCCGCCTGAAGAATACAT | (ATT)8 | 142 | 3 | 0.30 | 0.38 | 0.34 | 2 | 0.19 | 0.25 | 0.23 | 2 | 0.33 | 0.60 | 0.47 |
| R: TTCGGCTAACCAAACAAACC | |||||||||||||||
| ZBg24 | F: AACGCGCCATTTCATATTTC | (AT)8 | 189 | 5 | 0.73 | 0.54 | 0.80 | 4 | 0.66 | 0.50 | 0.76 | 2 | 0.33 | 0.60 | 0.47 |
| R: AGAGCATTGAGCCTCGTTGT | |||||||||||||||
| ZBg30 | F: CCCATGAGAGTTGACTGCAA | (TTTA)5 | 230 | 4 | 0.60 | 0.86 | 0.68 | 3 | 0.54 | 1.00 | 0.65 | 2 | 0.33 | 0.60 | 0.47 |
| R: CAGTGCCTGACAGAGTCGAG | |||||||||||||||
| ZBg31 | F: GTACAAGCGATGCGACAGAA | (TA)14 | 256 | 2 | 0.36 | 0.00 | 0.49 | 1 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| R: AGTGCGTGACTCGAACAGTG | |||||||||||||||
| ZBg34 | F: CCAACATCAAAGAAACGCAA | (AAT)6 | 163 | 2 | 0.37 | 0.00 | 0.51 | 2 | 0.33 | 0.00 | 0.44 | 1 | 0.00 | 0.00 | 0.00 |
| R: CATAATTCCTAGGTTGGCCG | |||||||||||||||
| ZBg35 | F: GCTGTGAACATGAAATCGGA | (TA)13 | 189 | 3 | 0.40 | 0.33 | 0.46 | 3 | 0.40 | 0.33 | 0.46 | – | – | – | – |
| R: TCGCGTGAAATAGAATGTCG | |||||||||||||||
| ZBg36 | F: TGGCATGTTTGGTTCTCTTG | (AG)10 | 271 | 2 | 0.16 | 0.20 | 0.19 | 2 | 0.22 | 0.30 | 0.27 | 1 | 0.00 | 0.00 | 0.00 |
| R: AGAAGACCTGGGTGTGGTTG | |||||||||||||||
| ZBg40 | F: GTCGTCAAATGAACCGTGTG | (TATATG)5 | 204 | 4 | 0.50 | 0.40 | 0.61 | 4 | 0.44 | 0.60 | 0.50 | 1 | 0.00 | 0.00 | 0.00 |
| R: AATCGATTCGGTGTGTGGAT | |||||||||||||||
| ZBg45 | F: ACGTGATTGGTAGGAGACGG | (AT)12 | 272 | 3 | 0.47 | 0.56 | 0.57 | 3 | 0.47 | 0.56 | 0.57 | – | – | – | – |
| R: ATGGGTCCACGGGTACATAA | |||||||||||||||
| ZBg46 | F: AATCCTTCCCCATCTCAAGC | (TAT)7 | 173 | 3 | 0.44 | 0.00 | 0.51 | 1 | 0.00 | 0.00 | 0.00 | 2 | 0.36 | 0.00 | 0.53 |
| R: CCCGATATTTTCCCAATGTG | |||||||||||||||
| ZBg71 | F: GAATATGGGGAAGGAAACCAA | (TATG)5 | 204 | 5 | 0.45 | 0.33 | 0.49 | 4 | 0.39 | 0.13 | 0.44 | 3 | 0.47 | 0.75 | 0.61 |
| R: TATTATGAATGGCGTGGGGT | |||||||||||||||
| ZBg73 | F: GGATGCCAATCCTTCACACT | (ATT)9 | 263 | 3 | 0.59 | 0.23 | 0.69 | 2 | 0.36 | 0.00 | 0.50 | 2 | 0.33 | 0.60 | 0.47 |
| R: TGAATAGTACTTGGGGGCCA | |||||||||||||||
| ZBg74 | F: TCCACGTCAACTCCAAACAA | (AT)9 | 259 | 3 | 0.49 | 0.07 | 0.60 | 2 | 0.24 | 0.11 | 0.29 | 1 | 0.00 | 0.00 | 0.00 |
| R: GACTCAACTGTCGGTGCTCA | |||||||||||||||
| ZBg76 | F: ACATCTCCGGTCGATCTTGT | (TA)12 | 270 | 3 | 0.47 | 0.00 | 0.57 | 2 | 0.21 | 0.00 | 0.26 | 1 | 0.00 | 0.00 | 0.00 |
| R: ATTGGAGATCGAGGAACACG | |||||||||||||||
| ZBg77 | F: CCATCATCTTCCATGATTGCT | (TTC)6 | 277 | 4 | 0.57 | 0.00 | 0.64 | 2 | 0.27 | 0.00 | 0.34 | 2 | 0.27 | 0.00 | 0.36 |
| R: GGTCTTCCAAATTCGAACCA | |||||||||||||||
| ZBg83 | F: GGGTTCTACCCTAGCCGAAC | (AT)12 | 266 | 4 | 0.46 | 0.23 | 0.53 | 1 | 0.00 | 0.00 | 0.00 | 4 | 0.54 | 0.60 | 0.64 |
| R: GGTTCCGATTTCAGTTCCAA | |||||||||||||||
| ZBg84 | F: ACGATATGAAACGGAAACGG | (AAG)11 | 225 | 5 | 0.67 | 0.00 | 0.74 | 3 | 0.47 | 0.00 | 0.57 | 2 | 0.35 | 0.00 | 0.53 |
| R: GATTCCAAGAAATGCCTCCA | |||||||||||||||
| ZBg86 | F: AGTTGGAATGAGAACATGGACA | (TAT)6 | 209 | 2 | 0.27 | 0.27 | 0.33 | 1 | 0.00 | 0.00 | 0.00 | 2 | 0.36 | 0.80 | 0.53 |
| R: TGCGACGCTATCACAAACTT | |||||||||||||||
| ZBg89 | F: GAGCCTAGAACAGCGTCGTC | (TATG)8 | 229 | 3 | 0.35 | 0.33 | 0.40 | 2 | 0.27 | 0.20 | 0.34 | 2 | 0.33 | 0.60 | 0.47 |
| R: AAACCTGAAAGGCAGCTTGA | |||||||||||||||
| ZBg90 | F: CATTTTGTGCGATAGGCAGA | (TAT)11 | 242 | 2 | 0.34 | 0.09 | 0.45 | 2 | 0.26 | 0.13 | 0.33 | 2 | 0.35 | 0.00 | 0.53 |
| R: CTAGGAGACAGCCCAGCAAC | |||||||||||||||
| ZBg91 | F: CCATGCAACAGCGATTCTAA | (TG)9 | 262 | 5 | 0.47 | 0.43 | 0.52 | 3 | 0.19 | 0.22 | 0.22 | 3 | 0.59 | 0.80 | 0.73 |
| R: TCCACACACATGTCAAACACA | |||||||||||||||
| ZBg92 | F: CGCTGCCATTATTTGCTGTA | (ATA)12 | 249 | 7 | 0.75 | 0.73 | 0.81 | 4 | 0.58 | 0.60 | 0.68 | 4 | 0.60 | 1.00 | 0.73 |
| R: TGGTGGCACTTAGCAGTGAG | |||||||||||||||
| ZBg94 | F: TAATACTCGGCCATGAACCC | (AAAAT)5 | 202 | 3 | 0.40 | 0.15 | 0.48 | 3 | 0.34 | 0.22 | 0.39 | 2 | 0.38 | 0.00 | 0.57 |
| R: CGAATGACGTGGTGAAGAAG | |||||||||||||||
| ZBg95 | F: CAGGATCGACCTCCACAGTT | (TTA)11 | 279 | 3 | 0.55 | 0.67 | 0.65 | 3 | 0.59 | 0.78 | 0.70 | 2 | 0.24 | 0.33 | 0.33 |
| R: AATGTCGCCAAAGTAGCGTC | |||||||||||||||
| ZBg96 | F: AATATTGTTTGGGGGCCATT | (GAA)7 | 279 | 5 | 0.60 | 0.00 | 0.68 | 3 | 0.49 | 0.00 | 0.61 | 3 | 0.50 | 0.00 | 0.62 |
| R: TTTATGGATGCCAAGCCTTC | |||||||||||||||
| ZBg97 | F: CATAGCACAAGCAATGTGGG | (TA)10 | 162 | 3 | 0.42 | 0.07 | 0.48 | 1 | 0.00 | 0.00 | 0.00 | 3 | 0.49 | 0.20 | 0.64 |
| R: ACACCTCCAGACCAGTCCAC | |||||||||||||||
| ZBg98 | F: TGGAATGAGGTCTTCCAAGG | (TTC)6 | 190 | 3 | 0.35 | 0.20 | 0.40 | 2 | 0.16 | 0.00 | 0.19 | 3 | 0.55 | 0.60 | 0.69 |
| R: ATGACAAGCTTTCGGCAGTT | |||||||||||||||
Abbreviations: He, expected heterozygosity; Ho, observed heterozygosity; Na, observed number of alleles; PIC, polymorphism information content.
Figure 3Polymorphisms revealed by ZBg46 in 15 individuals of Zanthoxylum
In the figure, the marker (M) was pBR322 DNA/MspI, the amplified bands from left to right were ZA01 to ZA05 (under the green stripe) and ZB01 to ZB10 (under the red stripe).
Figure 4Cluster diagram for 15 individuals of Zanthoxylum by UPGMA method