| Literature DB >> 30006536 |
Yi Wang1,2,3, Defeng Chen1, Xiaofeng He1,4, Jiangxian Shen1,4, Min Xiong1, Xian Wang1, Di Zhou5, Zunzheng Wei6.
Abstract
Although amaryllis (Hippeastrum hybridum) plants are commonly used in physiological and ecological research, the extent of their genomic and genetic resources remains limited. The development of molecular markers is therefore of great importance to accelerate genetic improvements in Hippeastrum species. In this study, a total of 269 unique genes were defined that might regulate the flower spathe development of amaryllis. In addition, 2000 simple sequence repeats (SSRs) were detected based on 171,462 de novo assembled unigenes from transcriptome data, and 66,4091 single nucleotide polymorphisms (SNPs) were also detected as putative molecular markers. Twenty-one SSR markers were screened to evaluate the genetic diversity and population structure of 104 amaryllis accessions. A total of 98 SSR loci were amplified for all accessions. The results reveal that Nei's gene diversity (H) values of these markers ranged between 0.055 and 0.394, whereas the average values of Shannon's Information index (I) ranged between 0.172 and 0.567. Genetic tree analysis further demonstrates that all accessions can be grouped into three main clusters, which can be further divided into two subgroups. STRUCTURE-based analysis revealed that the highest ΔK values were observed when K = 5, K = 6, K = 7 and K = 8. The results of this study enable large-scale transcriptomics and classification of Hippeastrum genetic polymorphisms and will be useful in the future for resource conservation and production.Entities:
Mesh:
Year: 2018 PMID: 30006536 PMCID: PMC6045658 DOI: 10.1038/s41598-018-28809-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of transcriptome statistics and functional annotations for amaryllis cv. Blossom Peacock.
| Data type | Number |
|---|---|
| Total number of raw reads | 28,974,793 |
| Total number of clean reads | 27,273,776 (94.13%) |
| Total number of transcript isoforms | 380,731 |
| Number of non-redundant unigenes | 171,462 |
| Number of unigenes between 200 bp and 500 bp in length | 128,058 (74.69%) |
| Number of unigenes between 500 bp and 1,000 bp in length | 25,138 (14.66%) |
| Number of unigenes greater than 1,000 bp in length | 18,266 (10.65%) |
| N50 length (bp) | 610 |
| Total length (bp) | 84,359,807 |
| Maximum length (bp) | 8,272 |
| Minimum length (bp) | 201 |
| Average length (bp) | 492 |
| Number of unigenes in the NR database | 47,359 (27.62%) |
| Number of unigenes in the Swiss-Prot database | 27,089 (15.80%) |
| Number of unigenes in the COG database | 14,151 (8.25%) |
| Number of unigenes in the GO database | 35,153 (20.50%) |
| Number of unigenes in the KEGG database | 18,048 (10.53%) |
| Number of unigenes annotated in at least one database | 47,359 (27.62%) |
Figure 1Features of amaryllis cv. Blossom Peacock unigene homology searches versus non-redundant protein databases (Nr) (E-value = 1E-10−5). (A) E-value distribution of BLASTx hits for each assembled unigene. (B) Similarity distribution of BLASTx hits for each assembled unigene. (C) Species-based distribution of the top BLASTx hits for each assembled unigene.
Figure 2Classification of amaryllis cv. Blossom Peacock unigenes. (A) COG classification. A total of 14,151 assembled unigenes were annotated and assigned to 25 functional categories in this case. The x-axis denotes subgroups within the COG classification, whereas the y-axis indicates the number of genes in each main category. (B) GO classification of assembled unigenes at level 1. In this case, a total of 35,153 unigenes were grouped into three main GO categories: ‘biological processes’, ‘cellular components’, and ‘molecular function’. The x-axis denotes the subgroups within the GO annotation, whereas the y-axis indicates the percentage of specific gene categories. (C) The top 38 KEGG metabolic pathways for assembled unigenes. In this case, the x-axis denotes the number of genes in each metabolic pathway, whereas the y-axis indicates subgroups.
Summary of SNPs identified from unigenes of amaryllis cv. Blossom Peacock.
| Transitions | Number | Transversions | Number |
|---|---|---|---|
| C/T | 212,902 | A/T | 76,217 |
| A/G | 210,405 | A/C | 60,232 |
| T/G | 57,440 | ||
| C/G | 46,895 | ||
| Total | 423,307 | Total | 240,784 |
Figure 3Summary of SSRs identified in amaryllis cv. Blossom Peacock unigenes.
Genetic diversity estimates based on 21 SSR primers from 104 amaryllis accessions.
| Locus | Motif | Primer sequences (5′-3′) | Length (bp) | N | NP | H | I |
|---|---|---|---|---|---|---|---|
| FP027 | (TGT)7 | F:CCAAATGATCCCAAGGAAGA | 184–196 | 4 | 4 | 0.380 | 0.561 |
| R:TCATGACACCTTCGGAGACA | |||||||
| FP033 | (GAG)7 | F:GAGTCGAGGTGGGTGTTGAT | 127–140 | 3 | 3 | 0.380 | 0.567 |
| R:ATCTCCCCAAACCCCATAAG | |||||||
| FP047 | (AGA)7 | F:TGAATGTGTTTGAGGCTTGC | 164–220 | 7 | 6 | 0.163 | 0.269 |
| R:TTGATCCTCTTTTCTTGTGGC | |||||||
| FP077 | (CCCT)5 | F:GGCATTATCACGCCTAAGGA | 146–212 | 4 | 3 | 0.265 | 0.401 |
| R:CACCACAAGAAACCGAACAA | |||||||
| FP083 | (TTTA)5 | F:CCAACTGTAAGAAACCCCCA | 183–195 | 3 | 3 | 0.055 | 0.121 |
| R:CCCAAAGGCCTAAATTCACA | |||||||
| FP089 | (GGAG)5 | F:GAGGATGCACTCTTTGAGCC | 192–300 | 8 | 8 | 0.293 | 0.438 |
| R:CGTCAACTCCTCTTCCTTCG | |||||||
| FP105 | (GAA)7 | F:CGGTGGGAGAAGAAGAGATG | 242–269 | 4 | 1 | 0.124 | 0.172 |
| R:GAGACGATGAAGCTCCGAAT | |||||||
| FP115 | (TA)10 | F:CGGGTCAATGTTAAGCCAGT | 147–181 | 5 | 5 | 0.193 | 0.323 |
| R:CAGGTGATGAGCATTGGATG | |||||||
| FP116 | (AG)7-(AG)7 | F:TCGGGGCAGACATCTTTAAC | 145–167 | 4 | 4 | 0.195 | 0.332 |
| R:GCTTTGGGAGGTATTTTTGTGA | |||||||
| FP131 | (AG)10 | F:TCGAGGTGCTGTTTGTTTTG | 125–133 | 3 | 3 | 0.316 | 0.481 |
| R:AGACCAACGCAAGTCAGTCC | |||||||
| FP136 | (GAG)8 | F:GAGCTTGACCTGACGGACTC | 175–190 | 5 | 5 | 0.271 | 0.416 |
| R:GCAGAGCATGGCTTCTATCC | |||||||
| FP213 | (TTTA)5 | F:CCCCTTTTGTAGATGCCTCA | 116–132 | 5 | 5 | 0.256 | 0.405 |
| R:AATTGAGACAGGCGTTTTGC | |||||||
| FP215 | (ACC)7 | F:TCGCTTCTCCAATCTCGACT | 98–113 | 6 | 6 | 0.258 | 0.410 |
| R:GTCGATCGCAACCATTCTTT | |||||||
| FP220 | (GC)6-(TG)6 | F:TGCCATTTAAGATCAATGGAAG | 140–148 | 3 | 3 | 0.287 | 0.449 |
| R:AAGTGGGCAGCTGAAAAAGA | |||||||
| FP249 | (CTT)7 | F:TGTGGGTTCTATGCTTTCCC | 131–146 | 5 | 5 | 0.394 | 0.568 |
| R:CCCCTGCTTCATCTCCAATA | |||||||
| FP255 | (ATTT)5 | F:AGGAAATCATTGGAGACCGA | 145–154 | 3 | 3 | 0.344 | 0.499 |
| R:AATATAGCCCCTCTCACCCC | |||||||
| FP257 | (CT)10 | F:CAGCGCTCTTGCTCAGTAGA | 137–167 | 8 | 8 | 0.214 | 0.354 |
| R:ACTCAGGGTCATGAAAACGG | |||||||
| FP259 | (CT)8-(CA)7 | F:TCTCCAAAACCTTCTTCTCACA | 116–126 | 4 | 4 | 0.392 | 0.577 |
| R:CTCGAGGAGGAGAGATGGGT | |||||||
| FP280 | (AATA)5 | F:TGAACAGTGAAACTCGGCAG | 166–194 | 6 | 6 | 0.379 | 0.558 |
| R:TGTGGTGGAAATTTTCTTCATT | |||||||
| FP292 | (AATA)5 | F:GATGCAAGAAGGGTTCCAAA | 110–122 | 4 | 3 | 0.126 | 0.229 |
| R:TTGCATTTTAACAGCGCAAG | |||||||
| FP305 | (CAA)7 | F:CACCATGCCAACCTTCTTCT | 165–175 | 4 | 4 | 0.265 | 0.421 |
| R:CCTGCTGAGATTTTGCCTTC | |||||||
| Total | 98 | 92 | 5.550 | 8.550 | |||
Note: N: Number of loci; NP: Number of polymorphic loci; H: Nei’s gene diversity; I: Shannon’s Information index.
Figure 4N-J phylogenetic tree aligned with a structural analysis of 104 amaryllis accessions from the Netherlands and South Africa. (a) N-J tree for 21 SSR markers based on H values depicting three major clusters, including I (35 accessions), II (31 accessions) and III (38 accessions). Each of these clusters was further separated into two sub-clusters. (b) Multilocus cluster analysis using STRUCTURE software, demonstrating the four best fitting models (i.e., K = 5, K = 6, K = 7, and K = 8) on the basis of Evanno’s DK. Each individual accession on this figure is represented by a vertical line divided into K coloured bars.