Literature DB >> 32595657

Characterization of Hemerocallis citrina Transcriptome and Development of EST-SSR Markers for Evaluation of Genetic Diversity and Population Structure of Hemerocallis Collection.

Sen Li1,2, Fangfang Ji1,2, Feifan Hou1,2, Huliang Cui1,2, Qingqing Shi1,2, Guoming Xing1,2, Yiqun Weng3, Xiuping Kang1,2.   

Abstract

Hemerocallis spp. commonly known as daylilies and night lilies, are among the most popular ornamental crops worldwide. In Eastern Asia, H. citrina is also widely cultivated as both a vegetable crop and for medicinal use. However, limited genetic and genomic resources are available in Hemerocallis. Knowledge on the genetic diversity and population structure of this species-rich genus is very limited. In this study, we reported transcriptome sequencing of H. citrina cv. 'Datonghuanghua' which is a popular, high-yielding variety in China. We mined the transcriptome data, identified and characterized the microsatellite or simple sequence repeat (SSR) sequences in the expressed genome. From ∼14.15 Gbp clean reads, we assembled 92,107 unigenes, of which 41,796 were annotated for possible functions. From 41,796 unigenes, we identified and characterized 3,430 SSRs with varying motifs. Forty-three SSRs were used to fingerprint 155 Hemerocallis accessions. Clustering and population structure analyses with the genotypic data among the 155 accessions reveal broader genetic variation of daylilies than the night lily accessions which form a subgroup in the phylogenetic tree. The night lily group included accessions from H. citrina, H. lilioasphodelus, and H. minor, the majority of which bloom in the evening/night, whereas the ∼100 daylily accessions bloomed in the early morning suggesting flowering time may be a major force in the selection of night lily. The utility of these SSRs was further exemplified in association analysis of blooming time among these accessions. Twelve SSRs were found to have significant associations with this horticulturally important trait.
Copyright © 2020 Li, Ji, Hou, Cui, Shi, Xing, Weng and Kang.

Entities:  

Keywords:  Hemerocallis; SSR; daylily; genetic diversity; night lily; population structure; transcriptome

Year:  2020        PMID: 32595657      PMCID: PMC7300269          DOI: 10.3389/fpls.2020.00686

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Hemerocallis species are among the most popular ornamental crops worldwide because of the large, conspicuous flowers and their adaptation to a wide range of soils and climates. In Eastern Asian countries, some species, especially H. citrina (night lily, or long yellow daylily) are important vegetables or medicinal plants with a long history of cultivation. For example, the first reference of H. fulva (daylily) was found in a writing in the Chou Dynasty of China (ca. 112–255 BCE) (Hu, 1968; Kitchingnan, 1985). The unopened flower buds (fresh or dried) of the night lily are consumed as a special vegetable (also known as “golden needle vegetable”). The medicinal value of Hemerocallis species has received more attention in recent years. The flower buds seem to be enriched with antioxidants, such as stelladerol and caffeoylquinic acid derivatives (Cichewicz and Nair, 2002; Hsu et al., 2011; Lin et al., 2011; Jiao et al., 2016). These secondary metabolites are used to treat anxiety and swelling (Uezu, 1998; Cichewicz et al., 2002; Gu et al., 2012; Yi et al., 2012; Du et al., 2014; Liu et al., 2014), and for other applications in modern medicine and biology (Venugopalan and Srivastava, 2015). Thus, Hemerocallis flowers have considerable potentials as “nutraceutical” or “functional” foods (Rop et al., 2011). In addition to their economic and medical values, some features of Hemerocallis species are of particular biological significance. For example, the day and night lilies provide an excellent model for understanding the genetic and molecular mechanisms of flower opening and flower senescing. All known species in this genus show strict circadian rhythm of flowering with the rapid opening and withering that lasts only a few hours indicting precise regulation of floral death probably by a programmed cell death system (Hasegawa et al., 2006; Nitta et al., 2010; Hirota et al., 2012, 2013). All Hemerocallis species exhibit self-incompatibility. As such, Hemerocallis was proposed as a future model system to study these biologically interesting phenomena (Rodriguez-Enriquez and Grant-Downton, 2012). As an ornamental plant, extensive breeding efforts during the last century have resulted in varieties with different flower size, color, shape, scent and blooming time. In the US, over 83,000 daylily varieties have been registered in the American Hemerocallis Society online database[1]. Some morphological characteristics of daylily and night lily plants and flowers are exemplified in Figure 1. Despite the economic and biological importance as well as extensive breeding efforts on Hemerocallis species, not much has been accomplished on genetic studies in this genus. As demonstrated in numerous other horticulture crops, molecular markers are important tools for genetic research and breeding in Hemerocallis with many potential applications such as evaluation of genetic diversity and population structure of germplasm collection, development of linkage maps, gene and QTL (quantitative trait loci) mapping or cloning, as well as marker-assisted selection of horticulturally important traits in breeding. Nevertheless, limited work has been done in the development of molecular markers for Hemerocallis species. The natural range of Hemerocallis encompasses temperate and sub-tropical Asia, with the main center of diversity of the genus in China, Korea and Japan (Flora of China, 2000). However, reports on the genetic diversity and population genetic analysis of Hemerocallis especially for the collections in China are sporadic (e.g., Tomkins et al., 2001; Podwyszynska et al., 2009; Vendrame et al., 2009; Miyake and Yahara, 2010). In these early reports, molecular markers employed included random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphism (AFLP), each of which has limitations in practical uses (Varshney et al., 2007). Despite the increased use of single nucleotide polymorphism (SNPs), SSR markers remain popular in plant genetic and breeding studies because of their many desirable attributes, including hyper variability, a multi allelic nature, co-dominant inheritance, reproducibility, relative abundance, extensive genome coverage (including organellar genomes), relatively low cost, and amenability to high throughput genotyping (Parida et al., 2009). However, very few SSRs have been reported for Hemerocallis (e.g., Zhu et al., 2009). While there are several methods to develop SSR markers, transcriptome sequencing (RNA-Seq) has been an efficient and affordable way for large-scale, and genome-wide SSR discovery and marker development for various marker-based studies in non-model plants (e.g., Hudson, 2010; Strickler et al., 2012; Yuan et al., 2013; Zhang et al., 2013; Guo et al., 2016; Smithunna et al., 2016). Thus, the objective of the present study is to develop EST (expressed sequence tag)-SSR markers through transcriptome sequencing and examine their utility in evaluation of genetic diversity and population structure of our Hemerocallis collection. We also conducted association analysis to identify potential association of molecular markers with horticulturally important traits.
FIGURE 1

Phenotypic diversity in flowers, petals, sepals, and buds of two typical Hemerocallis species: H. citrina (night lily; A), and H. fulva (daylily, B). (C) Flower buds and opened flowers from different accessions. C1–C5 are opened flowers. C1: Hemerocallis cv. ‘Stella de Oro 2’; C2: Hemerocallis cv. ‘Frans Hals’; C3: H. citrina cv. ‘Datonghuanghua’ C4: Hemerocallis cv. ‘Blazing sun’; C5: Hemerocallis cv. ‘H400’; C6-C9 are young flower buds. C6: Hemerocallis cv. ‘Beijing 7’; C7: Hemerocallis cv. ‘Blue Sheen 1’; C8: H. citrina cv. ‘Zaohuanghua’; C9: Hemerocallis cv. ‘Little Grapette’.

Phenotypic diversity in flowers, petals, sepals, and buds of two typical Hemerocallis species: H. citrina (night lily; A), and H. fulva (daylily, B). (C) Flower buds and opened flowers from different accessions. C1–C5 are opened flowers. C1: Hemerocallis cv. ‘Stella de Oro 2’; C2: Hemerocallis cv. ‘Frans Hals’; C3: H. citrina cv. ‘Datonghuanghua’ C4: Hemerocallis cv. ‘Blazing sun’; C5: Hemerocallis cv. ‘H400’; C6-C9 are young flower buds. C6: Hemerocallis cv. ‘Beijing 7’; C7: Hemerocallis cv. ‘Blue Sheen 1’; C8: H. citrina cv. ‘Zaohuanghua’; C9: Hemerocallis cv. ‘Little Grapette’.

Materials and Methods

Plant Materials

One hundred and fifty-five Hemerocallis accessions were employed for genetic diversity analysis with 43 EST-SSRs. These accessions belong to at least 13 species from different geographic regions around the world including commercial varieties, landraces, and collections from the wild. The details of these 155 accessions are presented in Supplementary Table S1, and their geographic distributions are illustrated in Figure 2. The taxonomic status of more than half of the 155 accessions was labeled uncertain during collection (Hemerocallis spp. in Supplementary Table S1). All accessions were grown in the Hemerocallis Germplasm Resource Nursery located on the campus of Shanxi Agricultural University (Taigu, Shanxi Province, China). All plants used for sample collection have been grown for 3–5 years in the nursery. At full flowering stage, the fresh roots, flower buds, and leaves of the H. citrina cv ‘Datonghuanghua’ were collected, flash frozen in liquid nitrogen, and used for total RNA extraction and RNA-Seq.
FIGURE 2

Geographic distribution 155 Hemerocallis accessions used in this study.

Geographic distribution 155 Hemerocallis accessions used in this study.

RNA-Seq, Transcriptome Assembly, and Annotation

For RNA-Seq, total RNA was isolated from each sample with three plants pooled with a modified CTAB protocol (Tel-Zur et al., 1999). RNA quantity and quality were evaluated using the Agilent Bioanalyzer (Agilent Technologies, CA, United States). The integrity of the total RNA was also assessed through agarose gel electrophoresis. The cDNA library for RNA-Seq was prepared using Illumina TruseqTM RNA sample prep kit (Illumina, CA, United States) following manufacturer’s protocols. The concentration and insert size of the library were assessed using Qubit2.0 and Agilent 2100. Pair-end (125 PE) Illumina high-throughput sequencing was performed with a Hi-Seq 2500 sequencing machine. All samples were sequenced in the same instrument (HWI-7001455), same run (316), same follow cell (HH7VHADXX), and same lane (2). After resequencing, the adaptors and low quality reads were filtered from the raw reads. The clean reads were de novo assembled into contigs with an optimized k-mer length = 25 and group pairs distance = 300 (Grabherr et al., 2011) using the Trinity program[2]. Unigene sequences were aligned against databases of the Clusters of Orthologous Groups (COG) (Tatusov et al., 2000), Gene Ontology (GO) (Michael et al., 2000), Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al., 2004), euKaryotic Orthologous Groups (KOG) (Eugene et al., 2004). The functions of unigenes were predicted by BLAST against the NCBI non-redundant protein (Nr), and Swiss-Prot databases at an E-value of 10–5. The resulting datasets were further aligned to the Protein family (PFAM) database with HMMER (E-value 10–10). For further quantitative assessment of the completeness of the assembly and annotation, the H. citrina cv ‘Datonghuanghua’ transcriptome was subjected to BUSCO (Benchmarking Universal Single-Copy Orthologs) analysis (Simão et al., 2015) against the Viridiplantae_odb10, embryophyta_odb10, and liliopsida_odb10 databases.

Microsatellite Sequence Identification and EST-SSR Genotyping

SSR identification among the unigene sequences (>1 kb) was performed with the MIcroSAtellite (MISA) program[3]. All microsatellites containing di-, tri-, tetra-, penta-, hexa-nucleotide, and compound motifs with more than five repeats were included. The Primer3 program[4] was used to design primers for the identified SSRs. Each designed primer pair was first evaluated with in silico PCR procedure[5] using unigene sequences as the template. Primers with multiple amplicons were filtered out. Forty-three EST-SSR markers were selected for genotyping 155 Hemerocallis accessions. Genomic DNA was extracted from fresh leaves of each accession (pooled from five plants) using the CTAB method. A touch-down PCR procedure (Azhaguvel et al., 2009) was employed and the amplicons were separated using 9% non-denaturing polyacrylamide gel electrophoresis (PAGE) with 0.5 × TBE buffer for 1.5 h in 250V, and then visualized with silver staining.

Cluster and Population Structure Analyses

The SSR genotypic data were organized in a matrix in which 0 and 1 representing absent and present alleles, respectively (9 = missing data). The genetic distances were calculated through the SM coefficient by the Sim Qual procedure in NTSYSpc 2.10 (Rohlf and Jensen, 1989). The dendrogram was constructed using the unweighted pair-group method with arithmetic mean (UPGMA) clustering and drawn by NTSYSpc 2.10. We also applied Neighbor Joining (NJ) clustering on the dataset, which was implemented in the MEGA5.05 software package. A model-based Bayesian clustering was applied to infer genetic structure and define the number of clusters (gene pools) in the dataset using STUCTURE v.2.3.4 (Pritchard et al., 2000). No prior information was used to define the clusters. Independent runs were done by setting the number of clusters (K) from 1 to 15. Each run was comprised of a burn-in length of 10,000 followed by 100,000 MCMC (Monte Carlo Markov Chain); each replicate at a particular K-value was repeated 20 times. An ad hoc statistic ΔK based on the rate of changes in the log probability of data among successive K-values (Evanno et al., 2005) was calculated through Structure Harvester v.0.9.93 and used to estimate the most likely number of clusters (K). ΔK was calculated using ΔK = m[| L(K+1)-2L(K)+L(K-1)|]/s| L(K)|, where (L(K) is the logarithm of K, s is the standard deviation, and m is the mean.

Association Analysis for Blooming Time in 155 Hemerocallis Accessions

We performed association analysis of blooming time with SSR markers among the 155 accessions of the Hemerocallis collection. During flowering season, the time of flower opening was continuously monitored real-time using a digital video camera (360 Smart Camera, D606, 360 China). The flowering time of each plant from each accession was recorded. The blooming time of at least five flowers per accession was collected. The blooming time of each accession is provided in Supplementary Table S1. In association analysis, the blooming time was treated a qualitative trait. Accessions with blooming time from 4:00 to 10:00 and 16:00 to 24:00 were assigned “0” and “1,” respectively. Pairwise linkage disequilibrium (LD) and association analyses were performed using TASSEL4.3.6 (Bradbury et al., 2007; Dent et al., 2012), and calculation of LD and P-values. Association analysis was performed using both the general linear model (GLM, Q model) and the mixed linear model (MLM, Q+K model) in TASSEL. The comparison wise significance was computed using 1,000 permutations as implemented in GLM. The kinship matrix was generated by NTSYSpc 2.10 with option P3D for variance component estimation in MLM (Kang et al., 2008). Significance of marker-trait associations was determined at P ≤ 0.05 (Jaiswal et al., 2012). False discovery rate (FDR) was also used to detect true associations.

Gene Expression in Different Tissues of H. citrina cv. ‘Datonghuanghua’

The expression level of blooming time-related gene c33464.graph_c0 in different tissues (fresh roots, flower buds and leaves) of H. citrina cv. ‘Datonghuanghua’ (H0006) was carried out with quantitative real-time PCR (RT-qPCR) on an ABI prism 7500 Fast Real-time PCR system 147 machine (Thermo Fisher Scientific Inc., Waltham, MA, United States). Sequences of the primers for c33464.graph_c0 were GGCGAATTAGTCTGGAAAGAACTAGG (forward primer) and TGTTATGTTCCTCGTCCGTCCAC (reverse primer). The H. citrina actin gene, HcACT, was used as the reference. The primers sequences for this gene were GAGCAAGGAAATCACGGCACT (forward primer) and GGAACCTCCAATCCAAACACTGTAC (reverse primer). RT-PCR procedure followed Hou et al. (2017). Each sample had three biological and three technical replications.

Results

Transcriptome Sequencing, de novo Assembly, and Unigenes Annotation

After filtering of the raw sequencing data, nearly 56 million clean reads (14.15 Gb) were obtained from root, bud and leaf transcriptomes. Main RNA-Seq statistics for the three tissues are presented in Supplementary Table S2. In all three transcriptomes, >90% reads had Q30 or higher quality scores. Using the Trinity de novo assembly program, the ∼14.15 Gb high-quality reads were assembled into contigs, transcripts and unigenes. Main statistics of the assemblies are shown in Table 1. From 6,390,477 contigs (>25 bp in length), 164,723 transcripts were assembled (mean length 838 bp) representing 92,107 unigenes. The length distribution of these unigenes is illustrated in Figure 3. The transcripts with >500 bp in length accounted for 47.06% of all transcripts. There were 13,724,749 (76.9%), 15,731,521 (79.2%), and 14,841,142 (80.4%) reads from the roots, buds and leaves mapped to the assembly transcripts and unigenes, respectively.
TABLE 1

Transcriptome assembly statistics of H. citrina cv. ‘Datonghuanghua’.

Length range (bp)Contigs (%)Transcripts (%)Unigenes
200–3006,332,326 (99.09%)*53,717 (32.61%)43,822 (47.58%)
300–50028,381 (0.44%)33,471 (20.32%)22,209 (24.11%)
500–100015,997 (0.25%)31,124 (18.89%)12,983 (14.10%)
1000–20009,538 (0.15%)30,756 (18.67%)8,806 (9.56%)
>20004,235 (0.07%)15,655 (9.5%)4,287 (4.65%)
Total number6,390,477164,72392,107
Total length328,357,742138,156,55652,976,189
N50 length481,467908
Mean length51.38838.72575.16
FIGURE 3

Length distribution of unigenes from leaf, root and flower bud transcriptomes of H. citrina cv. ‘Datonghuahua’.

Transcriptome assembly statistics of H. citrina cv. ‘Datonghuanghua’. Length distribution of unigenes from leaf, root and flower bud transcriptomes of H. citrina cv. ‘Datonghuahua’. We annotated the 92,107 unigenes by searches against the COG, GO, KEGG, KOG, Nr, and Swiss-Prot databases. Of them, 41,796 (45.4%) had hits in all 7 databases. The unigene hits in each database are shown in Table 2 with the number of annotated unigenes being the highest in the NR (41,053) and the lowest in KEGG (9,375). Further BLAST searches against other databases showed that 12,537, 23,716, 23,393, 28,711, and 24,125 unigenes had at least one match in the COG, GO, KOG, Swiss-Prot and PFAM databases, respectively (Table 2). In BLASTx homology searches (at cutoff E-value of 10–5), the ten top species with hits in the Nr database were: Elaeis guineensis (8,779; 21.38%), Phoenix dactylifera (7,194; 17.52%), Musa acuminate (2,912; 7.09%), Nelumbo nucifera (2,273; 5.54%), Vitis vinifera (2,204; 5.37%), Populus euphratica (761; 1.85%), Oryza sativa (650; 1.58%), Citrus sinensis (582; 1.42%), Populus trichocarpa (554; 1.35%), and Hordeum vulgare (474; 1.15%).
TABLE 2

Number of unigenes with annotations in six databases of H. citrina cv. ‘Datonghuanghua’.

COGGOKEGGKOGNrSwiss-ProtPfamTotal
# Unigenes12,53723,7169,37523,39341,05328,71124,12541,796
≥300 nt9,51716,3096,58016,43428,54821,03819,08028,834
≥1000 nt5,3187,4373,0987,69412,0909,62910,34612,111
Number of unigenes with annotations in six databases of H. citrina cv. ‘Datonghuanghua’. To further evaluate the functions of unigenes, 12,537 of the assembled 41,796 unigenes were classified into 25 clusters based on the Cluster of Orthologous Groups (COG) analysis (Figure 4). Among the 25 COG categories, the cluster related to general function prediction (2,801 or 22.3%) was the largest, followed by the clusters of replication, recombination and repair (1,630 or 13.0%), transcription (1,397 or 11.1%), and secondary metabolites biosynthesis, transport and catabolism (519 or 4.1%). The clusters represented by the least number of unigenes were cell motility, nuclear structure, and extracellular structures (1, 0.008%).
FIGURE 4

Functional classification of the assembled unigenes based on Clusters of Orthologous Groups (COG) categories.

Functional classification of the assembled unigenes based on Clusters of Orthologous Groups (COG) categories. In addition, 23,716 unigenes matched with the GO database were categorized into 51 functional sub-groups of the three main GO groups: cellular component, molecular function, and biological process (Figure 5). The majority of the unigenes was assigned to biological processes (63,089, 43.7%), followed by cellular components (35,677, 37.2%), and molecular functions (27,719, 19.2%). Under the category of biological processes, metabolic process (15,976, 25.3%) and cellular process (13,749, 21.8%) were predominant. In the cellular component category, cell part (12,837, 23.9%) and cell (12,746, 23.8%) were the most abundant classes. As for the molecular function, catalytic activities (12,157, 43.9%) and binding (11,677, 42.1%) were the top two categories in numbers.
FIGURE 5

Functional classification of the assembled unigenes based on Gene Ontology (GO).

Functional classification of the assembled unigenes based on Gene Ontology (GO). In KEGG pathway analysis of the 41,796 unigenes, 9,375 (22.4%) could be assigned to 117 pathways that belong to five main categories including metabolism (6,783), genetic information processing (3,013), cellular processes (495), environmental information processing (321), and organismal systems (305) (see details in Figure 6). The colchicine content is a very important trait for commercial production of night lily. Among annotated unigenes, 20 were involved in the isoquinoline alkaloids metabolic pathway that is related with colchicine biosynthesis. The category with the largest number of unigenes was metabolism, in which the most represented five pathways were oxidative phosphorylation (ko00190) (336, 4.9%), glycolysis/Gluconeogenesis (ko00010) (325, 4.8%), carbon fixation in photosynthetic organisms (ko00710) (306, 4.5%), purine metabolism (ko00230) (242, 3.6%), and starch and sucrose metabolism (ko00500) (228, 3.4%).
FIGURE 6

Functional classification of the assembled unigenes based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. 9,375 unigenes are annotated to five categories: metabolism, genetic information processing, cellular processes, environmental information processing and organic systems. There are a large number of pathways of metabolism and genetic information processing (in total 106 pathways) but the picture is limited, so the first five pathways with the largest number of each category are selected for plotting. The value to the right of each bar indicates the number of unigenes annotated in the pathway.

Functional classification of the assembled unigenes based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. 9,375 unigenes are annotated to five categories: metabolism, genetic information processing, cellular processes, environmental information processing and organic systems. There are a large number of pathways of metabolism and genetic information processing (in total 106 pathways) but the picture is limited, so the first five pathways with the largest number of each category are selected for plotting. The value to the right of each bar indicates the number of unigenes annotated in the pathway. We conducted BUSCO analysis to evaluate completeness of the transcriptome using the near-universal orthologous gene groups. The details are presented in Supplementary Table S3 and illustrated in Supplementary Figure S1. For example, among the 430 BUSCOs in the viridiplantae_odb10 database, H. citrina cv. ‘Datonghuanghua’ transcriptome had 388 (90.2%) complete single or duplicated copies, 27 were fragmented, and 15 were missing, which indicated that this transcriptome assembly had adequate coverage and quality to represent the H. citrina transcriptome for various downstream analyses.

Identification and Characterization of EST-SSRs in Datonghuahua Transcriptomes

Using MISA, we mined microsatellite-containing sequences from 13,019 unigenes with >1,000 bp in length. As a result, 3,430 potential SSRs were identified from 2,977 unigenes, 453 of which contained more than one SSR (detailed in Supplementary Table S4). Details of the 2,977 unigenes, associated SSRs, and motif statistics are provided in Supplementary Tables S5, S6, respectively. The sequences of these unigenes in fasta format are provided in Supplementary Table S7. Among those SSRs, those with the tri-nucleotide repeat motifs were the most abundant (54.29%, 1,862), followed by SSRs with di-nucleotide (31.75%, 1,089), and tetra-nucleotide (3.24%, 111) motifs. Other types were rare. Among the di- and tri-nucleotide repeats, (GAA/TTC), and GA/TC) or (AG/CT) were the most common repeat motifs, respectively (Supplementary Table S6). From the 2,977 SSR-containing unigene sequences, primers were designed for 3,577 SSRs. The primer sequences for each SSR and associated information are provided in Supplementary Table S8. From the list, we tested 192 experimentally in six accessions. Based on the preliminary data, we finally selected 43 SSRs that had clear, and unique amplicons with expected size. Information for the 43 EST-SSRs is provided in Table 3.
TABLE 3

Information of 43 SSRs used for assessment of genetic diversity among 155 Hemerocallis accessions.

No.SSR nameSSR MotifForward primer sequence (5′→3′)Tm/°CReverse primer sequence (5′→3′)Tm/°CAmplicon size (bp)
1SAU00003(GTC)6…(AG)8…/(AG)6F:GTGTTGTCGGAGAAGGAGGA60.2R:CCAATTCGTTCACATCTCCC60.3226
2SAU00006(TAG)5…(GAG)5…(AGG)6F:CCAAGTAAGAGGGTGCATCA58.7R:ATGTCATCTCCATCAAGGGC59.9241
3SAU00008(GTCC)5 (CTCC)5F:TCTTAATCGGACCATCGAGC60.2R:TTGAAATCGAGAGATTGGGG60.0137
4SAU00009(TCA)5…(GA)10F:GTGGTTGATTGGAGGCTGAT59.9R:AGGGGTTGAGCACGTATTTG60.0261
5SAU00010(CA)9…(AC)7F:ATACAAGCCATTGGCAAACC59.8R:GAAGATTCCCCCGTTGATCT60.3220
6SAU00014(TC)7…(CT)10F:CGGAACAACATCGCCTAAAT60.0R:ATAAGGTTGGGGGTGAGGAG60.2254
7SAU00017(A)12…(TCT)5F:TCTCCTATATAAAAACAAAAACGAAAA58.1R:TCTCACTTTGAGCTTGCGAA59.9248
8SAU00029(AT)7F:TCACTGCAACCAAATGAAGG59.7R:GGGGAGGTTATGGACGAGTT60.2243
9SAU00032(CT)6F:AATGTCCCGACTTCAGATGG59.9R:GCGAGGAAGGGAAGAAAAAG60.3260
10SAU00042(AAC)6F:GGGATGAAAGAAACGCAATC59.5R:TGCACAAAACAATTGCATCA59.7183
11SAU00045(AAG)5F:GGTGGAGGTGAGGTTCTCAA60.1R:CCAGCTTTCAAATGGACGAT60.1175
12SAU00047(GCC)5F:AGAAGGTTGTGGATTGCTGG60.1R:CCTTCTCTCTCGTCTCCCCT59.9183
13SAU00048(AGT)5F:CTCTTCGGCAGGAGTAGTGG60.0R:CCAGTCCATCAAAACGTCCT60.0151
14SAU00052(CTC)5F:TGGTCTAATCATTGTCCCCC59.6R:TCAATCATTCAGTTGGGCTG59.6107
15SAU00055(CTC)6F:AAATCCCCCAACCATCTTTC60.0R:GAGTGACGGTGGGTTTGAGT60.0207
16SAU00059(CCA)5F:CCTCTTCTCTTCCCCTCACC60.2R:ATGAGCTTCACCAGGGTTTG60.1189
17SAU00063(AGA)6F:TTGTATACGGTTTCCTCGCC60.0R:GTCAAATTGAGCGGTTGGAT59.9228
18SAU00064(ATCG)5F:GCTTCTTCTTGTCAGCACCC60.0R:ACCGAAACCCTCGGTAAGTT59.9192
19SAU00069(GAAT)5F:GGCAGAGGAAGAATTGTCCA60.2R:ATGTTCTTTCCTTCGCCTCA59.8218
20SAU00080(AGAT)5F:TTCTCCGCCTTGGTGTACTT59.7R:CTCCTCAGCTTGCTTCGTCT59.9181
21SAU00095(CCGCCA)5F:TCTCTCTCATCGCTTCCCAT59.9R:GGCACGGTGAGCAACTCTAT60.3201
22SAU00096(AAAGCC)5F:ACCACATGTCCTCCTATCGC60.0R:TCCCGAATAGCGTGTCCATA61.4206
23SAU00097(T)10 (TG)8*F:ATCGCATATAGCGGACCTTG60.1R:CCGAGACACAAAACTCACGA59.9112
24SAU00102(TGAAAC)5F:CTCCAGTGACAGTCAGCCAA60.0R:TCTGTGATGCTGCTTGGTTC60.0130
25SAU00104(CTGTG)5F:TGAATCAAACTCCACCACCA59.9R:GTTTGGAGATTGGAGGGGTT60.2221
26SAU00109(ATCG)5F:GCTTCTTCTTGTCAGCACCC60.0R:ACCGAAACCCTCGGTAAGTT59.9192
27SAU00120(TATT)5F:CCTTCCTCAATCCAACCTCA60.0R:GTCGGGGTACAATTCGAAGA59.9150
28SAU00121(GTCA)5F:TTTGCATGCCTAGTCCTGTG59.9R:AAGTCGCAGACCAATTCGAT59.7206
29SAU00124(GAAG)5F:CAACAACAAATCCCCAATCC60.0R:CCATGGATCAAATACCACCC59.9204
30SAU00132(GAG)6F:ACGGAGGGTCGAGTTGTATG60.0R:AGTTGACGAATCTCCCGATG60.1184
31SAU00135(ACC)5F:GGAAATTTTCGTCCGAGTCA60.1R:TTTGGTCACACGAAAGGACA60.1206
32SAU00137(GAA)6F:TGAATCCGAGAGACCTTTGG60.2R:ACCCTCGTCCTCAAACCCT60.9145
33SAU00138(TTG)5F:AAGCAGCATTTGAGGTTTGG60.2R:CCCTGAAGCTCCTCATCTTG59.9176
34SAU00141(TGG)5F:TCGAGTTTAGGTGGAGGGTG60.1R:ACAGACAATGGCAACAACCA60.0170
35SAU00143(CCA)5F:CAGAGCTGTCCTTCCTCCAC60.0R:AAATCGATGGGCTACAGGTG60.0214
36SAU00144(CAT)5F:TAAACACCTCGACAGCCTCC60.3R:ACGAGGAAAGGAGGAGTGGT60.1200
37SAU00150(CGC)6F:TGCTTGTCATCCTCCTTTCC60.2R:TATAGATCGAAGCTCGGCGT60.0278
38SAU00160(AT)6F:AATAGGCATCCACCAGGAGA59.5R:CATGAACTCGGCACTCTGAA60.0177
39SAU00164(GA)6F:GATGAGGTCGAGAGGATGGA60.2R:AAAACCGATACCCAGAACCC60.1262
40SAU00171(TC)11…(CT)8F:TAGTCATTACCTTCGCCGCT59.9R:AATGAGCGGTTTTCCATGAC59.9230
41SAU00172(GTG)6…(GTG)5F:CATTTTTGGTGTGCCTGTTG60.0R:TCCACAAAATACAAATTGCAAAA59.4185
42SAU00176(GGT)5…(GGA)6F:AGGATCTGGTGAGGGGAGAG60.6R:TAAACCAACCCCTAGGCCC61.0232
43SAU00180(CAT)5…(AGG)6F:CTTTTGCTCCCAAATTGAGC59.8R:TAATGATCTTTTCGGGCACC59.9279
Information of 43 SSRs used for assessment of genetic diversity among 155 Hemerocallis accessions.

Genetic Diversity of Hemeracallis Germplasm Collection

The forty-three EST-SSRs were used to fingerprint 155 Hemeracallis accessions. A total of 396 alleles were detected at the 43 marker loci. The number of alleles detected at each locus varied from 2 to 25 with an average of 9.21 per locus. The polymorphism information content (PIC) ranged from 0.907 to 0.259 with an average of 0.606 (Table 4). Mean Shannon’s information index (I) was 1.294 (0.477–2.653). Mean observed heterozygosity (Ho) and mean expected heterozygosity (He) was 0.1757 (0.0000–0.5175) and 0.5935 (0.2104–0.9109). These indicated that the 43 SSRs provided rich genetic information (Table 4).
TABLE 4

Estimation of genetic diversity based on 43 SSRs among 155 Hemerocallis accessions.

No.SSR nameAllele numberPICIHoHeAmplicon range (bp)
1SAU0000350.3390.6230.06540.2902226–260
2SAU0000660.7531.4650.26800.7556210–241
3SAU00008160.8262.1350.02630.8305130–160
4SAU0000960.6031.1220.12900.6156250–261
5SAU00010100.6921.4490.09680.6792220–245
6SAU00014100.7521.6690.29730.7247175–190
7SAU0001720.3670.5160.11810.3351290–300
8SAU00029140.7841.8910.22300.7986250–270
9SAU0003280.7131.4660.20000.7094240–255
10SAU00042250.7562.1010.29140.7855180–200
11SAU00045100.6351.2910.14570.6242150–170
12SAU00047130.6651.6170.26490.7014183–205
13SAU00048100.7631.7170.05920.7491150–175
14SAU00052220.9072.6530.07040.9109107–130
15SAU0005580.5001.0620.10670.4802185–207
16SAU0005980.3570.7710.05670.3358200–215
17SAU0006350.5470.9450.13820.5069228–240
18SAU0006460.5981.0470.27520.5481200–215
19SAU00069100.6621.4670.10260.6443218–240
20SAU0008050.2610.5960.00000.2615160–175
21SAU0009560.6141.1270.33330.5940200–210
22SAU00096160.7842.0110.51750.8091210–235
23SAU00097120.8201.9310.24650.8026112–130
24SAU0010280.7481.5570.04030.7466125–140
25SAU0010470.6311.2480.20170.6407221–230
26SAU00109110.4180.9700.03360.4067170–185
27SAU0012040.5480.9150.35170.5558140–160
28SAU0012190.5281.0180.15220.4796190–206
29SAU0012460.5951.0150.20390.5761200–210
30SAU0013250.5190.8030.27630.5031180–190
31SAU0013540.2590.4770.08510.2104195–206
32SAU0013750.4500.7160.33110.3937145–160
33SAU0013850.5400.8970.30070.4848170–176
34SAU00141100.6421.3090.32260.6217170–190
35SAU0014380.3760.7310.12340.3285214–230
36SAU0014470.5180.9320.04200.5164200–220
37SAU00150160.8322.1730.03550.8264265–285
38SAU0016060.5481.0360.11410.5271170–185
39SAU0016460.5031.0140.00000.5045262–275
40SAU00171170.7431.9410.05190.7442220–250
41SAU0017220.4530.6110.30430.4218185–190
42SAU00176190.8342.2560.44900.8450205–232
43SAU0018080.6941.3320.10340.6934260–279
Mean9.210.6061.2940.17570.5935
Estimation of genetic diversity based on 43 SSRs among 155 Hemerocallis accessions. Pairwise genetic similarity coefficients were calculated among all samples which are provided in Supplementary Table S9. The mean genetic similarity coefficient among these accessions was 0.8642 (range: 0.7828–1.000). The least genetic similarity coefficient was 0.7928 between “Suqian 1-C” and US4 suggesting they are genetically the most distant among these accessions. Meanwhile, the genetic similarity coefficient between ‘Datonghuanghua’ (H0006) and ‘Qiaotouhuanghua’ (H0010) was 1.000 indicating they may actually belong to the same accession. A UPGMA dendrogram of the 155 accessions based on 43 EST-SSR markers was constructed (Supplementary Figure S2). A cophenetic correlation test (r = 0.8687) showed that the clustering results were credible. Among the 155 accessions, 55 from three species (H. citrina, H. lilioasphodelus, and H. minor) were clustered in one group (branches highlighted in red in Supplementary Figure S2). Fifty of the 55 accession in this group are known to be night lily accessions, which are represented by ‘Datonghuanghua’ (H0006) and ‘Panlonghua’ (H0004). Thus, this group could be called the “Night Lily Group.” Of the rest 100 accessions, 95 bloomed in the morning, and five [‘Guanglinghuanghua’ (H0011), ‘Stella de Oro 2’ (H0077), ‘Stella de Oro 3’ (H0139), ‘Xue Qiu Hong’ (H0092), and ‘Beijing 6’ (H0101)] open flowers at night (see flowering time data in Supplementary Table S1). These 100 accessions included multiple species such as H. fulva [var. sempervirens (H0050), and var. kwanso var. reasata (H0052)], H. aurantiaca (H0040), H. middendorffii (H0041), H. thunbergii (H0042), H. multiflora (H0045), H. altissima (H0046), and H. dumortieri (H0048) (Supplementary Table S1). We also constructed a Neighbor-Joining (NJ) tree for the 155 accessions, which is presented in Figure 7. The groupings were largely the same as the UPGMA dendrogram (Supplementary Figure S2) with all night lily accession in one clade. These data suggested that night lily has narrower genetic base than daylily, and the night lily is likely the result of human selection from daylilies for human consumption during crop evolution.
FIGURE 7

Neighbor-Joining tree of 155 Hemerocallis accessions based on 43 EST-SSR constructed using MEGA. Group I (branches in red) includes 55 accessions belonging mostly to the night lily group. Group II (branches in green) consisted of 100 accessions forming the daylily group.

Neighbor-Joining tree of 155 Hemerocallis accessions based on 43 EST-SSR constructed using MEGA. Group I (branches in red) includes 55 accessions belonging mostly to the night lily group. Group II (branches in green) consisted of 100 accessions forming the daylily group.

Population Structure of 155 Hemeracallis Accessions

A model-based Bayesian clustering approach was used to analyze 155 accessions by STRUCTURE. The logarithm of the likelihood [Ln P(D)] on average continued to increase with increasing numbers of assumed subpopulations (K) from 2 to 20 (Figure 8A). Difference between Ln P(D) values at two successive K-values became non-significant after K = 5. As the ad hoc statistic ΔK preferentially detects the uppermost level of structure and gave the highest value at K = 2 (Figure 8B; Lia et al., 2009; Emanuelli et al., 2013). Thus, K = 2 was considered as the most probable prediction for the number of subpopulations. Indeed, in the first round of structure analysis, the 155 accessions were split into two major clusters. Group I contained 53 accessions with an average Q value of 0.9867 (86.79% of them with Q > 0.98), and Group II contained 102 accessions with an average Q value of 0.9213 (78.43% of them with Q > 0.90) (Figure 8C and Supplementary Table S1).
FIGURE 8

STRUCTURE analysis of 155 Hemerocallis accessions. (A) The relationship between K-value and LnP(D) value. (B) The relationship between K-value and ΔK value. (C) Clustering of 155 Hemerocallis accessions. The red is Groups II consisting of 102 accessions (daylily group), the green is Group I including 53 accessions (night lily group).

STRUCTURE analysis of 155 Hemerocallis accessions. (A) The relationship between K-value and LnP(D) value. (B) The relationship between K-value and ΔK value. (C) Clustering of 155 Hemerocallis accessions. The red is Groups II consisting of 102 accessions (daylily group), the green is Group I including 53 accessions (night lily group). Group I consisted of 53 accessions, all of which were blooming at night (night lily group). Groups II consisted of 102 accessions, 97 accessions of them were blooming in the morning (daylily group) (Supplementary Table S1). Some accessions were admixtures of the two groups.

Association Analysis of Blooming Time Among 155 Accessions

Blooming time is a very important horticultural trait for both daylily and night lily varieties. We recorded blooming time of all 155 accessions, and the data were presented in Supplementary Table S1. Among them, 96 bloomed (almost all daylily) in the morning (7–10 a.m.), and the rest (almost night lily) in the later afternoon or evening (4–11:30 p.m.). To confirm the utility of the EST-SSR markers developed in the present study, we conducted association analysis of blooming time using these markers. We first examined linkage disequilibrium (LD) in the Hemerocallis population. Of the 903 pairwise combinations generated for 43 the EST-SSR loci, 786 (87.0%) showed LD (Supplementary Table S10 and Supplementary Figure S3). At P ≤ 0.001, 74 pairs (8.2%) had mean r2 of 0.4266 indicating strong LD. At P ≤ 0.001, the standardized disequilibrium coefficients (D’) was 0.8435 which was close to 1.0 suggesting low chance of recombination between pairs of loci, which may be caused by the long-term asexual reproduction among the Hemerocallis collection. Association analysis was performed using both the general linear model (GLM, Q model) and the mixed linear model (MLM, Q+K model) in TASSEL. The results are presented in Table 5, and illustrated in Figure 9. In the GLM model, 12 markers showed significant association with blooming time, 8 of which had extremely high significance (P ≤ 0.01; mean r2 = 0.0459) (Points above the blue line in Figure 9c). Similarly, under the MLM model, 10 markers were significantly associated with blooming time and 5 had extremely significant association (P ≤ 0.01; mean r2 = 0.0560) (Points above the blue line in Figure 9d). The 10 markers with significant associations with the blooming time in both GLM and MLM were SAU00008, SAU00045, SAU00047, SAU00045, SAU00063, SAU00064, SAU00109, SAU00121, SAU00143, and SAU00150. Among them, SAU00109 had the strongest association (P << 0.001 in MLM model) that could explain 8.8% phenotypic variance (8.0% in GLM) (Table 5). The predicted functions of the 12 unigenes associated with these SSRs are provided in Table 5. Both SAU00064 and SAU00109 were associated with the same unigenes c22731.graph_c0, which seems to encode a eukaryotic translation initiation factor 2D. Another unigene, c33464.graph_c0, associated with marker SAU00063 was annotated to encode a LHY-like protein, which is known to be regulated by circadian clock. The expression of c33464.graph_c0 in flower buds was significantly higher than that of roots and leaves in both RNA-Seq and RT-qPCR experiments (Figure 10). The unigene c47752.graph_c0 associated with SAU00008 was predicted to encode phosphatase 2C 35/65, and its expression level in roots was significantly different from that in flower buds and leaves.
TABLE 5

Twelve SSR markers with significant association with blooming time in 155 Hemerocallis accessions*.

Markers with Significant associationR2
Unigenes Associated with markersPredicted functions of unigenes
GLMMLM
SAU000080.05538**0.06179**c47752.graph_c0Protein phosphatase 2C 35
SAU000140.04321*c50474.graph_c029 kDa ribonucleoprotein B
SAU000450.03733**0.04143**c17410.graph_c0F-box protein PP2-A15
SAU000470.04347**0.04721**c18505.graph_c0Pre-mRNA-processing protein 40A isoform X1
SAU000550.02924*0.03254*c24436.graph_c0Uncharacterized protein LOC105044077
SAU000630.04051**0.04161**c33464.graph_c0Protein LHY-like
SAU000640.02803*0.03182*c22731.graph_c0Eukaryotic translation initiation factor 2D isoform X1
SAU001090.08038**0.08781**c22731.graph_c0Eukaryotic translation initiation factor 2D isoform X1
SAU001210.0373**0.04022*c52364.graph_c0bZIP transcription factor 60
SAU001320.0194*c20852.graph_c0protein-tyrosine-phosphatase MKP1-like
SAU001430.02814**0.03023*c19709.graph_c0Basic proline-rich protein-like isoform X2
SAU001500.04503**0.04799*c28194.graph_c0Myosin IB heavy chain-like isoform X1
FIGURE 9

Association analysis of SSR markers with flower timing among 155 Hemerocallis accessions. In (a,b), QQ plots showing the P value of associated markers deviated from the expected P-value in GLM Model (a) and MLM Model (b). Blue dots and red lines indicate the P-value of the associated markers, and the expected P-value, respectively. In (c,d), Manhattan plots showed markers associated with flowering time in GLM Model (c) and MLM Model (d). Threshold: the red line is P = 0.05 and the blue line is P = 0.01.

FIGURE 10

The expression of c33464.graph_c0 based on quantitative real-time PCR and RNA-Seq in bud, leaf and root of H. citrina cv. ‘Datonghuanghua’. The results show the mean ± SD (error bars) and were generated from three biological replicates. ** mean significantly difference, P ≤ 0.01. HcACT was used as an internal control for RT-qPCR.

Association analysis of SSR markers with flower timing among 155 Hemerocallis accessions. In (a,b), QQ plots showing the P value of associated markers deviated from the expected P-value in GLM Model (a) and MLM Model (b). Blue dots and red lines indicate the P-value of the associated markers, and the expected P-value, respectively. In (c,d), Manhattan plots showed markers associated with flowering time in GLM Model (c) and MLM Model (d). Threshold: the red line is P = 0.05 and the blue line is P = 0.01. Twelve SSR markers with significant association with blooming time in 155 Hemerocallis accessions*. The expression of c33464.graph_c0 based on quantitative real-time PCR and RNA-Seq in bud, leaf and root of H. citrina cv. ‘Datonghuanghua’. The results show the mean ± SD (error bars) and were generated from three biological replicates. ** mean significantly difference, P ≤ 0.01. HcACT was used as an internal control for RT-qPCR.

Discussion

In this study, 92,107 Hemerocallis transcriptomic unigenes were assembled using the Illumina Hi-Seq 2500 platform (Table 1). The N50 length of these unigenes was 908 bp, and the average length was 575 bp. These results were comparable to what obtained in other recent studies in species of the Liliaceae family, such as Lilium regale (N50 = 920 bp, average length = 682 bp) (Cui et al., 2018). The number of unigenes and contained SSRs obtained from the present study (92,107 unigenes with 3,430 SSRs) was also close the number identified in other three Liliaceae species (L. formolong, L. longiflorum, and L. longiflorum) (average unigenes = 72,256) (Biswas et al., 2018). These results suggested that closely related species share similar gene contents and distribution of SSR sequences in their genomes. Fingerprinting is among the most popular uses of molecular markers. In Hemerocallis, hundreds of varieties have been released. Many varieties have the same names but may be morphologically different, or have the same appearances but with different names. Marker-based fingerprinting can help resolve these issues. In the study, the two landraces ‘Malinhuanghua (H0029)’ and ‘Malinhuanghua2 (H0125)’ had the same name (in Chinese), but have different geographic origins. Clustering analysis indicated that they were located in different clades (Figure 7 and Supplementary Figure S2). Therefore, these two accessions do not seem to share any common parent, but may have the same name by chance. The three accessions, ‘Taiguxuancao (H0038),’ ‘Stella de Oro 2 (H0077),’ and ‘Stella de Oro 3 (H0139)’ may have a similar situation. In addition, three accessions, H. minor Mill. (H0007) and H. minor Mill.2 (H2802) were all introduced from Qingyang, Gansu Province, China, but clustering analysis did not support their close relatedness. These apparent mis-identifications were probably because the original names of these foreign varieties were changed right after their introduction. This is also likely true for many accessions that were introduced from one place to another place inside China. Work in the present study presents a good example to show the power of molecular markers in variety identification and protection. Both clustering analysis and STRUCTURE analysis place ∼50 accessions into one group (Figure 7 and Supplementary Figure S2) that bloom in the evening, while the rest of ∼100 accession all bloom in the morning (Supplementary Table S1). Marker-based phylogenetic analysis revealed that the night lily accessions were more genetically related, which seems to be consistent with their geographical distribution. Accessions in the daylily group were collected from more geographically diverse locations (Supplementary Table S1). Based on our multiple observations (Ji et al., 2018), accessions in the daylily group are also morphologically more diverse than those in the night lily group. For example, the colors of petals and sepals of daylilies may vary from yellow, yellow-orange, orange, orange-red, red, purple-red to purple. For flower size, the width variation coefficient was high in petals and sepals. Among them, the orange petal width variation coefficient was the highest of 42.15% and the average width was 35.90 mm (13.58–75.56 mm). Similarly, the variation coefficient of the orange sepal width was the highest at 38.74%, and the average width was 23.40 mm (13.20∼48.34 mm). The difference between stalk height and leaf width were highly significant. The inflorescences were rich in morphological variation ranging from mini inflorescence, extremely short branches, to large branched inflorescence, with varying amounts of blossoms. On the other hand, the 55 accessions in the night lily group (branches in red in Figure 7 and Supplementary Figure S2 were collected from nine provinces of China, all of which were diploids and are edible. For these accessions, both petals and sepals were yellow while the width of petals and sepals were very small, with the mean values 17.46 and 13.77 mm, respectively. As compared with accessions in the daylily group, the stalk height (main stem) of night lily group was in general higher (129.08 cm), the leaf was narrower (3.16 cm), and the average number of blossoms was 22.10. Fifty-three accessions were blooming at night, except for ‘Panlonghua’ (H0004) and ‘Panlonghua 2’ (H0111) that bloomed in the morning. The results from both marker analysis (Figure 7 and Supplementary Figure S2) and morphological observations indicate that the night lily group may have narrower genetic base than the daylily group accessions in this collection. This seems reasonable because the daylily group accessions had a more diverse geographic origin (Supplementary Table S1). Also, many of the night lily accessions may share common parents during their breeding and selection. From the crop evolution perspective, night lily was probably selected from the genetically more diverse daylily gene pool for vegetable use. During this process, blooming at night may be a main target of selection which is critical for vegetable use to collect the tender unopened flower bud during day time. A few accessions were placed in different clades when different programs were used in clustering analysis, for example, ‘Panlonghua’ and ‘Panlonghua 2’ were clustered into night lily group by NTSYS and MEGA, which were in daylily group by STRUCTURE (Supplementary Table S1, Figures 7, 8, and Supplementary Figure S2). ‘Panlonghua’ and ‘Panlonghua 2’ were edible landraces, which were probably the hybrids between the night lily and the daylily. They displayed characteristics of both parents including morning blooming and the narrow perianth traits. ‘Stella de Oro 2’ (H0077), ‘Xue Qiu Hong’ (H0092), ‘Beijing 6’ (H0101), and ‘Little Bumble Bee’ (H0141) were daylily varieties that blossom at night, but they were clustered together with other daylily accessions that bloomed during the day. These observations suggest gene flow between the two cultivated groups during the long-term breeding and selection. Hermerocallis varieties are often recognized as either daylily or night lily based on their blooming time. Probably only a few genes with strong phenotypic effects are underlying the blooming time (Masahiro et al., 2006). In this study, we identified 12 SSRs with strong association with blooming time (Table 5). The SSR marker SAU00063 with the strongest association is in the unigene (c33464.graph_c0) that was annotated to encode a LHY-like protein (Table 5). The LHY (Late Elongated Hypocotyl) gene has been shown to regulate flower timing in a number of plants (e.g., Ramos-Sanchez et al., 2019; Zhang et al., 2019). Our work in the present study provides a foundation for gene discovery of flower timing in Hermerocallis. However, additional work is needed to validate this result.

Conclusion

This is the first report on the Hemerocallis transcriptome, and a large number of EST-SSR markers have been developed from this transcriptome, which will provide an excellent resource for researchers and breeders focusing on Hemerocallis. More importantly, our results showed that this strategy using high-throughput sequencing technology is feasible and extremely convenient and efficient in the genus Hemerocallis. The genes associated with circadian rhythm provide germplasm resources and genes basis for further study on flowering rhythm of this genus.

Data Availability Statement

The datasets presented in this study can be found in NCBI under accession numbers SRR11610941, SRR11610942, and SRR11610943 (Submission ID: SUB7327030; BioProject ID: PRJNA628147).

Author Contributions

SL designed and conducted the whole research. FJ collected the phenotype data and conducted the GWAS analysis. FH collected some accessions and developed the EST-SSR. HC analyzed the transcriptome data. QS ran the PAGE gel. GX and XK kept the Hemerocallis germplasm. YW advised the project and revised the manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  34 in total

Review 1.  Endophytes as in vitro production platforms of high value plant secondary metabolites.

Authors:  Aarthi Venugopalan; Smita Srivastava
Journal:  Biotechnol Adv       Date:  2015-07-28       Impact factor: 14.227

2.  Efficient control of population structure in model organism association mapping.

Authors:  Hyun Min Kang; Noah A Zaitlen; Claire M Wade; Andrew Kirby; David Heckerman; Mark J Daly; Eleazar Eskin
Journal:  Genetics       Date:  2008-03       Impact factor: 4.562

3.  Sequencing breakthroughs for genomic ecology and evolutionary biology.

Authors:  Matthew E Hudson
Journal:  Mol Ecol Resour       Date:  2008-01       Impact factor: 7.090

4.  Nitrogen and nitric oxide regulate Arabidopsis flowering differently.

Authors:  Zhong-Wei Zhang; Yu-Fan Fu; Yang-Hong Zhou; Chang-Quan Wang; Ting Lan; Guang-Deng Chen; Jian Zeng; Yang-Er Chen; Ming Yuan; Shu Yuan; Jin-Yao Hu
Journal:  Plant Sci       Date:  2019-04-18       Impact factor: 4.729

5.  Effects of Hemerocallis on sleep in mice.

Authors:  E Uezu
Journal:  Psychiatry Clin Neurosci       Date:  1998-04       Impact factor: 5.188

6.  Relative role of flower color and scent on pollinator attraction: experimental tests using F1 and F2 hybrids of daylily and nightlily.

Authors:  Shun K Hirota; Kozue Nitta; Yuni Kim; Aya Kato; Nobumitsu Kawakubo; Akiko A Yasumoto; Tetsukazu Yahara
Journal:  PLoS One       Date:  2012-06-15       Impact factor: 3.240

7.  TransRate: reference-free quality assessment of de novo transcriptome assemblies.

Authors:  Richard Smith-Unna; Chris Boursnell; Rob Patro; Julian M Hibberd; Steven Kelly
Journal:  Genome Res       Date:  2016-06-01       Impact factor: 9.043

8.  Genetic diversity and population structure assessed by SSR and SNP markers in a large germplasm collection of grape.

Authors:  Francesco Emanuelli; Silvia Lorenzi; Lukasz Grzeskowiak; Valentina Catalano; Marco Stefanini; Michela Troggio; Sean Myles; José M Martinez-Zapater; Eva Zyprian; Flavia M Moreira; M Stella Grando
Journal:  BMC Plant Biol       Date:  2013-03-07       Impact factor: 4.215

9.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes.

Authors:  Eugene V Koonin; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Dmitri M Krylov; Kira S Makarova; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Igor B Rogozin; Sergei Smirnov; Alexander V Sorokin; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  Genome Biol       Date:  2004-01-15       Impact factor: 13.583

10.  Antidepressant-like effects of the hydroalcoholic extracts of Hemerocallis citrina and its potential active components.

Authors:  Bingjian Du; Xiaoshuang Tang; Fei Liu; Chunyue Zhang; Guanghua Zhao; Fazheng Ren; Xiaojing Leng
Journal:  BMC Complement Altern Med       Date:  2014-09-01       Impact factor: 3.659

View more
  2 in total

1.  Tissue-specific transcriptome for Rheum tanguticum reveals candidate genes related to the anthraquinones biosynthesis.

Authors:  Tao Zhou; Tianyi Zhang; Jiangyan Sun; Honghong Zhu; Miao Zhang; Xumei Wang
Journal:  Physiol Mol Biol Plants       Date:  2021-11-17

2.  De novo transcriptome assembly, gene annotation, and EST-SSR marker development of an important medicinal and edible crop, Amomum tsaoko (Zingiberaceae).

Authors:  Mengli Ma; Hengling Meng; En Lei; Tiantao Wang; Wei Zhang; Bingyue Lu
Journal:  BMC Plant Biol       Date:  2022-09-29       Impact factor: 5.260

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.