Literature DB >> 31034501

Transcriptome analysis and annotation: SNPs identified from single copy annotated unigenes of three polyploid blueberry crops.

Yunsheng Wang1, Muhammad Qasim Shahid2,3, Fozia Ghouri2,3, Sezai Ercişli4, Faheem Shehzad Baloch5, Fei Nie6.   

Abstract

Blueberry is a kind of new rising popular perennial fruit with high healthful quality. It is of utmost importance to develop new blueberry varieties for different climatic zones to satisfy the demand of people in the world. Molecular marker assisted breeding is believed to be an ideal method for the development of new blueberry varieties for its shorter breeding cycle than the conventional breeding. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) markers are widely used molecular tools for marker assisted breeding, which could be detected at large scale by the transcriptome sequencing. Here, we sequenced the leaves transcriptome of 19 rabbiteye (Vaccinium ashei Reade), 13 southern highbush (Vaccinium. corymbosum L × native southern Vaccinium Spp) and 22 cultivars of northern highbush blueberry (Vaccinium corymbosum L) by using next generation sequencing technologies. A total of 80.825 Gb clean data with an average of about 12.525 million reads per cultivar were obtained. We assembled 58,968, 55,973 and 53,887 unigenes by using the clean data from rabbiteye, southern highbush and northern highbush blueberry cultivars, respectively. Among these unigenes, 3599, 3495 and 3513 unigenes were detected as candidate resistance genes in three blueberry crops. Moreover, we identified more than 8756, 9020, and 9198 SSR markers from these unigenes, and 7665, 4861, 13,063 SNPs from the annotated single copy unigenes, respectively. The results will be helpful for the molecular genetics and association analysis of blueberry and the basic molecular information of pest and disease resistance of blueberry, and would also offer huge number of molecular tools for the marker assisted breeding to produce blueberry cultivars with different adaptive characteristics.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31034501      PMCID: PMC6488077          DOI: 10.1371/journal.pone.0216299

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Blueberry is perennial flowering shrub or small tree, which comprises of about twenty members that belong to section) Cyanococcus, genus Vaccinium, and family Ericaceae [1]. The blueberry is a delicious fruit, and its fruits are famous in the world for its high anthocyanins contents, and it is listed among the top five healthful fruits (non-citrus) in North America [2,3]. The previous studies showed that blueberry anthocyanins have multiple healthful functions including retarding age-related diseases like Alzheimer’s and enhancing memory [4], reducing eye strain, preventing macular degeneration, exhibiting anti-cancer activity [5,6], and reduce the risk of heart diseases [7]. Blueberry fruit is also a good raw material of sauces, juices and wine [8,9], and used as a dye because of high pigment contents [10]. In the recent decade, the blueberry production has increased significantly in the world, especially the production of new emerging countries from Asia, Oceania and South America [11-13]. The world production of highbush blueberry, which is a major blueberry crop, had passed the 1-billion pound in 2012 [14]. However, blueberry cultivars planted in the whole world are still mainly from North America [15], and the new blueberry producing countries have different climatic and soil conditions compared to the native blueberry producing area [16]. In order to cope with the challenges from various ecological and climatic conditions, more new widely adaptive cultivars are required for the development and growth of blueberry industry. However, blueberry is a perennial fruit crop with long juvenile period and complex ploidy genome [17-19]. Therefore, it required a long time to overcome these unfavorable factors and to select key traits in the breeding procession by conventional methods [20-22], and also needs to spend a lot of manpower and resources [23]. Modern molecular marker assisted breeding techniques and genetic engineering techniques are apt to overcome these problems and accelerate the breeding process [19]. With the advent of high-throughput sequencing technology and the development of bioinformatics analysis, genomics research has become a common method for biological laboratories. Blueberry research has entered into the genomic era with the availability of huge genomic data [24]. For example, the molecular mechanism of the cold adaptation of blueberry was studied extensively by using functional genomics methods, especially RNA-seq sequencing technique, and the gene expression analysis under the cold environment [25-31]. The metabolic related genes of blueberry antioxidant substances were explored by transcriptome analysis [32]. The changes in gene expression profiles of blueberry after infection with Bacillus anthracis were studied by RNA-seq technique [33]. Metabolite profiling showed transcriptional regulation of abscisic acid and flavonoids metabolism during the development of blueberry fruit [34], and candidate genes involved in fruit ripening were identified [35]. The EST sequence database of cultured blueberries (Vaccinium corymbosum) was also established in 2007 [36]. Meanwhile, a reference genome of blueberry (Vaccinium corymbosum with diploid genome) has been published, and researchers can access it by the genome Browser8.5.2 software (http://bioviz.org/igb/). However, the above studies were only limited to an individual blueberry cultivar, and the genome information about different blueberry cultivars or populations have not been reported yet. Moreover, there are few studies about the exploitation of SSR or SNP markers and haplotype-phased genome assembly of blueberry by genotyping by sequencing (GBS) and whole genome sequencing [37-39]. Molecular markers are indispensable tools for marker assisted breeding. The SSR and SNP markers are two attractive and widely used because of many merits including co-dominant, reproducibility, locus-specificity, and random genome-wide distribution in many organisms [40,41]. In this genomic era, the development of SSR and SNP markers by high-throughput next-generation sequencing platform has been popular work and marker assisted breeding has also entered into the genomics era [42-45]. In the present study, we sequenced the leaves transcriptome of 19 rabbiteye blueberry cultivars, 13 southern highbush blueberry cultivars and 22 cultivars of northern highbush blueberry by using next generation sequencing technologies. Our aims were (1) to collect functional genome information about different blueberry cultivars; (2) to uncover the preliminary molecular mechanism of blueberry adaptation by mining resistance genes; and (3) to develop SSR and SNP markers to assist in the breeding and other corresponding studies about blueberry.

Materials and methods

Ethics statement

No specific permissions were required for these locations/activities because all samples were collected from blueberry germplasm nursery of Majiang Blueberry Industry Engineering Technology Center, Guizhou, China. We collected leaves from blueberry cultivars for research, and also confirmed that the field studies did not involve any endangered or protected species.

Plant material and RNA extraction

We extracted the total RNA from the young leaves of 2–3 years old seedlings of 54 blueberry cultivars that were planted at blueberry germplasm nursery of Majiang Blueberry Industry Engineering Technology Center (Wuyangma village, Xuanwei town, Majiang county, Guizhou province, China), including 19 rabbiteye, 22 northern highbush and 13 southern highbush blueberry cultivars (S1 Table). The total RNA from young leaves of all cultivars was extracted by using the Spectrum plant total RNA kit (Sigma-Aldrich-STRN250 MSDS, USA) and strictly followed the guidelines provided by the company. High quality RNA with RIN (RNA integrity number) above 7.0 was used for RNA sequencing.

Library construction and sequencing

High quality total extracted RNAs (A260/A230 of OD value more than 2.0, A260/A280 OD value between 1.8–2.0, electrophoretic bands clear, concentration more than 50ng/μL) were used to construct the paired-end sequencing libraries, and the sequencing was done according to the sequencer provider’s instructions as follow: First, the total RNA was treated with DNAse and then separated poly-A-containing mRNA from the total RNA by using poly-T-oligo-attached magnetic beads. Second, the purified mRNA sequences were fragmented into approximately 300~500 base length fragments, and these mRNA fragments were used as template to synthetize the first single strand of cDNA, and then the first strand of cDNA was used as template to synthetize the second strand of cDNA. Third, the synthetized double strands were purified and quantified after carrying out the reaction of end repair, A-tailing and adapter ligation. Then the purified cDNA was enriched by a 15-cycle-PCR reaction to complete sequencing library. Finally, paired-end sequencing was conducted on Illumina HighSeq 4000 platform. Raw reads with fastq format have been deposited to NCBI and are available at genbank with ID: PRJNA511922.

Raw data filtering

We obtained the clean reads for further assembly by filtering the raw reads based on the following steps and rules: 1) removing reads containing adapters; 2) removing reads containing more than 10% of unknown nucleotides (N); 3) removing reads containing more than 50% of low quality (Q-value≤20) bases.

De novo assembly

Though the genome of a highbush diploid blueberry is available (http://bioviz.org/igb/), but the sequencing coverage and the genome integrity of reference genome is very low. So we assembled the unigenes of three kinds of blueberry crops independently by using program “Trinity”, a software package designed specifically for the assembling of short reads without reference genome [46]. The unigenes with a length longer than 201 bp were accounted for statistics and used for further analysis.

Annotation of unigenes

We executed basic annotations including protein functional annotation, pathway annotation, COG/KOG functional annotation and Gene Ontology (GO) enrichment analysis to predict the molecular functions of assembled unigenes. First, we used BLASTx program [47] with an E-value threshold of 1e-5 to hit against the NCBI non-redundant protein database (http://www.ncbi.nlm.nih.gov), the Swiss-Port protein database (http://www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [48], and the COG/KOG database [49]. We obtained the protein functional annotation codes of corresponding unigenes according to the best alignment results. Then we performed GO functional annotation of unigenes by using the Blast2GO software [50], and the functional classification of unigenes was done using WEGO software [51].

Identification of resistance genes

We used all assembled unigenes to query the plant resistance genes database (PRGdb; http://prgdb.org) with an E-value threshold of 1e-5.

Detection of SSR markers and primer designing

We used program MISA (http://pgrc.ipk-gatersleben.de/misa/) to identify SSR markers and designed corresponding primers by using following parameters: (1) motif ranged from 2 to 6 nucleotides; (2) minimum repeat units were six for 2 nucleotide repeat motifs, five for 3 nucleotide repeat motifs, four for 4–6 nucleotide repeat motifs; (3) the maximum interruption length between two SSR markers was set as 100 bp. The program Primer 3 (http://primer3.ut.ee/) was used to design primers with the following criteria: The GC contents of primer sequences were ranged from 40% to 60%, and the size of expected PCR product was ranged from 100 to 250 bp.

SNP calling

We used program tophat v2.0.14 which is built in bowtie software package (http://bowtie-bio.sourceforge.net/index.shtml) to call the original SNPs dataset by setting default parameters. To avoid the false positive mutant loci as much as possible, we filtered the original SNP dataset by following criteria: sequencing quality of SNP loci base reach to Q30, the read depth of opposite base of SNP loci reach to five, minor allele frequency of SNP loci greater than 15%, and SNP found only in annotated single copy unigenes. To identify single copy unigenes, we first executed two-two alignment of all unigenes that belong to different species by using blastp method, and the unigene pairs with E-value lower than 1e-7 of were regarded as homologous genes, and then we clustered unigenes that are homologous to each other into one gene family by running the program of OrthoMCL (http://orthomcl.org/orthomcl/). If a gene family includes only one unigene in each species, then it was regarded as a single copy unigene.

Results

Data statistics and Unigenes assembly

We obtained about 248.26, 139.28, 288.81 million raw reads from leaves transcriptome of 19 rabbiteye, 13 southern highbush, and 22 northern highbush blueberry cultivars by using HighSeq 4000 platform, respectively. After filtering the reads containing adapters, more than 10% of unknown nucleotides and low quality bases (highbush and northern highbush blueberry cultivars, and the average length of three unigenes clusters were 857 bp, 873 bp and 896 bp, respectively (S2 Table).

Annotation of Unigenes

Of the 45,535, 42,914 and 43,630 unigenes, a total of 28,091, 28,115, 27,256 unigenes were functionally annotated by one or more databases, such as Nr, Swiss-Port, KOG and KEGG, which accounted for 61.69%, 65.51% and 62.47% of total unigenes, respectively (Table 1). Among the three kinds of unigenes annotated by Nr database, the top 15 species hit by about 60% annotated unigenes were Vitis vinifera, Theobroma cacao, Sesamum indicum, Nelumbo nucifera, Jatropha curcas, Prunus mume, Nicotiana tomentosiformis, Gossypium arboreum, Nicotiana sylvestris, Populus euphratica, Brassica napus, Citrus sinensis, Medicago truncatula, Solanum tuberosum, and Gossypium raimondii (S3 Table). Among the unigenes annotated by Swiss-Port database, the numbers that fall within the E-value scope of 0~1E150, 1E150~1E125, 1E125~1E100, 1E100~1E75, 1E75~1E50, 1E50~1E25 and 1E25~1E5 based on the match degree were 3866, 929, 949, 1176, 1480, 2155 in terms of rabbiteye blueberry, 3670, 6285; 3828, 958, 912, 1220, 1529, 2192, 3815, 6299 in terms of southern highbush blueberry and 3952, 966, 951, 1171, 1498, 2100, 3512, 5934 in terms of northern highbush blueberry, respectively (S4 Table). Annotation by KOG database showed that most of the unigenes in three kinds of blueberries were involved into “General function prediction only”, and reached to 6204 (36.36%), 6154 (35.75%) and 5990 (36.15%), followed by the molecular function of “signal transduction mechanisms” and “posttranslational modification, protein turnover, chaperones”, and the number reached to 3451 (20.23%), 3375 (19.61%), 3311 (19.98%) and 3238 (18.98%), 3313 (19.25%), 3227 (19.47%), respectively (Table 2). According to the annotation results of KEGG database, the unigenes of three kinds of blueberries were associated with 129 metabolism pathways. The top five metabolism pathways were “Plant-pathogen interaction”, “Carbon metabolism”, “Ribosome”, “Protein processing in endoplasmic reticulum” and “Biosynthesis of amino acids” (S5 Table). GO enrichment analysis was used for functional annotation of unigenes, and 17,751, 18,237 and 17,503 unigenes hit 94,620, 97,611 and 94,168 GO terms with an average of 5.33, 5.35 and 5.38 hits per unigene (S6 Table). The “metabolic process” was the main term of “biological process” category, and “cell” and “cell part” terms were enriched in “cellular process” category, while “catalytic activity” and “binding” were significantly enriched in the “molecular function” category (S6 Table).
Table 1

Overview of unigenes annotation in transcriptome of three blueberry crops.

DatabaseNumber (percentage) of total annotated unigenes
Rabbiteye blueberrySouthern highbush blueberryNorthern highbush blueberry
Nr28028 (61.55%)28029 (65.31%)27189 (62.32%)
Swiss-Port20510 (45.04%)20753 (48.36%)20084 (46.03%)
COG17061 (37.475)17213 (40.11%)16570 (37.98%)
KEGG10431 (22.91%)10755 (25.06%)10265 (23.51%)
Annotated by one or more above databases28091 (61.69%)28115 (65.51%)27256 (62.47%)
None of the above four databases17444 (38.31%)14799 (34.49%)16374 (37.53%)
Table 2

KOG (COG) annotation of unigenes in transcriptome of three blueberry crops.

Classification of molecular functionNumber (percentage) of unigenes annotated by KOG
Rabbiteye blueberrySouthern highbush blueberryNorthern highbush blueberry
RNA processing and modification1768 (%)1729 (%)1676 (%)
Chromatin structure and dynamics469 (%)502 (%)464 (%)
Energy production and conversion1060 (%)1042 (%)1026 (%)
Cell cycle control, cell division, chromosome partitioning703 (%)724 (%)725 (%)
Amino acid transport and metabolism754 (%)792 (%)776 (%)
Nucleotide transport and metabolism222 (%)232 (%)228 (%)
Carbohydrate transport and metabolism1054 (%)1045 (%)1007 (%)
Coenzyme transport and metabolism188 (%)202 (%)194 (%)
Lipid transport and metabolism897 (%)910 (%)868 (%)
Translation, ribosomal structure and biogenesis1232 (%)1269 (%)1205 (%)
Transcription1551 (%)1615 (%)1574 (%)
Replication, recombination and repair889 (%)885 (%)894 (%)
Cell wall/membrane/envelope biogenesis338 (%)324 (%)321 (%)
Cell motility6 (%)14 (%)10 (%)
Posttranslational modification, protein turnover, chaperones3238 (%)3313 (%)3227 (%)
Inorganic ion transport and metabolism542 (%)543 (%)557 (%)
Secondary metabolites biosynthesis, transport and catabolism840 (%)857 (%)798 (%)
General function prediction only6204 (%)6154 (%)5990 (%)
Function unknown1184 (%)1206 (%)1172 (%)
Signal transduction mechanisms3451 (%)3375 (%)3311 (%)
Intracellular trafficking, secretion, and vesicular transport1366 (%)1444 (%)1385 (%)
Defense mechanisms195 (%)208 (%)190 (%)
Extracellular structures73 (%)68 (%)80 (%)
Nuclear structure105 (%)108 (%)94 (%)
Cytoskeleton511 (%)539 (%)644 (%)

Detection and statistics of R-Genes

We identified 3599, 3495 and 3513 candidate R-gene unigenes, which belong to more than 15 families in rabbiteye, southern highbush and northern highbush blueberries, respectively. The number of candidate R-gene families in three kinds of blueberries had almost the same trend. Candidate unigenes in RLP family has an absolute advantage in number, and reached to 996, 1055, 1016, which accounted for 27.67%, 30.19% and 28.92% of total candidate R-gene unigenes, followed by NL, N, CNL, TNL, and their numbers reached to 549 (15.25%), 509 (14.56%), 518 (14.75%); 504 (14.00%), 475 (13.59%), and 473 (13.46%); 417 (11.59%), 382 (10.93%), and 6 (11.56%); and 433 (12.03%), 374 (10.70%) and 397 (11.30%) in three blueberry crops, respectively (Table 3).
Table 3

Candidate R-gene identified from unigenes in the transcriptomes of three blueberry crops.

R-gene familiesNumber (percentage) of putative R-gene
Rabbiteye blueberrySouthern highbush blueberryNorthern highbush blueberry
RLP996 (27.67%)1055 (30.19%)1016 (28.92%)
NL549 (15.25%)509 (14.56%)518 (14.75%)
N504 (14.00%)475 (13.59%)473 (13.46%)
TNL417 (11.59%)382 (10.93%)406 (11.56%)
CNL433 (12.03%)374 (10.70%)397 (11.30%)
RLK252 (7.00%)260 (7.44%)270 (7.69%)
RLK-GNK2141 (3.92%)156 (4.46%)138 (3.93%)
T70 (1.94%)67 (1.92%)71 (2.02%)
CN66 (1.83%)63 (1.80%)75 (2.13%)
Pto-like44 (1.22%)34 (0.97%)34 (0.97%)
Mlo-like22 (0.61%)23 (0.66%)18 (0.51%)
L15 (0.42%)15 (0.43%)16 (0.46%)
RPW8-NL6 (0.17%)6 (0.17%)6 (0.17%)
Other84 (2.33%)76 (2.17%)75 (2.13%)
Total359934953513

Detection of SSR markers

We identified 8756, 9020, and 9198 SSR markers from 7251, 7282 and 7518 unigenes from rabbiteye, southern highbush and northern highbush blueberry cultivars. The numbers of SSR kinds with different core motifs exhibited similar distribution patterns in three blueberry crops, for example, two repeat motifs accounted for the majority in numbers, and reached to 5829, 6177, 6230, which accounted for 66.57%, 68.48% and 67.73% of total SSR markers in three blueberry crops, followed by 3, 4, 6 and 5 repeat type SSR motifs (Table 4). Of the all kinds of SSR markers with different motifs, “AG/CT” motif was found to be the highest proportion, which accounted for 61.96%, 63.80%, 62.85% of total SSR markers in three blueberry crops, followed by AAG/CTT motif which accounted for 8.0% of total SSR markers in three blueberry crops, while all other motifs accounted for less than 5% of total SSR markers in three blueberry crops. Most of the SSR markers were found to be suitable for sequence information to design primers (S7 Table).
Table 4

SSR markers identified from unigenes in transcriptome of three blueberry crops.

SSR motifNumber (percentage) of SSR markers
T1-Rabbiteye blueberry (8756)T2-Southern highbush blueberry (9020)T3-Northern highbush blueberry (9198)
AC/GT313 (3.57%)324 (3.59%)324 (3.52%)
AG/CT5425 (61.96%)5755 (63.80%)5781 (62.85%)
AT/AT81 (0.94%)89 (0.97%)115 (1.25%)
AAC/GTT113 (1.29%)103 (1.14%)99 (1.08%)
AAG/CTT739 (8.44%)774 (8.58%)806 (8.76%)
AAT/ATT27 (0.31%)28 (0.31%)32 (0.35%)
ACC/GGT435 (4.97%)382 (4.24%)406 (4.41%)
ACG/CGT114 (1.30%)97 (1.08%)97 (1.05%)
ACT/AGT25 (0.29%)25 (0.28%)24 (0.26%)
AGC/CTG294 (3.36%)268 (2.97%)286 (3.11%)
AGG/CCT452 (5.16%)414 (4.59%)428 (4.65%)
ATC/ATG172 (1.96%)157 (1.74%163 (1.77%)
CCG/CGG129 (1.47%)154 (1.71%)156 (1.70%)
AAAG/CTTT29 (0.33%)28 (0.31%)28 (0.30%)
AAAT/ATTT32 (0.37%)28 (0.31%)34 (0.37%)
others376 (4.29%)394 (4.37%)419 (4.55%)

Identification of SNPs

After using a strict filtering procedure, we identified 7665, 4861, 13,063 SNPs in leaf`s transcriptome of rabbiteye, northern highbush and southern highbush blueberry cultivars, respectively (S8 Table). Among these SNPs, base mutants with transitions were 1.90, 1.93 and 1.93 times of transversion, and G/A, C/T mutant patterns were much higher than other mutants, and the numbers reached to 2647, 1770, 4580, and 1560, 980, 2413 that accounted for 34.53%, 36.41%, 35.06% and 20.35%, 20.16%, 18.47% of total SNPs in rabbiteye, northern highbush and southern highbush blueberry cultivars, respectively (Fig 1). The minor allele frequency of these SNPs were in the range of 0.15–0.50, and if we divided these values into seven intervals with 0.05 per interval, the minor allele frequency of most of the SNPs fall into 0.35–0.40 in rabbiteye and northern highbush blueberry, and 0.20–0.25 in southern highbush blueberry cultivars (Fig 2). The heterozygosity ratio of all the SNPs was in the range of 0.00–0.80 in three blueberry crops, and if we divided these heterozygosity values into 9 intervals with 0.10 per interval, most of the SNPs fall into the range of 0.4–0.5 in rabbiteye blueberry and northern highbush blueberry and 0.3–0.4 in southern highbush blueberry (Fig 3).
Fig 1

Statistics of SNP distribution pattern in transcriptome of three blueberry crops.

Fig 2

Minor allele frequency distribution of SNPs in transcriptome of three blueberry crops.

Fig 3

Heterozygosity distribution of SNPs in transcriptome of three blueberry crops.

Discussion

In the last decade, genomics research based on high-throughput sequencing for fruit crops had made a dramatic progress, and the reference genome of more than ten fruit crops and huge RNA data have been published. These investigations have greatly promoted the studies of molecular biology, evolution genetics and breeding program of fruit crops [52-55]. Recently, the genome of blueberry (Vaccinium corymbosum with diploid genome) has been published, and researchers can access it by the genome Browser8.5.2 software (http://bioviz.org/igb/). However, the assembly integrity and sequencing coverage of this reference genome is very low [36]. Therefore, we assembled the transcriptome with a method of no reference genome to get more information about gene functions in this study. In spite of a lot of genome information or transcriptome sequences have been deposited in Genebank [24-36], the reports on SSR or SNP markers at large scale are limited. In this study, we developed more than 8000 SSRs and 4000 high-quality SNPs markers in three kinds of blueberry crops based on the transcriptome data, and this would offer great help for the blueberry studies about molecular genetics, molecular breeding and association analysis that mainly rely on the molecular tools. Plants have evolved a wide range of defense mechanisms to protect themselves against pathogens, and the major defense mechanisms are disease resistance which commonly mediated by semi-dominant or dominant R genes that encode receptors and detect pathogen infection either by recognition of pathogen effector molecules directly, or by recognition of effector modified host targets indirectly [56,57]. Crops are the plant groups that offer basic sources of energy and nutrition for human survival. There are number of factors that reduces the global crop yield, such as huge number of plants grown together, inadequate supply of fertilizer and water, and plants of a crop are more susceptible to a large number of pathogens, including bacteria, insects, oomycetes, and nematodes [58,59]. So, developing disease-resistant varieties by different methods, such as genetic transformation of plant resistance genes, are believed to be a good choice to protect crops from diseases, insects and pests. Identifying plant resistance genes and R-gene loci are the basic premise to assemble various resistance sources effectively and to engineer new strategies for disease resistance in agriculture [60,61]. In this study, we identified about thousands of unigenes that were homologous with R-gene that belong to more than 13 families, and this would offer the molecular information to understand the ecological adaption of blueberry. At the same time, these unigenes information also offer the basic molecular tools for resistance breeding. In the past decade, single nucleotide polymorphisms (SNPs) have become a popular and conventional choice of genetic marker, especially for diploid species by high-throughput sequencing method [62-65]. However, identification of SNPs in polyploids is more challenging because of complex genome duplication events which incurring homologous SNPs (polymorphic positions occurring across subgenomes within and among individuals). SNP markers were produced at large scale by next generation sequencing platform in few polyploid species by using different methods to filter false positives [66]. For example, to filter out false positives as much as possible, the SNPs from uniquely mapping reads or the reads depth more than three have been used in transcriptome data of B. napus [67,68], and SNPs from these strategies have been successfully used for genome-wide association studies [69]. Another SNP filtering strategy was successfully used in potato by combining of read depth, quality and SNP density of transcriptome sequence [70]. Besides, a Network-Enabled Analysis Kit (UNEAK pipeline) implemented in the TASSEL-GBS software program (https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline) has been developed and proven to be effective for the identification of SNPs in complex species such as switchgrass [71]. The conclusion drawn from above successful cases is that high-quality SNPs can be identified in even the most difficult polyploid species. Most cultivated blueberry cultivars are polyploid, for example, lowbush blueberry is tetraploid [68], northern highbush blueberry is 2x, 4x and hexaploid (6x), and 3x and 5x are produced by hybridization [72,73], and rabbiteye blueberry is hexaploid [74]. The southern highbush and inter-highbush are also generated by crossing with northern highbush and other species, and both are polyploid [75]. To overcome the adverse effects incurred by complex genome duplication events, we further filtered out the original SNP dataset, which was generated by using program tophat v2.0.14 with default parameters. We systematically considered the status of SNP loci by sequencing quality, read depth, minor allele frequency, annotation statistics, and only from annotated single copy unigenes (a gene family includes only one unigene in a species). We believed that final SNP datasets are reliable molecular tools for the association studies and marker assisted breeding of blueberry.

Raw reads statistics of transcriptome in three blueberry crops.

(XLSX) Click here for additional data file.

Unigenes assembled information of transcriptome in three blueberry crops.

(XLSX) Click here for additional data file.

Species distribution of Nr annotation of unigenes in transcriptome of three blueberry crops.

(XLSX) Click here for additional data file.

E-value distribution of Swiss-Port annotation of unigenes in transcriptome of three blueberry crops.

(XLSX) Click here for additional data file.

KEGG annotation of unigenes in three blueberry crops.

(XLS) Click here for additional data file.

GO enrichment analysis of unigenes identified in three blueberry crops.

(XLS) Click here for additional data file.

SSR loci identified and their corresponding primers designed in the unigenes of three blueberry crops.

(XLSX) Click here for additional data file.

SNPs loci identified from leaf transcriptome of three blueberry crops.

(XLSX) Click here for additional data file.
  38 in total

1.  Determination of ploidy level and nuclear DNA content in blueberry by flow cytometry.

Authors:  D E Costich; R Ortiz; T R Meagher; L P Bruederle; N Vorsa
Journal:  Theor Appl Genet       Date:  1993-09       Impact factor: 5.699

2.  De novo sequencing and comparative analysis of the blueberry transcriptome to discover putative genes related to antioxidants.

Authors:  Xiaoyan Li; Haiyue Sun; Jiabo Pei; Yuanyuan Dong; Fawei Wang; Huan Chen; Yepeng Sun; Nan Wang; Haiyan Li; Yadong Li
Journal:  Gene       Date:  2012-09-17       Impact factor: 3.688

3.  Major differences observed in transcript profiles of blueberry during cold acclimation under field and cold room conditions.

Authors:  Anik L Dhanaraj; Nadim W Alkharouf; Hunter S Beard; Imed B Chouikha; Benjamin F Matthews; Hui Wei; Rajeev Arora; Lisa J Rowland
Journal:  Planta       Date:  2006-09-05       Impact factor: 4.116

4.  Lipophilic and hydrophilic antioxidant capacities of common foods in the United States.

Authors:  Xianli Wu; Gary R Beecher; Joanne M Holden; David B Haytowitz; Susan E Gebhardt; Ronald L Prior
Journal:  J Agric Food Chem       Date:  2004-06-16       Impact factor: 5.279

5.  WEGO: a web tool for plotting GO annotations.

Authors:  Jia Ye; Lin Fang; Hongkun Zheng; Yong Zhang; Jie Chen; Zengjin Zhang; Jing Wang; Shengting Li; Ruiqiang Li; Lars Bolund; Jun Wang
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

6.  RNA-Seq analysis and annotation of a draft blueberry genome assembly identifies candidate genes involved in fruit ripening, biosynthesis of bioactive compounds, and stage-specific alternative splicing.

Authors:  Vikas Gupta; April D Estrada; Ivory Blakley; Rob Reid; Ketan Patel; Mason D Meyer; Stig Uggerhøj Andersen; Allan F Brown; Mary Ann Lila; Ann E Loraine
Journal:  Gigascience       Date:  2015-02-13       Impact factor: 6.524

7.  A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato.

Authors:  Jan G A M L Uitdewilligen; Anne-Marie A Wolters; Bjorn B D'hoop; Theo J A Borm; Richard G F Visser; Herman J van Eck
Journal:  PLoS One       Date:  2013-05-08       Impact factor: 3.240

Review 8.  Application of genomics-assisted breeding for generation of climate resilient crops: progress and prospects.

Authors:  Chittaranjan Kole; Mehanathan Muthamilarasan; Robert Henry; David Edwards; Rishu Sharma; Michael Abberton; Jacqueline Batley; Alison Bentley; Michael Blakeney; John Bryant; Hongwei Cai; Mehmet Cakir; Leland J Cseke; James Cockram; Antonio Costa de Oliveira; Ciro De Pace; Hannes Dempewolf; Shelby Ellison; Paul Gepts; Andy Greenland; Anthony Hall; Kiyosumi Hori; Stephen Hughes; Mike W Humphreys; Massimo Iorizzo; Abdelbagi M Ismail; Athole Marshall; Sean Mayes; Henry T Nguyen; Francis C Ogbonnaya; Rodomiro Ortiz; Andrew H Paterson; Philipp W Simon; Joe Tohme; Roberto Tuberosa; Babu Valliyodan; Rajeev K Varshney; Stan D Wullschleger; Masahiro Yano; Manoj Prasad
Journal:  Front Plant Sci       Date:  2015-08-11       Impact factor: 5.753

Review 9.  Genomics-assisted breeding in fruit trees.

Authors:  Hiroyoshi Iwata; Mai F Minamikawa; Hiromi Kajiya-Kanegae; Motoyuki Ishimori; Takeshi Hayashi
Journal:  Breed Sci       Date:  2016-01-01       Impact factor: 2.086

10.  PRGdb 3.0: a comprehensive platform for prediction and analysis of plant disease resistance genes.

Authors:  Cristina M Osuna-Cruz; Andreu Paytuvi-Gallart; Antimo Di Donato; Vicky Sundesha; Giuseppe Andolfo; Riccardo Aiese Cigliano; Walter Sanseverino; Maria R Ercolano
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

View more
  2 in total

1.  There and back again; historical perspective and future directions for Vaccinium breeding and research studies.

Authors:  Patrick P Edger; Massimo Iorizzo; Nahla V Bassil; Juliana Benevenuto; Luis Felipe V Ferrão; Lara Giongo; Kim Hummer; Lovely Mae F Lawas; Courtney P Leisner; Changying Li; Patricio R Munoz; Hamid Ashrafi; Amaya Atucha; Ebrahiem M Babiker; Elizabeth Canales; David Chagné; Lisa DeVetter; Mark Ehlenfeldt; Richard V Espley; Karina Gallardo; Catrin S Günther; Michael Hardigan; Amanda M Hulse-Kemp; MacKenzie Jacobs; Mary Ann Lila; Claire Luby; Dorrie Main; Molla F Mengist; Gregory L Owens; Penelope Perkins-Veazie; James Polashock; Marti Pottorff; Lisa J Rowland; Charles A Sims; Guo-Qing Song; Jessica Spencer; Nicholi Vorsa; Alan E Yocca; Juan Zalapa
Journal:  Hortic Res       Date:  2022-04-11       Impact factor: 7.291

2.  Molecular footprints of selection effects and whole genome duplication (WGD) events in three blueberry species: detected by transcriptome dataset.

Authors:  Yunsheng Wang; Fei Nie; Muhammad Qasim Shahid; Faheem Shehzad Baloch
Journal:  BMC Plant Biol       Date:  2020-06-03       Impact factor: 4.215

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.