Literature DB >> 34824304

Chromosome-scale genome assembly of the high royal jelly-producing honeybees.

Lianfei Cao1, Xiaomeng Zhao2, Yanping Chen3, Cheng Sun4.   

Abstract

A high royal jelly-producing strain of honeybees (HRJHB) has been obtained by successive artificial selection of Italian honeybees (Apis mellifera ligustica) in China. The HRJHB can produce amounts of royal jelly that are dozens of times greater than their original counterparts, which has promoted China to be the largest producer of royal jelly in the world. In this study, we generated a chromosome-scale of the genome sequence for the HRJHB using PacBio long reads and Hi-C technique. The genome consists of 16 pseudo-chromosomes that contain 222 Mb of sequence, with a scaffold N50 of 13.6 Mb. BUSCO analysis yielded a completeness score of 99.3%. The genome has 12,288 predicted protein-coding genes and a rate of 8.11% of repetitive sequences. One chromosome inversion was identified between the HRJHB and the closely related Italian honeybees through whole-genome alignment analysis. The HRJHB's genome sequence will be an important resource for understanding the genetic basis of high levels of royal jelly production, which may also shed light on the evolution of domesticated insects.
© 2021. The Author(s).

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34824304      PMCID: PMC8617152          DOI: 10.1038/s41597-021-01091-7

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Royal jelly (RJ) is a proteinaceous secretion synthesized by the hypopharyngeal and mandibular glands of nurse worker bees and is used for feeding queen and larvae[1]. It also plays a critical role in the caste determination of honeybees[2]. Nowadays, RJ is widely used in medical products, health foods and cosmetics in many countries owing to the numerous biological activities it is known to perform including anti-bacterial, anti-oxidative, anti-inflammatory, immunomodulatory, anti-tumoral, and anti-aging activities[3,4]. China is now the largest producer and exporter of RJ in the world, which satisfies nearly all the global demand[5]. Since the 1980s, the yearly production of RJ in China has increased from 200 to around 3000 tons[5]. The rapidly increased production of RJ in China has been mainly attributed to the successful breeding of the high royal jelly-producing honeybees (HRJHB) (Fig. 1), and the effective utilization of corresponding production tools and techniques[6].
Fig. 1

High royal jelly-producing honeybees (HRJHB) in China. (a) Queen and workers in one colony. (b) Royal jelly in the queen’s cells.

High royal jelly-producing honeybees (HRJHB) in China. (a) Queen and workers in one colony. (b) Royal jelly in the queen’s cells. HRJHB was derived from an Italian honeybee subspecies (Apis mellifera ligustica), which was chiefly introduced into China in the 1910s–1930s[7]. In 1960s, attempts were made by beekeepers in the Southeast region of China to select high RJ producing bee stocks to meet a high demand for RJ[8]. The colony that displayed a high rate of RJ production was selected for raising daughter queens and drones in each apiary[8]. Sometimes queens were also developed using larvae of high RJ producing colonies from different apiaries[8]. Queens then open-mated with local drones in the air[8]. After the aforementioned semi-controlled style of breeding, the annual RJ production per colony increased from 0.2–0.3 kg in the 1960s to 2–3 kg in the late 1980s and even reaching 6–8 kg in the 2000s[8]. This was perceived as a miracle and the HRJHB was rapidly introduced to other regions of China from the 1980s, onwards as well as other countries at a later date. At present, the annual production per HRJHB colony has reached more than 10 kg, which is dozens of times greater than that of common Italian honeybees (A. m. ligustica)[5]. RJ production has become a major income source for many beekeepers in China and the HRJHB has been certified as a new honeybee genetic resource by the Chinese government[7]. Previous studies regarding isoenzymes, microsatellites and mitochondrial DNA have shown significant genetic differentiation between the HRJHB and the other common A. m. ligustica populations in China[9-11]. It was suggested that morphological markers, behavioural and physiological changes, and differently expressed proteins and genes, correlate to the high royal jelly-producing trait[12-16]. However, related research has so failed to develop an entirely clear picture of what causes the complex royal jelly-producing trait. In recent years, honeybee selection programs for high RJ production have also been implemented in Brazil and France beekeeping[17,18]. Additionally, further breeding of HRJHB for improving general resistance to disease is being carried out in China. In this study, we generated a chromosome-scale of the genome assembly of the HRJHB using PacBio long-reads, Illumina short reads, and the Hi-C chromosome conformation capture technique (Table 1; Fig. 2a). The resultant genome has a total length of 222 Mb with 16 chromosomes, and the scaffold N50 was 13.6 Mb (Table 1). One chromosome inversion was identified between HRJHB and the closely related Italian honeybees via whole-genome alignment analysis (Fig. 2b). Moreover, through a combination of ab initio gene predictions, transcript evidence and homologous protein evidence, 12,288 protein coding genes were identified in this genome, therein 6,615 genes were assigned a GO term and 8,614 genes were assigned a protein domain (Table 2). Repetitive elements are made of 8.11% of the HRJHB genome sequence, but transposable elements (TEs) only occupy 2.15% (Table 2). Among those TEs, DNA transposons represented the most abundant TE class, which make up the majority of the total TE content (1.68% out of 2.15%). Furthermore, Tc1-mariner (TcMar) is the most abundant TE superfamily in the genome. The genome sequence provides a valuable resource for exploring the molecular basis of the high royal jelly-producing trait in honeybee and will facilitate further genetic improvements. The HRJHB may even represent a novel animal model for studying the effects of artificial selection on insects.
Table 1

Sequencing data generated for the HRJHB genome assembly.

Genome sequencing
Read numberRead_length(mean)Total read length (Gb)
PacBio long reads2,154,16315,48933.37
Ilumina sequencing7178645015010.77
RNA-seq13025800015018.05
Hi-C sequencing21859299615032.79
Genome assembly
Genome assembly size222 Mb
Number of scaffolds16
Scaffold N5013.6 Mb
BUSCO completeness99.30%
Fig. 2

Chromosome-scale assembly for HRJHB genome. (a) The HRJHB’s genome contig contact matrix using Hi-C data. (b) The HRJHB’s genome sequence was aligned with a closely related honeybee genome (NCBI assembly: Amel_HAv3). The red arrow indicates the chromosome inversion between the two genomes on LG7.

Table 2

Annotation of protein-coding genes and repetitive sequences.

Protein-coding genes
Total gene number12,288
BUSCO completeness97%
Number of genes with a GO term6,615
Number of genes with a protein domain8,614
Repetitive sequences
TE superfamilyLength occupied (bp)Percent of genome
DNA transposonsTcMar35570561.67
hAT143380.01
non-LTR retrotransposonsCR19526520.45
R2493760.02
LTR retrotransposonsCopia60140.00
Gypsy3832800.18
Total TEs49627162.15
Other repeats137615925.96
Total repeats187243088.11
Chromosome-scale assembly for HRJHB genome. (a) The HRJHB’s genome contig contact matrix using Hi-C data. (b) The HRJHB’s genome sequence was aligned with a closely related honeybee genome (NCBI assembly: Amel_HAv3). The red arrow indicates the chromosome inversion between the two genomes on LG7. Sequencing data generated for the HRJHB genome assembly. Annotation of protein-coding genes and repetitive sequences.

Methods

Sample collection and genome sequencing

Samples of the HRJHB for genome and transcriptome sequencing were collected in 2019 from Zhejiang Province, China, where the HRJHB was originated and primarily distributed (Fig. 3).
Fig. 3

Original area of HRJHB (red arrowhead).

Original area of HRJHB (red arrowhead). Newly emerged drone bees (n = 20), that are descendants of the queen bee, were collected from a single colony (Fig. 1a). The thoraxes were pooled for PacBio single molecule real-time (SMRT) sequencing and Illumina HiSeq sequencing. Genomic DNA was extracted using the Gentra Puregene Tissue Kit (Qiagen) and was sequenced in accordance with the standard protocols. Newly emerged worker bees (n = 20) were collected from the same colony and their thoraxes were pooled for Hi-C sequencing. Hi-C library preparation was performed by Frasergen (http://www.frasergen.com/), which mainly followed a protocol described previously[19]. The obtained Hi-C sequencing libraries were sequenced on the Illumina HiSeq X Ten platform. Worker bees that were excreting royal jelly (n = 20) were collected from the same colony and their heads, thoraxes and abdomens (excluding the mid-gut tissues) were pooled for RNA-seq on the Illumina HiSeq X Ten platform.

De novo genome assembly for HRJHB

A total of 33.37 Gb of long reads were generated by the PacBio Sequel platform (Table 1), which were self-corrected and assembled into contigs using Canu v2.1[20], with default parameters. The obtained contigs were parsed by Purge Haplotigs v1.1.1[21] to get rid of the redundancies caused by the heterozygosity of the pooled honeybee samples. Then, the remaining non-redundant contigs were polished with Illumina HiSeq reads (Table 1) three times by utilizing software Pilon v1.23[22]. Finally, the Juicer tool[23] was applied to map Hi-C reads (Table 1) against the polished contig sequences of HRJHB using the BWA algorithm[24]. The 3D-DNA pipeline[25] was applied to scaffold the contig sequences in relation to the chromosome-scale of genome assembly.

Annotation of repeat sequences

TEs were de novo identified by RepeatModeler2[26], in line with default parameters. Using the obtained repeat library, each honeybee genome assembly was analyzed with RepeatMasker (http://www.repeatmasker.org) to yield a comprehensive summary of the TE landscape in each assembly. The annotation files produced by RepeatMasker were processed by in-house scripts to eliminate redundancies. In addition, refined annotation files were used to determine the TE diversity and abundance within each assembly and tandem repeats were identified with the Tandem Repeat Finder[27], which was implemented in RepeatMasker.

Prediction and functional annotation of protein-coding genes

Annotation of protein-coding genes was based on ab initio gene predictions, transcript evidence, and homologous protein evidence, which were all applied in the MAKER computational pipeline[28]. Meanwhile, RNA-seq reads obtained in this study were assembled using Trinity[29]. The assembled RNA-seq transcripts, along with proteins from bees (superfamily Apoidea) that are available in the National Center for Biotechnology Information (NCBI) GenBank (last accessed on 01/28/2020), were imported into the MAKER pipeline to generate gene models. To obtain functional clues for the predicted gene models, protein sequences encoded by them were searched against the Uniprot-Swiss-Prot protein databases (last accessed on 01/28/2020) using the BLASTp algorithm implemented in BLAST suite v2.28[30]. In addition, protein domains and GO terms associated with gene models were identified by InterproScan-5[31].

Data Records

The raw data was submitted to the National Center for Biotechnology Information (NCBI) SRA database (Experiments for SRP300170) under BioProject accession number PRJNA689474[32]. The assembled genome has been deposited at DDBJ/ENA/GenBank under the accession GCA_019321825.1[33]. Moreover, the genome annotation results have been deposited at the Figshare database[34].

Technical Validation

Evaluation of the genome assembly

The completeness of the genome assembly was evaluated using a set of 4,415 hymenopteran benchmarking universal single-copy orthologs (BUSCOs) using software BUSCO v3[35]. The results indicated that 99.3% of these BUSCOs were present in the genome assembly (Table 1), suggesting a remarkably complete assembly of the HRJHB genome. Furthermore, the chromosome-level structural accuracy was assessed by performing whole-genome alignments between HRJHB genome and a closely related honeybee genome (GenBank assembly: Amel_HAv3) using software D-GENIES[36]. The alignment results revealed a highly conserved chromosome structure between the two genomes, indicating an accurate scaffolding of contigs in the HRJHB genome. Nevertheless, we did find one inversion on LG7 (Fig. 2b). The Hi-C heatmap revealed a well-organized interaction contact pattern along the diagonals within/around the chromosome inversion region in HRJHB (Fig. 4), which rules out the possibility that the structural variation was derived from unreliable Hi-C signals in the HRJHB assembly. In addition, as chromosome inversion has been found to be associated with honeybee adaptations[37], the inversion identified in the HRJHB genome will guarantee that further analysis will be carried out to investigate its association with high royal jelly production.
Fig. 4

Hi-C heatmap around the identified chromosome inversion region in the HRJHB.

Hi-C heatmap around the identified chromosome inversion region in the HRJHB.
Measurement(s)genome • DNA • transcriptome • sequence_assembly
Technology Type(s)DNA sequencing • RNA sequencing • sequence assembly process
Sample Characteristic - OrganismApis mellifera
  26 in total

1.  Royalactin induces queen differentiation in honeybees.

Authors:  Masaki Kamakura
Journal:  Nature       Date:  2011-04-24       Impact factor: 49.962

2.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes.

Authors:  Brandi L Cantarel; Ian Korf; Sofia M C Robb; Genis Parra; Eric Ross; Barry Moore; Carson Holt; Alejandro Sánchez Alvarado; Mark Yandell
Journal:  Genome Res       Date:  2007-11-19       Impact factor: 9.043

3.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

4.  Chromosomal inversions associated with environmental adaptation in honeybees.

Authors:  Matthew J Christmas; Andreas Wallberg; Ignas Bunikis; Anna Olsson; Ola Wallerman; Matthew T Webster
Journal:  Mol Ecol       Date:  2018-12-21       Impact factor: 6.185

5.  Microsatellite analysis of royal jelly producing traits of Italian honeybee (Apis mellifera Liguatica).

Authors:  Sheng-Lu Chen; Jian-Ke Li; Bo-Xiong Zhong; Song-Kun Su
Journal:  Yi Chuan Xue Bao       Date:  2005-10

6.  Population genomics of honey bees reveals a selection signature indispensable for royal jelly production.

Authors:  Muhammad Rizwan; Pingping Liang; Habib Ali; Zhiguo Li; Hongyi Nie; Hafiz Sohaib Ahmed Saqib; Sajid Fiaz; Muhammad Fahad Raza; Aqai Kalan Hassanyar; Qingsheng Niu; Songkun Su
Journal:  Mol Cell Probes       Date:  2020-02-24       Impact factor: 2.365

7.  MRJP microsatellite markers in Africanized Apis mellifera colonies selected on the basis of royal jelly production.

Authors:  R S Parpinelli; M C C Ruvolo-Takasusuki; V A A Toledo
Journal:  Genet Mol Res       Date:  2014-08-28

8.  InterProScan 5: genome-scale protein function classification.

Authors:  Philip Jones; David Binns; Hsin-Yu Chang; Matthew Fraser; Weizhong Li; Craig McAnulla; Hamish McWilliam; John Maslen; Alex Mitchell; Gift Nuka; Sebastien Pesseat; Antony F Quinn; Amaia Sangrador-Vegas; Maxim Scheremetjew; Siew-Yit Yong; Rodrigo Lopez; Sarah Hunter
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

9.  Fast and accurate long-read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2010-01-15       Impact factor: 6.937

10.  Whole-genome resequencing of honeybee drones to detect genomic selection in a population managed for royal jelly.

Authors:  David Wragg; Maria Marti-Marimon; Benjamin Basso; Jean-Pierre Bidanel; Emmanuelle Labarthe; Olivier Bouchez; Yves Le Conte; Alain Vignal
Journal:  Sci Rep       Date:  2016-06-03       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.