| Literature DB >> 29186418 |
Seong-Ryul Kim1, Woori Kwak2, Hyaekang Kim3, Kelsey Caetano-Anolles3, Kee-Young Kim1, Su-Bae Kim1, Kwang-Ho Choi1, Seong-Wan Kim1, Jae-Sam Hwang1, Minjee Kim4, Iksoo Kim4, Tae-Won Goo5, Seung-Won Park6.
Abstract
Background: Antheraea yamamai, also known as the Japanese oak silk moth, is a wild species of silk moth. Silk produced by A. yamamai, referred to as tensan silk, shows different characteristics such as thickness, compressive elasticity, and chemical resistance compared with common silk produced from the domesticated silkworm, Bombyx mori. Its unique characteristics have led to its use in many research fields including biotechnology and medical science, and the scientific as well as economic importance of the wild silk moth continues to gradually increase. However, no genomic information for the wild silk moth, including A. yamamai, is currently available. Findings: In order to construct the A. yamamai genome, a total of 147G base pairs using Illumina and Pacbio sequencing platforms were generated, providing 210-fold coverage based on the 700-Mb estimated genome size of A. yamamai. The assembled genome of A. yamamai was 656 Mb (>2 kb) with 3675 scaffolds, and the N50 length of assembly was 739 Kb with a 34.07% GC ratio. Identified repeat elements covered 37.33% of the total genome, and the completeness of the constructed genome assembly was estimated to be 96.7% by Benchmarking Universal Single-Copy Orthologs v2 analysis. A total of 15 481 genes were identified using Evidence Modeler based on the gene prediction results obtained from 3 different methods (ab initio, RNA-seq-based, known-gene-based) and manual curation. Conclusions: Here we present the genome sequence of A. yamamai, the first genome sequence of the wild silk moth. These results provide valuable genomic information, which will help enrich our understanding of the molecular mechanisms relating to not only specific phenotypes such as wild silk itself but also the genomic evolution of Saturniidae.Entities:
Keywords: Antheraea yamamai; Japanese oak silk moth; Japanese silk moth; genome assembly; wild silkworm
Mesh:
Year: 2018 PMID: 29186418 PMCID: PMC5774507 DOI: 10.1093/gigascience/gix113
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Photograph of Antheraea Yamamai. From left: larva, cocoon, and adult A. yamamai, respectively. Green color is one of the representative characteristics of tensan silk.
Summary statistics of generated whole-genome shotgun sequencing data using Illumina Nextseq 500
| Library name | Library type | Insert size | Platform | Read length | No. of reads | Total base, bp | Reads retained after trimming |
|---|---|---|---|---|---|---|---|
| 350 bp | Paired-end | 350 bp | Nextseq500 | 151 | 293 176 268 | 44 269 616 468 | 291 070 362 |
| 700 bp | Paired-end | 700 bp | Nextseq500 | 151 | 246 945 900 | 37 288 830 900 | 244 698 580 |
| 3 Kbp | Mate-pair | 3 Kbp | Nextseq500 | 76 | 284 204 762 | 21 599 561 912 | 195 095 164 |
| 6 Kbp | Mate-pair | 6 Kbp | Nextseq500 | 76 | 246 238 370 | 18 714 116 120 | 152 496 372 |
| 9 Kbp | Mate-pair | 9 Kbp | Nextseq500 | 76 | 239 919 538 | 18 233 884 888 | 148 612 724 |
| Total | 1 310 484 838 | 140 106 010 288 | 1 031 973 202 |
Summary statistics of generated long reads data using Pacbio RS II system
| No. of reads | 1005,571 |
|---|---|
| Total bases | 5836 969 225 |
| Length of longest (shortest) read | 50 132 (50) |
| Average read length | 5804.63 |
Summary statistics of generated transcriptome data obtained from 6 organ tissues using Illumina platform
| Tissue | Sample name | Read length | Read count | Total base, bp |
|---|---|---|---|---|
| Hemocyte | Hemocyte_1 | 76 | 20 815 674 | 1 581 991 224 |
| Hemocyte_2 | 76 | 26 704 666 | 2 029 554 616 | |
| Hemocyte_2 | 76 | 53 068 562 | 4 033 210 712 | |
| Malpighian tube | Malpighi_1 | 76 | 22 635 428 | 1 720 292 528 |
| Malpighi_2 | 76 | 24 893 788 | 1 891 927 888 | |
| Malpighi_3 | 76 | 45 213 164 | 3 436 200 464 | |
| Midgut | Midgut_1 | 76 | 23 350 138 | 1 774 610 488 |
| Midgut_2 | 76 | 24 597 972 | 1 869 445 872 | |
| Midgut_3 | 76 | 50 949 986 | 3 872 198 936 | |
| Head | Head_1 | 76 | 26 526 276 | 2 015 996 976 |
| Head_2 | 76 | 26 581 124 | 2 020 165 424 | |
| Head_3 | 76 | 40 900 456 | 3 108 434 656 | |
| Integument | Skin_1 | 76 | 24 592 846 | 1 869 056 296 |
| Skin_2 | 76 | 42 775 430 | 3 250 932 680 | |
| Skin_3 | 76 | 35 043 570 | 2 663 311 320 | |
| Fat body | Fat Body_1 | 76 | 24 637 810 | 1 872 473 560 |
| Fat Body_2 | 76 | 24 037 494 | 1 826 849 544 | |
| Fat Body_3 | 76 | 40 817 582 | 3 102 136 232 | |
| Anterior-middle/silk gland | AM/Silk Gland_1 | 76 | 21 399 638 | 1 626 372 488 |
| AM/Silk Gland_2 | 76 | 24 292 386 | 1 846 221 336 | |
| AM/Silk Gland_3 | 76 | 37 331 530 | 2 837 196 280 | |
| Posterior/silk gland | P/Silk Gland_1 | 76 | 27 359 580 | 2 079 328 080 |
| P/Silk Gland_2 | 76 | 23 300 962 | 1 770 873 112 | |
| P/Silk Gland_3 | 76 | 39 421 430 | 2 996 028 680 | |
| Testis | Testis_1 | 76 | 40 890 404 | 3 107 670 704 |
| Testis_2 | 76 | 45 733 846 | 3 475 772 296 | |
| Testis_3 | 76 | 44 985 224 | 3 418 877 024 | |
| Ovary | Ovary_1 | 76 | 40 797 628 | 3 100 619 728 |
| Ovary_2 | 76 | 40 409 752 | 3 071 141 152 | |
| Ovary_3 | 76 | 42 417 892 | 3 223 759 792 |
Figure 2:19-mer distribution of A. yamamai genome using jellyfish with 350-bp paired-end whole genome sequencing data.
Summary statistics of the A. yamamai genome (>2 kb)
| Assembled genome | |
|---|---|
| Size, 1n | 656 Mb |
| GC level | 34.07 |
| No. of scaffolds | 3675 |
| N50 of scaffolds, bp | 739 388 |
| No. of bases in scaffolds, % | 19 257 439 (2.93) |
| Longest (shortest) scaffolds, bp | 3 156 949 (2003) |
| Average scaffold length, bp | 178 657.53 |
Summary of identified repeat elements in the A. yamamai genome
| Repeat element | No. of elements | Length, % |
|---|---|---|
| SINE | 59 968 | 8 615 338 (1.30) |
| LINE | 426 522 | 101 251 176 (15.31) |
| LTR element | 53 977 | 4 552 386 (0.69) |
| DNA element | 512 760 | 69 071 227 (10.44) |
| Small RNA | 43 645 | 6 691 619 (1.01) |
| Simple repeat | 135 989 | 6 256 839 (0.95) |
| Low complexity | 19 937 | 932 829 (0.14) |
| Unclassified | 294 190 | 54 552 009 (8.25) |
Figure 3:Amount and proportion of identified repeat element from 8 species including A. yamamai. A) Absolute amount of repeat element classified into 8 different categories. B) Proportion of each repeat element in identified total repeat element.
Summary statistics of ab initio, RNA-seq-based, and homology-based gene prediction results
| Evidence type | Programs | Element | Total count | Exon/gene | Total length, bp | Mean length, bp |
|---|---|---|---|---|---|---|
| Gene | 14 576 | 142 415 318 | 9770.53 | |||
| Augustus | 4.85 | |||||
| Exon | 70 733 | 14 736 668 | 208.34 | |||
| Gene | 10 946 | 46 119 402 | 4213.35 | |||
|
| Geneid | 2.25 | ||||
| Exon | 24 686 | 3 925 563 | 159.01 | |||
| Gene | 27 754 | 273 745 951 | 9863.29 | |||
| GeneMarks-ET | 5.50 | |||||
| Exon | 152 660 | 30 847 503 | 202.06 | |||
| Gene | 36 213 | 840 429 061 | 23 207.94 | |||
| RNA-seq | Cufflinks Transdecoder | 7.03 | ||||
| Exon | 254 770 | 201 721 675 | 791.77 | |||
| Known gene (NCBI lepidoptera) | PASA (gmap) | 44 561 | 22 484 151 | 504.57 |
Summary statistics for the consensus gene set of the A. yamamai genome
| Element | No. of elements | Exon/gene | Avg. length | Total length | Genome coverage, % |
|---|---|---|---|---|---|
| Gene | 15 481 | 11 016.34 | 170 543 958 | 25.78 | |
| 5.64 | |||||
| Exon | 87 346 | 1346.23 | 20 840 925 | 3.31 |
Figure 4:Constructed phylogenetic tree and comparative gene family analysis. Node values indicate Bayesian posterior probability, bootstrap and gene expansion, and contraction value. Orange and blue colors indicate expansion and contraction, respectively. Bar chart indicates the number of genes cauterized into 4 groups (Specific, 1:Multi, Multi:Multi, and 1:1) using OrthoMCL.
Figure 5:Expansion of chorion gene in the A. yamamai genome. A, B) The gene trees of chorion A and B in the rapid expanded gene family cluster, respectively. Color of terminal node indicates each taxon identified in the gene family cluster.
Figure 6:Karyotype of A. yamamai using a gamete of testis in metaphase.
Summary statistics of generated Illumina synthetic long read (Moleculo) library
| 500–1499 bp | ≥1500 bp | |
|---|---|---|
| No. of assembled reads | 302 132 | 342 738 |
| No. of bases in assembled read | 268 853 717 | 1 205 349 082 |
| N50 length of assembled read | 960 | 4031 |