| Literature DB >> 29785397 |
Kazuo Araki1,2, Jun-Ya Aokic1, Junya Kawase1,2, Kazuhisa Hamada3, Akiyuki Ozaki1, Hiroshi Fujimoto1, Ikki Yamamoto1, Hironori Usuki1.
Abstract
Greater amberjack (Seriola dumerili) is distributed in tropical and temperate waters worldwide and is an important aquaculture fish. We carried out de novo sequencing of the greater amberjack genome to construct a reference genome sequence to identify single nucleotide polymorphisms (SNPs) for breeding amberjack by marker-assisted or gene-assisted selection as well as to identify functional genes for biological traits. We obtained 200 times coverage and constructed a high-quality genome assembly using next generation sequencing technology. The assembled sequences were aligned onto a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map by sequence homology. A total of 215 of the longest amberjack sequences, with a total length of 622.8 Mbp (92% of the total length of the genome scaffolds), were lined up on the yellowtail RH map. We resequenced the whole genomes of 20 greater amberjacks and mapped the resulting sequences onto the reference genome sequence. About 186,000 nonredundant SNPs were successfully ordered on the reference genome. Further, we found differences in the genome structural variations between two greater amberjack populations using BreakDancer. We also analyzed the greater amberjack transcriptome and mapped the annotated sequences onto the reference genome sequence.Entities:
Year: 2018 PMID: 29785397 PMCID: PMC5896239 DOI: 10.1155/2018/7984292
Source DB: PubMed Journal: Int J Genomics ISSN: 2314-436X Impact factor: 2.326
Summary statistics of the whole-genome sequence assembly of greater amberjack.
| Hiseq 2500 | Hiseq + PacBio | |
|---|---|---|
| Total bases | 662,587,481 bp | 677,669,644 bp |
| Number of scaffolds | 34,824 | 11,655 |
| Number of gaps | 32,742 | 9742 |
| Mean of scaffolds | 19,026 bp | 19,554 bp |
| Longest bases | 22,167,742 bp | 24,919,768 bp |
| N50 | 4,989,656 bp | 5,812,906 bp |
| Number of >2Kb | 724 | 707 |
| Total bases (>2Kb) | 655,539,910 bp | 670,698,073 bp |
The HiSeq 2500 sequence assembly was compared with the HiSeq 2500 sequence data mapped onto the PacBio RSII sequence data.
Figure 1Greater amberjack scaffolds aligned onto two linkage groups of the yellowtail radiation hybrid physical map. A representative part of the yellowtail radiation hybrid (RH) physical map is shown with the greater amberjack scaffolds aligned. Numbers on the left indicate distance (cR) from the top of the RH map. Black lines indicate chromosomes. Red lines on the left indicate scaffold lengths. Seq numbers indicate mapped sequence number. Scaffold numbers identify the aligned scaffolds.
Number of mapped scaffolds and total length of the scaffolds of greater amberjack mapped to the 24 linkage groups onto yellowtail RH physical map.
| LGNo | The number of mapped scaffold | Total length (bp) of mapped scaffolds on each LG (bp) |
|---|---|---|
| 1 | 10 | 32,041,590 |
| 2 | 13 | 28,988,571 |
| 3 | 9 | 30,680,262 |
| 4 | 9 | 31,420,922 |
| 5 | 3 | 22,220,271 |
| 6 | 4 | 28,673,657 |
| 7 | 2 | 23,029,943 |
| 8 | 9 | 27,651,179 |
| 9 | 6 | 35,439,400 |
| 10 | 9 | 26,409,569 |
| 11 | 6 | 10,977,986 |
| 12 | 12 | 25,517,532 |
| 13 | 9 | 31,109,094 |
| 14 | 14 | 21,191,548 |
| 15 | 9 | 26,971,016 |
| 16 | 12 | 25,150,760 |
| 17 | 20 | 19,289,833 |
| 18 | 6 | 26,805,652 |
| 19 | 6 | 26,928,810 |
| 20 | 4 | 28,110,268 |
| 21 | 12 | 23,144,198 |
| 22 | 15 | 15,779,806 |
| 23 | 2 | 25,211,550 |
| 24 | 14 | 30,054,902 |
| Total | 215 | 622,798,319 |
LGNo indicates the linkage group number; number of mapped scaffold indicates the number of scaffolds mapped onto each linkage group; and total length of mapped scaffolds on each LG indicates the total length (bp) of the scaffold sequences mapped onto each linkage group.
Summary of nonredundant SNPs mapped to the 24 linkage groups by resequencing 20 greater amberjack genomes.
| LGNo | The number of mutations in each LG | The number of mapped SNPs onto each LG |
|---|---|---|
| 1 | 373,859 | 7831 |
| 2 | 375,132 | 7968 |
| 3 | 360,266 | 7933 |
| 4 | 334,990 | 7725 |
| 5 | 269,110 | 7413 |
| 6 | 311,326 | 7640 |
| 7 | 262,037 | 7763 |
| 8 | 309,899 | 7802 |
| 9 | 383,874 | 7980 |
| 10 | 310,587 | 7791 |
| 11 | 132,260 | 7172 |
| 12 | 285,525 | 7602 |
| 13 | 323,954 | 7722 |
| 14 | 227,417 | 7583 |
| 15 | 259,349 | 7763 |
| 16 | 267,369 | 7882 |
| 17 | 253,519 | 7833 |
| 18 | 280,457 | 7785 |
| 19 | 300,795 | 7878 |
| 20 | 312,122 | 7870 |
| 21 | 273,678 | 7831 |
| 22 | 220,279 | 7863 |
| 23 | 300,003 | 7800 |
| 24 | 331,264 | 7829 |
| Total | 7,059,071 | 186,259 |
LGNo indicates the linkage group number; the number of mutations in each LG indicates the number of mutations found in each linkage group; and the number of mapped SNPs onto linkage group indicates the number of SNPs ordered onto each linkage group.
Figure 2Comparison of structural variations in two greater amberjack populations. Greater amberjack samples 1–8 were captured in the sea near Kochi Prefecture, Japan. Greater amberjack samples 9–20 were captured in the sea near the Chinese Hainan Islands. (a) Number of insertions (INS) and (b) number of intra- (ITX) and inter- (CTX) chromosomal translocations detected in the 20 genomes. The vertical axis shows the number of structural variations.
Figure 3Genome-wide landscape of structural variations of greater amberjack. We linked the BreakDancer data to Circus plots to visualize regions of the genome that contained structural variations. We combined the resequenced data of (a) eight greater amberjacks captured off the Kochi coast and (b) 12 greater amberjacks captured off the Chinese Hainan Islands coast and analyzed these data by BreakDancer. Gene density of each contig is visualized by dark lines. The outermost circle shows inversion, next circle shows insertion, and third circle shows deletion. Orange lines show intrachromosomal translocations, and blue lines show interchromosomal translocations.
Structural variations detected in the genomes of 20 greater amberjack.
| Sample number | DEL | INS | INV | ITX | CTX |
|---|---|---|---|---|---|
| 1(Ig4894) | 6921 | 21 | 737 | 5042 | 8293 |
| 2(Ig4895) | 7447 | 22 | 730 | 4954 | 8346 |
| 3(Ig4896) | 6458 | 15 | 760 | 5147 | 8745 |
| 4(Ig4897) | 7690 | 13 | 731 | 5213 | 8504 |
| 5(Ig4898) | 6821 | 12 | 716 | 5262 | 8852 |
| 6(Ig4899) | 7146 | 14 | 742 | 5243 | 8946 |
| 7(Ig4900) | 7383 | 11 | 771 | 5118 | 8672 |
| 8(Ig4901) | 6789 | 14 | 751 | 5159 | 8863 |
| 9(Ig7999) | 5948 | 75 | 653 | 4107 | 6347 |
| 10(Ig8000) | 6157 | 80 | 693 | 4236 | 6699 |
| 11(Ig8001) | 6085 | 58 | 636 | 4133 | 6456 |
| 12(Ig8002) | 5989 | 62 | 665 | 4061 | 6343 |
| 13(Ig8003) | 6015 | 91 | 657 | 4027 | 6403 |
| 14(Ig8004) | 6112 | 89 | 701 | 4056 | 6503 |
| 15(Ig8005) | 5879 | 43 | 654 | 4025 | 6370 |
| 16(Ig8006) | 6269 | 54 | 680 | 4290 | 6759 |
| 17(Ig8007) | 6034 | 35 | 661 | 3990 | 6108 |
| 18(Ig8008) | 6180 | 61 | 677 | 4261 | 6822 |
| 19(Ig8009) | 5974 | 72 | 686 | 4230 | 6351 |
| 20(Ig8010) | 6300 | 51 | 650 | 4201 | 6068 |
Structural variations were detected by BreakDancer using pair-end resequenced data for 20 greater amberjack genomes. Sample numbers 1–8 represent greater amberjacks captured near the Kochi coast, and sample numbers 9–20 represent greater amberjacks captured near Chinese Hainan Islands. The Ig numbers are the resequenced data analysis numbers. DEL: deletion; INS: insertion; INV: inversion; ITX: intrachromosomal translocation; and CTX: interchromosomal translocation.
Figure 4Gene ontology terms assigned to the assembled cDNA transcripts of greater amberjack. The gene ontology annotations under the three main categories: (a) molecular function, (b) biological process, and (c) cellular component.