| Literature DB >> 28369352 |
Seunghyun Kang1, Do-Hwan Ahn1, Jun Hyuck Lee1,2, Sung Gu Lee1,2, Seung Chul Shin1, Jungeun Lee1,2, Gi-Sik Min3, Hyoungseok Lee1,2, Hyun-Woo Kim4, Sanghee Kim5, Hyun Park1,2.
Abstract
Background: The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. Findings: We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi . The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis -specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. Conclusions: The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota.Entities:
Keywords: Adaptation; Antarctic; Copepoda; Genome; Tigriopus
Mesh:
Year: 2017 PMID: 28369352 PMCID: PMC5467011 DOI: 10.1093/gigascience/giw010
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1.Photograph of an adult Tigriopus kingsejongensis specimen (scale bar = 200 μm)
DNA library statistics
| Library | Reads (n) | Average | Sequences | Reads Average | Sequences | ||
|---|---|---|---|---|---|---|---|
| length | (bp) (n) | (trimmed) (n) | length | (trimmed) (n) | |||
| Paired-end | Sum | 99 710 266 | 29 271 916 613 | 65 644 374 | 14 668 956 871 | ||
| 350S1 | 6 661 392 | 300 | 2 005 078 992 | 4 446 394 | 233 | 1 034 231 244 | |
| 350S2 | 4 933 058 | 265 | 1 311 700 122 | 4 618 711 | 211 | 975 471 763 | |
| 400S1 | 65 668 598 | 300 | 19 766 247 998 | 36 863 154 | 228 | 8 397 426 481 | |
| 450S1 | 3 418 988 | 300 | 1 029 115 388 | 2 812 455 | 230 | 646 302 159 | |
| 450S2 | 8 009 162 | 245 | 1 968 652 020 | 7 660 814 | 199 | 1 527 566 312 | |
| 500S1 | 11 019 068 | 289 | 3 191 122 093 | 9 242 846 | 226 | 2 087 958 911 | |
| Mate-Paired | Sum | 103 373 998 | 7 753 049 850 | 73 515 391 | 5 169 006 268 | ||
| 3KS1 | 8 374 238 | 75 | 628 067 850 | 6 745 546 | 73 | 493 099 413 | |
| 3KS2 | 9 250 994 | 75 | 693 824 550 | 5 281 513 | 65 | 344 618 723 | |
| 3KS3 | 51 349 594 | 75 | 3 851 219 550 | 39 147 167 | 72 | 2 816 638 666 | |
| 3KS4 | 3 063 232 | 75 | 229 742 400 | 1 740 986 | 65 | 112 554 745 | |
| 8KS1 | 9 847 636 | 75 | 738 572 700 | 7 887 612 | 73 | 572 246 251 | |
| 8KS2 | 16 322 038 | 75 | 1 224 152 850 | 9 653 293 | 65 | 630 842 698 | |
| 8KS3 | 5 166 266 | 75 | 387 469 950 | 3 059 274 | 65 | 199 005 774 | |
| Total | 203 084 264 | 37 024 966 463 | 139 159 765 | 19 837 963 139 | |||
| Coverage (folds) | 120.7 | 64.7 | |||||
Transcriptome sequencing and assembly analysis for Tigriopus japonicus
|
| |
|---|---|
| Total reads (n) | 37 956 160 |
| Total bases (n) | 7 714 415 316 |
| Trimmed reads (n) | 35 577 636 |
| Trimmed bases (n) | 5 989 188 343 |
|
| |
| Contigs (n) | 40 172 |
| Total contig length (bases) | 28 850 726 |
| N50 contig length (bases) | 1093 |
| Max scaffold length (bases) | 23 942 |
|
| |
| With BLAST results | 20 392 |
| Without BLAST hits | 7090 |
| With mapping results | 8172 |
| Annotated sequences | 4518 |
RNA-seq statistics analysis for Tigriopus kingsejongensis
| Temperature | ||
|---|---|---|
| 4 °C | 15 °C | |
| Total reads (n) | 15 786 118 | 16 417 072 |
| Total bases (n) | 3 567 662 668 | 3 763 295 032 |
| Trimmed reads (n) | 14 845 103 | 15 388 513 |
| Trimmed bases (n) | 2 761 189 158 | 2 833 805 442 |
Figure 2.Estimation of the Tigriopus kingsejongensis genome size based on 33-mer analysis. X-axis represents the depth (peak at 39×) and the y-axis represents the proportion. Genome size was estimated to be 298 Mb (total k-mer number/volume peak)
Genome assembly statistics
| Type | Parameter | Assembly size according to Celera Assembler |
|---|---|---|
| Scaffold | Total scaffold length (bases) | 295 233 602 |
| Gap size (bases) | 10 474 460 | |
| Scaffolds (n) | 11 558 | |
| N50 scaffold length (bases) | 159 218 | |
| Max scaffold length (bases) | 3 401 446 | |
| Contig | Total contig length (bases) | 305 712 242 |
| Contigs (n) | 48 368 | |
| N50 contig length (bases) | 17 566 | |
| Max contig length (bases) | 349 507 |
Figure 3.Scaffold and contig size distributions of Tigriopus kingsejongensis. The percentage of the assembly included (y-axis) in contigs or scaffolds of a minimum size (x-axis, log scale) is shown for the contig (red) and scaffold (blue)
Tigriopus kingsejongensis genes: general statistics
| Genes (n) | 12 772 |
| Gene length sum (bp) | 82 293 116 |
| Exons per genes (n) | 4.6 |
| mRNA length sum (bp) | 43 306 342 |
| Average mRNA length (bp) | 1090 |
| Number of tRNA | 1393 |
| Number of rRNA | 215 |
Tigriopus kingsejongensis genome completeness reports with the other arthropod genomes
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
| Species |
|
|
|
|
|
|
|
|
| Assembly | This study | GCA_000187875.1 | GCA_000208615.1 | GCA_000484575.1 | Smar1.22 | GCA_000239435.1 | Dmel_r5.55 | AaegL3 |
| Sample type | genome | genome | genome | genome | genome | genome | genome | genome |
| CEGMA | 83/77.8 | 99.2/98.8 | 79.8/41.9 | 57.3/24.2 | 95.1 | 98.0/95.2 | 100/100 | 99.2/83.5 |
| BUSCO | 61.1 [10.5], 10.7, 28.1 | 83 [3.9], 11, 5.1 | 68.9 [2.4], 21.0, 10.1 | 34.4 [4.0], 23.0, 42.7 | 84 [5.9], 12, 3.2 | 68.8 [5.8], 9.9, 21.3 | 98 [6.4], 0.6, 0.3 | 86 [ |
| BUSCO | 70.9 [13.6], 6.0, 23.0 | |||||||
| BUSCO | 67.1 [16.8], 5.1, 27.7 |
a248 CEGMA genes found/complete
bBUSCO Arthropods complete [duplicated], fragmented, missing
cBUSCO Metazoa complete [duplicated], fragmented, missing
dBUSCO Eukaryotes complete [duplicated], fragmented, missing
e[38]
f[39]
g[47]
Summary of orthologous gene clusters in 11 representative species
| Species | Source of data | No. of coding genes | No. of gene families | No. of genes in gene families | No. of orphan genes | No. of unique gene families | Average No. of genes in gene families |
|---|---|---|---|---|---|---|---|
|
| Ensembl genome 25 | 15 797 | 7958 | 12 792 | 7839 | 854 | 1.61 |
|
| Ensembl gene 78 | 20 447 | 6536 | 13 737 | 13 911 | 1528 | 2.10 |
|
| Ensembl gene 78 | 16 671 | 7017 | 9058 | 9654 | 503 | 1.29 |
|
| Ensembl genome 25 | 30 590 | 6710 | 8362 | 7208 | 368 | 1.25 |
|
| Ensembl gene 78 | 13 918 | 9673 | 21 917 | 20 917 | 2408 | 2.27 |
|
| Ensembl gene 78 | 20 300 | 8696 | 17 186 | 11 604 | 1065 | 1.98 |
|
| Ensembl genome 25 | 20 486 | 8097 | 11 277 | 12 389 | 873 | 1.39 |
|
|
| 32 016 | 8389 | 19 961 | 23 627 | 2276 | 2.38 |
|
| Ensembl genome 25 | 14 992 | 7727 | 11 012 | 7265 | 583 | 1.43 |
|
| Ensembl genome 25 | 18 224 | 6602 | 11 788 | 11 622 | 939 | 1.79 |
|
| this study | 12 772 | 6205 | 8813 | 6567 | 649 | 1.42 |
Figure 4.Comparative genome analyses of the T. kingsejongensis genome. A. Venn diagram of orthologous gene clusters between four arthropod lineages. B. Gene family gain-and-loss analysis. The number of gained gene families (red), lost gene families (blue) and orphan gene families (black) are indicated for each species. Time lines specify divergence times between the lineages.
Figure 5.Tigriopus kingsejongensis-specific adaptive evolution. A. Global mean w (ratio of nonsynonymous (dN) to synonymous mutations (dS)) distribution by GO categories of T. kingsejongensis and T. japonicus. GO categories showing supposedly accelerated nonsynonymous divergence (binomial test, test statistic <0.05) in T. kingsejongensis and T. japonicus are colored in red and blue, respectively. B. A total of seven enzyme-coding genes were positively selected genes (PSGs) involved in the four metabolic pathways (oval frame) of T. kingsejongensis: energy (purple), nucleotide (red), lipid (green), and carbohydrate (blue) metabolic pathways. The three genes belonging to the oxidative phosphorylation pathway (KEGG pathway map00190) (rectangular frame) are presented below the enzymes involved. Solid lines indicate direct processes and dashed lines indicate that more than one step is involved in a process.