| Literature DB >> 35809048 |
Yujung Lee1, Bongsang Kim1,2, Jaehoon Jung1,2, Bomin Koh1, So Yun Jhang1,3, Chaeyoung Ban1, Won-Jae Chi4, Soonok Kim4, Jaewoong Yu1.
Abstract
BACKGROUND: Plazaster borealis has a unique morphology, displaying multiple arms with a clear distinction between disk and arms, rather than displaying pentaradial symmetry, a remarkable characteristic of echinoderms. Herein we report the first chromosome-level reference genome of P. borealis and an essential tool to further investigate the basis of the divergent morphology.Entities:
Keywords: Hi-C; Nanopore; Plazaster borealis; genome assembly
Mesh:
Year: 2022 PMID: 35809048 PMCID: PMC9270726 DOI: 10.1093/gigascience/giac063
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 7.658
Figure 1:(A) Adult Plazaster borealis. Photograph by National Institute of Biological Resources [77]. (B) Sampling spot of P. borealis studied in this research.
Plazaster borealis assembly statistics
| Assembly statistics | Value |
|---|---|
| Genome size (bp) | 561,050,340 |
| Number of scaffolds | 801 |
| Number of chromosome-scale scaffolds | 22 |
| N50 of scaffolds (bp) | 24,975,817 |
| L50 of scaffolds | 10 |
| Chromosome-scale scaffolds (bp) | 518,884,334 |
| GC content of the genome (%) | 38.89 |
| QV score | 36.3457 |
| Error rate | 0.00023 |
| BUSCO analysis | |
| Library | Metazoan_odb10 |
| Complete | 935 (98.0%) |
| Complete and single copy | 925 (97.0%) |
| Complete and duplicated | 10 (1.0%) |
| Fragmented | 11 (1.2%) |
| Missing | 8 (0.8%) |
Plazaster borealis repetitive DNA elements
| Type | Number of elements | Length occupied (bp) | Percentage of sequence (%) |
|---|---|---|---|
| DNA | 10,734 | 3,597,965 | 0.64 |
| LINE | 42,851 | 3,472,043 | 0.62 |
| SINE | 60,394 | 13,931,402 | 2.48 |
| LTR | 8,277 | 5,145,127 | 0.92 |
| Satellite | 9 | 2,752 | 0 |
| Small RNA | 20,889 | 1,464,546 | 0.26 |
| Simple repeat | 162,149 | 8,016,020 | 1.43 |
| Unclassified | 1,294,477 | 249,314,223 | 44.44 |
| Low complexity | 25,170 | 1,365,485 | 0.24 |
| Total | 51.05 |
Plazaster borealis genome annotation statistics
| Statistic | Value |
|---|---|
| Number of predicted genes | 26,836 |
| Number of predicted protein-coding genes | 25,224 |
| Average gene length | 8,948.89 |
| Number of transcripts | 26,737 |
| Average transcript length (bp) | 1,502.90 |
| Number of exons | 192,343 |
| Average exon length (bp) | 213.57 |
| Average exon per transcript | 7.19 |
| Number of introns | 165,606 |
| Average intron length (bp) | 1,261.88 |
| Number of genes annotated to Swiss-Prot | 18,451 |
| Number of genes annotated to PFAM | 18,541 |
| Number of genes annotated to NR | 24,229 |
| BUSCO analysis | |
| Complete (%) | 884 (92.6%) |
| Complete and single copy (%) | 859 (90.0%) |
| Complete and duplicated (%) | 25 (2.6%) |
| Fragmented (%) | 44 (4.6%) |
| Missing (%) | 26 (2.8%) |
Figure 2:A phylogenetic tree of P. borealis and 6 other species. This tree was constructed using protein sequences of 7 species, showing gene family expansion and contraction. The number below the branches represents the number of gene families with either expansion (blue) or contraction (red). The ratio of expanded and contracted gene families is expressed in the pie chart above the branches. The numbers at the node indicate the bootstrap value. The species used in the tree are P. borealis, A. rubens, A. planci, P. miniata, L. variegatus, P. parvimensis, and S. purpuratus.
Figure 3:Syntenic relationship of P. borealis and species of the order Forcipulatida. (A) Synteny between A. rubens and P. borealis. The syntenic blocks were calculated with MCscan. (B–D) Syntenic relationship of P. borealis between A. rubens (B), Pisaster ochraceus (C), and Marthasterias glacialis (D). Genomic sequences were compared with Chromeister based on inexact k-mer matching.
Figure 4:GO enrichment analysis of expanded gene families of P. borealis.
Genes with accelerated evolution in the P. borealis
| Gene | H0_lnl | H1_lnl | Likelihood ratio | FDR | No. of positively selected sites* |
|---|---|---|---|---|---|
| GPR161 | −8,827.28 | −8,798.95 | 56.66761 | 2.06E-13 | 5 |
| RPL5 | −3,991.54 | −3,968.12 | 46.84587 | 2.3E-11 | 1 |
| RSL24D1 | −2,215.1 | −2,192.93 | 44.35075 | 6.59E-11 | 14 |
| PHB2 | −4,815.8 | −4,805.98 | 19.631658 | 1.61E-05 | 4 |
| NAA10 | −4,703.42 | −4,694.3 | 18.237898 | 2.92E-05 | 4 |
| IQCA1 | −9,112.13 | −9,103.79 | 16.684644 | 5.88E-05 | 2 |
| SLC30A5 | −10,574.5 | −10,566.6 | 15.766218 | 8.6E-05 | 3 |
| BMP10 | −8,017.18 | −8,010.17 | 14.034764 | 0.000196 | 4 |
| STOML2 | −5,414.16 | −5,408.06 | 12.206464 | 0.000476 | 1 |
| ACYP1 | −1,855.62 | −1,849.54 | 12.153438 | 0.000452 | 3 |
| NIPSNAP3A | −4,951.12 | −4,946.47 | 9.296206 | 0.001968 | 1 |
H0_lnl: log likelihood given H0 (ω does not vary across the branches), H1_lnl: log likelihood given H1, *Number of positively selected sites with a BEB (Bayes Empirical Bayes) of > 0.95.