| Literature DB >> 34223611 |
Euna Jo1,2, Yll Hwan Cho1, Seung Jae Lee1, Eunkyung Choi1, Jinmu Kim1, Jeong-Hoon Kim2, Young Min Chi1, Hyun Park1.
Abstract
The genus Pogonophryne is a speciose group that includes 28 species inhabiting the coastal or deep waters of the Antarctic Southern Ocean. The genus has been divided into five species groups, among which the P. albipinna group is the most deep-living group and is characterized by a lack of spots on the top of the head. Here, we carried out genome survey sequencing of P. albipinna using the Illumina HiSeq platform to estimate the genomic characteristics and identify genome-wide microsatellite motifs. The genome size was predicted to be ∼883.8 Mb by K-mer analysis (K = 25), and the heterozygosity and repeat ratio were 0.289 and 39.03%, respectively. The genome sequences were assembled into 571624 contigs, covering a total length of ∼819.3 Mb with an N50 of 2867 bp. A total of 2217422 simple sequence repeat (SSR) motifs were identified from the assembly data, and the number of repeats decreased as the length and number of repeats increased. These data will provide a useful foundation for the development of new molecular markers for the P. albipinna group as well as for further whole-genome sequencing of P. albipinna.Entities:
Keywords: GC content; Pogonophryne albipinna; genome assembly; genome size; microsatellite
Mesh:
Substances:
Year: 2021 PMID: 34223611 PMCID: PMC8292760 DOI: 10.1042/BSR20210824
Source DB: PubMed Journal: Biosci Rep ISSN: 0144-8463 Impact factor: 3.840
Statistics of the genome survey sequencing data of P. albipinna
| Raw data (bp) | Total reads | Q20 (%) | Q30 (%) | GC content (%) |
|---|---|---|---|---|
| 57104280342 | 378174042 | 96.6 | 91.8 | 41.7 |
Genome estimation based on K-mer analysis of P. albipinna
| K-mer | Genome size (bp) | Heterozygosity (%) | Duplication ratio (%) |
|---|---|---|---|
| 17 | 829857227 | 0.275 | 0.795 |
| 19 | 843219952 | 0.294 | 0.758 |
| 25 | 883779230 | 0.289 | 0.751 |
Figure 1K-mer (K = 25) distribution of P. albipinna genome
Blue bars represent the observed K-mer distribution; black line represents the modeled distribution without the error K-mers (indicated by the red line), up to a maximum K-mer coverage specified in the model (indicated by the yellow line). Len, estimated total genome length; Uniq, unique portion of the genome (not repetitive); Het, heterozygosity rate; Kcov, mean K-mer coverage for heterozygous bases; Err, error rate; Dup, duplication rate.
Statistics of the assembled genome sequences of P. albipinna
| Total length (bp) | Total number | Max length (bp) | N50 length (bp) | GC content (%) | |
|---|---|---|---|---|---|
| 819289238 | 571624 | 51460 | 2867 | 41.02 |
Statistics of SSR for P. albipinna
| Statistics | Di- | Tri- | Tetra- | Penta- | Hexa- | Total |
|---|---|---|---|---|---|---|
| SSR number | 1926231 | 249028 | 36955 | 3372 | 1836 | 2217422 |
| Percentage | 86.87 | 11.23 | 1.67 | 0.15 | 0.08 | - |
Figure 2Type and frequency of microsatellite motifs in P. albipinna genome
(A) Frequency of different microsatellite motif types. (B) Frequency of different dinucleotide microsatellite motifs. (C) Frequency of different trinucleotide microsatellite motifs. (D) Frequency of different tetranucleotide microsatellite motifs. (E) Frequency of different pentanucleotide microsatellite motifs. (F) Frequency of different hexanucleotide microsatellite motifs.