| Literature DB >> 31341172 |
Mingyi Cai1, Yu Zou2, Shijun Xiao3,4, Wanbo Li2, Zhaofang Han2, Fang Han2, Junzhu Xiao2, Fujiang Liu2, Zhiyong Wang5,6.
Abstract
Collichthys lucidus (C. lucidus) is a commercially important marine fish species distributed in coastal regions of East Asia with the X1X1X2X2/X1X2Y multiple sex chromosome system. The karyotype for female C. lucidus is 2n = 48, while 2n = 47 for male ones. Therefore, C. lucidus is also an excellent model to investigate teleost sex-determination and sex chromosome evolution. We reported the first chromosome genome assembly of C. lucidus using Illumina short-read, PacBio long-read sequencing and Hi-C technology. An 877 Mb genome was obtained with a contig and scaffold N50 of 1.1 Mb and 35.9 Mb, respectively. More than 97% BUSCOs genes were identified in the C. lucidus genome and 28,602 genes were annotated. We identified potential sex-determination genes along chromosomes and found that the chromosome 1 might be involved in the formation of Y specific metacentric chromosome. The first C. lucidus chromosome-level reference genome lays a solid foundation for the following population genetics study, functional gene mapping of important economic traits, sex-determination and sex chromosome evolution studies for Sciaenidae and teleosts.Entities:
Mesh:
Year: 2019 PMID: 31341172 PMCID: PMC6656731 DOI: 10.1038/s41597-019-0139-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1A picture of Collichthys lucidus used for the genome sequencing.
Sequencing data used for the C. lucidus genome assembly.
| Types | Method | Library size (bp) | Clean data (Gb) | length (bp) | coverage (×) |
|---|---|---|---|---|---|
| Genome | Illumina | 300–350 | 52.0 | 150 | 62.6 |
| Genome | Pacbio | 20,000 | 90.5 | 14,002 | 109.0 |
| Genome | Hi-C | — | 193.1 | 150 | 232.7 |
| Transcriptome | Illumina | 250–300 | 9.8 | 150 | — |
The coverage was calculated using an estimated genome size of 830 Mb.
Fig. 2Kmer frequency of C. lucidus. Note that the first, second and third peak was composed of the homozygous, heterozygous and repeated Kmers, respectively.
Assembly statistics of C. lucidus.
| Sample ID | Contig Length (bp) | Contig number |
|---|---|---|
| Total | 877,428,965 | 2,912 |
| Max | 9,855,977 | — |
| Number >=2000bp | — | 2,853 |
| N50 | 1,098,566 | 210 |
| N60 | 794,488 | 305 |
| N70 | 545,261 | 437 |
| N80 | 319,460 | 646 |
| N90 | 152,174 | 1,044 |
General statistics of predicted protein-coding genes.
| Gene set | Number | Average transcript length (bp) | Average CDS length (bp) | Average exons per gene | Average exon length (bp) | Average intron length (bp) | |
|---|---|---|---|---|---|---|---|
|
|
| 32,502 | 11,378.88 | 1,494.29 | 8.52 | 175.44 | 1,314.88 |
|
| 40,805 | 15.596.28 | 1,560.39 | 8.56 | 182.21 | 1,855.72 | |
|
|
| 52,244 | 9,049.21 | 1,076.27 | 5.56 | 193.69 | 1,749.76 |
|
| 48,861 | 7,508.49 | 1,028.16 | 5.79 | 177.46 | 1,351.80 | |
|
| 45,957 | 7,811.18 | 1,035.02 | 6.04 | 171.27 | 1,447.46 | |
|
| 44,650 | 8,137.02 | 1,036.88 | 5.91 | 175.59 | 1,405.38 | |
|
| 43,159 | 8,366.10 | 1,046.02 | 6.21 | 168.48 | 1,401.06 | |
|
|
| 11,694.21 | 1,095.81 | 7.62 | 317.99 | 1,401.06 | |
|
|
| 13,241.72 | 1,673.58 | 9.74 | 207.05 | 1,284.21 | |
General statistics of gene function annotation.
| Type | Number | Percent(%) | |
|---|---|---|---|
|
| 28,602 | 100 | |
|
|
| 24,918 | 87.12 |
|
| 18,942 | 66.23 | |
|
| 17,806 | 62.25 | |
|
| 26,038 | 91.04 | |
|
| 27,883 | 97.49 | |
|
| 27,996 | 97.88 | |
|
| 28,032 | 98.01 | |
|
| 570 | 1.99 | |
Fig. 3Repetitive element distribution and potential sex-determination gene identification in the chromosomes of C. lucidus. The color bar represented the density of repetitive elements (number per 100 kb) along the genome and 21 key genes involving in teleost sex-determination that reported in previous studies were identified and label on chromosomes.
Fig. 4Chromosome comparison of C. lucidus to L. corcea using protein-coding genes synteny. The chromosome id of C. lucidus were sorted by the sequence lengths.
| Design Type(s) | sequence assembly objective • sequence annotation objective • transcription profiling design |
| Measurement Type(s) | whole genome sequencing assay • transcript expression assay |
| Technology Type(s) | DNA sequencing • RNA sequencing |
| Factor Type(s) | organism part |
| Sample Characteristic(s) | Collichthys lucidus |