| Literature DB >> 35298623 |
Zhixiang Pan1, Jianfeng Jin2, Cong Xu2, Daoyuan Yu3.
Abstract
The family Tomoceridae is among the earliest derived collembolan lineages, thus is of key importance in understanding the evolution of Collembola. Here, we assembled a chromosome-level genome of one tomocerid species Tomocerus qinae by combining Nanopore long reads and Hi-C data. The final genome size was 334.44 Mb with the scaffold/contig N50 length of 71.85/13.94 Mb. BUSCO assessment indicated that 96.80% of complete arthropod universal single-copy orthologs (n = 1,013) were present in the assembly. The repeat elements accounted for 26.11% (87.26 Mb) and 494 noncoding RNAs were identified in the genome. A total of 20,451 protein-coding genes were predicted, which captured 96.0% (973) BUSCO genes. Gene family evolution analyses identified 4,825 expanded gene families of T. qinae, among them, 47 experienced significant expansions, and these significantly expanded gene families mainly involved in proliferation and growth. This study provides an important genomic resource for future evolution and comparative genomics analyses of Collembola.Entities:
Keywords: Hi-C; Nanopore; Tomocerinae; comparative genomics; gene family evolution
Mesh:
Year: 2022 PMID: 35298623 PMCID: PMC8995043 DOI: 10.1093/gbe/evac039
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Genome assembly and annotation statistics of three Collembola species
|
|
|
| |
|---|---|---|---|
| Genome assembly | |||
| Assembly size (Mb) | 334.44 | 381.46 | 221.70 |
| Number of scaffolds/contigs | 115/272 | 599/599 | 162/228 |
| Longest scaffold/contig (Mb) | 140.45/25.68 | 12.99/12.99 | 28.53/20.23 |
| N50 scaffold/contig length (Mb) | 71.85/13.94 | 3.28/3.28 | 6.52/4.89 |
| GC (%) | 34.42 | 37.51 | 37.52 |
| Gaps (%) | 0.05 | 0 | 0.11 |
| BUSCO completeness (%) | 96.8 | 95.3 | 97.0 |
| Gene annotation | |||
| Protein-coding genes | 20,451 | 23,943 | 28,734 |
| Mean protein length (aa) | 518.16 | 524.60 | 461.54 |
| Mean gene length (bp) | 6,083.26 | 4,040.85 | 4,615.44 |
| Exons per gene | 7.23 | 5.59 | 6.89 |
| Exon (%) | 17.13 | 15.69 | 31.90 |
| Mean exon length | 387.54 | 446.95 | 357.02 |
| Intron (%) | 20.07 | 9.67 | 28.00 |
| Mean intron length | 556.85 | 364.58 | 414.36 |
| BUSCO completeness (%) | 96.0 | 90.8 | 77.6 |
Fig. 1.(a) Phylogenetic and gene family evolution analyses of T. qinae and another nine arthropod species. Node values indicate the number of gene families showing expansion, contraction, and rapid evolution. “1:1:1” represents shared single-copy genes, “N:N:N” as multicopy genes shared by all species, “Collembola” as shared orthologs unique to Collembola, “Others” as unclassified orthologs, “Unassigned” as orthologs which cannot be assigned into any gene families (orthogroups). (b) The bars show top 20 significantly expanded families.