| Literature DB >> 33219670 |
Fuyun Liu1, Yuli Li1,2, Hongwei Yu1, Lingling Zhang1,2, Jingjie Hu1,3, Zhenmin Bao1,3,4, Shi Wang1,2,3.
Abstract
Mollusca represents the second largest animal phylum but remains poorly explored from a genomic perspective. While the recent increase in genomic resources holds great promise for a deep understanding of molluscan biology and evolution, access and utilization of these resources still pose a challenge. Here, we present the first comprehensive molluscan genomics database, MolluscDB (http://mgbase.qnlm.ac), which compiles and integrates current molluscan genomic/transcriptomic resources and provides convenient tools for multi-level integrative and comparative genomic analyses. MolluscDB enables a systematic view of genomic information from various aspects, such as genome assembly statistics, genome phylogenies, fossil records, gene information, expression profiles, gene families, transcription factors, transposable elements and mitogenome organization information. Moreover, MolluscDB offers valuable customized datasets or resources, such as gene coexpression networks across various developmental stages and adult tissues/organs, core gene repertoires inferred for major molluscan lineages, and macrosynteny analysis for chromosomal evolution. MolluscDB presents an integrative and comprehensive genomics platform that will allow the molluscan community to cope with ever-growing genomic resources and will expedite new scientific discoveries for understanding molluscan biology and evolution.Entities:
Year: 2021 PMID: 33219670 PMCID: PMC7779068 DOI: 10.1093/nar/gkaa918
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of MolluscDB database structure and web interface features.
Summary of MolluscDB data composition
| Data | Statistics |
|---|---|
| Class /order/species | 3/46/123 |
| Protein-coding genes | 563 593 |
| Transcriptomic data/expression profiles | 538 |
| Mitogenomic data | 409 |
| Taxonomic categories with paleobiological records | 241 |
| Types of functional annotation database | 6 |
| Swissprot/Nr/Go/Kegg/Pfam/Panther annotation | 347 623/508 505/277 773/165 238/411 647/455 626 |
| Transposable elements/associated genes | 72 640 596/522 372 |
| Gene families/associated genes | 29 151/513 684 |
| Groups of Pan-geneset | 38 |
| Core gene families | 122 434 |
| Dispensable gene families | 169 392 |
| Core genes | 513 684 |
| Unclustered genes | 49 909 |
| Transcription factors/TF families | 26 441/71 |
| Co-expressed gene networks | 18 |
| Synteny gene pairs | 363 152 |
Summary of 20 high-quality molluscan genome assemblies
| Taxonomy | Species | Genome_size (Mb) | Number of protein-coding genes | Contig N50 (Kb) | Scaffold N50 (Kb) | GC_content (%) | Repeat_rate (%) | References/Resources |
|---|---|---|---|---|---|---|---|---|
| Bivalvia |
| 988 | 24 738 | 38 | 804 | 36.52 | 27.85 | ( |
|
| 780 | 28 602 | 22 | 602 | 35.49 | 27.73 | ( | |
|
| 725 | 26 256 | 80 | 1 020 | 35.40 | 32.04 | ( | |
|
| 559 | 28 072 | 19 | 401 | 33.44 | 34.71 | ( | |
|
| 685 | 34 596 | 1 971 | 75 944 | 34.83 | 39.69 | ( | |
|
| 788 | 29 738 | 40 | 804 | 33.31 | 45.39 | ( | |
|
| 1024 | 31 477 | 21 | 167 | 35.03 | 43.35 | ( | |
|
| 991 | 30 815 | 21 | 324 | 35.32 | 48.01 | ( | |
|
| 1660 | 33 584 | 13 | 343 | 34.17 | 47.25 | ( | |
|
| 2630 | 36 549 | 20 | 100 | 33.96 | 59.66 | ( | |
|
| 885 | 24 045 | 1798 | 4500 | 33.70 | 46.41 | ( | |
|
| 1 332 | 26 273 | 679 | 57 990 | 35.45 | 36.65 | ( | |
| Cephalopoda |
| 2 372 | 33 609 | 5 | 470 | 36.04 | 50.43 | ( |
|
| 5 090 | 30 010 | 197 | 3020 | 36.34 | 75.62 | ( | |
| Gastropoda |
| 360 | 23 818 | 96 | 1870 | 33.28 | 23.73 | ( |
|
| 1 865 | 29 449 | 14 | 211 | 40.51 | 36.07 | ( | |
|
| 558 | 24 980 | 29 | 422 | 37.65 | 29.25 | ( | |
|
| 916 | 25 550 | 19 | 48 | 35.99 | 43.79 | ( | |
|
| 927 | 19 944 | 10 | 917 | 40.35 | 39.70 | NCBI Genome (AplCal3.0) | |
|
| 440 | 21 533 | 1073 | 31 530 | 40.62 | 20.72 | ( |
Figure 2.Screenshots for (A) overview of species information, (B) summary of genome assembly, (C) summary of transcriptomic data, (D) overview of mitogenomic information and (E) summary of paleobiological records.
Figure 3.Screenshots for (A) gene annotation, (B) transposable elements, (C) transcription factors, (D) Gbrowse, (E) gene search and (F) gene family.
Figure 4.Screenshots for (A) expression visualization and (B) gene coexpression network.
Figure 5.Screenshots for specially customized modules for (A) Pan-geneset analysis and (B) macrosynteny analysis.