| Literature DB >> 35134929 |
Paul Davis1, Magdalena Zarowiecki1, Valerio Arnaboldi2, Andrés Becerra1, Scott Cain3, Juancarlos Chan2, Wen J Chen2, Jaehyoung Cho2, Eduardo da Veiga Beltrame2, Stavros Diamantakis1, Sibyl Gao3, Dionysis Grigoriadis1, Christian A Grove2, Todd W Harris3, Ranjana Kishore2, Tuan Le1, Raymond Y N Lee2, Manuel Luypaert1, Hans-Michael Müller2, Cecilia Nakamura2, Paulo Nuin3, Michael Paulini1, Mark Quinton-Tulloch1, Daniela Raciti2, Faye H Rodgers4, Matthew Russell1, Gary Schindelman2, Archana Singh4, Tim Stickland4, Kimberly Van Auken2, Qinghua Wang2, Gary Williams1, Adam J Wright3, Karen Yook2, Matt Berriman4, Kevin L Howe1, Tim Schedl5, Lincoln Stein3, Paul W Sternberg2.
Abstract
WormBase (www.wormbase.org) is the central repository for the genetics and genomics of the nematode Caenorhabditis elegans. We provide the research community with data and tools to facilitate the use of C. elegans and related nematodes as model organisms for studying human health, development, and many aspects of fundamental biology. Throughout our 22-year history, we have continued to evolve to reflect progress and innovation in the science and technologies involved in the study of C. elegans. We strive to incorporate new data types and richer data sets, and to provide integrated displays and services that avail the knowledge generated by the published nematode genetics literature. Here, we provide a broad overview of the current state of WormBase in terms of data type, curation workflows, analysis, and tools, including exciting new advances for analysis of single-cell data, text mining and visualization, and the new community collaboration forum. Concurrently, we continue the integration and harmonization of infrastructure, processes, and tools with the Alliance of Genome Resources, of which WormBase is a founding member.Entities:
Keywords: zzm321990 Caenorhabditis eleganszzm321990 ; annotation; caenorhabditis; community; curation; data; database; gene; health; human; literature; mining; model; nematode; platform; research; resource; software; tools
Mesh:
Year: 2022 PMID: 35134929 PMCID: PMC8982018 DOI: 10.1093/genetics/iyac003
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.402
C. elegans gene counts by type in the WS282 data release.
|
| Count |
|---|---|
| Total genes | 49,187 |
| Coding | 19,985 |
| Noncoding | 25,537 |
| piRNA | 15,363 |
| ncRNA | 7,764 |
| circRNA | 724 |
| tRNA | 634 |
| snoRNA_gene | 346 |
| miRNA | 261 |
| lincRNA_gene | 193 |
| snRNA_gene | 129 |
| antisense_lncRNA_gene | 100 |
| rRNA | 22 |
| scRNA | 1 |
| Pseudogene | 2,129 |
| Uncloned | 1,536 |
Number of genomic features in the WS282 data release.
| Feature type | Count |
|---|---|
| Total | 869,687 |
| SL1 predicted from RNASeq | 72,602 |
| SL2 predicted from RNASeq | 18,597 |
| SL1 | 91,449 |
| SL2 | 15,083 |
| polyA_signal_sequence | 2,454 |
| polyA_site | 87,271 |
| TF_binding_site | 533 |
| TF_binding_site_region | 327,166 |
| binding_site | 1,604 |
| binding_site_region | 683 |
| histone_binding_site_region | 5,164 |
| DNaseI_hypersensitive_site | 49,832 |
| Promoter | 849 |
| regulatory_region | 163 |
| Enhancer | 2,449 |
| TSS_region | 73,499 |
| transcription_end_site | 92,672 |
| three_prime_UTR | 21,345 |
| Genome_sequence_error | 1,235 |
| Corrected_genome_sequence_error | 1,553 |
| segmental_duplication | 3,484 |
Fig. 1.Examples of RNA-Seq expression data availability in WormBase through custom interfaces. a) FPKM expression over conditions, as determined by different studies; the left panel is a study selector, and the right panel shows data for the selected study; b) a customized plot for the highly-accessed modENCODE RNA-Seq data corpus.
Summary of the number of transcriptomic samples available per species.
| Species | No. studies | No. transcriptomic samples |
|---|---|---|
|
| 50 | 977 |
|
| 4 | 36 |
|
| 5 | 43 |
|
| 6 | 133 |
|
| 2 | 12 |
|
| 2 | 21 |
|
| 2 | 8 |
|
| 5 | 80 |
|
| 2 | 16 |
|
| 1 | 22 |
|
| 1 | 1 |
Fig. 2.For single-cell data, WormBase developed two web apps for easily performing differential gene expression analysis (scdefg app) and visualization of gene expression (wormcells-viz app) in the annotated cell types.
Fig. 3.Wormicloud creates word clouds from the scientific literature based on the words you input (in this case the input was “WormBase”).
Fig. 4.Vennter tool allows researchers to visualize available interactions and select the interaction type(s) of interest and retrieve a list of genes that interact with a focused gene.