| Literature DB >> 28053166 |
Linhuan Wu1,2, Qinglan Sun1, Philippe Desmeth3, Hideaki Sugawara4, Zhenghong Xu2, Kevin McCluskey5, David Smith6, Vasilenko Alexander7, Nelson Lima8, Moriya Ohkuma9, Vincent Robert10, Yuguang Zhou11, Jianhui Li12, Guomei Fan1, Supawadee Ingsriswang13, Svetlana Ozerskaya7, Juncai Ma14.
Abstract
The World Data Centre for Microorganisms (WDCM) was established 50 years ago as the data center of the World Federation for Culture Collections (WFCC)-Microbial Resource Center (MIRCEN). WDCM aims to provide integrated information services using big data technology for microbial resource centers and microbiologists all over the world. Here, we provide an overview of WDCM including all of its integrated services. Culture Collections Information Worldwide (CCINFO) provides metadata information on 708 culture collections from 72 countries and regions. Global Catalogue of Microorganism (GCM) gathers strain catalogue information and provides a data retrieval, analysis, and visualization system of microbial resources. Currently, GCM includes >368 000 strains from 103 culture collections in 43 countries and regions. Analyzer of Bioresource Citation (ABC) is a data mining tool extracting strain related publications, patents, nucleotide sequences and genome information from public data sources to form a knowledge base. Reference Strain Catalogue (RSC) maintains a database of strains listed in International Standards Organization (ISO) and other international or regional standards. RSC allocates a unique identifier to strains recommended for use in diagnosis and quality control, and hence serves as a valuable cross-platform reference. WDCM provides free access to all these services at www.wdcm.org.Entities:
Mesh:
Year: 2016 PMID: 28053166 PMCID: PMC5210620 DOI: 10.1093/nar/gkw903
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A system-level overview of the WDCM databases.
Figure 2.Web interface of CCINFO database.
Figure 3.ABC data mining working flow.
Statistical summary of the results from the ABC tool
| Data Type | Resources Identified | Isolates included |
|---|---|---|
| Papers | 137 983 Papers | 73 817 strains |
| Patents | 36 729 Patents | 38 457 strains |
| Genomes | 2473 Genomes | 1920 strains |
| Nucleotide | 348 047 Nucleotide | 73 617 strains |
Figure 4.Workflow of RSC processing.
Figure 5.Browse page of strain information.
Statistic summary of culture collection distribution
| Continents | Number of countries | Number of collections | Number of holdings | Average holdings/collection |
|---|---|---|---|---|
| Africa | 7 | 11 | 15 935 | 1448 |
| America | 11 | 178 | 497 894 | 2797 |
| Asia | 17 | 246 | 1 025 865 | 4170 |
| Europe | 33 | 232 | 889 837 | 3835 |
| Oceania | 4 | 41 | 105 379 | 2570 |
| Total | 72 | 708 | 2 534 910 | 3580 |
Figure 6.Average holdings per collection by different continents.
Top 10 countries with the largest number of holdings
| Rank | Countries and regions | Total holdings | Number of collections | Average holdings/collection |
|---|---|---|---|---|
| 1 | USA | 261 637 | 29 | 9022 |
| 2 | Japan | 254 830 | 26 | 9801 |
| 3 | India | 194 174 | 30 | 6472 |
| 4 | China | 182 235 | 19 | 9591 |
| 5 | Republic of Korea | 167 090 | 23 | 7264 |
| 6 | Brazil | 114 494 | 77 | 1483 |
| 7 | Denmark | 102 066 | 3 | 34 022 |
| 8 | Thailand | 99 323 | 63 | 1577 |
| 9 | Germany | 95 593 | 13 | 7353 |
| 10 | Belgium | 93 421 | 7 | 13 346 |
Top 10 highest cited genus in ABC database
| Genus | Paper counts |
|---|---|
| Rickettsia | 173 189 |
| Staphylococcus | 95 829 |
| Saccharomyces | 71 001 |
| Pseudomonas | 64 904 |
| Mycobacterium | 63 261 |
| Streptococcus | 55 547 |
| Sclerotinia | 41 428 |
| Salmonella | 36 043 |
| Helicobacter | 29 060 |
| Clostridium | 25 759 |