| Literature DB >> 29917040 |
Tingting Zhang1,2, Jiaojiao Miao1,2, Na Han1,2, Yujun Qiang1,2, Wen Zhang1,2.
Abstract
Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes.Database URL: http://data.mypathogen.org.Entities:
Mesh:
Year: 2018 PMID: 29917040 PMCID: PMC6007212 DOI: 10.1093/database/bay055
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Workflow of the data integration processes. Data are imported from two main sources: public resources and users. For public data: Through the collected background information, some records which lack of background information will be discarded. Then data filtering among different databases was conducted by our self-designed program. Finally, the non-redundant data are uploaded to the integrated genome and metagenome database. For user’s data: Users should input the complete background information, then upload related files with corresponding format types (FASTQ, FASTA and TXT).
Figure 2.(A) The text on the top right is statistics data of the genome database. The three circles from inside to outside is about the number of records according to kingdom, phylum and class in the genome database. It is corresponding to the list on the bottom right. It is a dynamic visual display, when simply drag the mouse on three circles or the list, the other will change with it. (B) The text on the top left is statistics data of the metagenome database. The three circles from inside to outside is about the number of records according to sample classification, sample sources and sample sites in the metagenome database. Similar to (A), it is a dynamic visual display and corresponding to the list on the bottom left.
Figure 3.(A) Sample search page. The search box on the top is for keyword-based search, which can quick and focus searches; click on the advanced search can allow the user to accurate and customized searches; on the left side is the classification list according to the species relationship and the sample sources in the genome and metagenomic database pages, respectively; in the middle displays the top 10 hot words, which can directly to be linked to the corresponding data. (B) Sample search result list. From left to right in the genome result list is the ID number of MPD, strain, species, the name of Latin, authority status, data source and file list. (In the metagenome result list is the ID number of MPD, project information, sample source, sequencing method, authority status, data source and file list.) (C) Sample of the online dynamic visual display for microbial genomes. It depends on the genome feature annotation file and the genomic data file, which can scan gene location, length, etc., by simply clicking and dragging the mouse. (D) The local client for data transfer. The local client can be downloaded from the tools page on the website, which is provided as a zip file. The user can directly log in to the client by double-clicking the ‘exe’ file contained in the folder after unzipping the file.