| Literature DB >> 34850116 |
Georges P Schmartz1, Anna Hartung1, Pascal Hirsch1,2, Fabian Kern1, Tobias Fehlmann1, Rolf Müller2,3, Andreas Keller1,2.
Abstract
Plasmids are known to contain genes encoding for virulence factors and antibiotic resistance mechanisms. Their relevance in metagenomic data processing is steadily growing. However, with the increasing popularity and scale of metagenomics experiments, the number of reported plasmids is rapidly growing as well, amassing a considerable number of false positives due to undetected misassembles. Here, our previously published database PLSDB provides a reliable resource for researchers to quickly compare their sequences against selected and annotated previous findings. Within two years, the size of this resource has more than doubled from the initial 13,789 to now 34,513 entries over the course of eight regular data updates. For this update, we aggregated community feedback for major changes to the database featuring new analysis functionality as well as performance, quality, and accessibility improvements. New filtering steps, annotations, and preprocessing of existing records improve the quality of the provided data. Additionally, new features implemented in the web-server ease user interaction and allow for a deeper understanding of custom uploaded sequences, by visualizing similarity information. Lastly, an application programming interface was implemented along with a python library, to allow remote database queries in automated workflows. The latest release of PLSDB is freely accessible under https://www.ccb.uni-saarland.de/plsdb.Entities:
Mesh:
Year: 2022 PMID: 34850116 PMCID: PMC8728149 DOI: 10.1093/nar/gkab1111
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.New Data of PLSDB: (A) Growth of the PLSDB data collection over time. (B) Taxonomic tree capturing the main composition of PLSDB in terms of quantity across several taxonomic ranks. Node size indicates frequency in the current database. Color fade represents relative growth compared to the first release of PLSDB. (C) Yearly growth of annotation data per source collection. Fold change is always computed with respect to the first release. (D) Manual validation of automatic preprocessing results. For each information type, annotations are compared before and after preprocessing. To generate the heatmap, all unique lowercase representatives of descriptions were extracted from the current version of PLSDB. Entries were then manually evaluated. For the preprocessed description, the comparison was drawn to the respective ontologies.