| Literature DB >> 33987016 |
Daniyar Karabayev1, Ulykbek Kairov1, Askhat Molkenov1, Kaiyrgali Yerulanuly1,2, Ilyas Kabimoldayev1, Asset Daniyarov1, Aigul Sharip1, Ainur Seisenova1, Zhaxybay Zhumadilov1,3.
Abstract
BACKGROUND: High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples.Entities:
Keywords: Bioinformatics; Genome analysis; Genomics; Genomics data mining; NGS data analysis; VCF; Variant call format
Year: 2021 PMID: 33987016 PMCID: PMC8101456 DOI: 10.7717/peerj.11333
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Main window of re-Searcher GUI.
Figure 2Data processing workflow.
Figure 3Genotype conversion example.
(A) The numeric genotype of biallelic and multiallelic variants before conversion, and (B) letter genotype of the same variants after conversion.
Comparison of re-Searchers features with similar tools.
| Categories | Features | re-Searcher | VIVA | VCFtools | GEMINI | BrowseVCF | VCF.Filter | VCF-Miner |
|---|---|---|---|---|---|---|---|---|
| Technical Aspects | Compatibility with operation system | Windows, MacOS, Linux | Windows, MacOS, Linux | Windows, MacOS, Linux | Windows, MacOS, Linux | Windows, MacOS, Linux | Windows, MacOS, Linux | Windows |
| Language | Python | Julia | C++, Perl | Python | Python, JavaScript, CSS, HTML5 | Java | Java | |
| Interface | GUI, Web Browser, CLI | CLI, Jupyter Notebook | CLI | Web Browser, CLI | GUI, CLI | GUI | GUI | |
| Works offline | ||||||||
| Portable launcher | ||||||||
| Functionality | Search by keyword | |||||||
| Sample selection | ||||||||
| Genotype format conversion | ||||||||
| Visualization | ||||||||
| Export filtered VCF file |
Figure 4Web interface of re-Searcher available via browser at https://nla-lbsb.nu.edu.kz.
re-Searcher multi-platform run time comparison.
| OS | VCF file size (Gb) | Execution Time (sec) | |||
|---|---|---|---|---|---|
| Extract header | Keyword extraction | Sample ID extraction | GT | ||
| Linux | 0.081 | 2.227 | 9.995 | 15.243 | 38.375 |
| 0.814 | 2.497 | 38.106 | 21.334 | 117.186 | |
| 1.320 | 2.462 | 67.260 | 61.159 | 206.868 | |
| 1.980 | 2.115 | 75.331 | 104.167 | 330.527 | |
| 7.950 | 6.145 | 366.137 | 200.347 | 1117.168 | |
| Windows | 0.081 | 14.865 | 22.482 | 4.785 | 49.642 |
| 0.814 | 18.898 | 48.820 | 21.398 | 139.641 | |
| 1.320 | 21.054 | 44.669 | 59.329 | 238.192 | |
| 1.980 | 10.919 | 53.958 | 90.446 | 339.275 | |
| 7.950 | 16.308 | 502.996 | 309.177 | 1320.197 | |
| MacOS | 0.081 | 9.627 | 9.297 | 10.423 | 16.298 |
| 0.814 | 5.262 | 20.916 | 19.705 | 116.82 | |
| 1.320 | 6.286 | 35.433 | 53.247 | 186.181 | |
| 1.980 | 3.231 | 37.923 | 119.136 | 286.544 | |
| 7.950 | 5.457 | 148.612 | 254.128 | 1130.225 | |
Notes.
operational system
genotype
seconds
gigabyte