M Kösters1, J Leufken1,2, S Schulze1, K Sugimoto1, J Klein3, R P Zahedi4,5, M Hippler1, S A Leidel2, C Fufezan1,6. 1. Institute of Plant Biology and Biotechnology, WWU Münster, Münster, Germany. 2. Max Planck Institute for Molecular Biomedicine, Münster, Germany. 3. Bioinformatics Program, Boston University, One Silber Way, Boston, MA, USA. 4. Gerald Bronfman Department of Oncology, Jewish General Hospital, McGill University, 5100 de Maisonneuve Boulevard West, Suite 720, Montreal, Quebec, Canada. 5. Segal Cancer Proteomics Centre, Lady Davis Institute, Jewish General Hospital, McGill University, 3755 Côte-Sainte-Catherine Road, Montreal, Quebec, Canada. 6. Cellzome A GSK Company, Heidelberg, Germany.
Abstract
Motivation: In the new release of pymzML (v2.0), we have optimized the speed of this established tool for mass spectrometry data analysis to adapt to increasing amounts of data in mass spectrometry. Thus, we integrated faster libraries for numerical calculations, improved data retrieving algorithms and have optimized the source code. Importantly, to adapt to rapidly growing file sizes, we developed a generalizable compression scheme for very fast random access and applied this concept to mzML files to retrieve spectral data. Results: pymzML performs at par with established C programs when it comes to processing times. However, it offers the versatility of a scripting language, while adding unprecedented fast random access to compressed files. Additionally, we designed our compression scheme in such a general way that it can be applied to any field where fast random access to large data blocks in compressed files is desired. Availability and implementation: pymzML is freely available on https://github.com/pymzML/pymzML under GPL license. pymzML requires Python3.4+ and optionally numpy. Documentation available on http://pymzml.readthedocs.io.
Motivation: In the new release of pymzML (v2.0), we have optimized the speed of this established tool for mass spectrometry data analysis to adapt to increasing amounts of data in mass spectrometry. Thus, we integrated faster libraries for numerical calculations, improved data retrieving algorithms and have optimized the source code. Importantly, to adapt to rapidly growing file sizes, we developed a generalizable compression scheme for very fast random access and applied this concept to mzML files to retrieve spectral data. Results: pymzML performs at par with established C programs when it comes to processing times. However, it offers the versatility of a scripting language, while adding unprecedented fast random access to compressed files. Additionally, we designed our compression scheme in such a general way that it can be applied to any field where fast random access to large data blocks in compressed files is desired. Availability and implementation: pymzML is freely available on https://github.com/pymzML/pymzML under GPL license. pymzML requires Python3.4+ and optionally numpy. Documentation available on http://pymzml.readthedocs.io.
Authors: Yu Han; Sara A Wennersten; Julianna M Wright; R W Ludwig; Edward Lau; Maggie P Y Lam Journal: Am J Physiol Heart Circ Physiol Date: 2022-08-05 Impact factor: 5.125
Authors: Shuo Han; Will Van Treuren; Curt R Fischer; Bryan D Merrill; Brian C DeFelice; Juan M Sanchez; Steven K Higginbottom; Leah Guthrie; Lalla A Fall; Dylan Dodd; Michael A Fischbach; Justin L Sonnenburg Journal: Nature Date: 2021-07-14 Impact factor: 49.962
Authors: L Peter Sarin; Sandra D Kienast; Johannes Leufken; Robert L Ross; Agnieszka Dziergowska; Katarzyna Debiec; Elzbieta Sochacka; Patrick A Limbach; Christian Fufezan; Hannes C A Drexler; Sebastian A Leidel Journal: RNA Date: 2018-07-16 Impact factor: 4.942