Literature DB >> 32212641

AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing.

Craig McLean^1,2, Elizabeth B Kujawinski¹.

Abstract

Untargeted metabolomics experiments provide a snapshot of cellular metabolism but remain challenging to interpret due to the computational complexity involved in data processing and analysis. Prior to any interpretation, raw data must be processed to remove noise and to align mass-spectral peaks across samples. This step requires selection of dataset-specific parameters, as erroneous parameters can result in noise inflation. While several algorithms exist to automate parameter selection, each depends on gradient descent optimization functions. In contrast, our new parameter optimization algorithm, AutoTuner, obtains parameter estimates from raw data in a single step as opposed to many iterations. Here, we tested the accuracy and the run-time of AutoTuner in comparison to isotopologue parameter optimization (IPO), the most commonly used parameter selection tool, and compared the resulting parameters' influence on the properties of feature tables after processing. We performed a Monte Carlo experiment to test the robustness of AutoTuner parameter selection and found that AutoTuner generated similar parameter estimates from random subsets of samples. We conclude that AutoTuner is a desirable alternative to existing tools, because it is scalable, highly robust, and very fast (∼100-1000× speed improvement from other algorithms going from days to minutes). AutoTuner is freely available as an R package through BioConductor.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32212641 PMCID： PMC7310949 DOI： 10.1021/acs.analchem.9b04804

Source DB: PubMed Journal: Anal Chem ISSN： 0003-2700 Impact factor: 6.986

Metabolomics is the study of all the compounds present within a cell, organism, or tissue. Such investigations provide a holistic snapshot of the activity within a biological matrix and have led to a myriad of discoveries ranging from the elucidation of novel biochemical pathways to the recognition of molecular adaptive responses to stress and the clarification of mechanisms driving cell–cell interactions.[1−3] Advances in mass spectrometry fostered these discoveries, specifically improvements in instrument sensitivity, accuracy, and data collection capacity.[1,4,5] Parallel advances in computational tools have historically followed to fulfill the potential of analytical improvements.[6] Prior to data analysis, raw data from untargeted metabolomics experiments must be processed to generate a features table. Features are defined as peaks with unique mass to charge (m/z) and retention time values, with relative abundances determined by their height or area. Processing is critical to extract chemical signals from electrical noise and to correct for retention time drift across samples.[7] A variety of untargeted data processing methods exist,[8−11] including two commonly used tools: MZmine2[12] and XCMS.[13] Although these tools reliably extract true features from complex data, their performance depends on the selection of algorithmic parameters that capture the structure of the data, such as matrix effects and differences in analytical platforms.[14−16] No universal set of parameters exists for all datasets; hence, parameter optimization must occur prior to analysis to avoid noise inflation within the feature table.[17−19] Tuning parameters manually is prohibitively time-consuming due to the high number of possible numerical combinations. To overcome this challenge, several methods exist to identify optimal dataset-specific parameters.[20−22] These methods each use distinct optimization functions based on maximizing or minimizing a numerical value. Each approach iteratively runs XCMS peak-picking and retention time correction algorithms until they identify a set of parameters that optimizes a desired criterion. For example, isotopologue parameter optimization (IPO), the most commonly used parameter selection tool, scores groups of parameters by the number of features detected after XCMS that contain a naturally occurring 13C isotopologue. Many separate XCMS runs are required to find ideal parameters, sometimes taking weeks to complete.[20−22] Currently, these parameter selection algorithms depend on high performance computing resources. As users continue to adopt ultrahigh pressure liquid chromatography systems and rapid scanning mass spectrometers, the size and abundance of data from these platforms will preclude the use of unscalable parameter selection algorithms to users without access to high performance computing resources.[23,24] We designed a novel parameter optimization algorithm, AutoTuner, to ameliorate these challenges. The method performs statistical inference on raw data in a single step in order to make parameter estimates as opposed to iteratively checking estimates. Further, it complements recent tools focused on generating higher-confidence feature annotations.[25−28] AutoTuner is capable of selecting parameters for seven continuously valued parameters required for centWave peak selection algorithm used by both MZmine2 and XCMS, and it determines a key parameter for grouping in XCMS. AutoTuner is freely distributed through BioConductor as an R package.

Theory and Design of AutoTuner

Algorithm Overview

AutoTuner makes estimates for the following mass spectrometry peak-picking and grouping algorithms parameters: , , , , , , and . We chose to optimize these parameters because they have the greatest influence on the number and quality of features postprocessing and have the greatest number of possible values.[21,29] We chose to optimize centWave peak-picking parameters over other peak-picking methods, as centWave is the recommended method for processing high-resolution untargeted data, which is increasingly becoming the standard for untargeted metabolomics.[4] See Table for a description of the parameters and their matching arguments in XCMS. AutoTuner makes estimates in three steps ( Figure ). (1) Total ion chromatogram (TIC) peak detection: A user identifies peaks within each sample’s total ion chromatogram (TIC), which is the plot of the integrated ion intensities within the mass spectrometer over time. (2) Parameter estimation within extracted ion chromatograms (EICs) of each TIC peak: AutoTuner isolates predicted EICs within each identified TIC peak. An EIC is a plot of one or more selected m/z values in a series of mass spectra. AutoTuner applies statistical inference on all EIC peaks to estimate parameters in an unsupervised manner. (3) Dataset-wide parameter estimates: AutoTuner integrates all peak-specific estimates into a dataset-wide set.

Table 1

Parameters Estimated through the AutoTuner Algorithma

parameter	description	XCMS parameter name	functionality	application
Group difference	expected retention time deviation of an mz/rt feature between samples	bw	grouping	XCMS
ppm	parts per million (ppm) error threshold used to bin consecutive mass intensities across adjacent scans into a single peak	ppm	centWave (peak-picking)	XCMS and MZmine2
S/N Threshold	the minimum ratio between peak and average noise intensity required to retain a feature	snthresh	centWave (peak-picking)	XCMS and MZmine2
Scan count	minimum number of scans required to retain a peak	prefilter scan	centWave (peak-picking)	XCMS and MZmine2
Noise	numerical threshold used to filter out noise from true masses	noise	centWave (peak-picking)	XCMS and MZmine2
Prefilter intensity	minimum integrated intensity required retain a peak	prefilter intensity	centWave (peak-picking)	XCMS and MZmine2
Peak-width	the width of a chromatographically resolved peak	min/max peak-width	centWave (peak-picking)	XCMS and MZmine2

We chose to optimize these parameters due to their influence on the number and quality of features returned following XCMS data processing.[20,28]Table S4 gives more information on these parameters.

Figure 1

Schematic of the three stages of the AutoTuner algorithm. (A) Total ion chromatogram (TIC) peak detection requires user input and is focused on identifying peaks within each sample’s TIC. The user directly adjusts a signal processing sliding window analysis to identify peaks within the TIC. (B) Parameter estimation within extracted ion chromatograms (EICs) of each TIC peak iteratively looks at each peak to make parameter estimates from EICs. (C) Data set-wide parameter estimates aggregate results from the second stage to provide an ideal set of parameters for the entire dataset. Parameters estimated are in bold. The R package vignette at BioConductor provides an example on how to use the algorithm.

Total Ion Chromatogram (TIC) Peak Detection

To identify TIC peaks, AutoTuner first applies a sliding window analysis, which detects peaks by testing if an upcoming scan’s intensity is greater than an intensity threshold determined by the average and standard deviation of a fixed number of prior scans. To ensure the correct peak bounds are retained, AutoTuner generates a linear model from the first three and last three points bounding each TIC peak. If the model fails to calculate an R2 value greater than or equal to 0.8 or to reach a local R2 maximum, AutoTuner expands the ending bound by one scan and reruns the model until the model meets either criterion. The time difference of a TIC peak’s final bounds represents its width. AutoTuner groups all TIC peaks originating from distinct samples whose maxima co-occur within each other’s retention-time bounds. It then determines the time differences between intensity maxima of all pairs of grouped peaks. AutoTuner returns the largest time difference as the estimate for the parameter that is used in the grouping step following peak-picking. Because highly complex datasets may contain distinct sample-specific peaks occurring at similar retention times, AutoTuner may overestimate this parameter. Prioritizing the inclusion of experimental replicates within AutoTuner would limit this issue. The overestimation of does not affect downstream parameter estimation, as future estimates do not involve comparisons across samples and instead focus on properties of individual EICs. At this point, AutoTuner has only collected data to estimate the parameter.

Parameter Estimation within EICs of Each TIC Peak

AutoTuner estimates the remaining parameters (, , , , , and ) from raw data contained within each individual TIC peak. A central assumption in AutoTuner is that TIC peaks represent chromatographic regions enriched in chemical ions relative to electrical noise.[7]

Error ()

First, AutoTuner sorts all m/z values detected in mass spectra contained within the bounds of a TIC peak. AutoTuner bins m/z values if the difference in mass of two adjacent m/z values is below a user-provided threshold. AutoTuner stores unbinned m/z values as noise peaks. Because peaks of true features are made up of m/z values within adjacent scans (scan continuity criterion), AutoTuner sorts each bin by scan number to check that this criterion holds for the binned m/z values. In the case where two or more m/z values are retained from a single scan, only the m/z value with the lowest difference in mass to the previous scan’s mass is retained. If multiple m/z values occur within the first scan of the bin, the difference in mass is calculated for the next adjacent scan’s m/z value instead. AutoTuner stores m/z values within bins that fail the scan continuity criterion as noise peaks, similar to the noise removal step earlier. AutoTuner estimates the parts per million () error parameter from the remaining bins by distinguishing between bins formed by random associations of noise peaks and those of hypothesized true features. To do this, AutoTuner first calculates the of all m/z values within bins. AutoTuner then builds an empirical distribution of values using a Gaussian kernel density estimator (KDE) defined bywhereand x is the set of all observations, x is the ith observation, n is the number of observations, and h is a measure of smoothness for the empirical distribution. The function between and absolute error is not surjective, meaning two identical absolute mass error values can have distinct values. Thus, we hypothesize that the value of noise peaks should be scattered widely, while values of real features should be within a narrow range.[14] Hence, we expect that by using a user-provided mass difference threshold larger than an instrument-defined threshold, the KDE will have a long-tail and a high narrow peak representing the instrument-dependent of real features and a shorter smaller downstream peak representing the from erroneously binned m/z values (Figure S1). AutoTuner subsamples the empirical distribution of all values to speed downstream calculations when calculated values are abundant (>500). To do this, AutoTuner checks the similarity between the original distribution comprising all values and seven distributions comprising one-half of all values randomly sampled from the total. Seven was chosen arbitrarily. The distance between the original distribution and each subsampled distribution is calculated using the Kullback–Leibler divergence (KLD), a function that calculates the information theoretic gain required to describe both distributions. A KLD value of 0.5 represents an increase of one-half bit of information required to store the two distributions. If a KLD value of 0.5 or greater is not calculated across any comparison, AutoTuner replaces the original distribution with one consisting of half as many values subsampled randomly from the original and repeats the subsampling. AutoTuner then calculates an outlier score for each value to distinguish between error values derived from real features and those derived from random associations of noise using the following outlier score function:[30]where C represents the largest cluster of error values and x is the ith observation similar to (1). To identify this cluster, AutoTuner uses k-means clustering, a data partitioning technique used to separate a set of observations into k-many groups. Either the gap statistic or a user-provided variance-explained threshold is used to determine the appropriate number of clusters.[31] Using C ensures that the density of each calculated value is normalized by the density of the true error values (Figure S1). The estimate is calculated by the following:where x is any calculated value with outlier scores above 1 and sd(x) is the standard deviation of all x values. An outlier score value above 1 indicates that the density of that particular x is at least as great as the expected value of the density of all elements within C. The statistical properties of probability distributions inspired this heuristic, as the sum of a probability distribution’s mean and three times its standard deviation provides an upper bound containing 99.7% of the total distribution area.[30]

Signal-to-Noise Threshold

We calculate the maximum intensity of each bin as well as the mean and standard deviation of the intensity of all noise features occurring within two peak widths from the original bin to estimate the signal-to-noise (S/N) threshold, similar to Myers et al.[14] First, AutoTuner subtracts the maximum intensity of each bin (xbin,) from the mean intensity of adjacent noise (μnoise). AutoTuner retains the bin if this difference is greater than three times the standard deviation (σnoise) of the adjacent noise intensity values:AutoTuner calculates the from the smallest observed value of bin and noise intensity difference divided by the standard deviation of noise intensity across all bins passing the above threshold:

Remaining Parameters

AutoTuner sets the estimate as the minimum number of scans across all bins. AutoTuner estimates and parameters by first determining the minimum integrated bin and single scan intensities. Then, it returns 90% of the magnitude of these values as the estimate to ensure that no AutoTuner-detected bin is removed during actual peak-picking. The represents the lowest number of scans within any bin multiplied by the duty cycle of the instrument. To estimate the , AutoTuner expands bins at the boundaries of the TIC peak. The expansion continues until an adjacent scan does not contain a m/z value whose error against the mean m/z of the bin is below the estimated threshold. A correlation checks to ensure that adjacent m/z values are not coming from noise after a bin has been expanded by 3 scans. For this, AutoTuner requires an absolute Spearman correlation coefficient of 0.9 between scans and intensity values for expansion to continue. AutoTuner returns the across bins.

Data Set-Wide Parameter Estimates

AutoTuner uses the average of all and values weighed by the number of bins within each TIC peak to return dataset-wide estimates for these parameters. For dataset-wide values of , , , and , AutoTuner returns the minimum values from all bins detected. The maximum calculated parameter represents the dataset-wide parameter estimate. The average of each sample’s maximal peak-width represents the estimate.

Experimental Demonstration

Materials

We chose a suite of 85 metabolites that represent compounds expected in metabolomic experiments, including cofactors, amino acids, and secondary metabolites. Of these, 41 ionized exclusively in negative mode; 28 ionized exclusively in positive mode, and 16 ionized in both modes. See Table S1 for a complete list of standards. We prepared stock solutions of each metabolite standard in water or a 1:1 mix of methanol and water at 1000 mg mL–1, unless constrained by solubility. Some standards required the addition of ammonium hydroxide or formic acid for dissolution. We stored stock solutions in the dark at −20 °C. We created a standard metabolite mix (10 mg mL–1) from the stock solutions and diluted it with Milli-Q water to obtain four solutions with standard concentrations of 500 ng mL–1. We obtained standards at the highest grade available through Sigma-Aldrich (MO, USA) with the exception of dimethylsulfoniopropionate (DMSP), purchased from 21 Research Plus Inc. (NJ, USA).

Mass Spectrometry

We analyzed four replicates of the standard mixes with ultrahigh-performance liquid chromatography (UPLC; Accela Open Autosampler and Accela 1250 Pump (Thermo Scientific)), coupled via heated electrospray ionization (H-ESI) to an ultrahigh resolution tribrid mass spectrometer (Orbitrap Fusion Lumos (Thermo Scientific)). We performed chromatographic separation with a Waters Acquity HSS T3 column (2.1 × 100 mm, 1.8 μm) equipped with a Vanguard precolumn, both maintained at 40 °C. We used mobile phases of (A) 0.1% formic acid in water and (B) 0.1% formic acid in acetonitrile at a flow rate of 0.5 mL min–1 to elute the column. The gradient started at 1% B for 1 min, ramped to 15% B from 1 to 3 min, ramped to 50% from 3 to 6 min, ramped to 95% B from 6 to 9 min, held until 10 min, ramped to 1% from 10 to 10.2 min, and finally held at 1% B (total gradient time of 12 min). We made separate positive and negative ion mode autosampler injections of 5 μL. We set electrospray voltage to 3600 V (positive) and 2600 V (negative) and source gases to 55 (sheath) and 20 (auxiliary). We set the heated capillary temperature to 375 °C and the vaporizer temperature to 400 °C. We acquired full scan MS data in the Orbitrap analyzer (mass resolution of 120 000 fwhm at m/z 200), with an automatic gain control (AGC) target of 4 × 105, a maximum injection time of 50 s, and a scan range of 100–1000 m/z. We set the AGC target value for fragmentation spectra at 5 × 104, and the intensity threshold at 2 × 104. We collected all data in profile mode.

Validation Data

We used two published datasets to validate the AutoTuner’s performance on experimental data: (1) a bacterial culture experiment,[32] MetaboLights[33] identifier MTBLS157, and (2) a rat fecal microbiome, by direct contact with the authors (Table ).[34]

Table 2

Information on the Datasets Used to Test the AutoTuner’s Performancea

dataset	reference	access	mass spectrometer	liquid chromatography	sample number	ionization mode
standards	current project	current project	Orbitrap Fusion Lumos (Thermo)	ultrahigh performance liquid chromatography (Accela 2015 Pump, Thermo)	4	pos/neg
culture	(31)	MetaboLights MTBLS157	Hybrid Linear Ion Trap 7T Fourier Transform Ion Cyclotron Resonance (Thermo)	high performance liquid chromatography (Surveyor MS Pump Plus, Thermo)	45	pos/neg
community	(33)	contributing author	Time-of-Flight Tandem Mass Spectrometer (Xevo-G2, Waters)	ultrahigh performance liquid chromatography (Acquity, Waters)	90	pos/neg

The mass spectrometers and liquid chromatography systems herein are some of the most commonly used analytical platforms for untargeted metabolomics[4].

Data Processing and Quality Control

We converted all raw data files from their proprietary formats to mzML files using msconvert.[35] All computing of mzML files took place within an Ubuntu Xenial 16.04 Google Cloud instance with 8 CPUs and 10Gb of memory. During time comparisons, we used 8 and 1 CPU(s) to obtain IPO and AutoTuner data processing parameters, respectively. We used an m/z mass error threshold of 0.005 Da for AutoTuner, because this absolute error was sufficiently large enough to return a broad range of error values (in ppm) greater than those of the mass analyzers used to generate the validation data (See Table ). We used XCMS and centWave to generate feature tables for each dataset[13,36] and CAMERA for isotopologue and adduct detection.[37]Table S2 contains parameters used for processing. For the standards, we searched for the most abundant parent ion within the EICs (see Table S1). We confirmed the presence of a metabolite standard within feature tables if a feature had an intensity above 1 × 104, was within an exact mass error of 5 ppm of the parent ion, and had a retention time error of 5 s from the EIC peak. We identified 12C13C1 and 12C13C2 isotopologue peaks as features with an exact mass error of 5 ppm of the parent ion isotopologue masses (1.0033 for 13C1 and 2.0066 for 13C2). Additionally, we required that peaks matching m/z values of isotopologues also had a retention time error less than 5 s from the 12C13C0 peak. Prior to any identification, we confirmed that standards contained isotopologue peaks by visually inspecting raw data. For the culture experiment, the data was subjected to quality control as described previously.[38] Briefly, we removed features detected in process blanks, features detected within only one replicate, and features representing isotopologues and adducts. Additionally, we removed features with coefficient of variation values above 0.4 across six pooled samples. We defined overlapping features in AutoTuner- and IPO-parametrized feature tables to be those with ppm error below 5 and retention time differences less than 20 s. The culture experiment allowed a higher retention time correction because it relied on data collected with HPLC compared to the standards, which were analyzed with a UPLC system.

Statistical Analyses

We applied several distinct statistical methods to summarize the various pieces of data used to validate AutoTuner. We used R programming language to perform all analyses (CRAN R-Project). We used a Kolmogorov–Smirnov Test (KS-test) to compare empirical cumulative distribution functions. We used the hypergeometric test to compare MS2 enrichment of IPO- and AutoTuner-specific features against features observed in the intersect of the two datasets. In order to estimate the robustness of AutoTuner parameter estimation, we performed a Monte Carlo experiment by running AutoTuner on distinct subsets of the data. We first randomly selected 7 subsets of 11 samples to compare the variability across parameters. We used the coefficient of variation from estimates within each group as a measure of variability. We also performed linear regressions on these values to find trends between estimate variability and sample numbers used for estimates. We randomly selected 3 to 9 samples from each of these subsets 55 times. In total, there were 385 estimates for each group of 3–9 samples, resulting in a total of 2695 separate runs of AutoTuner per dataset. We performed a sensitivity analysis to determine the downstream data processing effect that different values on mzDiff, the only continuous valued centWave parameter not optimized by AutoTuner, had on the returned feature table. To accomplish this, we counted the number of unique features between pairs of feature tables generated with mzDiff parameters varying by a value of 0.001.

Results

AutoTuner Accuracy and Comparison to IPO

At this time, the only open source method for automated selection of peak-picking parameters for XCMS is isotopologue parameter optimization (IPO).[22] IPO uses a gradient descent algorithm that requires users to iteratively run centWave with different combinations of parameters until the set that maximizes a scoring function is identified. We used 5 distinct metrics to compare the accuracy, speed, and downstream data structure of IPO- and AutoTuner-derived parameters. The metrics include the accuracy, number of features, the peak areas, and shapes of EIC peaks only detected using parameters from one of the two methods and MS2 count. We searched for 85 known chemical standards (a total of 101 possible ions) within feature tables generated with IPO- and AutoTuner-derived parameters to test the influence of each parameter selection method on data processing accuracy (Figure , Tables and S1). We detected 82 and 100 standards within the feature table generated with IPO- and AutoTuner-derived parameters, respectively. Figure S2 provides an example of compounds that were only detected with AutoTuner and were absent when the IPO-derived parameters were used. These results were robust to the choice of intensity thresholds (Figure S3).

Figure 2

AutoTuner and IPO accuracy comparison. Percentages were determined from the number of detected standard peaks relative to the total possible set. M denotes 12C13C0 isotopologue; [M + 1] denotes 12C13C1 isotopologue; [M + 2] denotes 12C13C2 isotopologue. We normalized percentages by the total number of possible detectable peaks based on the detection of 12C standards (12C13C0). Additionally, we enumerated all features matching 12C13C1 and 12C13C2 isotopologues of standards to determine the influence of parameter values on the detection of lower intensity features (Figure ). We only considered these isotopologues if the 12C13C0 ion was present within the feature tables derived with method-specific parameters. We detected 80 out of 81 and 38 out of 64 possible 12C13C1 and 12C13C2 peaks, respectively, within the AutoTuner-derived feature table. We detected 46 out of 68 and 8 out of 59 possible 12C13C1 and 12C13C2 peaks, respectively, within the IPO-derived feature table. We first compared the number of features from culture data generated with parameters from each algorithm to understand the influence of parameter selection on downstream data properties (Figures A and S5A). Each feature table contained a distinct number of total features following processing and quality control (Table S3). In positive ion mode, AutoTuner-derived parameters detected fewer unique features (203) compared to 2606 unique features detected with IPO-derived parameters (Figure A), while sharing 1022 features between them. A similar situation was observed in negative ion mode where AutoTuner detected 540 unique features compared to 3420 unique features found with IPO-derived parameters, while sharing 904 features (Figure S5A).

Figure 3

Comparing the differences between positive ion mode data generated by AutoTuner and IPO on the culture dataset. (A) The overlap in the number of m/z-rt features generated by both methods. Features with an error of 5 ppm and retention time error of 20 s are placed in the intersect. (B) Comparison of the differences in structural properties for the maximum continuous wavelet transform coefficient (CWT) between peaks detected only within AutoTuner (orange) and IPO (green). Both curves are empirical cumulative distribution functions (CDF) of the calculated metric. (C) Three randomly selected EIC peaks that fall on distinct regions of the maximum CWT empirical cumulative distribution function to demonstrate how this metric influence peak shapes. The EIC shape reflects the maximal CWT rather than the parametrization method. The CDF curves were significantly different (KS-test, p < 10–4, n = 203). Results for positive mode data area CDF and negative data were similar to this data and are found in Figures S4 and S5, respectively. We then compared the structural differences between features exclusively detected using IPO- and AutoTuner-derived parameters. We created an empirical cumulative distribution function (CDF) to compare the distribution of peak areas (Figure S5A,B) and maximum observed continuous wavelet transform (CWT) coefficients (Figures B and S5C) of all EIC peaks belonging to features outside of the intersect. The maximum observed CWT coefficient increases with peak steepness and may provide a measure of a peak’s chromatographic resolution (Figure C). The CDF of each metric was significantly different in positive (KS-Test; area: p < 10–6; CWT: p < 10–4; n = 203) and negative (KS-Test; area: p < 10–14; CWT: p < 10–8; n = 540) ionization mode data. The application of the same test on unbalanced comparisons (e.g., negative ion mode: 3420 IPO- vs 540 AutoTuner-unique features) was more highly significant than using equivalent numbers of features obtained through subsampling. Next, we compared the abundance of features with MS/MS spectra within each unique feature table because features with MS/MS spectra can be compared to spectral databases and authenticated standards, thus enabling potential identification. In total, we observed more features with MS/MS spectra within the feature table generated with IPO-derived parameters compared to that generated with AutoTuner-derived parameters (positive: 448 vs 115; negative: 686 vs 121; both for IPO vs AutoTuner, respectively). However, this is due primarily to the greater number of features in the IPO-derived table. Indeed, relative to total features, IPO-derived features were less likely to have associated MS/MS spectra than features within the intersect of both datasets (hypergeometric test; negative ion mode: p < 10–10; positive ion mode: p < 10–10). A similar comparison revealed that MS/MS enrichment was not significantly different between AutoTuner-derived features and those within the intersect (Table S4). Finally, using all three of the test datasets, we compared the time required to run AutoTuner and IPO (Table ). After accounting for number of CPUs, AutoTuner ran hundreds to thousands of times faster.

Table 3

Run-Times for AutoTuner and IPO Required to Run 6 Common Samples Collected in Positive (+) and Negative (−) Ionization Modesa

algorithm	culture (−)	culture (+)	standards (−)	standards (+)	community (−)	community (+)
AutoTuner	2 min	9 min	2 min	3 min	25 min	26 min
IPO	7 h 23 min	28 h 40 min	31 h 56 min	28 h 5 min	38 h 4 min	21 h 27 min
ratio (Auto/IPO)	1479	1518	6970	4238	715	396
samples used	6	6	4	4	6	6

All system time measurements were done on an 8 CPUs and 10 Gb of memory Ubuntu Xenial 16.04 Google Cloud instance. IPO ran on 8 CPUs, while AutoTuner ran on 1 CPU. The ratio accounts for the total computing power used to run both algorithms.

Testing the Robustness of the AutoTuner Estimation

Figure shows the coefficient of variation and estimate values for each 11 sample subset for the parameter obtained from Monte Carlo analysis on the culture and community datasets. Figures S6–S13 show the complete set of results from the Monte Carlo analysis for all parameters. For all parameter estimates, the variability of estimation decreased linearly with the number of samples used under both ionization modes (Figures A and S6–S13). The rendered parameter estimates were consistent with expectations based on the mass analyzer used to generate the data (Table ). With the exception of the parameter estimate in the community dataset and the estimate in the negative culture data, all parameters had a coefficient of variation (CV) of less than or equal to 0.1 when using 9 samples to obtain the estimates (Figures S6, S8, S10, and S12).

Figure 4

Results from the Monte Carlo experiment for parameter in positive ion mode data. (A) The distribution of the coefficient of variation from parameters within the 11 sample group, while (B) shows the distribution of all estimates for ppm. Blue bars describe data collected by the qTOF instrument (community data) while yellow bars describe FT-ICR-MS data (culture data). See Figures S5–S12 for results on other parameter estimates in each ionization mode and dataset.

Discussion

AutoTuner is a robust, rapid, and high-fidelity estimator of untargeted mass spectrometry data processing parameters. Its unique design improves upon previous methods by providing a scalable framework to handle large datasets, reducing run-time, and generating high-accuracy parameters that retain known features. AutoTuner’s ease of use make it an ideal candidate to include within existing data processing pipelines.[39−42] AutoTuner’s high accuracy indicates that its parameter selection is based on true data features. One possible explanation for the lower accuracy of IPO is that the peak-width of the missing standards was below the parameter selected by IPO (Table S2 and Figure S2). AutoTuner parameter estimates were robust across all datasets and ionization modes. Some parameters like , , , , and reflect systematic properties inherent to the platform chromatography, mass analyzer, and/or sample matrix.[43] Other parameters like are more specific to each sample; hence, the increase in the total number of samples used to estimate parameters always strengthened their robustness. The low CV values for parameter estimates suggests that using a subset of samples to generate estimates returns a set representative for all samples. On the basis of our results, we recommend the use of 9 and 12 samples to generate estimates in culture and community datasets, respectively. For most of the parameters estimated here, 9 samples proved sufficient to obtain estimates with CV values less than 0.1. The 12-parameter recommendation originated from extrapolating the linear fits of these data to obtain 0.1 CV values for remaining parameters that failed this criterion (Figures S6, S8, S10, and S12). We were unable to check the robustness of the parameter estimate, as this parameter is estimated through a nonautomatable cross sample comparison during the TIC peak detection step of the algorithm. Although other algorithms return more parameter estimates than AutoTuner, the parameters calculated by AutoTuner represent continuous valued ones with the greatest possible number of choices. Performing a parameter sweeping optimization like previous approaches to estimate the remaining parameters after fixing the AutoTuner derived ones reduces the total combinations of available centWave parameters from a space of at least 24 × 58 possible choices of parameters to one of 40. This is because the centWave algorithm, used by both XCMS and MZmine2 data processing tools, requires tuning of 11 distinct parameters. Of these, 8 are continuously valued, meaning that they can be any real number. The remainder either are boolean values or can be one of a few discrete choices (Table S5). The reduction of the total number of combinations is achieved by optimizing 7 of the 8 continuous valued parameters. In regards to the last continuous parameter not optimized by AutoTuner, mzDiff, we performed a sensitivity analysis to show that distinct values had a minimal effect on the returned feature table (Table S6). Future contributions towards AutoTuner’s design can help the estimation of additional parameters not covered within its current design. AutoTuner’s low run-time indicates that the algorithm is scalable (Table ). As more and more data is generated due to increases in analytical throughput and dataset size, AutoTuner will remain a tractable option to generate estimates of metabolomics data processing parameters.[4,44] Because AutoTuner estimates parameters much faster than IPO and IPO was shown to perform at a faster rate than software preceding it, we surmise that AutoTuner is the fastest parameter selection algorithm available.[22] The evaluation of quality between culture dataset feature tables generated with IPO- and AutoTuner-derived parameters is impossible without performing a complete validation of all possible features. Such analyses are time-consuming, labor intensive, and beyond the scope of this manuscript. However, the measured properties of these datasets may provide some expectation for practitioners of metabolomics of how the data generated using each method may differ. When considering the unique features identified by each algorithm in the culture dataset, AutoTuner found fewer features in each case (Figures and S4). This may be due to the selection of different parameters between AutoTuner and IPO; the thresholds selected by IPO were higher under each ionization mode than those selected by AutoTuner (Table S2). AutoTuner’s lower estimates do not appear to be too stringent, as they are between 4 and 6 times greater than the instrument-recommended error threshold of 0.5 ppm and they are consistent with recommendations by the “centWave” developers.[36] The size of the processed data using AutoTuner-derived parameters was in line with previous work validating the metabolome of Escherichia coli after performing stringent isotope labeling experiments and quality control filtering.[26] AutoTuner feature selection does not appear to be biased toward higher intensity features, because the standard dataset processed with AutoTuner-derived parameters contained a high percentage of possible 13C isotopologues. The paucity of size-validated metabolome datasets precludes further evaluation of the feature number comparison. Within the AutoTuner-derived feature tables, those features unique to the AutoTuner-derived parameters were enriched in MS/MS relative to the unique IPO-derived features. We stress that features with MS/MS spectra cannot be assumed to be more or less important within a metabolomics dataset; however, the presence of these spectra enhances downstream identification efforts and may be desirable to some investigators. AutoTuner has several avenues for possible improvement. First, AutoTuner could be parallelized to reduce computation time by a factor of the total number of CPUs used. Second, additional algorithms may be implemented to optimize parameters not covered here. One drawback from the speed gained in the computation through its “divide-and-conquer” approach comes at a loss of comparing EIC peaks across samples to estimate retention time correction algorithms. This challenge leaves room for the implementation of additional algorithms. Third, the replacement of the sliding window analysis with a more sophisticated and sensitive peak detection approach may eliminate the need for user input during the first portion of AutoTuner. However, this automation comes at the cost of manual inspection of the raw data. We support manual inspection of the raw data, because it provides a quality control check for the data generation steps leading up to the analysis. AutoTuner provides several built-in plotting functions to facilitate this evaluation step. Despite these minor caveats, AutoTuner is a viable and time-saving option to determine proper data processing parameters for untargeted metabolomics data.

34 in total

1. XCMS Online: a web-based platform to process untargeted metabolomic data.

Authors: Ralf Tautenhahn; Gary J Patti; Duane Rinehart; Gary Siuzdak
Journal: Anal Chem Date: 2012-05-10 Impact factor: 6.986

2. Evaluation of peak picking quality in LC-MS metabolomics data.

Authors: Leonid Brodsky; Arieh Moussaieff; Nir Shahaf; Asaph Aharoni; Ilana Rogachev
Journal: Anal Chem Date: 2010-10-26 Impact factor: 6.986

3. MetaboLights: An Open-Access Database Repository for Metabolomics Data.

Authors: Namrata S Kale; Kenneth Haug; Pablo Conesa; Kalaivani Jayseelan; Pablo Moreno; Philippe Rocca-Serra; Venkata Chandrasekhar Nainala; Rachel A Spicer; Mark Williams; Xuefei Li; Reza M Salek; Julian L Griffin; Christoph Steinbeck
Journal: Curr Protoc Bioinformatics Date: 2016-03-24

4. OpenMS: a flexible open-source software platform for mass spectrometry data analysis.

Authors: Hannes L Röst; Timo Sachsenberg; Stephan Aiche; Chris Bielow; Hendrik Weisser; Fabian Aicheler; Sandro Andreotti; Hans-Christian Ehrlich; Petra Gutenbrunner; Erhan Kenar; Xiao Liang; Sven Nahnsen; Lars Nilse; Julianus Pfeuffer; George Rosenberger; Marc Rurik; Uwe Schmitt; Johannes Veit; Mathias Walzer; David Wojnar; Witold E Wolski; Oliver Schilling; Jyoti S Choudhary; Lars Malmström; Ruedi Aebersold; Knut Reinert; Oliver Kohlbacher
Journal: Nat Methods Date: 2016-08-30 Impact factor: 28.547

5. Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer than 1000 Unique Metabolites.

Authors: Nathaniel G Mahieu; Gary J Patti
Journal: Anal Chem Date: 2017-09-15 Impact factor: 6.986

6. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets.

Authors: Carsten Kuhl; Ralf Tautenhahn; Christoph Böttcher; Tony R Larson; Steffen Neumann
Journal: Anal Chem Date: 2011-12-12 Impact factor: 6.986

Review 7. Annotation: A Computational Solution for Streamlining Metabolomics Analysis.

Authors: Xavier Domingo-Almenara; J Rafael Montenegro-Burke; H Paul Benton; Gary Siuzdak
Journal: Anal Chem Date: 2017-11-03 Impact factor: 6.986

8. Autonomous METLIN-Guided In-source Fragment Annotation for Untargeted Metabolomics.

Authors: Xavier Domingo-Almenara; J Rafael Montenegro-Burke; Carlos Guijas; Erica L-W Majumder; H Paul Benton; Gary Siuzdak
Journal: Anal Chem Date: 2019-02-11 Impact factor: 6.986

9. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.

Authors: Franck Giacomoni; Gildas Le Corguillé; Misharl Monsoor; Marion Landi; Pierre Pericard; Mélanie Pétéra; Christophe Duperier; Marie Tremblay-Franco; Jean-François Martin; Daniel Jacob; Sophie Goulitquer; Etienne A Thévenot; Christophe Caron
Journal: Bioinformatics Date: 2014-12-19 Impact factor: 6.937

10. Highly sensitive feature detection for high resolution LC/MS.

Authors: Ralf Tautenhahn; Christoph Böttcher; Steffen Neumann
Journal: BMC Bioinformatics Date: 2008-11-28 Impact factor: 3.169

10 in total

Review 1. Using MetaboAnalyst 5.0 for LC-HRMS spectra processing, multi-omics integration and covariate adjustment of global metabolomics data.

Authors: Zhiqiang Pang; Guangyan Zhou; Jessica Ewald; Le Chang; Orcun Hacariz; Niladri Basu; Jianguo Xia
Journal: Nat Protoc Date: 2022-06-17 Impact factor: 17.021

2. Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection.

Authors: Yasin El Abiead; Maximilian Milford; Harald Schoeny; Mate Rusz; Reza M Salek; Gunda Koellensperger
Journal: Anal Chem Date: 2022-06-07 Impact factor: 8.008

Review 3. New software tools, databases, and resources in metabolomics: updates from 2020.

Authors: Biswapriya B Misra
Journal: Metabolomics Date: 2021-05-11 Impact factor: 4.290

Review 4. Optimization of metabolomic data processing using NOREVA.

Authors: Jianbo Fu; Ying Zhang; Yunxia Wang; Hongning Zhang; Jin Liu; Jing Tang; Qingxia Yang; Huaicheng Sun; Wenqi Qiu; Yinghui Ma; Zhaorong Li; Mingyue Zheng; Feng Zhu
Journal: Nat Protoc Date: 2021-12-24 Impact factor: 13.491

Review 5. Innovative Application of Metabolomics on Bioactive Ingredients of Foods.

Authors: Sumei Hu; Caiyu Liu; Xinqi Liu
Journal: Foods Date: 2022-09-23

6. MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics.

Authors: Zhiqiang Pang; Jasmine Chong; Shuzhao Li; Jianguo Xia
Journal: Metabolites Date: 2020-05-07

7. Recurrent Topics in Mass Spectrometry-Based Metabolomics and Lipidomics-Standardization, Coverage, and Throughput.

Authors: Evelyn Rampler; Yasin El Abiead; Harald Schoeny; Mate Rusz; Felina Hildebrand; Veronika Fitz; Gunda Koellensperger
Journal: Anal Chem Date: 2020-11-28 Impact factor: 6.986

8. Feature-based molecular networking in the GNPS analysis environment.

Authors: Louis-Félix Nothias; Daniel Petras; Robin Schmid; Kai Dührkop; Johannes Rainer; Abinesh Sarvepalli; Ivan Protsyuk; Madeleine Ernst; Hiroshi Tsugawa; Markus Fleischauer; Fabian Aicheler; Alexander A Aksenov; Oliver Alka; Pierre-Marie Allard; Aiko Barsch; Xavier Cachet; Andres Mauricio Caraballo-Rodriguez; Ricardo R Da Silva; Tam Dang; Neha Garg; Julia M Gauglitz; Alexey Gurevich; Giorgis Isaac; Alan K Jarmusch; Zdeněk Kameník; Kyo Bin Kang; Nikolas Kessler; Irina Koester; Ansgar Korf; Audrey Le Gouellec; Marcus Ludwig; Christian Martin H; Laura-Isobel McCall; Jonathan McSayles; Sven W Meyer; Hosein Mohimani; Mustafa Morsy; Oriane Moyne; Steffen Neumann; Heiko Neuweger; Ngoc Hung Nguyen; Melissa Nothias-Esposito; Julien Paolini; Vanessa V Phelan; Tomáš Pluskal; Robert A Quinn; Simon Rogers; Bindesh Shrestha; Anupriya Tripathi; Justin J J van der Hooft; Fernando Vargas; Kelly C Weldon; Michael Witting; Heejung Yang; Zheng Zhang; Florian Zubeil; Oliver Kohlbacher; Sebastian Böcker; Theodore Alexandrov; Nuno Bandeira; Mingxun Wang; Pieter C Dorrestein
Journal: Nat Methods Date: 2020-08-24 Impact factor: 28.547

9. Adduct annotation in liquid chromatography/high-resolution mass spectrometry to enhance compound identification.

Authors: Thomas Stricker; Ron Bonner; Frédérique Lisacek; Gérard Hopfgartner
Journal: Anal Bioanal Chem Date: 2020-10-29 Impact factor: 4.142

10. Data Processing Thresholds for Abundance and Sparsity and Missed Biological Insights in an Untargeted Chemical Analysis of Blood Specimens for Exposomics.

Authors: Dinesh Kumar Barupal; Sadjad Fakouri Baygi; Robert O Wright; Manish Arora
Journal: Front Public Health Date: 2021-06-10

10 in total