Literature DB >> 28221771

Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics.

Leslie Myint¹, Andre Kleensang², Liang Zhao², Thomas Hartung^2,3, Kasper D Hansen^1,4.

Abstract

As mass spectrometry-based metabolomics becomes more widely used in biomedical research, it is important to revisit existing data analysis paradigms. Existing data preprocessing efforts have largely focused on methods which start by extracting features separately from each sample, followed by a subsequent attempt to group features across samples to facilitate comparisons. We show that this preprocessing approach leads to unnecessary variability in peak quantifications that adversely impacts downstream analysis. We present a new method, bakedpi, for the preprocessing of both centroid and profile mode metabolomics data that relies on an intensity-weighted bivariate kernel density estimation on a pooling of all samples to detect peaks. This new method reduces this unnecessary quantification variability and increases power in downstream differential analysis.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：
Androgens
Resveratrol

Year: 2017 PMID： 28221771 PMCID： PMC5362739 DOI： 10.1021/acs.analchem.6b04719

Source DB: PubMed Journal: Anal Chem ISSN： 0003-2700 Impact factor: 6.986

As mass spectrometry-based metabolomics becomes a more mature and popular means of scientific investigation,[1−3] it is important to revisit existing data analysis paradigms. Existing approaches to preprocessing metabolomics data focus on a two-step approach which starts by extracting features (peaks) separately from each sample, followed by a subsequent attempt to group features across samples to facilitate comparisons.[4] In particular, there has been considerable attention in the literature on individual stages of preprocessing, including peak detection[5−14] and alignment.[15−18] Additional work has been done on specific issues with downstream differential analysis such as missing information or dependence structures.[19−21] Single sample processing methods tend to focus on reducing bias. The bias-variance trade-off[22] shows that the overall performance of a method also depends on its noise and experience from gene expression studies suggests that noise can be removed by processing samples jointly. In this work, we investigate the consequences of traditional sample-specific preprocessing on the quality of differential analysis. We show that the retention time (RT) bounds that arise from preprocessing samples individually cause unnecessary variability in peak quantifications (based on integrated peak area) which leads to under-powered differential analysis. We propose a relative quantification method, called bakedpi, which addresses this shortcoming by jointly detecting and bounding peaks in the two-dimensional m/z-RT space, across all samples simultaneously. The backbone of our method is an intensity-weighted bivariate kernel density estimation that is computed on a pooling of all samples. We show that this approach reduces unnecessary quantification variability and increases power in downstream differential analysis. Our method is open source and freely available as part of the yamss package through the Bioconductor project under Artistic License 2.0.

Results and Discussion

Excess Variability with Sample-Specific Processing

To demonstrate issues with sample specific detection and bounding of peaks, we consider the widely used software packages XCMS[23] and MZmine2.[24] Output for one peak from a QTOF data set with two sample groups is shown in Figure (additional examples from other data sets in Supplementary Figures S1 and S2). The shape, width, and location of this peak do not appear to vary across samples. Despite this, the XCMS and MZmine2 RT bounds for this peak, indicated by blue and purple rectangles respectively, are highly heterogeneous between samples (Figure c). To a first approximation, the retention time (RT) bounds can be grouped into narrow and wide bounds; this grouping is not associated with the two sample groups (light and dark rectangles). As a consequence, the integrated peak area is completely determined by whether the RT bounds are narrow or wide (Figure d,e), and this leads to high variability in the peak quantifications (Figure f). If instead, we use the same RT bound across all samples (Figure c, orange rectangle), we substantially reduce the between-sample variability in the peak quantifications (Figure f). Excess variability results in loss of power in a differential analysis.

Figure 1

Problems with sample-specific processing in XCMS and MZmine2. Peak detection and bounding for a single peak in the MTBLS2_rep1 data set. (a) The m/z-RT space surrounding this peak for a single sample, color is used to depict intensity (red is high). (b) Overlaid extracted ion chromatograms from all 8 samples in the experiment. Different colors denote different samples. (c) The peak bounds for all samples for XCMS (blue), MZmine2 (purple) and bakedpi (orange; all samples have same bounds). This experiment compares two groups of samples indicated with different color shades. (d) XCMS peak quantification vs peak width. (e) Like part d but for MZmine. (f) Distribution of peak quantifications, based on the peak bounds in part c. Substantial heterogeneity in the sample-specific bounds leads to excess variability in the quantifications; this is addressed by using the same RT bound for all samples.

Joint Sample Processing with bakedpi

To address the problem of excess variability, we propose a method which jointly detects and bounds peaks across all samples in an experiment (see Methods); an important feature of our method is the use of homogeneous RT bounds across all samples. We pool the data from all samples into a single metasample, on which we detect and bound peaks (Figure a,b). To do this, we use intensity-weighted bivariate kernel density estimation in the two-dimensional m/z-RT space. By using the intensities as weights, we differentiate between groups of detected m/z values (data points) with high and low intensities. The output is a smooth density in the m/z-RT space, where peaks in the density correspond to clusters of high-intensity points (Figure c). To detect and bound peaks, we slice the density using a single global threshold and form a set of contiguous regions based on the density slices. By performing this procedure on a single metasample, we ensure the same peak bounds across all samples. Like XCMS and MZmine2, we quantify the peaks by integrating the extracted ion chromatogram (EIC) for each sample across the peak’s RT bounds. We can optionally perform RT alignment prior to density estimation. Our method has 3 parameters: 2 of these parameters control the bandwidth in the m/z and RT domains and are easy to set based on the resolution of the instrument. The last parameter, the only significant tuning parameter, is the global density threshold. We call our method bakedpi for bivariate approximate kernel density estimation for peak identification.

Figure 2

Weighted bivariate kernel density estimation. We depict a selected rectangle in m/z-RT space for (a) one sample and (b) the pooled metasample. m/z values with higher intensity are shown in red, lower with blue. (c) The weighted bivariate density estimate.

Joint Sample Processing Reduces Excess Variability

We applied bakedpi to 10 different data sets from 7 different experiments. Features of these data sets are summarized in Table . All data sets were subset (if necessary) to only contain two sample groups, to keep the experimental design simple and constant. For the Orbitrap data set (MTLS216) we expect little to no differences between the sample groups, based on the design of the experiment.[25] We ran XCMS, MZmine2, and bakedpi on the 10 data sets. XCMS parameters were optimized using the IPO package available on Bioconductor[26] using recommended starting values for most data sets (Methods). MZmine2 parameters were set based on optimized XCMS parameters where possible (Methods). When running bakedpi, we use the higher of a fixed quantile cutoff and a data-driven cutoff to set the global tuning parameter (Methods).

Table 1

Characteristics of Evaluation Datasetsa

name (source)	MS instrument column	no. samples (group 1, 2)
ASD_hirisk (C)	QTOF	20, 20
ASD_hirisk (C)	HPLC-HILIC	20, 20
timecourse_4 h (C)	QTOF	6, 6
timecourse_4 h (C)	HPLC-HILIC	6, 6
timecourse_24 h (C)	QTOF	6, 6
timecourse_24 h (C)	HPLC-HILIC	6, 6
MTBLS2_rep1 (M)	QTOF	4, 4
MTBLS2_rep1 (M)	UPLC-reverse phase	4, 4
MTBLS2_rep2 (M)	QTOF	4, 4
MTBLS2_rep2 (M)	UPLC-reverse phase	4, 4
CAMERA_pos (M)	QTOF	3, 3
CAMERA_pos (M)	UPLC-reverse phase	3, 3
CAMERA_neg (M)	QTOF	3, 3
CAMERA_neg (M)	UPLC-reverse phase	3, 3
MTBLS103 (M)	QTOF	14, 12
MTBLS103 (M)	UPLC-HILIC	14, 12
MTBLS213 (M)	QTOF	6, 6
MTBLS213 (M)	UPLC-reverse phase	6, 6
MTBLS126 (M)	Orbitrap	3, 3
MTBLS126 (M)	HPLC-HILIC	3, 3

C = CAAT, M = Metabolights.

C = CAAT, M = Metabolights. To compare the quantification variability between bakedpi and XCMS and between bakedpi and MZmine2, we first identified peaks which overlapped between bakedpi and XCMS and between bakedpi and MZmine2. We will call these shared peaks. The number of peaks detected by both methods as well as the percentage of peaks that are common to both methods are shown in Supplementary Figure S3; for many data sets the overlap is around 60–80% of the peaks. On these overlapping peaks, we computed the residual standard deviation of the log-abundances to assess their variability. We used residual standard deviation to avoid being influenced by changes in the log-abundances between the two sample groups in the different experiments. Figure shows the distribution of differences in residual standard deviation (XCMS or MZmine2 minus bakedpi) for each data set. Values greater than zero indicate that bakedpi has smaller variability than the other method. For all data sets examined, more than half of the peaks detected by both methods had lower variability when quantified by bakedpi; for some data sets it was substantially higher.

Figure 3

Variability comparison of peak quantifications. (a) For peaks that are detected both by bakedpi and XCMS, the distribution of the differences in residual standard deviation for all data sets are shown as violin plots. Each violin is a mirrored density plot; the median is indicated by a horizontal red line. (b) Like part a but for MZmine. For all data sets, the majority of peaks detected by both methods have quantifications that are less variable when quantified with bakedpi.

Joint Processing Improves Power in a Differential Analysis

We next sought to determine if the decrease in residual standard deviation of the peak quantifications leads to increased power in a differential analysis. We used the limma[27] differential analysis pipeline as it has been shown to provide robust and powerful inference for proteomics data.[28] This method was originally developed to analyze microarray expression studies and uses empirical Bayes techniques to shrink feature (adduct)-wise variances toward a common underlying value to provide more stable inference. The resulting p-value distributions for the shared peaks in the timecourse_4 h data set are shown in Figure a (additional data sets in Supplementary Figure S4). For the majority of the data sets, bakedpi has a p-value distribution that is more peaked around zero than XCMS and MZmine2, indicating that bakedpi detects more significant peaks among the overlapping peaks. When comparing with XCMS, the timecourse_24 h data set is the only one in which XCMS has a taller peak arount zero. When comparing with MZmine2, only for the CAMERA_pos data set does MZmine2 have a taller peak around zero.

Figure 4

Comparison of differential analysis quality and type I error control in the timecourse_4 h data set. (a) Distribution of p-values for peaks detected by both bakedpi and XCMS. (b) Like part a but for MZmine, (c) median error rate over null permutations as a function of the nominal error rate. Higher detection rates alone do not necessarily indicate an increase in power. To assess power, we also evaluated the type I error control of the methods. We performed a permutation experiment in which we shuffled the sample group labels so that each of the new comparison groups were composed half of cases and half of controls. For example, in an experiment with eight cases and eight controls, the new permuted “case” group would include four true cases and four true controls, as would the new permuted “control” group. In this way, we created null data sets in which no abundance differences are expected. With data sets containing a sufficient number of samples, we performed 1000 permutations. Otherwise we enumerated all permutations satisfying the balancing characteristic just described. We again used limma to perform differential testing. Results of the permutation experiment for the timecourse_4 h data set are shown in Figure c (additional data sets in Supplementary Figure S5). For a range of nominal type I error rates, we computed the median observed error rate over all permutations. For all 10 data sets, all methods are quite conservative, showing a markedly lower error rate than the nominal value for the entire range. For most of the data sets, bakedpi is the most conservative of the three methods. The combination of more conservative type I error control and a higher detection rate indicates that bakedpi has higher power to detect differences than the sample-by-sample processing procedures of XCMS and MZmine2.

Retention Time Alignment

It is well established that RT deviations between experimental runs can complicate the matching of peaks across samples. We investigated the impact of correcting RT drift on the variability improvements of our method using multiple strategies. First, we used the RT warping function computed by XCMS to align the raw data before computing the density estimate. Second, we computed local sample-specific RT shifts that maximized the correlation of the chromatograms between samples and used these shifts to align the raw data. Third, we used correlation-optimal shifts to align peaks already detected from the density estimate before quantification. None of these RT alignment strategies had a large impact on the variability of detected features. The proportion of peaks detected by both bakedpi and XCMS or MZmine2 that had lower variability with bakedpi did not change appreciably with these RT corrections (Supplementary Figure S6).

Parameter Choices

Because the detection of peaks and their bounds depend on the cutoff applied to the density estimate, it is important to investigate the sensitivity of method performance to this cutoff. We performed a sensitivity analysis by varying the density cutoff and examining the p-value distribution resulting from the detected peaks (Supplementary Figures S7 and S8). Raising the cutoff to be more stringent or lowering the cutoff to be more inclusive generally does not have a substantial impact on the global pattern of inference as assessed by p-value distributions.

Method-Specific Peaks

There are a number of peaks that are detected only by one method (Supplementary Figure S3). As comprehensive gold standard information on the true peaks corresponding to compounds was not available, we examined the characteristics of these method-specific peaks to assess their quality (Supplementary Figures S9–S12). For more than half of the data sets, XCMS-specific peaks tend to have more extreme t-statistics and lower p-values. For half the data sets, MZmine2 peaks have higher p-values than bakedpi. For nearly all data sets, bakedpi-specific peaks have greater peak heights than XCMS- and MZmine2-specific peaks with comparable peak widths. Peaks specific to bakedpi are also more likely to be supported by all samples in the experiment. The last two observations are sensible given that bakedpi relies on an intensity-weighted density estimation; a peak is more likely to be detected when a large number of high-intensity points are close together. Based on observations about t-statistics and p-values, it is not clear that one of the two sets of methods-specific peaks is best. If peaks with greater heights or greater numbers of samples supporting them are more likely to be of scientific interest, then bakedpi-specific peaks seem to be of higher quality than XCMS- or MZmine2-specific peaks. Given the lack of gold standard data on peak identities, evaluation of method-specific peaks is less clear than evaluation of peaks common to multiple methods. On peaks common to both bakedpi and MZmine2 or XCMS, bakedpi shows a clear reduction in quantification variability and an increase in statistical power.

Conclusions

We have proposed a method for the joint processing of metabolomics data across samples, which reduces variability in peak quantification across samples, leading to increased power in a differential analysis. We take the position that the most important task in metabolomics is the identification of differentially abundant peaks, in contrast to, e.g., identifying all peaks in a sample. Our method compares favorably to XCMS and MZmine2 across 10 data sets and will be useful for drawing better and more substantiated inferences from untargeted metabolomics studies. We do note that the commercial software Progenesis CoMet also uses the idea of pooling samples into a metasample for processing. However, details on CoMet method are not available, making it impossible to comment further on differences between the two approaches. A limitation of our approach is that peaks that are only truly present in a small fraction of the samples are unlikely to be detected. Such metabolites may be of interest but are by definition less well supported by the observed data. In developing bakedpi, we have chosen to focus on peaks with sufficient information across all samples and on obtaining for those peaks the best quality quantifications for the purposes of differential analysis. It is important to note that the benefit of our method is dependent on using peak areas for quantification rather than peak height. As we show, the variability in quantification of a particular peak across samples is driven entirely by the variability in peak width. If peak height is used instead of peak area, our method will show the same quantification as XCMS and MZMine2, provided the sample-specific RT bounds contain the mode of the peak; this is true for two of our three examples. In our evaluation of bakedpi, we have used both centroid-mode and profile-mode data sets with fairly stable chromatography. The RT drift we observe in these data sets is not so large that corresponding peaks from different samples do not overlap. However, stable chromatography is not required for bakedpi to work because we do implement RT alignment procedures. Our evaluation data sets also come from mass spectrometers with a range of mass accuracies from 5 ppm on Q-TOF instruments to less than 1 ppm on the Orbitrap, so bakedpi is able to handle data from a representative range of instruments. We expect lower mass accuracy to make peak merging more likely and to cause peak m/z bounds to be wider than necessary, but this is mostly a feature of low mass accuracy in general. Currently, our method is implemented as the standalone yamss package as part of the Bioconductor project.

Methods

Data

Also see Table . ASD_hirisk: Prenatal serum samples from 40 mothers participating in the EARLI study whose infants had the highest (n = 20) and lowest (n = 20) Autism Observation Scale for Infants (AOSI) at the time of the experiment.[29] timecourse_4h, timecourse_24 h: Six MCF-7 cell line samples exposed to estradiol (E2) and six control samples unexposed to E2 for up to 72 h.[30] MTBLS2: Four wild-type and four cyp79b2 cyp79b3 knockout Arabidopsis thaliana leaves exposed to silver nitrate.[31,32] CAMERA: Spike-ins of 39 known compounds at varying concentrations on methanolic extracts of Arabidopsis thaliana leaves.[33] Three samples with a spike-in concentration of 20 μM were compared to three samples with a spike-in concentration of 5 μM in both positive and negative ion modes. MTBLS103: Serum profiling of 12 adolescent girls with hyperinsulinaemic androgen excess and 14 healthy controls matched on age, weight, and ethnicity.[34] MTBLS213: Human retinal pigment epithelium cell line (ARPE-19) batches grown labeled and unlabeled glucose media.[35] MTBLS126: Liver concentrations of resveratrol (RESV) metabolites after application of a mixture of RESV in hydrophilic ointment to mouse skin (3 samples) compared to liver concentrations of resveratrol (RESV) metabolites after application of hydrophilic ointment without RESV to mouse skin (3 samples).[25]

Processing with XCMS and MZmine2

XCMS parameters were optimized using the IPO package available on Bioconductor[26] using recommended starting values for most data sets. Because optimization for the MTBLS2 and MTBLS213 data sets required significant computational time (we terminated the optimization after 11 days), we either fixed parameters that could be reasonably inferred beforehand (such as ppm) or set a smaller range of values over which to optimize. MZmine2 parameters were set based on optimized XCMS parameters where possible. In particular, the “prefilter”, “mzdiff”, minimum and maximum peakwidth, and ppm parameters from XCMS had near equivalents in MZmine2 parameters. For XCMS, we used the “centWave” algorithm[9] for the nine centroid-mode data sets and the “matchedFilter” algorithm[23] for the profile-mode MTBLS126 data set. We used the density method for peak grouping, the obiwarp method for retention time alignment, and the fillPeaks method to fill in information for peaks missing from certain samples. For MZmine2, we used the GridMass module for peak detection,[36] the join aligner for retention time alignment, and the same-range gap filler module. Details on optimization and parameter settings for XCMS and MZmine2 are provided in the Supporting Information.

Processing Workflow

Our processing procedure consists of three steps. First is background correction which increases the signal-to-noise ratio of true peaks. Second is RT alignment, which aligns the raw data to correct for drifts in compound elution times between samples; this is optional. Third is density estimation to detect peaks.

Background Correction

Background correction is performed on each sample separately. We divide the m/z-RT space into bins and estimate background separately for each bin; this is arbitrarily done for bins of width 10 m/z units and 40 scans in the RT domain. We observe that each grid region exhibits a multimodal intensity distribution with 2 or more modes (Supplementary Figure S13) and reason that the lowest mode is background. We estimate the location of the mode with the first peak of the kernel density estimate of the intensity distribution and subtract this value from all observations in the grid region. We investigated two RT alignment procedures that could be applied to the raw data before peak detection and one procedure that could be applied after peak detection. The first prepeak detection approach was to use the sample-specific corrected RTs reported by XCMS to define a RT warping function that could be applied to the raw data to yield aligned RTs. In the second approach, we found tentative m/z regions containing peaks using univariate kernel density estimation and computed EICs in these regions for all samples. For each region and sample, we then found the shift that would maximize the correlation between the EICs in each sample and a reference sample (the sample with the largest area beneath the EIC). These local and sample-specific shifts were applied to the raw data to yield aligned RTs. We also investigated a correction procedure that could be applied to peaks that had already been detected. For each detected peak, we computed the sample-specific shifts that would maximize the correlation between the EICs in each sample and a reference sample (the sample with the largest area beneath the EIC). We then recomputed the peak quantifications using the original RT bounds and shifted EICs.

Bivariate Density Estimation

To detect peaks, we pool all samples into a single metasample by concatenating the spectral information from all of the samples. For example, the spectral information for the first scan of the metasample is formed by concatenating the first scan’s spectral information from the individual samples. We use this metasample to estimate a two-dimensional density in the m/z-RT space. We represent the input data as a set of data points (M, T, I), where M is the mass over charge (m/z) of the jth data point (all samples are pooled), T is the scan number (RT in seconds divided by number of scans per second), and I is the intensity. Per sample, T typically has up to a few thousand unique values depending on the scan rate of the mass spectrometer and the duration of the experiment, and M has on the order of one hundred observations per scan in centroid-mode data and several hundred in profile-mode data. Thus, the data consists of tens of thousands of data points such triples for each sample. The bivariate intensity-weighted density estimator using a Gaussian kernel at a point (m, t) in m/z-RT space is given bywhere j = 1, ..., n indexes the n data points, hM and hT are the bandwidths in m/z and RT space, respectively, and ϕ2 is a bivariate Gaussian density. The density estimate is not highly sensitive to the RT bandwidth, and a default of bandwidth of 10 scans is recommended. The m/z bandwidth should be set based on the type of mass spectrometer used and is recommended to be 0.005 for TOF and 0.002 for Orbitrap instruments. Because the density estimate involves a sum over all n data points at each value of (m, t), we use various approaches to make this computationally tractable. First, we use a diagonal covariance matrix for the bivariate kernel; this implies the factorizationWe do this because our focus is on identifying regions of interest rather than on highly exact estimation of the density.[37] Second, we use a simple binning strategy[38] where the m/z-RT space is binned and a single representative value for each bin is chosen. In the RT domain, the default bin width is 1 scan, and in the m/z domain the default bin width is set to be equal to the bandwidth (0.005 for TOF and 0.002 for Orbitrap). Third, we use a Gaussian kernel truncated at ±3, effectively only including points close to (m, t) in the summation.[38] Fourth, in our implementation, we make use of sparse linear algebra as well as efficient data structures for selecting points close to (m, t) as implemented in the data.table package.[39] After obtaining the density estimate, we select a cutoff using information from the strongest (most intense) features in the data. The m/z domain is divided into bins of a default width of 2 m/z. Within each bin, the most intense data point is selected. We assume that this data point belongs to a true feature and use local univariate density estimation in the m/z and RT domains to define a m/z and RT window for this feature. We compute quantiles of the density estimate values in these regions and compute the mode of this quantile distribution for various quantile values. For example, we compute the 99th percentile for each of the approximately 500 strong feature regions and compute the mode of this distribution. We repeat this for a wide range of percentiles. We then order these modes and select the first mode substantially different from zero as a cutoff. To ensure reasonable peak bounds, we enforce that this cutoff should be greater than or equal to the 99th percentile of nonzero density values. Applying the cutoff to the density estimate matrix yields a binary matrix that denotes peak and nonpeak regions. In order to obtain m/z and RT bounds for these peak regions, we use a connected components labeling algorithm.[40]

Software Availability

Our method is implemented in the yamss package, available from the Bioconductor project at https://www.bioconductor.org/packages/yamss.

33 in total

1. Bayesian approach for peak detection in two-dimensional chromatography.

Authors: Gabriel Vivó-Truyols
Journal: Anal Chem Date: 2012-03-02 Impact factor: 6.986

2. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification.

Authors: Colin A Smith; Elizabeth J Want; Grace O'Maille; Ruben Abagyan; Gary Siuzdak
Journal: Anal Chem Date: 2006-02-01 Impact factor: 6.986

3. Retention time alignment algorithms for LC/MS data must consider non-linear shifts.

Authors: Katharina Podwojski; Arno Fritsch; Daniel C Chamrad; Wolfgang Paul; Barbara Sitek; Kai Stühler; Petra Mutzel; Christian Stephan; Helmut E Meyer; Wolfgang Urfer; Katja Ickstadt; Jörg Rahnenführer
Journal: Bioinformatics Date: 2009-01-28 Impact factor: 6.937

4. Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens.

Authors: Sandra L Taylor; L Renee Ruhaak; Robert H Weiss; Karen Kelly; Kyoungmi Kim
Journal: Bioinformatics Date: 2016-09-04 Impact factor: 6.937

5. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets.

Authors: Carsten Kuhl; Ralf Tautenhahn; Christoph Böttcher; Tony R Larson; Steffen Neumann
Journal: Anal Chem Date: 2011-12-12 Impact factor: 6.986

6. IPO: a tool for automated optimization of XCMS parameters.

Authors: Gunnar Libiseller; Michaela Dvorzak; Ulrike Kleb; Edgar Gander; Tobias Eisenberg; Frank Madeo; Steffen Neumann; Gert Trausinger; Frank Sinner; Thomas Pieber; Christoph Magnes
Journal: BMC Bioinformatics Date: 2015-04-16 Impact factor: 3.169

7. Metabolism of skin-absorbed resveratrol into its glucuronized form in mouse skin.

Authors: Itsuo Murakami; Romanas Chaleckis; Tomáš Pluskal; Ken Ito; Kousuke Hori; Masahiro Ebe; Mitsuhiro Yanagida; Hiroshi Kondoh
Journal: PLoS One Date: 2014-12-15 Impact factor: 3.240

8. Highly sensitive feature detection for high resolution LC/MS.

Authors: Ralf Tautenhahn; Christoph Böttcher; Steffen Neumann
Journal: BMC Bioinformatics Date: 2008-11-28 Impact factor: 3.169

9. Genetic variability in a frozen batch of MCF-7 cells invisible in routine authentication affecting cell function.

Authors: Andre Kleensang; Marguerite M Vantangoli; Shelly Odwin-DaCosta; Melvin E Andersen; Kim Boekelheide; Mounir Bouhifd; Albert J Fornace; Heng-Hong Li; Carolina B Livi; Samantha Madnick; Alexandra Maertens; Michael Rosenberg; James D Yager; Liang Zhao; Thomas Hartung
Journal: Sci Rep Date: 2016-07-26 Impact factor: 4.379

10. Quality assurance of metabolomics.

Authors: Mounir Bouhifd; Richard Beger; Thomas Flynn; Lining Guo; Georgina Harris; Helena Hogberg; Rima Kaddurah-Daouk; Hennicke Kamp; Andre Kleensang; Alexandra Maertens; Shelly Odwin-DaCosta; David Pamies; Donald Robertson; Lena Smirnova; Jinchun Sun; Liang Zhao; Thomas Hartung
Journal: ALTEX Date: 2015 Impact factor: 6.043

5 in total

1. Metabolite Profiles of Healthy Aging Index Are Associated With Cardiovascular Disease in African Americans: The Health, Aging, and Body Composition Study.

Authors: Ashish Yeri; Rachel A Murphy; Megan M Marron; Clary Clish; Tamara B Harris; Gregory D Lewis; Anne B Newman; Venkatesh L Murthy; Ravi V Shah
Journal: J Gerontol A Biol Sci Med Sci Date: 2019-01-01 Impact factor: 6.053

Review 2. Integrative omics approaches provide biological and clinical insights: examples from mitochondrial diseases.

Authors: Sofia Khan; Gulayse Ince-Dunn; Anu Suomalainen; Laura L Elo
Journal: J Clin Invest Date: 2020-01-02 Impact factor: 14.808

Review 3. From chromatogram to analyte to metabolite. How to pick horses for courses from the massive web resources for mass spectral plant metabolomics.

Authors: Leonardo Perez de Souza; Thomas Naake; Takayuki Tohge; Alisdair R Fernie
Journal: Gigascience Date: 2017-07-01 Impact factor: 6.524

4. Flimma: a federated and privacy-aware tool for differential gene expression analysis.

Authors: Olga Zolotareva; Reza Nasirigerdeh; Julian Matschinske; Reihaneh Torkzadehmahani; Mohammad Bakhtiari; Tobias Frisch; Julian Späth; David B Blumenthal; Amir Abbasinejad; Paolo Tieri; Georgios Kaissis; Daniel Rückert; Nina K Wenke; Markus List; Jan Baumbach
Journal: Genome Biol Date: 2021-12-14 Impact factor: 13.583

Review 5. The metaRbolomics Toolbox in Bioconductor and beyond.

Authors: Jan Stanstrup; Corey D Broeckling; Rick Helmus; Nils Hoffmann; Ewy Mathé; Thomas Naake; Luca Nicolotti; Kristian Peters; Johannes Rainer; Reza M Salek; Tobias Schulze; Emma L Schymanski; Michael A Stravs; Etienne A Thévenot; Hendrik Treutler; Ralf J M Weber; Egon Willighagen; Michael Witting; Steffen Neumann
Journal: Metabolites Date: 2019-09-23

5 in total