Literature DB >> 32991165

Numerical Representations of Metabolic Systems.

Abstract

Metabolomics is becoming a mature part of analytical chemistry as evidenced by the growing number of publications and attendees of international conferences dedicated to this topic. Yet, a systematic treatment of the fundamental structure and properties of metabolomics data is lagging behind. We want to fill this gap by introducing two fundamental theories concerning metabolomics data: data theory and measurement theory. Our approach is to ask simple questions, the answers of which require applying these theories to metabolomics. We show that we can distinguish at least four different levels of metabolomics data with different properties and warn against confusing data with numbers. This treatment provides a theoretical underpinning for preprocessing and postprocessing methods in metabolomics and also argues for a proper match between type of metabolomics data and the biological question to be answered. The approach can be extended to other omics measurements such as proteomics and is thus of relevance for a large analytical chemistry community.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32991165 PMCID： PMC7581014 DOI： 10.1021/acs.analchem.9b05613

Source DB: PubMed Journal: Anal Chem ISSN： 0003-2700 Impact factor: 6.986

Metabolomics concerns the measurement of small biochemical compounds (metabolites) in samples obtained from biological systems or, in a broader context, from samples that contain such metabolites (extracts from natural foods, environmental samples, etc.). Such measurements are subsequently used to infer relevant information about the associated (biological) system related to a certain research question. Nowadays, there is a whole variety of metabolomics measurements available which can be categorized either by the type of instruments used (mostly liquid chromatography–mass spectrometry (LC–MS), gas chromatography–mass spectrometry (GC–MS), capillary electrophoresis–mass spectrometry (CE–MS), and NMR) or by the type of measurement performed. The latter pertains to whether the measurement is targeted to a certain number of (known) metabolites or to an untargeted analysis in which also (many) unknown metabolites are being measured. There are also methods which are a combination of both. A typical pipeline for a metabolomics study runs through different steps: formulating a biological question, experimental design, sampling, measuring, preprocessing the data, analyzing the preprocessed data, visualization of results, and answering the biological question.[1] An often neglected part of the above-mentioned pipeline is the difference between numbers and data. This is a very fundamental issue at the heart of any measurement. We will explain this issue starting from asking a few very simple questions about whether numbers in a metabolomics experiment can be meaningfully compared to each other. To answer these questions, we have to introduce two theories, namely, data theory and measurement theory. After that, we will give (partly) answers to the questions and try to come to a synthesis. As a running example throughout this paper, we will consider measuring lipids in blood using LC–MS. The goals of this paper are (1) provide a theoretical underpinning of preprocessing methods; (2) give guidelines for a proper use of data analysis methods and propose alternatives; (3) warn against conclusions which are not supported by the (properties of) the data; (4) argue for a proper match between data properties, biological question, and data analysis method; and (5) creating awareness that numbers (from an instrument) is not yet data. In short, we provide a theoretical framework for thinking about and dealing with metabolomics data. In a broader context, we would like to create awareness that numbers are not data, which is highly relevant in this era of Big Data. We will not discuss the specifics of the different preprocessing and data analysis methods nor of related topics such as missing data handling and measurement error. There are many papers already discussing this. We invite the metabolomics practitioners to apply our framework on their way of analyzing metabolomics data.

Simple Questions

We start by visualizing the numbers obtained from an LC–MS experiment on lipids in blood (see Figure ). The raw data can be arranged in intensities obtained at a certain m/z value at a certain retention time (rt), and the combined index rt.mz indicates a column in the matrix containing the samples in its rows. Looking at Figure , we can ask simple questions to what extent the numbers are comparable, specifically:

Figure 1

Schematic of raw measurements of lipids in blood. Legend: i is a row in the matrix; j is a column; A, B, and C are specific numbers in the matrix; rt.mz is the retention time-mass spectrometry index.

Schematic of raw measurements of lipids in blood. Legend: i is a row in the matrix; j is a column; A, B, and C are specific numbers in the matrix; rt.mz is the retention time-mass spectrometry index. (1) If A > C; does that have a meaning? (2) If A > B; does that have a meaning? (3) Does A – C have a meaning? (4) Does A – B have a meaning? (5) Does A/C have a meaning? (6) Does A/B have a meaning? which should be taken as examples, e.g., when A < C, then question one has to change accordingly. Note that moving from question one to three puts a higher demand on the numbers, e.g., if A/C is meaningful then necessarily A > C must have a meaning (but not vice versa!). This notion will be formalized later. The questions asked above are relevant for a subsequent data analysis. Take the example of PCA (For a short explanation of the methods, see the Supporting Information.), the workhorse of metabolomics data analysis. The score-plots of a PCA are usually interpreted in terms of distances between dots representing the samples, where samples far apart are regarded as very dissimilar and vice versa. However, scores are linear combinations of the original variables (i.e., numbers), and this assumes that for distances in scores plots to be meaningful, at least also the original numbers should be comparable (at least at the level of A–C). Similar reasonings hold for loading plots and for the results of OPLS-DA and other often used techniques. Hence, it makes sense to answer the above posed questions. Actually, there is even a more basic question to ask before considering the simple questions: is it even meaningful to start comparing the values A, B, and C? This question is key in the field known as data theory. The next questions regarding at which level comparisons are possible is the subject of measurement theory. Therefore, both will be explained briefly in the sequel. This paper will be mainly concerned with mass-spectrometry based measurements; the case for NMR is a little different and will be touched upon in the end.

Data Theory

A set of notions regarding comparability is called data theory and was pioneered by Coombs[2] and explained for multiway analysis.[3] The first important notion in data theory is conditionality, where we can distinguish column-, row-, and matrix-conditionality. When considering numbers arranged in a matrix (such as in Figure ), then different types of comparisons can be made: between numbers across rows in the same column (Figure , between A and B) and between numbers across columns in the same row (Figure , between A and C). When such data can be compared meaningfully, the data are called column-conditional and row-conditional, respectively. When data can be meaningfully compared across rows and columns, then these data are called matrix-conditional. The prototypical example of row-conditional data are metabolomics measurements of urine, e.g., using NMR. Depending on the different urine histories of the subjects, the urine can be more or less concentrated. This makes the values within one column of a data matrix incomparable since the (unknown) dilution factor of the subjects destroys the comparability. The typical solution of this problem is found in normalizing the different samples thereby attempting to achieve matrix-conditionality. Whether this completely solves the problem is a matter of debate and it also depends on the research question. Actually, different types of metabolites are differently excreted by the kidney: some are only excreted by filtration, some are (partly) readsorbed, and readsorption is achieved by different transporters, for example, one for acidic amino acids, one for dibasic amino acids, one for neutral amino acids.[4] This could justify a normalization per certain metabolite classes rather than normalizing all metabolites in the same manner. Moreover, the type of normalization may also depend on the type of sample, e.g, whether it originates from urine, serum, or tissue. Discussing these issues further is beyond the scope of this paper. A more serious problem regarding comparability as discussed in data theory is lack-of-invariance: the numbers in a single column do not have the same meaning. This problem is more fundamental than conditionality. Whereas in conditionality, numbers cannot be compared since there are (unknown) arbitrary differences, in lack-of-invariance the meaning of the variables changes within a column. The prototypical example is unsynchronized time series data (see Figure , panel a). The time series of three subjects are collected for multiple metabolites; in this figure, only one metabolite is shown. The series are not synchronized, therefore the measurements at, e.g., physical time point t4 cannot be compared across subjects because they pertain to different states of the biological process measured with the metabolite. Hence, the meaning of the measurement at time point t4 changes and is not invariant.

Figure 2

Lack-of-invariance illustrated: (a) unsynchronized times series of several subjects and (b) the naive arrangement of the numbers.

Lack-of-invariance illustrated: (a) unsynchronized times series of several subjects and (b) the naive arrangement of the numbers. A naive arrangement of the numbers is shown in Figure , panel b). This is called naive since the lack-of-invariance is not taken into account. A more accurate arrangement of the numbers is shown in Figure because now it is clear that each subject has its own unique time points (see the subscript i on the variables indicating time points). Obviously, the numbers as shown in Figure cannot be used as such. Remedies of this problem are found in alignment procedures (e.g., using warping approaches[5]). After such an alignment of all metabolites, assuming that this has solved the lack-of-invariance problem, the numbers can be arranged in a three-way array and analyzed with proper three-way methods such as PARAFAC.[6]

Figure 3

Proper arrangement of the numbers whereby each individual i receives its own time points.

Proper arrangement of the numbers whereby each individual i receives its own time points. It is important to realize that such lack-of-invariance problems can also be solved by using data analysis methods that do not require the numbers to be synchronized. One such an alternative for the case of Figure is to concatenate the data sets per subject (time versus metabolites) in such a way that all subject-matrices are stacked on top of each other with the metabolites as the common mode. Then methods like simultaneous component analysis (SCA)[7] can be used. This is one of the ways to solve the synchronization problems in batch statistical process monitoring where the batches are also not synchronized.[8] Hence, the properties of the data have repercussions on the methods that can be applied and the type of biological questions that can be solved.

Measurement Theory

After having established that a comparison between numbers is meaningful, the next question is at what level this can be done. This was pioneered by Stevens[9] and later taken up and further developed by several authors.[10−13] A nice introduction is given by Hand[14] and a summary is given in Table which is explained briefly. In the Supporting Information, we give a more formal treatment with an illustrative example.

Table 1

Formal Treatment of Types of Data Scalesa

scale-type	example	permissible transformations	permissible statistics
nominal	categories	one-to-one	number of cases
ordinal	survey data	monotonic	median, IQR
interval	degree Celsius	positive linear transformation	mean, standard deviation
	calender time	x′ = αx + β (α > 0)
ratio	length mass	similarity transformation x′ = αx (α > 0)	coefficient of variation
absolute	counts	x′ = x	all previous

For explanation, see the text.

For explanation, see the text. The basic notion in measurement theory is that we want to represent (properties of) a system by numbers, i.e., we want to give a numerical representation of a system. The lowest measurement level is nominal data which are merely (exclusive) categories. Examples are different types of cars, different countries, etc. The data are only used as class labels, and these can be changed as long as each class receives a unique other label. Hence, the permissible transformations, the transformations between numerical representations that keep the relationships in the corresponding system intact, are one-to-one transformations. The type of statistics to be used for this type of data are number of cases, frequencies, χ2-tests, etc. The next level of measurement scale are ordinal data. The prototypical example is survey data in which respondents can score on certain issues using the answers strongly disagree, disagree, neutral, agree, strongly agree. Obviously, there is an order in these answers; and these answers can be labeled from 1 to 5. The difference between 2 and 1 on the one hand and between 3 and 2 on the other hand, although exactly equal, does not have a meaning. The system can also be represented using a different set of numbers, e.g, 2, 4, 7, 8, 9, but the transformation between the two numerical representations needs to be monotonic. The type of statistics to be employed are the ones for the lower-scaled measurement (i.e., nominal data) and in addition median, interquartile range (IQR), etc. Interval-scale data is the next level. An example is degree Celsius where the numbers zero and hundred are arbitarily chosen. Stated otherwise, this scale does not have a natural zero and unit. This means that another scale (x′) can be used with x′ = αx + β (α > 0), and this scale has the same meaning for the system; an example is Fahrenheit where α = 9/5 and β = 32. Nevertheless, the ratio of differences between values of this scale has meaning in terms of the system, e.g., in using calendar times can be interpreted in a meaningful way as the first period being four times as long as the second one. However, the ratio does not have a meaning; 1980 is not twice as old as 990, hence, the name interval-scale. In addition to statistics at the lower measurement levels, means and standard deviations can be used meaningfully for interval-scaled data. The next level is ratio-scaled data with examples length and weight. A ratio-scaled variable has no natural unit. Length can be expressed in meters or centimeters, but it has a natural zero. Hence, the permissible transformation is x′ = αx (α > 0). In addition to the lower measurement levels, also coefficients of variation can be used meaningfully for ratio-scaled data. The highest degree of measurability is absolute scale data, e.g., count data. Such data has a natural zero and a natural unit, and the only permissible transformation is the identity. Apart from the measurement levels mentioned above, there are still other types of more exotic scales.[10] When considering the simple questions, it is clear that metabolomics measurements can have different measurement scales. It is certainly not always the case that metabolomics measurements are measured on a ratio-scale. If simple questions 1 and 2 are answered affirmative, then the data is at least ordinal-scaled. If simple questions 3 and 4 are answered affirmative, then the data is at least interval-scaled; and if simple questions 5 and 6 are answered affirmative, then the data is ratio-scaled. This will be explained in the next section.

Levels of Metabolomics Measurements

Level 0 Measurements: Raw Numbers

Given the knowledge explained above regarding different aspects of comparability, we now turn to the simple questions. The most basic measurement readouts of an LC–MS measurement of blood-lipids are shown in Figure a. This is simply a list of raw intensities measured per sample in an LC–MS run arranged in an rt.mz format and will be called level 0 numbers.

Figure 4

Level 0 of LC–MS measurements: (a) rt.mz is a specific combination of retention time and m/z ratio, i is an index for sample, j is an index for column; (b) Met1 means metabolite 1, Frag1 is fragment 1, and Frag2 is fragment 2 of the same metabolite 1. We can now start by answering the first simple question. Suppose that A0> C0: does that have a meaning? There are two cases to consider. Case a, where the numbers pertain to fragments of different metabolites (we do not consider trivial cases of spurious signals due to noise). For this case, the answer is that A0 > C0 has no meaning since the response factors of both metabolites are different and at this point unknown (see Supporting Information, Calibration Models). Hence, these numbers do not reflect (relative) concentrations within the system. Case b is shown in Figure b and pertains to intensities of different fragments (this does not hold for adducts; their ratios can depend on the concentrations). of the same metabolite (and, thus, at the same rt). In that case, the ratio A0/C0 may have meaning since it refers to the same metabolite. In fact, such a ratio should also hold for the same fragments of that metabolite in other rows, thus A0/C0 = B0/D0 (assuming alignment of rt.mz values). Comparing A0 with B0 in case a, we run into a lack-of-invariance problem since the rt.mz values are not aligned. Even when alignment would not be a problem, we still have only row-conditional numbers since there may be batch and sample workup differences between the samples.

Level 1 Measurements: Alignment, QC, and IS-Corrected

One of the first steps being done after acquiring the raw data is alignment of the chromatograms, global-IS correction and QC correction of the data (see the Supporting Information, Internal Standards). Alignment is needed to combat the lack-of-invariance problem by assuring that the same feature is now represented in a single column so that each column represents the same compound. Global IS correction is used to reduce sample workup and injection volume errors. The QC correction step is needed to reduce the within and between measurement batch drift of the instruments.[15] After this data cleaning, we arrive at Level 1 measurements (Figure ). The columns now represent features and have the same meaning across each column. A feature can represent one individual lipid molecule but can be also due to a combination of two or more lipid molecules which are isomers but cannot be differentiated with the mass spectrometric method. An example is phosphatidylcholine PC (22:1/18:O) where without MS/MS, the position of the unsaturated fatty acid cannot be determined, and the position of the double bond requires even further advanced methods.

Figure 5

Level 1 of LC–MS measurements after Global-IS and QC correction.

Level 1 of LC–MS measurements after Global-IS and QC correction. For comparing A1 and C1, the same argument goes as for the Level 0 measurements. Comparing A1 with B1 is now meaningful since they pertain to the same feature and the numbers are column-conditional due to the IS and QC steps. Still, A1 and B1 are measured intensities and not directly interpretable as concentrations. In general, a calibration model has four regions: (i) a below limit of detection region, (ii) a linear region, (iii) a concave region (flattening), and (iv) a saturation region (see the Supporting Information, Calibration Models). If A1 and B1 are both in region ii, then their ratio can be interpreted as a ratio of concentrations. Hence, the numbers are ratio-scaled. If they are both in the concave region, then a ratio is not meaningful anymore but if A1 > B1, then it can still be concluded that the concentration at measurement A1 is larger than the concentration at measurement B1. Hence, the numbers are ordinal-scaled. (actually, a bit more than ordinal-scaled since the calibration model has a specific shape.) If one or both of A and B are in the saturated region iv, then a comparison is meaningless. Summarizing, the conclusion about comparability in this case depends crucially on the shape of the calibration model which is unknown at this point. Until now, we have been discussing numbers. By using instrumental analysis theory into the transition from level 0 to level 1, we have arrived at data because the numbers in level 0 have become a certain meaning. In short: data = numbers + meaning.

Level 2 Measurements: Group IS-Corrected

It is also possible to have internal standards for a group of lipids, e.g., separate standards for triglycerides and certain phospholipid classes such as phosphoethanolamines, phosphoethanolserines, and cholesterols. Compared to level 1, the signal of the feature representing one individual lipid or a combination of isomeric lipids is better quantified as a more appropriate IS is used; the IS should be chosen such that the IS is matching the structure of the lipid such that effects such as ion suppression is compensated. Hence, at this level, the metabolite or lipid class of a feature has to be identified. The results are called level 2 measurements and shown in Figure .

Figure 6

Level 2 of LC–MS measurements after group-IS and QC correction. (Panel a) A2 and C2 are in the same lipid class (indicated by shades of blue) and (panel b) A2 and C2 are in different lipid classes.

Level 2 of LC–MS measurements after group-IS and QC correction. (Panel a) A2 and C2 are in the same lipid class (indicated by shades of blue) and (panel b) A2 and C2 are in different lipid classes. For comparing A2 and B2, the conclusions are the same as for the level 1 measurements. In comparing A2 and C2, we now have to distinguish between A2 and C2 in the same lipid-class or not. When A2 and C2 are in the same lipid-class (Figure , panel a) and if we can expect similar response factors, then these numbers are comparable. Whether the numbers A2 and C2 are ratio- or ordinal-scaled depends again on the region of the calibration models in which A2 and C2 are. If the numbers A2 and C2 are in different lipid classes (Figure , panel b), then we have in principle again level 1 measurements. Note that in the transition from numbers to data, we have not only used instrumental analysis theory but also chemical theory, in particular, theory regarding the behavior during analysis (ionization) and chemical similarity between lipids.

Level 3 Measurements: Concentrations

The highest level of measurements is obtained after having built calibration models for all individual lipids. Obviously, for each rt.mz feature, the structure of the lipid has to be known, which is not a trivial task (but outside of the scope of this paper). At this level, concentrations of a lipid are determined rather than a relative concentration, i.e., a ratio versus an internal standard. This requires that an authentic standard is available, and that thus the lipid is fully identified. An example is the quantification of prostaglandin E2, a bioactive lipid, where absolute concentrations measured in a patient can then be compared to reference values. For this, the most ideal IS is an isotopically labeled prostaglandin E2 eluting at the same retention time and experiencing the same ion suppression. For practical reasons, often calibration models within a class of lipids are constructed using only a limited number of standards. This is possible if the response factors are proven to be the same or a (preferably on theory based) model for the response factor of each lipid is applied[16] (see the Supporting Information, Internal Standards and Calibration Models). These allow for a transformation from intensities to concentrations for all numbers in the matrix of Figure . This results in matrix-conditional data, and the data are ratio-scaled.

Figure 7

Level 3 of LC–MS measurements after using calibration models.

Level 4 Measurements: Biological Activities

Up until now, the focus has been on concentrations of the lipids. Suppose that the interest is in the lipids as ligands in a biological activity study. It is known that ligands have an affinity for a receptor which can be modeled by a (nonlinear) sigmoidal dose–response function which is usually specific for each lipid. An example of such a bioactive lipid is again prostaglandin E2. At level 4 measurements, the bioactive effect, e.g., the pro-inflammatory effect on the vasculature in an in-vitro model, is measured rather than the concentration. From this perspective, the data in level 3 are now suddenly column-conditional since the values A3 and C3 have become incomparable due to the differences in dose–response functions. Moreover, because of the sigmoidal relationship, the values A3 and B3 are not ratio-scaled anymore but ordinal-scaled (actually, a bit more than ordinal-scaled since the dose–response curves are sigmoidal.) When all dose–response curves are known, then the data could be transformed to biological activities again and become ratio-scaled matrix conditional. This could be called level 4 data, which are tailor-made for a specific purpose.

Synthesis

From the previous presentation, it is clear that there is an interplay between numbers, data, theory, and type of biological question. An attempt to synthesize this is shown in Figure .

Figure 8

Synthesis of the foregoing. Legend: arrow A represents the transition from numbers to data using chemical- and instrumental analysis theory; arrow B represents the translation from a biological question to a model and modeling objective; and, finally, arrow C asks whether A and B are properly matched. The blue ellipsoid marked “Numbers” represent the raw numbers coming from an instrument. They do not represent data yet, as explained above. Instrumental analysis and chemical/physical theory should be used to turn these numbers into data (arrow A). These data have then certain properties, conditionality, measurement scale, depending on the original numbers and the theory that is used to turn them into data. This is exemplified above in the different levels of metabolomics measurement with increasing efforts to change the properties of the data, e.g, by using internal standards going from level 0 to level 1 and using calibration models going from level 2 to level 3. The biological questions pertain to certain biological systems, and these questions need to be formalized in a model to be able to confront the question with the data. The term model should be taken in a broad context, e.g., even simple correlations can be considered models. The modeling objective is then formulated in terms of which parameters have to be estimated, which loss-functions to use, which algorithms to use, etc. As an example, if the blood-lipids are measured for a group of controls and patients and if the data are (at least) interval-scaled, then OPLS-DA can be used to find biomarkers. The crucial part of Figure is arrow C. There should be a match between modeling objectives and properties of the data. Citing the example above, if time-series data of the lipids are available for different subjects and these are not synchronized (or cannot be synchronized), then it does not make sense to use three-way models. If the data are only ordinal scaled, then we cannot fit quantitative systems biology models to the data. If there are discrepancies in arrow C, then there are two routes to take: change the properties of the data or change the modeling objective. For the example of unsynchronized time-series data, we have to switch to simultaneous component analysis (see the section Data Theory) models (thereby possibly also rephrasing the biological question). To fit systems biology models, we have to make calibration models for all lipids and make all data in the concentration form. Obviously, there are many examples of how to solve such discrepancies.

Broader Context: Considerations for the Field

Repercussion for Metabolomics Data Analysis

The above presented theory has repercussions for metabolomics data analysis. In the subsection Correlations, we will show what it means for correlations as a simple example but similar types of considerations hold for more complicated methods such as PCA and OPLS-DA since these methods use correlations. In the subsection Overview, we will subsequently give an overview.

Correlations

To show how correlations are affected by different comparability properties, we present a small example. Suppose that intensities of three lipids are measured at level 1 (global-IS and QC corrected and aligned). The data for five samples are presented in eq :This data is column-conditional and, depending on whether the lipids are measured within the linear range of the calibration models, ordinal or ratio-scaled within a column. The Pearson correlation matrix of this data isassuming that the numbers are ratio-scaled. Suppose now that we have made calibration models for all three lipids and the concentrations are as follows:then this matrix has exactly the same (Pearson) correlations as the one of eq (all intensities were in the linear range of the calibration models). Hence, the column-conditionality of the data does not hamper the use of correlations, and when using correlations, there is no need for calibration models. The reason is that going from intensities to concentrations (assuming that the numbers are in the linear range of the calibration models) are simple linear transformations and correlations are invariant under such linear transformations. If the original intensities were on an ordinal-scale, then similarly Spearman correlations could be used. Following our example, suppose that we are interested in biological activities and have measured these activities corresponding to the above-mentioned concentrations of prostaglandin E2 and related lipids, and these activities arewhere lipid two is in the saturation phase of the dose–response curve; lipid one also shows nonlinear behavior, and lipid three is in the linear range. The correlation matrix of these activities iswhich is clearly different from eq because of the nonlinearity of the dose–response curves.

Overview

This section discusses the repercussions of the foregoing discussion for metabolomics data analysis. It should not be read as a cookbook about what (not) to do but merely as some remarks about things to consider when performing metabolomics data analysis. Table summarizes the remarks.

Table 2

Different Levels of Metabolomics Measurements and Their Propertiesa

level	characteristics	data properties	statistics
level 0	raw numbers	incomparable	some within-row comparisons
level 1	QC-corrected/aligned	column-conditional	within-column
level 1	global-IS-corrected	ordinal or ratio	nonmetric or metric
level 2	QC-corrected/aligned	column-conditional	within-column
level 2	group-IS-corrected	within-group matrix-conditional ordinal or ratio	within-group submatrix nonmetric or metric
level 3	concentrations	matrix conditional	within matrix comparisons
level 3	concentrations	ratio	metric
level 4	tailor made	case specific	case specific

For explanation, see the text.

For explanation, see the text. The notions of conditionality and measurement scales were explained in the previous sections and summarized in Table . As also explained in the foregoing, the data obtained from level 1 and level 2 can be ordinal or ratio- scaled depending on the form of the calibration model and the specific measurement. When we are in the ordinal-scaled regime, then nonmetric methods can be applied such as the Mann–Whitney two-sample tests and nonmetric multidimensional scaling.[17] Also optimal scaling for multivariate analysis is then an option.[18] When we can assume ratio-scaled data (and at least level 2) then the whole (metric) machinery of PCA, PLS, and OPLS-DA is at our disposal. When the data is in the ordinal-scaled regime and still methods such as PCA and OPLS(-DA) are applied, it is unclear at this point whether the results from such an analysis are (in)valid: as mentioned earlier the data is also a bit more than ordinal-scaled.

FAIR Data

Recently, the life sciences and especially the omics field starts to agree that data should be FAIR (findable, accesible, interoperable, reusable). This allows to reuse data or to combine data from different sources. However, so far often not much information is provided about the quality and theory of the data: how secure is the identification of a metabolite or lipid? How quantitative are the data: is the data for the metabolites ratio-scaled or ordinal-scaled? If FAIR data is not provided with the proper measurement information and theory (i.e., meta-data), they are actually more numbers than data (see Figure ).

Data Fusion

A field of growing interest is data fusion and, specifically, fusion of metabolomics data with other types of omics data. The issues of scale-type and comparability (in general, data characteristics) also play a dominant role in this field but until to now have received little attention. An obvious question to ask is whether two data sets can be compared, likewise as comparing two columns in a matrix of measurements as explained above. When different types of omics measurements are performed on the same set of samples, then such questions arise when the two data sets are going to be fused. Differences in scale-type between two omics data sets also often occur, e.g., when fusing metabolomics with mutation data which are intrinsically binary. Several methods exists for fusing such types of data,[19−22] but comparability issues as explained above have received little attention.

NMR

For NMR, the situation is different than for MS-based metabolomics and we will briefly explain the levels 0–4 for NMR. At level 0, the raw NMR data is considered. These are row-conditional since values in the same column cannot be compared without preprocessing for two reasons. First, the NMR-spectra may not be aligned so that there is lack-of-invariance and, second, even if the spectra are aligned there may still be dilution effects (e.g., in urine spectra) hampering between sample comparisons. Within a row (that is, within the same spectrum) the numbers are comparable because they all pertain to counts of hydrogen atoms. If all preprocessing has been done (aligning, calibration (e.g., ERETIC signal), and normalization) then we arrive at levels 1–2. The data are now row- and column conditional, hence, matrix conditional. In essence, the data pertains to counts of hydrogen atoms and are not concentrations yet. To arrive at concentrations, the peaks have to be identified, quantified, and calibrated thus thereby arriving at level 3 which can be done by current software such as mNOVA and Chenomx. These concentrations are ratio-scaled and matrix conditional. Also in this case, if interest shifts to biological activities, then the same conclusions (for level 4) hold as for the case of MS-based measurements.

Other Omics Measurements

We do not give a full treatment here, but much of the theory explained above also holds for other types of omics measurements. MS-based proteomics is a clear example, but similar simple questions treated in this paper can also be asked about, e.g., RNAseq data as collected in gene-expression measurements or in microbiome research. We invite researchers in those areas to consider these simple questions too!

8 in total

Review 1. Sodium-coupled amino acid transport in renal tubule.

Authors: I Zelikovic; R W Chesney
Journal: Kidney Int Date: 1989-09 Impact factor: 10.612

2. Measurement scales on the continuum.

Authors: R D Luce; L Narens
Journal: Science Date: 1987-06-19 Impact factor: 47.728

3. On the Theory of Scales of Measurement.

Authors: S S Stevens
Journal: Science Date: 1946-06-07 Impact factor: 47.728

4. Time alignment algorithms based on selected mass traces for complex LC-MS data.

Authors: Christin Christin; Huub C J Hoefsloot; Age K Smilde; Frank Suits; Rainer Bischoff; Peter L Horvatovich
Journal: J Proteome Res Date: 2010-03-05 Impact factor: 4.466

5. Analytical error reduction using single point calibration for accurate and precise metabolomic phenotyping.

Authors: Frans M van der Kloet; Ivana Bobeldijk; Elwin R Verheij; Renger H Jellema
Journal: J Proteome Res Date: 2009-11 Impact factor: 4.466

6. Pattern discovery and cancer gene identification in integrated cancer genomic data.

Authors: Qianxing Mo; Sijian Wang; Venkatraman E Seshan; Adam B Olshen; Nikolaus Schultz; Chris Sander; R Scott Powers; Marc Ladanyi; Ronglai Shen
Journal: Proc Natl Acad Sci U S A Date: 2013-02-21 Impact factor: 11.205

Review 7. Selection of internal standards for accurate quantification of complex lipid species in biological extracts by electrospray ionization mass spectrometry-What, how and why?

Authors: Miao Wang; Chunyan Wang; Xianlin Han
Journal: Mass Spectrom Rev Date: 2016-01-15 Impact factor: 10.946

8. Quantitative metabolomics based on gas chromatography mass spectrometry: status and perspectives.

Authors: Maud M Koek; Renger H Jellema; Jan van der Greef; Albert C Tas; Thomas Hankemeier
Journal: Metabolomics Date: 2010-11-16 Impact factor: 4.290

8 in total