Literature DB >> 33615781

Generative Algorithm for Molecular Graphs Uncovers Products of Oil Oxidation.

Yuliia Orlova1, Alessa A Gambardella2, Ivan Kryven3,4, Katrien Keune2, Piet D Iedema1.   

Abstract

The autoxidation of triglyceride (or triacylglycerol, TAG) is a poorly understood complex system. It is known from mass spectrometry measurements that, although initiated by a single molecule, this system involves an abundance of intermediate species and a complex network of reactions. For this reason, the attribution of the mass peaks to exact molecular structures is difficult without additional information about the system. We provide such information using a graph theory-based algorithm. Our algorithm performs an automatic discovery of the chemical reaction network that is responsible for the complexity of the mass spectra in drying oils. This knowledge is then applied to match experimentally measured mass spectra with computationally predicted molecular graphs. We demonstrate this methodology on the autoxidation of triolein as measured by electrospray ionization-mass spectrometry (ESI-MS). Our protocol can be readily applied to investigate other oils and their mixtures.

Entities:  

Year:  2021        PMID: 33615781      PMCID: PMC7988456          DOI: 10.1021/acs.jcim.0c01163

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


Introduction

Modern tools in the field of analytical chemistry are progressively increasing their resolution and are able to generate a manifold of physical and chemical data for the characterization of functional materials.[1,2] The interpretation of such data is becoming a challenge, wherein computational modeling plays a role of growing importance. Hence, the models are becoming of considerable complexity themselves, which challenges their implementation and provides an opportunity for their automated construction. Consequently, the emergent models exploit the benefits of experiments coupled with computational models to understand complex chemical systems. One significant application in this context is the autoxidation of triglyceride, which is relevant in food chemistry,[3,4] cosmetics,[5] biofuels,[6] and cultural heritage.[7−10] Triglycerides are formed from one glycerol molecule and three fatty acids (palmitic, oleic, linoleic, linolenic) in various ratios. Fatty acid chains may differ by the number of unsaturations, which are responsible for a chain of autoxidation reactions that lead to the formation of irregular macromolecular structures. The resulting macromolecules are composed of a wide range of monomer units bearing multiple types and numbers of cross-links. Meanwhile, the esters between the glycerol and fatty acids undergo hydrolysis releasing free fatty acids into the system. Hydrolysis, along with oxidative scission and peroxide decomposition, is responsible for the eventual breakdown of macromolecules. Several experimental studies of oil autoxidation analyze the changes of relative amounts of functional groups in oil samples. Muizebelt et al.[11] reported the evolution of various functional groups, such as bis-allylic carbons, conjugated double bonds, epoxides, acids, and cross-links, which characterize the drying process of ethyl linoleate. Work by Oakley et al.[12] and Fjällström et al.[13] reported volatile aldehydes emission during the autoxidation of linoleic acid. Drier influence on autoxidation of oils was studied by Oyman et al.,[14,15] van Gorkum et al.,[16] Bouwman et al.,[17] Mallégol et al.,[18] and Spier et al.[19] On the theoretical side, the experimental studies are supported by the detailed exploration of reaction mechanisms utilizing quantum mechanics calculations.[12,20] In addition, studies by Oakley et al. and Iedema et al.[21−23] present detailed mathematical models, which describe the concentration profiles of functional groups characteristic to the drying of ethyl linoleate. Both interpretation of experimental measurements and computational modeling techniques will benefit from knowledge of the explicit reaction mechanism describing all intermediate and product molecular species. Mass spectrometry (MS) provides more detailed information about a measured sample than spectroscopic techniques, such as Fourier-transform infrared spectroscopy (FTIR). Generating a wealth of information about the molecular species present in samples, direct MS measurements used in autoxidation of oils are often difficult to interpret due to: (1) fragmentation of the molecules that occurs during the measurement and (2) the possibility for multiple species of equal or very similar molecular masses to contribute to the same peak.[24,25] These two points increase the complexity of peak assignments, although the fragmentation of the molecules may be mitigated in some instances with soft ionization [e.g., electrospray ionization-mass spectrometry (ESI-MS)].[26,27] Although the peak assignment task has been assisted with tandem MS[26,27] or application of the Kendrick mass defect technique,[28,29] high time consumption, methodological challenges, and unanswered questions still remain. Therefore, direct MS interpretation can benefit from the assistance of computational modeling. Grimme et al.[30,31] describe how to handle fragmentation of a single small molecule by simulating the conditions inside the instrument using quantum-chemical calculations. However, this approach is not yet applicable in the case of large or complex molecules, as the spectrum of triglyceride simultaneously features peaks from both intermediate and final products (e.g., of the autoxidation reaction mechanism). Understanding the aging of oil, both computationally and experimentally, requires choosing an appropriate model system that is simpler than linseed oil, while it also allows for the complexities to be added later. We choose triolein (i.e., glycerol trioleate), a triacylglycerol (TAG) that is formed from three symmetrical oleic esters and a glycerol molecule. The shape of the triolein molecule resembles the triglyceride structure of linseed oil, while monounsaturated fatty esters make triolein an ideal model system of olive oil. The reaction scheme of triolein has a large number of intermediate species but only a small number of reaction types. Holding significance for food lipid oxidation studies, triolein thus will serve as a practical benchmark for the model development. The reaction mechanism therefore still produces a significant number of possible extractable products that can be measured with direct MS. With such systems, a global picture is obtained of the reaction processes and changes that happen in the measured sample. To identify a relevant set of possible molecular structures corresponding to peaks in mass spectra following triolein autoxidation, we turn to the automated reaction network generation (ARNG).[32−34] The advent of the ARNG inspired numerous graph theory-based approaches to study reaction pathways of chemical mechanisms[35−38] and retrosynthesis.[39] The ARNG method recursively applies all chemically relevant transformations to all molecular species, which are represented as molecular graphs. However, using this algorithmic discovery of reaction steps on large molecules becomes prohibitive because the algorithm is NP-hard.[40] To overcome this problem, we have formulated ARNG for random graphs in Orlova et al.,[34] and, in this paper, extended this method further to many cross-link types. Because of our approach, a large irregular macromolecule is viewed as an ensemble of randomly interconnected monomers, which have precise structures and are represented as molecular graphs. In the case of triolein, the macromolecule is broken up into smaller pieces: glycerol with adjacent esters and three oleic chains. Therefore, the computational complexity of the ARNG method is considerably reduced to a large but tractable size. For the purpose of this paper, the result of this ARNG application consists of all chemically distinguishable structures of glycerol and derivatives of oleic esters that result from hydrolysis, oxidation, and β scission reactions. As triolein is not a drying oil, this study focuses solely on oxidative reactions, which are predominant in triolein. Thus, in this paper, the cross-linking reactions are not part of ARNG implementation. On the experimental side, masses following the gently accelerated drying of triolein in the presence of inert titanium dioxide (rutile) are measured with the use of electrospray ionization-mass spectrometry (ESI-MS). In contrast to previous reports on the MS of oils, where peak assignments were performed manually,[41] this work matches the products generated by the adapted ARNG to the complex mass spectra. This paper is based on Chapter 4 of PhD Thesis.[66]

Methods

Experimental Setup

Paint Model Preparation and Aging

Paint models were prepared using inorganically coated titanium dioxide (rutile form) pigment (Tronox CR-826), which was chosen for its chemical inertness toward linseed oil binding medium compared to other pigments.[42] On an automatic muller comprising two glass plates, triolein (Sigma-Aldrich, glyceryl trioleate ≥97.0% (TLC), CAS No. 122-32-7) was added to the pigment (37.6% w/w in pigment), and both were gently mixed together with a palette knife. The paint was then mixed between the two glass plates by rotating the top plate 25 turns (under a weight of 10 kg) and, at completion, consolidated at the center of the bottom plate using the palette knife; this step was then repeated once. Mixed paint was deposited on a Melinex polyester wrap (thickness, 250 μm, #M026 from ref (43)) and drawn into a thin strip with a drawdown bar (height, 25 μm). The paint film was set to dry flat overnight under lab conditions for ca. 24 h. A snipping was removed and placed in the dark in an acid-free archive box (type C, blue gray, Verenigde Bedrijven, Jansen-Wijsmuller & Beuns B.V.) and left under lab conditions until analysis. The remainder of the film was heated in an oven (preheated) at 70 °C without light for 5 days and, at removal, also added to an acid-free archive box until analysis. In all instances, air was free to circulate and light was only present briefly (< 5 min per sample) during sample handling. All analyses were performed less than a week after completion of heating.

Electrospray Ionization-Mass Spectrometry (ESI-MS)

Direct-injection ESI-MS measurements were performed in both positive and negative modes on solvent extractions from the paint films using a Micromass Q-tof-2 equipped with a nanoprobe and ESI source. For all solutions and cleaning, ethanol (Fisher Scientific, CHROMASOLV, absolute, for HPLC) was used. Extractions were prepared using 0.1–0.5 mg scrapings from the paint film (through the entire thickness) dissolved in ethanol (50 μL) for 1 h, mixed via a vortex for 20 s, and centrifuged (8 G) for 7 min. Following centrifugation to precipitate pigment particles, the supernatant liquid was removed and mixed 1:1 with 20 mM ammonium acetate in ethanol.[41,44] All solutions were handled using a glass syringe and rinsed 10 x with ethanol between samples. The samples were delivered to the Micromass instrument via a Micromass CapLC system using 10 mM ammonium acetate in ethanol as an eluent after calibrating with 0.1% phosphoric acid solution (50:50, DI water/acetonitrile).

Data Processing, Peak Picking, and Matching

Data were collected and processed with MassLynx 4.0 software (Waters). Using standard scientific packages in Python (3.6.5),[45−48] experimental peak positions were then determined applying a maximum filter to the mass spectra (smoothed via a Gaussian filter). Experimental peak positions were compared to exact masses for the most abundant isotope of each computationally determined structure adjusted by the exact mass of each possible ion. The calculations were performed using the exact mass calculator of the MS Online Tools of Scientific Instrument Services (https://www.sisweb.com/referenc/tools/exactmass.htm)[49] with addition of ammonium, hydrogen, or sodium for positive mode and addition of acetate or subtraction of hydrogen for negative mode. A match between experimental peak positions and computational masses was accepted for pairs of peaks within a tolerance of 0.15 m/z, smaller than half the distance between two isotopic peaks.

ARNG

ARNG is an algorithm that automatically discovers reaction pathways. The algorithm starts with an initial set of molecules defined in terms of molecular graphs and a predefined list of reaction templates. The first step is to find the reactive functional group in the initial molecule and apply a corresponding transformation. According to the reaction templates, the functional group of the product comes in the place of the reactive functional group. In this way, a new configuration of each molecule is created. Such new molecules may undergo further transformations. The algorithm stops when no new structures can be produced. The main steps are illustrated in Figure .
Figure 1

Block diagram representing the main steps of the ARNG.

Block diagram representing the main steps of the ARNG.

Results and Discussion

Representation of a Macromolecule

Since molecular structures derived from triolein can bear several functional groups, manually writing down all of the reaction steps and intermediate products is infeasible due to the large number of combinations in which multifunctional molecules may interact. To set up an automated exploration process yielding such reaction steps,[34] we will first formalize the structures of irregular macromolecules so that they become palatable to a computer algorithm. As illustrated in Figure a, we distinguish two hierarchical levels: the macromolecule level (Level 1) and the fragment level (Level 2). When applied to the triolein system, the meaning of this description is illustrated in Figure b. Namely:The above-introduced levels are the key concepts for adapting the ARNG to handle large molecules, such as triolein. The algorithm acts separately on molecular graphs of the molecular substructure formed by glycerol with adjacent esters and the substructure formed by oleic chains (Level 2). It reconstructs all possible transformations of these two substructures as they follow hydrolysis and autoxidation reactions, respectively. Then, all possible triolein-derived molecules (Level 1) are reconstructed by joining atoms corresponding to “heads” and “tails” between all configurations of the glycerol part of the molecule after hydrolysis and all states of the three oleic chains after autoxidation reactions. Cross-linking reactions between oleic chains are not considered in this work and will additionally increase the complexity of the resulting macromolecules.
Figure 2

(a) Fragmented representation of the triolein molecule in two levels: Level 1, a large molecule (triolein) is represented as a random graph with edges being ester bonds; Level 2, molecular graphs of glycerol with adjacent esters and three oleic chains, edges are covalent bonds. (b) Functional groups necessary to define the reaction mechanism of triolein autoxidation. (c) Example of a triolein-derivative molecule composed of the following fragments: glycerol with adjacent esters and three carbon chains (previously oleic chains), having carboxyl, hydroxy, and hydroperoxy functional groups. Functional groups present in the molecule are highlighted in the same colors as in (b).

Level 1: The triolein molecule is represented as a random graph[50−52] with different types of nodes (three oleic chains and glycerol unit) and ester bonds as edges (connections between oleic chains and glycerol unit). The random graph representation enables the analysis of the properties of large molecules by studying their fragments, which, in turn, obey the connectivity statistics. At this level, “head” and “tail” atoms are introduced, indicating that the oleic chains and glycerol unit are studied separately, even though they are connected via the head–tail pair. Level 2: This level describes molecular graphs of the fatty acid chains and the glycerol unit with adjacent esters molecules. On this level, each node of the molecular graph represents an atom. Precise knowledge of the molecular graphs is necessary for the ARNG to decide which chemical compounds can react and what the products of such reactions could be. (a) Fragmented representation of the triolein molecule in two levels: Level 1, a large molecule (triolein) is represented as a random graph with edges being ester bonds; Level 2, molecular graphs of glycerol with adjacent esters and three oleic chains, edges are covalent bonds. (b) Functional groups necessary to define the reaction mechanism of triolein autoxidation. (c) Example of a triolein-derivative molecule composed of the following fragments: glycerol with adjacent esters and three carbon chains (previously oleic chains), having carboxyl, hydroxy, and hydroperoxy functional groups. Functional groups present in the molecule are highlighted in the same colors as in (b).

Automatic Discovery of Reactions for Triolein

The organic chemistry of the complete triolein molecule is manually intractable, but at the level of functional groups, the chemistry allows easy manual treatment. We singled out 5 small molecules (oxygen, hydroxide, water, initiator, and initiator radical) and 18 functional groups that are the minimum requirement to describe the autoxidation of the whole system (see Figure b). We further describe the interactions between the functional groups and small molecules by 22 reaction templates, as discussed below. A reaction template is a transformation that maps a functional group (subgraph) of a reactant molecule to a functional group of a product molecule. The functional groups and the reaction templates (see Supporting Information Figures S1 and S2 for the complete list) are carefully formulated using known reaction steps from oil autoxidation literature[53−55] (see Figure ), which are summarized in the paragraph below. The validity of implemented reaction steps is assessed by the ability of the algorithm to form all experimentally reported functional groups, which are typical for the oil oxidation. The reaction templates are formulated using functional groups from Figure b and several auxiliary molecular substructures (see Supporting Information Figure S3) that were defined to efficiently represent some reaction products in the ARNG setup. In the current work, the reaction template formulation is done manually; however, there exist methods for the automatic extraction of reaction templates from chemical databases in the context of synthesis planning.[56−58] This can be considered as a promising enrichment of the ARNG methodology in future.
Figure 3

Reaction pathways of unsaturated double bonds leading to the formation of a wide range of monomers. (Left) Oxidation pathway and (right) β scission pathway leading to four different products.

Reaction pathways of unsaturated double bonds leading to the formation of a wide range of monomers. (Left) Oxidation pathway and (right) β scission pathway leading to four different products. The diversity of the intermediate and product species derived from triolein is caused by three main reaction pathways: hydrolysis, oxidation of unsaturated double bond, and β scission of the hydrocarbon chains.[59,60] Under the influence of water, hydrolysis of ester bonds of triglyceride (triacylglycerol, TAG) results in the formation of free fatty acids, diglycerides (diacylglycerol, DAGs), and monoglycerides (monoacylglycerol, MAGs) (see Supporting Information Figure S4). The autoxidation process is initiated at the double bonds of the unsaturated carbon chains. A hydrogen is abstracted from an allylic carbon on the oleic acid tail, forming an alkyl radical that very rapidly reacts with oxygen to form a peroxy radical. These radicals abstract other allylic hydrogens, thus forming hydroperoxides. Decomposition of hydroperoxides results in alkoxy and hydroxy radicals. This reaction happens under the influence of light and chemicals containing transition metals, like cobalt, that are used as drying agents for drying oils such as linseed oil. Two reaction pathways are then possible from the alkoxy radical state: (1) another allylic hydrogen abstraction by the alkoxy radical to form an alcohol or (2) β scission of CC bond next to the alkoxy radical. If the scission occurs on the side closer to the glycerol ester, the disconnected fragment remains in the system. However, if the β scission happens on the side closer to the end of the hydrocarbon chain, low-molecular-weight components (aldehydes, ketones) volatilize.[61] Aldehydes that remain in the system undergo hydrogen abstraction, forming an alkyl radical, which again undergoes the oxidation pathway. This pathway leads to the formation of carboxylic acids, stable degradation products. Furthermore, all of the radicals that appear during the autoxidation mechanism may terminate via recombination reactions and form oligomers. A special type of termination, Russell termination, consumes peroxy radicals and produces ketones, alcohols, and oxygen. As the present work focuses on matching masses of monomers, termination via radical recombination leading to oligomers is not included in the set of reaction templates. With respect to our fragmentation scheme, the input for the ARNG consists of molecular graphs representing the initial species at Level 2 and the above-described reaction mechanism represented as reaction templates. The algorithm then discovers all derivative products by recursively applying the reaction templates. Two ways of visualization of the reaction network for triolein oxidation mechanism are illustrated in Figure a,b. Figure a is a visualization of full reaction network. It is a bipartite network that includes both species and reactions. Species are connected to each other through reactions. An edge coming from a reactant species points toward a reaction node. An edge from a reaction node points toward a product species. Species nodes correspond to big red circles, while the reaction nodes are smaller circles of different colors. Reactions are categorized in initiation, hydrogen abstraction, oxidation, β scission, hydroperoxide decomposition, and Russell termination. Figure b shows a bipartite reaction network from Figure a projected on a monopartite one, where nodes correspond to the species and the edges correspond to the transformation from a reactant to a product. Colors correspond to the distance from the initial state of oleic chain. The essential features of such transformations may also be understood by drawing a parallel with “phylogenetic” trees (see Supporting Information Figure S5). These trees specify all derivatives that could be obtained from a single input species by following the shortest route of transformations. Since these species are only the transformed fragments, the transformed complete triglycerides are obtained by connecting all derivatives of heads and tails to obtain molecular graphs of the intermediate and product species generated by the reaction scheme of triolein implemented in ARNG.
Figure 4

(a) Bipartite reaction network of transformations of oleic chain with two types of nodes: species and reactions. Species correspond to large red nodes, and reactions correspond to smaller nodes of different colors. (b) Projected monopartite reaction network of transformations of oleic chain with one type of nodes that corresponds to species. The edges point from a reactant to a product. The root node in the center (circled) is the initial state of the oleic tail. Color intensity indicates the distance from the root. Each node represents a distinct derivative of oleic chain formed via predefined oxidation scheme. See Supporting Information Figure S5 for an extended version of this figure with all molecular graphs.

(a) Bipartite reaction network of transformations of oleic chain with two types of nodes: species and reactions. Species correspond to large red nodes, and reactions correspond to smaller nodes of different colors. (b) Projected monopartite reaction network of transformations of oleic chain with one type of nodes that corresponds to species. The edges point from a reactant to a product. The root node in the center (circled) is the initial state of the oleic tail. Color intensity indicates the distance from the root. Each node represents a distinct derivative of oleic chain formed via predefined oxidation scheme. See Supporting Information Figure S5 for an extended version of this figure with all molecular graphs. The complete reconstruction generates 14 045 unique molecular graphs. This number includes numerous repetitions of molecular structures, which are all distinguishable by graph isomorphism search.[62] The ARNG performs this search every time a potentially new molecular graph is generated. The isomorphism search distinguishes all differences in the structure of a molecular graph representing the isomers, while these isomers are identical in the ESI-MS analysis. This is illustrated by the following example. The molecule of triolein is symmetric in terms of its fatty ester composition. Moreover, the reactive site of each fatty acid tail, the double bond, is also symmetric. When triolein undergoes a reaction and is transformed into a product, a functional group may be located on the allylic position on either side of the double bond. Although this leads to the formation of two chemically different molecules, their chemical reactivity is the same according to our assumptions concerning chemical reactivity. Furthermore, as this type of isomerism may happen to each of the three identical fatty esters, six very similar structural isomers are resulting with identical mass and chemical functionality. The ARNG methodology generates these six configurations of molecular graphs as different graphs. This example illustrates that the majority of 14 045 molecular graphs may be structural isomers, which differ only by the location of a functional group with respect to the double bond and a fatty acid ester group. To reduce this set, the molecules that are characterized having the same mass, number, and type of functional groups, as well as length of fatty esters are grouped together. From each group, a representative molecule (the first molecule generated by the ARNG within each group) is added to the reduced set of molecular graphs. Thus, the set of molecular graphs generated by the ARNG is reduced to 1483 molecules, which are further used for the identification of ESI-MS measured peaks. Importantly, along with this procedure, all masses of the reconstructed molecules may be computed for easy comparison to ESI-MS spectral products of triolein autoxidation, hydrolysis, and β scission. This calculated set of masses is the direct link to the interpretation of the measured MS spectrum.

Matching Masses of ESI-MS and ARNG

The results from ESI-MS provide information about the masses and relative abundance of various molecules present in the measured sample, including arising from variations in isotopic compositions of a given molecule. Although direct MS formally measures mass-to-charge ratio, it is assumed, under the experimental conditions applied, that the absolute charge is equal to 1 (see ref (25)). Mass spectra from both positive and negative modes were considered for matching, as each mode has been previously reported to show different sensitivities to oil-based products.[63] For this first proof of concept, consideration of reaction products was limited to those that are most isotopically abundant. Thus, exact masses were calculated for each computationally derived structure (i.e., generated by the ARNG algorithm) assuming the mass of only the most abundant isotope for each element (see the Methods section). To match the peaks in each mass spectrum, the calculated masses were further corrected to match their corresponding ions (with the most abundant isotopes) formed in the ESI-MS: adding masses of H+, Na+, or NH4+ for positive mode and subtracting mass of H– or adding mass of CH3COO– for negative mode. Then, the experimentally measured masses were matched to the calculated ones allowing 0.15 m/z difference between them (see Supporting Information Figure S6). Molecular structures that are generated by the ARNG algorithm and matched to mass spectrum peaks are given in Supporting Information Table S2. See also Supplementary data for raw files.

Matching Masses of ESI-MS and ARNG Including Oxidation Pathway

We will now illustrate the peak matching procedure aided by the ARNG model. Figure shows the MS measurement on the sample at the early stage of drying process of triolein measured in positive mode together with the products matched by the ARNG model. Matching is done with the molecular structures generated by the algorithm accounting only for the oxidation pathway that occurs after the allylic hydrogen abstraction (see Figure ). These peaks can be easily assigned in a manual manner and serve as a proof of concept for our methodology. The beginning of the drying process of triolein is characterized by the oxidation pathway, where the expected reaction products are hydroxides and hydroperoxides, which are known to be stable products under MS measurement conditions.[64] By including only the oxidation pathway in the ARNG methodology, the reaction scheme consists of triolein and all its derivatives containing different number and combinations of [OH] and [OOH] groups on their oleic chains.
Figure 5

Spectrum of extracts from triolein with titanium dioxide at early drying stage measured in positive mode showing products of oxidation pathways. Bold dark blue lines highlight the peaks that were matched with oxidation products for a given ionizing ion.

Spectrum of extracts from triolein with titanium dioxide at early drying stage measured in positive mode showing products of oxidation pathways. Bold dark blue lines highlight the peaks that were matched with oxidation products for a given ionizing ion. Further, the methodology is applied to the artificially aged triolein sample that demonstrates a wide variety of measured MS peaks. We will proceed with smaller, cutdown parts of the model that describe only a limited amount of reactions and/or species and therefore only match part of the measured MS spectrum. Then, we will gradually increase the number of species and reactions taken into account in the model and observe how the matching with the measured MS spectrum is improved. The result of this model-supported matching exercise is depicted in Figure (positive mode) and Supporting Information Figure S7 (negative mode). The model is first cut down to only the oxidation reactions of MAG, where no more than four products are detectable under the ionization conditions in both modes: pure MAG and MAG containing hydroperoxide, hydroxyl, or carbonyl group. In the spectrum measured in the positive mode, see the upper part of spectrum in Figure a, there are seven peaks that correspond to four molecular species with the aforementioned functional groups accounting for two different ions, H– and CH3COO–. In the negative mode, the intensity of these peaks is low relative to other peaks, indicating that MAGs are minor products in the detected sample. More pronounced intensity in the region of the oxidation of MAG is seen in positive mode, indicative of the variable sensitivities of the two modes to different products.[63] Subsequently, we increase the number of species by including DAGs and TAGs as well. In the upper parts of the spectra in Figure b,c, we highlight the peaks corresponding to the oxidation of DAGs and TAGs. We conclude that the intensity of DAG oxidation products is also rather small, while the oxidation products of TAGs are more pronounced. The abundance of TAGs also implies that this species did not undergo substantial hydrolysis. A similar trend concerning oxidation versus hydrolysis products is observed when regarding the measured and matched peaks in the negative mode (see Supporting Information Figure S7).
Figure 6

Spectra of extracts from triolein aged with titanium dioxide measured in positive mode highlighting products of hydrolysis, oxidation, and β scission. Products of oxidation and β scission pathways of MAG (a), DAG (b), and TAG (c). The dark vertical lines in the upper part of each spectrum highlight the peaks that were matched with oxidation products, and the dark vertical lines on the lower mirrored spectrum highlight the peaks that were matched to the products formed after β scission reaction. The light gray areas in the upper part of each spectrum correspond to the approximate regions for oxidation products, and the light gray areas in the lower part of each spectrum correspond to the approximate regions for β scission products.

Spectra of extracts from triolein aged with titanium dioxide measured in positive mode highlighting products of hydrolysis, oxidation, and β scission. Products of oxidation and β scission pathways of MAG (a), DAG (b), and TAG (c). The dark vertical lines in the upper part of each spectrum highlight the peaks that were matched with oxidation products, and the dark vertical lines on the lower mirrored spectrum highlight the peaks that were matched to the products formed after β scission reaction. The light gray areas in the upper part of each spectrum correspond to the approximate regions for oxidation products, and the light gray areas in the lower part of each spectrum correspond to the approximate regions for β scission products.

Matching Masses of ESI-MS and ARNG Including β Scission Pathway

Next, we extend the reaction scheme of the ARNG model by including the β scission reaction in the set of reaction templates, and the procedure described above is repeated by successively matching peaks for TAG, DAG, and MAG molecules. Peaks that could be matched after introducing β scission are shown in the lower parts of spectra in Figure a–c. One observes that this reaction gives rise to a large variety of masses of intermediate and product molecular species. Comparing the matched parts of the spectra of oxidation only with the spectra of matched peaks after β scission in Figure indicates that oxidation of MAG overlaps with β scission products of DAG. In addition, we see that oxidation products of DAG strongly overlap with the β scission products of TAG in the spectra. It is obvious that this matching analysis reveals boundaries when attributing products to the various possible reaction pathways. The boundaries for oxidation and β scission products of MAG, DAG, and TAG are summarized in Table .
Figure 7

Regions of mass spectra (both negative and positive modes) highlighting the anticipated products of oxidation and β scission on triolein (TAG) and its products after hydrolysis (MAG and DAG). Peaks matched to the molecular structures generated by the ARNG (dark blue) in the measured mass spectra (light green) of extracts from triolein aged with titanium dioxide. Heights for matched peaks are set to the height of the corresponding measured peaks.

Table 1

Regions of Spectra in Positive and Negative Modes Containing the Anticipated Products of Oxidation and β Scission on Triolein-Derived Species and Their Hydrolysis Products (MAG and DAG)

 oxidationβ scission
Positive Mode
MAG355.3–404.3 m/z217.2–320.3 m/z
DAG620.6–701.6 m/z344.3–617.6 m/z
TAG899.8–998.8 m/z417.6–914.8 m/z
Negative Mode
MAG353.2–445.2 m/z229.1–345.0 m/z
DAG618.4–742.5 m/z342.1–624.4 m/z
TAG911.6–1039.7 m/z489.3–955.6 m/z
Regions of mass spectra (both negative and positive modes) highlighting the anticipated products of oxidation and β scission on triolein (TAG) and its products after hydrolysis (MAG and DAG). Peaks matched to the molecular structures generated by the ARNG (dark blue) in the measured mass spectra (light green) of extracts from triolein aged with titanium dioxide. Heights for matched peaks are set to the height of the corresponding measured peaks.

Matching Masses of ESI-MS and ARNG Including All Reaction Pathways

The result of the mass matching for the triolein molecule is shown in Figure (for mass matching per ion in positive and negative modes, see Supporting Information Figure S8). For the sake of clarity, peak matching is shown in the range between 200 and 1100 m/z only. Although this range does not include very small products, the plot yet demonstrates the variety of molecular species present in the material after aging (with the highest possible mass of the monomer of 998.8 m/z in positive mode and 1039.7 m/z in negative mode). The measured peaks that matched to the calculated molecular structures are highlighted in dark blue, while the remaining measured peaks are shown in light green. Note that the height of the peak corresponds to the measured intensity. The methodology was able to match 56 out of 151 high-intensity peaks (higher than 10% of intensity of the highest peak in the measured mode) in negative mode and 35 out of 67 in positive mode. Out of 1483 computationally generated molecular graphs, 665 were found in the negative-mode spectrum and 1330 were found in the positive-mode spectrum of artificially aged sample. High-resolution spectra with matched peaks can be seen in Supporting Information Figures S11 and S12. Although numerous peaks in the measured spectra are matched, there are peaks in positive as well as in negative modes, namely, regions of ca. 700–800 m/z and beyond 1000 m/z that do not match any molecular structure generated by ARNG. The unidentified regions on the spectrum might correspond to dimers formed from relatively low-molecular-mass species or other reaction pathways that are not included in this study. We did not analyze this in further detail as this work focuses on identifying monomeric reaction products only.

Automated Identification of Functional Groups

The reconstructed molecular graph of each molecular species contains one or more functional groups from Figure b. An example of a triolein-derivative molecule having carboxy, hydroxy, and hydroperoxy functional groups is shown in Figure c. Functional groups present in the molecule are highlighted in the same colors as in Figure b. This allows grouping of molecules present in the mass spectrum according to their functional groups to access the overall state of the measured sample. The overall distribution of these groups is given in Figure with three pie charts constructed from matching with ESI results from both negative and positive modes. Three charts represent the relative amount of (1) chemical classes: acids, aldehydes, alcohols, and hydroperoxides; (2) hydrolysis products: triglycerides, diglycerides, and monoglycerides; and (3) β scission products, namely, the relative amounts of carbon chains (lengths of 7, 8, 10, and 11 carbons) that remain connected to the glycerol. This overview illustrates the different outcomes of the analysis with negative and positive modes, complementing earlier observations[41,63] and offering a detailed quantification of the complex variety of oil-based products. In the same manner, functional groups may be analyzed in spectra; see Supporting Information Figure S9, where the highlighted peaks correspond to the molecular structures containing alcohols and aldehydes. This approach additionally provides the possibility to extract information about any functional group present in the system, to compare samples changing in time, or to explore the behavior of material in the presence of various additives, serving as only one of the many possible applications of the ARNG–MS combination.
Figure 8

Pie charts illustrating relative ratios of various products present in the sample for two different measurement modes: positive and negative accounting for all ionizing ions. Pie charts on the left demonstrate relative ratios of oxidation products contained in the sample: hydroperoxides, alcohols, carboxylic acids, and aldehydes. Pie charts in the middle demonstrate relative ratios of hydrolysis products: triglycerides (triacylglycerol, TAG), diglycerides (DAG), and monoglycerides (MAG). Pie charts on the right illustrate relative ratios of β scission products: β 7, β 8, β 10, and β 11 corresponding to the number of carbons left on the oleic chain (counting from the ester bond).

Pie charts illustrating relative ratios of various products present in the sample for two different measurement modes: positive and negative accounting for all ionizing ions. Pie charts on the left demonstrate relative ratios of oxidation products contained in the sample: hydroperoxides, alcohols, carboxylic acids, and aldehydes. Pie charts in the middle demonstrate relative ratios of hydrolysis products: triglycerides (triacylglycerol, TAG), diglycerides (DAG), and monoglycerides (MAG). Pie charts on the right illustrate relative ratios of β scission products: β 7, β 8, β 10, and β 11 corresponding to the number of carbons left on the oleic chain (counting from the ester bond).

Identification of Species Contributing to the Same Measured Mass

One final feature of the ARNG-based matching to be mentioned is the ability to reveal various species contributing to the same measured mass. Experimentally, this would require additional analytical techniques and method development, such as chromatographic separation or tandem MS. Figure c,d demonstrates the complexity of the measured sample showing significant overlaps between the products of different reaction pathways. This implies that some peak intensities may get contribution from more than one possible structure. Histograms on Supporting Information Figure S13 show the number of identified peaks that are matched to the same number of molecular graphs. One can see that the majority of the peaks are matched to one or two molecular graphs, while the highest number of molecular graphs per peak being 13 in negative mode and 27 in positive mode. As structural isomers have the same molecular formula, their masses are exactly the same. However, the reaction pathways leading to such structural isomers can vary significantly. For example, in the case of triolein, the addition of one hydroperoxy group, or [OOH], may occur on either side of a double bond. Slightly more complex, triolein following the addition of one hydroperoxy group ([OOH]) is a structural isomer to triolein following the addition of two hydroxy groups (2[OH]); as each group requires a hydrogen abstraction, the products have the same mass yet were achieved through slightly different pathways. Structural isomers may also be formed following different β scission pathways. For example, triolein following β scission on two chains may result in a pair of chains of length 7 and 11 carbons or length 8 and 10 carbons, both of which produce the same mass (see Supporting Information Figure S10). However, species of very similar mass but more distinct structures can also occur. In Figure , two states of triolein are shown, which both may contribute to the same mass of 637.57 m/z measured in positive mode, as their exact masses fall in the tolerance interval for peak assignment in our methodology and, more importantly, are not resolvable in the mass spectra (see Supporting Information Figure S6). These molecules have a distinctly different structure: the molecule on the left has one fatty acid hydrolyzed and an added hydroxy group on one of the remaining acids, while the molecule on the right is a result of two β scission reactions occurring on two of its fatty acids resulting in two chains of eight carbons ending in aldehydes (i.e., “β 8” in Figure ). These differences further illustrate how the ARNG–MS approach helps to identify not only relevant products but also the chemical pathways occurring in the oxidation of triolein.
Figure 9

Example of two different molecules both contributing to the mass of 637.57 m/z measured in positive mode. Calculated masses account for ion H+. The molecule on the left is a diglyceride with hydroxyl on one of its oleic chains. The molecule on the right is a triglyceride after β scission occurring on two oleic chains. For the complete list of matched molecular structures, see Supporting Information Tables S1 and S2.

Example of two different molecules both contributing to the mass of 637.57 m/z measured in positive mode. Calculated masses account for ion H+. The molecule on the left is a diglyceride with hydroxyl on one of its oleic chains. The molecule on the right is a triglyceride after β scission occurring on two oleic chains. For the complete list of matched molecular structures, see Supporting Information Tables S1 and S2. As it currently stands, the algorithm may identify several molecular structures contributing to the same peak; however, the algorithm does not assign the probabilities to these molecular structures. Such a problem can be partially relieved using kinetic modeling, which brings the information about the concentration of various species at different points of time. Another possibility is to rank the molecular structures corresponding to a single peak by estimating their energies of formation with quantum-chemical calculations. These ideas are worth investigating in future and are out of the scope of our current paper.

Discussion and Conclusions

This paper addresses interpretation of complex data from MS using automated reaction discovery. Combining ARNG with the results from ESI-MS allows one to depart from manual peak assignment and enrich the output of this experimental technique by matching molecular structures to the measured masses. This approach was applied to study the aged sample of triolein with monounsaturated oleic chains. Although triolein is considered to be one of the simpler cases of triglyceride autoxidation, it leads to an intertwined scheme of reactions that involves a large number of intermediate and product species, as is illustrated by the wide range of masses measured by ESI-MS. The reaction products from ARNG show good coverage of the area of the identified spectrum. The methodology matched 56 out of 151 abundant species in negative mode and 35 out of 67 in positive mode. Unidentified peaks from ca. 700–800 m/z and beyond 1000 m/z that can be seen in Figure might correspond to dimers (formed by lower-molecular-mass species) or products of other autoxidative reactions, for which we did not account in this work. The methodology can be extended to model polymers by introducing an additional hierarchical level in the representation of a molecule. This implies that one should be able to infer information about the connectivity of the whole polymer network from the fragments given, and thus, reconstruct dimers, trimers, higher oligomers, etc. In the future, we will exploit this property of our algorithm to compute macroscopic polymer properties, as, for instance, the average size or the size distribution. The method can identify products of particular reaction pathways by including and excluding various reactions from the set of reaction templates defined in the ARNG, as was demonstrated with the hydrolysis, oxidation, and β scission reactions. We identified the regions of the mass spectrum corresponding to the anticipated products of particular reaction pathways. Reconstructing explicit molecular graphs using the ARNG provides access to a detailed description of the functional groups attributed to the MS peaks. This information is presented in the shape of pie charts that indicate relative amounts of characteristic functional groups present in the matched molecules. Identification of reaction pathways and pie charts representation of various functional groups provide global information about a sample as a whole. The model is furthermore able to distinguish between different species with the same or similar molecular mass, thereby assisting in the interpretation of peaks in the mass spectrum with contributions from more than one molecule. This work is a pilot study, which demonstrates possible tandem between automated reaction network generation methodology and mass spectrometry. On the experimental side, the availability of ultrahigh-resolution devices (e.g., orbitrap, FT-ICR) would help to validate our peak matching methodology and improve its precision. The coupling of experiments and computational methodology demonstrated in this paper has potential to complement tandem MS analyses and be used for instances when high-resolution measurements are not feasible. The results of this paper have a qualitative nature and can be considered as a first step toward large-scale studies of chemical systems with similar complexity to triglycerides. The ARNG specifies all possible species in the course of reactions but does not quantify their concentrations. Such a task would require translating the output of the algorithm into a set of ordinary differential equations, a kinetic model having quantitative predictive power. The relative ratios of the concentrations of the individual molecular masses can then be related to the relative ratios of the intensities of the peaks of the corresponding masses. Such information can be used to deduce the (relative) speed of various reactions and ultimately estimate the kinetic rate constants. However, the estimation of the kinetic parameters for the numerous reactions is a formidable task. This has to rely on such concepts as kinetic similarity within “families of reactions”[22,23,55,65] and will further require developing model reduction techniques, which will enable one to carry out the ultimate quantitative validation step with experimental data from MS and other sources. Finally, the presented two-level hierarchy approach for dissembling a molecule into smaller substructures reduces the amount of computational time needed for the ARNG to reconstruct all possible intermediate and product species. This approach may be used to study various configurations of more complex triglycerides, in which case the algorithm may be applied separately to resolve reaction schemes evolving from different fatty acids: oleic, linoleic, and linolenic. Subsequently, making use of a similar hierarchical fragmentation procedure may allow for the study and reconstruction of the complete set of triglycerides. Thus, many larger systems involving various configurations of fatty acids are within the reach of the current modeling—experiment paradigm. This will ultimately allow one to study the effects of various additives on aging of oil (e.g., metal-containing pigments in oil-based paints).
  22 in total

Review 1.  The basics of mass spectrometry in the twenty-first century.

Authors:  Gary L Glish; Richard W Vachet
Journal:  Nat Rev Drug Discov       Date:  2003-02       Impact factor: 84.694

2.  The use of mass defect in modern mass spectrometry.

Authors:  Lekha Sleno
Journal:  J Mass Spectrom       Date:  2012-02       Impact factor: 1.982

3.  First principles calculation of electron ionization mass spectra for selected organic drug molecules.

Authors:  Christoph Alexander Bauer; Stefan Grimme
Journal:  Org Biomol Chem       Date:  2014-11-21       Impact factor: 3.876

4.  Graph theory-based reaction pathway searches and DFT calculations for the mechanism studies of free radical-initiated peptide sequencing mass spectrometry (FRIPS MS): a model gas-phase reaction of GGR tri-peptide.

Authors:  Jae-Ung Lee; Yeonjoon Kim; Woo Youn Kim; Han Bin Oh
Journal:  Phys Chem Chem Phys       Date:  2020-02-19       Impact factor: 3.676

Review 5.  Computer-Assisted Synthetic Planning: The End of the Beginning.

Authors:  Sara Szymkuć; Ewa P Gajewska; Tomasz Klucznik; Karol Molga; Piotr Dittwald; Michał Startek; Michał Bajczyk; Bartosz A Grzybowski
Journal:  Angew Chem Int Ed Engl       Date:  2016-04-08       Impact factor: 15.336

6.  Quantitative analysis and molecular species fingerprinting of triacylglyceride molecular species directly from lipid extracts of biological samples by electrospray ionization tandem mass spectrometry.

Authors:  X Han; R W Gross
Journal:  Anal Biochem       Date:  2001-08-01       Impact factor: 3.365

7.  Machine Learning in Computer-Aided Synthesis Planning.

Authors:  Connor W Coley; William H Green; Klavs F Jensen
Journal:  Acc Chem Res       Date:  2018-05-01       Impact factor: 22.384

Review 8.  Mechanisms of free radical oxidation of unsaturated lipids.

Authors:  N A Porter; S E Caldwell; K A Mills
Journal:  Lipids       Date:  1995-04       Impact factor: 1.880

9.  Selection of cost-effective yet chemically diverse pathways from the networks of computer-generated retrosynthetic plans.

Authors:  Tomasz Badowski; Karol Molga; Bartosz A Grzybowski
Journal:  Chem Sci       Date:  2019-03-01       Impact factor: 9.825

10.  Comprehensive quantification of triacylglycerols in soybean seeds by electrospray ionization mass spectrometry with multiple neutral loss scans.

Authors:  Maoyin Li; Emily Butka; Xuemin Wang
Journal:  Sci Rep       Date:  2014-10-10       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.