Literature DB >> 34048582

LipidSig: a web-based tool for lipidomic data analysis.

Wen-Jen Lin1, Pei-Chun Shen2, Hsiu-Cheng Liu2, Yi-Chun Cho2, Min-Kung Hsu2, I-Chen Lin1, Fang-Hsin Chen3,4,5, Juan-Cheng Yang6, Wen-Lung Ma1, Wei-Chung Cheng1,2,7.   

Abstract

With the continuing rise of lipidomic studies, there is an urgent need for a useful and comprehensive tool to facilitate lipidomic data analysis. The most important features making lipids different from general metabolites are their various characteristics, including their lipid classes, double bonds, chain lengths, etc. Based on these characteristics, lipid species can be classified into different categories and, more interestingly, exert specific biological functions in a group. In an effort to simplify lipidomic analysis workflows and enhance the exploration of lipid characteristics, we have developed a highly flexible and user-friendly web server called LipidSig. It consists of five sections, namely, Profiling, Differential Expression, Correlation, Network and Machine Learning, and evaluates lipid effects on cellular or disease phenotypes. One of the specialties of LipidSig is the conversion between lipid species and characteristics according to a user-defined characteristics table. This function allows for efficient data mining for both individual lipids and subgroups of characteristics. To expand the server's practical utility, we also provide analyses focusing on fatty acid properties and multiple characteristics. In summary, LipidSig is expected to help users identify significant lipid-related features and to advance the field of lipid biology. The LipidSig webserver is freely available at http://chenglab.cmu.edu.tw/lipidsig.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Year:  2021        PMID: 34048582      PMCID: PMC8262718          DOI: 10.1093/nar/gkab419

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Cells are composed, in part, of a diverse set of functional lipids with different backbones, head groups, fatty acid linkages and carbon chain compositions (1). Beyond their roles as building blocks and energy sources, lipids can mediate signalling transduction (2), protein modification (3) and cell death (4), or constitute dynamic assemblies such as lipid rafts (5), lipoproteins (6) and lipid droplets (7). Additionally, disturbances in lipid homeostasis have been linked to many disorders, including cardiovascular disease (8), obesity (9), diabetes (10) and cancer (11). Given their central role in physiological and pathological conditions, understanding the diversity and composition of lipids helps us explore the potential biological functions underlying those conditions. With the advances in analytical chemistry and shotgun liquid chromatography–tandem mass spectrometry (LC–MS/MS), researchers have been enabled to detect hundreds to thousands of lipid species in order to assess their phenotype associations, allowing in turn for the creation of a new field called lipidomics (12,13). Although lipidomics is one of the novel omics technologies, it has been widely applied in biomedical research, agriculture, and food-related industries (14). An exponential increase in lipidomic-related studies has also been observed among PubMed publications (15). Lipidomic profiling yields high-dimensional data such as the data yielded by genomics and metabolomics, but the relative scarcity of analysis tools remains a problem for its further development. Moreover, in contrast with other omics, each lipid species in lipidomic data can be grouped into distinct categories based on a diverse set of characteristics including its lipid class, shape, chain length, degree of unsaturation, hydroxyl groups, and fatty acid composition (1). These characteristics have been proven to dramatically affect cellular function. For example, more unsaturated (double bonds) cell membranes promote Akt clustering and activation to potentiate osteogenic differentiation (16), while LPCAT1 enhances the expression of saturated phosphatidylcholine (PC), which is required for EGFR signalling and glioblastoma survival (17). Another example is that human cytomegalovirus has been found to manipulate target cells to synthesize very-long-chain fatty acids (VEFAs) as their envelopes by inducing the production of elongase enzyme 7 (ELOVL7) (18). Recently, polyunsaturated phospholipids with hydroperoxyl or hydroxyl groups have also drawn considerable research attention due to their causal relationship with ferroptosis (19,20). Hence, analyses of lipidomics data need to consider not only lipid species but also lipid characteristics. A range of web servers or software packages have been proposed to deal with lipidomic data (21). Of these, Lipid Data Analyzer (22), LipidBlast (23), LipidHunter (24), LipidMatch (25), are designed to identify and quantify lipids from mass spectrometry (MS) data. Only three tools, lipidr (21), ALEX (26), and Lipid Mini-On (27), are able to perform lipid characteristics analysis. However, some of these have strict formatting requirements on lipid names, and thus cannot process data from all MS platforms. The others provide profiling functions to visualize changes in different characteristics but lack statistical analysis tools. More importantly, they make it difficult to undertake characteristics analyses beyond lipid class, chain length, and double bonds. There is, therefore, an urgent need to develop a robust tool that can easily analyze lipidomic data based on lipid-specific characteristics, and interrogate their associations with underlying cellular mechanisms. Here, we present LipidSig, the first web-based platform that provides an integrated, comprehensive analysis for streamlined data mining of lipidomic datasets. LipidSig gives users more flexibility in input data format, and allows them to define a variety of characteristics for exploration. We provide five main functions, namely Profiling, Differential Expression, Correlation, Network, and Machine Learning, to evaluate the importance of lipid species or characteristics in different experimental groups. To the best of our knowledge, LipidSig is the first tool to fill the gaps in the lipid-specific analysis essential for lipid biology. Furthermore, interactive plots with downloadable images and corresponding tables are created to support interpreting from multiple perspectives. Collectively, LipidSig enables users to perform intensive lipid analysis easily and efficiently, and will promote the development of lipidomics.

MATERIALS AND METHODS

Workflow

The LipidSig workflow is illustrated in Figure 1 and consists of four parts: (i) Data upload, (ii) Lipid characteristics transformation, (iii) Data processing, and (iv) Functionality and Visualization. Users can explore the platform through demo datasets, or submit their own files to specify lipid expression, lipid characteristics, group assignment, or confounding variables for multivariate regression according to each analysis section. After the required matrices are uploaded, LipidSig provides two analysis pipelines for lipid species and lipid characteristics that can be freely converted between by a transformation function. This built-in function classifies and sums lipid species from ‘Lipid expression data’ into different categories in ‘Lipid characteristics table’. Take Figure 1 as an example, the expression of Lipid1 and Lipid2 in sample1 are 20 and 10, respectively. The two lipid species belong to PC class and yield an expression of 30 for PC. Users are encouraged to employ different data processing strategies in response to data quality and desired methods, making the data suitable for downstream analysis. Missing values due to detection limits can be addressed by missing-value exclusion and by using the imputation options. LipidSig also supports lipid percentage transformation to reduce sample variance. Log transformation and data scaling are offered to improve the performance for further statistical and machine learning analysis. For lipidomic data mining, five powerful functions allow users to profile lipid expression, identify significant lipid features, or construct an informative network. These are useful to analyse lipidomic change at a different resolution levels, and to create concise visualizations for effective interpretation. Also, LipidSig enables users to select computation methods and determine statistical significance through an easy-to-use interface. A short description of the methods and the examples adopted is given below.
Figure 1.

Workflow of the LipidSig web server. The LipidSig workflow is composed of four steps: (i) data upload, (ii) lipid characteristics transformation, (iii) data processing and (iv) functionality and visualization. Four tables at most can be uploaded according to different analysis section. LipidSig provides two analysis pipelines focusing on lipid species or lipid characteristics. The transformation processes between species and characteristics are labelled using different colors. Four data processing strategies enable users to transform and make the data suitable for following analysis. To identify significant lipid features and explore their relationships, five useful functions are offered to comprehensively analyse lipidomic change, and to create effective visualizations.

Workflow of the LipidSig web server. The LipidSig workflow is composed of four steps: (i) data upload, (ii) lipid characteristics transformation, (iii) data processing and (iv) functionality and visualization. Four tables at most can be uploaded according to different analysis section. LipidSig provides two analysis pipelines focusing on lipid species or lipid characteristics. The transformation processes between species and characteristics are labelled using different colors. Four data processing strategies enable users to transform and make the data suitable for following analysis. To identify significant lipid features and explore their relationships, five useful functions are offered to comprehensively analyse lipidomic change, and to create effective visualizations.

Input

Users can use our demo datasets or upload two to four tables for different analysis sections, including ‘Lipid expression data’, ‘Group information’/‘Condition table’, ‘Lipid characteristics table’ and ‘Multivariate adjusted table’. ‘Lipid expression data’ encompasses the expression levels for all lipid species in all samples. The data can derive from absolute quantification (picomole) or relative quantification (lipid percent). ‘Group information’ or ‘Condition table’ assigns the samples into different groups while ‘Lipid characteristics table’ records the descriptions of the characteristics for each lipid species. ‘Multivariate adjusted table’ is used to adjust the confounding effects and only available in the ‘Correlation’ analysis section. LipidSig accepts comma-separated and tab-separated values (CSV, TSV) formats, and each table should meet basic requirements for minimum sample size and lipid count, column specification, variable type, and cell content. Format is checked as soon as users press the ‘Upload’ button, and the results will be shown immediately, as the analyses cannot be conducted if files do not meet the standard. Users are encouraged to replicate the formats from the example datasets available on the LipidSig web server.

Lipid characteristics analysis

The massive degree of structural diversity contributes to lipids’ functional and characteristic variety. Variations can be subtle (chain length and double bond number) or major (head group and backbone). Lipid Characteristics Analysis evaluates the alterations of a group of lipids categorized by one or more lipid characteristics. LipidSig provides an automatic transformation function to convert the expression of lipid species into lipid characteristics. Firstly, lipid species can be summarized into specific characteristics based on the ‘Lipid characteristics table’ (Figure 2). For instance, four lipid species in Figure 2A are classified into three categories, lipids with 0, 1 or 6 double bonds. Using this information, the original ‘Lipid expression data’ (Figure 2B) is then transformed into a new characteristic expression table (Figure 2C). According to user-selected characteristics, the corresponding tables are produced and applied in all analysis sections except ‘Network’. Two further functions are derived from the Lipid Characteristics Analysis: Fatty Acid Analysis and Multi-Characteristics Analysis. Fatty Acid Analysis is a special transformation method because it calculates characteristic expression on the basis of fatty acids instead of on whole lipid species. For example, phosphatidylethanolamine (PE) 16:0:0/22:6:1 is a lipid species belonging to PE class with two fatty acids separated by a slash (Figure 2A). The three numbers denote, respectively, the numbers of carbon atoms, double bonds, and hydroxyl groups. Both the total chain length of 38 and the fatty acid chain lengths of 16 and 22 can be supplied via the ‘Lipid characteristics table’. These two descriptive conventions offer an overall or partial perspective on lipid characteristic change. When declared in fatty acid format, the summations follow a prior decomposition of the lipid species into fatty acid parts. In Figure 2D, fatty acid chain length in PE has three categories (fatty acids with 16, 20, or 22 carbons) and their expressions in ctrl2 are 0.6 (0.3 + 0.3), 0.3 and 0.3, respectively.
Figure 2.

The characteristics transformation function in Lipid Characteristics Analysis. (A) A diagram showing how to categorize lipid species into different lipid characteristics. (B) Original ‘Lipid expression data’ uploaded by users. (C) A new characteristics expression table for total double bond (Totaldb). (D) An expression table combining two characteristics, specific to ‘Differential Expression’. Font colors in (B) to (D) can be used to track the transformation processes between species and characteristics. Cer, ceramdie; PE, phosphatidylethanolamine.

The characteristics transformation function in Lipid Characteristics Analysis. (A) A diagram showing how to categorize lipid species into different lipid characteristics. (B) Original ‘Lipid expression data’ uploaded by users. (C) A new characteristics expression table for total double bond (Totaldb). (D) An expression table combining two characteristics, specific to ‘Differential Expression’. Font colors in (B) to (D) can be used to track the transformation processes between species and characteristics. Cer, ceramdie; PE, phosphatidylethanolamine. On the other hand, Multi-Characteristics Analysis can be undertaken to explore interactions among numerous characteristics. In ‘Differential Expression’, users may split the data using one particular characteristic before performing computations based on another characteristic. As an example, it is able to analyze fatty acid chain length in ceramide (Cer) or PE class specifically (Figure 2D). It is also possible to combine more than two characteristics to build a prediction model in ‘Machine Learning’. Multiple characteristics tables are formed and used as predictor variables to provide additional structural or functional information of lipidomic data. The important characteristics will be selected and ranked in the resulting model.

MAIN FUNCTIONS OF LipidSig

LipidSig offers five main functions, namely ‘Profiling’, ‘Differential Expression’, ‘Correlation’, ‘Network’ and ‘Machine Learning’ for assessing lipid effects on biological mechanisms. In ‘Profiling’, an overview of comprehensive analyses allows researchers to efficiently examine data quality, clustering of samples, correlation between lipid species, and composition of lipid characteristics. ‘Differential Expression’ integrates many useful lipid-focused analyses, assisting users to identify significant lipid species or lipid characteristics. ‘Correlation’ analysis is designed to illustrate and compare the relationships between different clinical phenotypes and lipid features. The ‘Network’ function constructs (i) lipid metabolism pathways based on the Reactome database and (ii) a pathway enrichment network queried with lipid-related genes. ‘Machine learning’ provides a broad variety of feature selection methods and classifiers to build binary classification models. Subsequent analyses then help users to evaluate the learning algorithm's performance and to explore important lipid-related variables. In general, except ‘Network’, which is generated based on the human database, all analysis sections make use of uploaded lipid expression and lipid characteristics data to compute the results. Hence, there is no limitation to the source of lipidomic data in most analyses. Detailed documentation for each section is given in the Supplementary Methods.

EXAMPLES

Discovery of novel driver lipids for ferroptosis in cancer cells

Ferroptosis is iron-dependent, non-apoptotic, oxidative cell death that is implicated in various diseases including neurodegeneration, ischemic injury in many organs, and cancer resistance (28,29). Ferroptosis is closely related to polyunsaturated phospholipid peroxidation but the underlying mechanism, for example whether it happens on global or specific lipids, is poorly understood. Here, we utilized Zou, Yilong et al.’s dataset as an example (30). The expression table for this dataset contains six OVCAR-8 cell samples and 202 lipid species. The species are categorized into different lipid classes in the lipid characteristics table. Through genome-wide CRISPR–Cas9 screens, they found alkylglycerone phosphate synthase (AGPS) was a critical mediator in driving ferroptosis susceptibility. Thus, we separated the samples into control and AGPS knockout groups. In ‘Profiling’, OVCAR-8 cells (n = 3) expressing control sgRNAs (sgNC) or sgRNAs targeting AGPS (sgAGPS) can be perfectly distinguished using PCA on lipidomic data (Figure 3A). In addition, our characteristics profiling reveals an obvious change in lipid class composition, especially the reduction of ether-linked PC (PC O–) and PE (PE O–) in the AGPS knockout group (Supplementary Figure S1A). Significant lipid species were then further explored using differential expression analysis. In Figure 3B, the volcano plot indicates statistical significance versus magnitude of fold change for each lipid species. Still more information can be dug into using LipidSig. We obtained detailed statistical information, for example, as well as characteristic distributions to emphasize the importance of ether lipids (Supplementary Figure S1B and S1C). Likewise, a hierarchical clustering (Figure 3C) and an enrichment using over representation analysis (Figure 3D) clearly revealed that significantly down-regulated lipid species were enriched in the PC O– and PE O– categories. The expression of different lipid classes can be compared through Lipid Characteristics Analysis, and used to corroborate the results from the preceding evaluations (Supplementary Figure S1D). Besides, ether lipid metabolism was successfully ranked as the top hit in KEGG pathway enrichment analysis, queried with PC O– and PE O– related genes (Figure 3E). These data indicated that depletion of AGPS remodeled the intracellular lipidome, especially the biosynthesis of ether lipids, which might be critical to the pro-ferroptotic status. Taken together, LipidSig can not only identify differential expressed lipid species but also offer more practical functions to associate them with specific lipid characteristics and potential metabolic pathways.
Figure 3.

An example of LipidSig being used to identify critical lipids driving ferroptosis in OVCAR-8 cells. (A) The PCA plot of lipidome in the OVCAR-8 cells expressing control sgRNAs (sgNC) and sgRNAs targeting AGPS (sgAGPS). (B) An interactive volcano plot showing the differentially expressed lipid species of OVCAR-8 cells expressing sgNC or sgAGPS. n = 3 biological replicates. Two-tailed Student's t-tests with Benjamini–Hochberg correction method were used to calculate the p-values. (C) Heatmap of hierarchical clustering for significant lipid species (P < 0.05) with sample group labels in the top. (D) Enrichment analysis of significant lipid species was performed using over representation analysis based on lipid class. Bars indicate –log10(P-value). (E) Enrichment network built from KEGG pathway analysis presents the significantly altered pathways (P < 0.05) associated with PC O– and PE O– related genes. Nodes are filled according to –log10(P-value) and their sizes represent the lipid-related gene number involved in the pathway. Line width indicates the value of gene similarity between the pathways. PC O-, ether-linked phosphatidylcholine; PE O–, ether-linked phosphatidylethanolamine.

An example of LipidSig being used to identify critical lipids driving ferroptosis in OVCAR-8 cells. (A) The PCA plot of lipidome in the OVCAR-8 cells expressing control sgRNAs (sgNC) and sgRNAs targeting AGPS (sgAGPS). (B) An interactive volcano plot showing the differentially expressed lipid species of OVCAR-8 cells expressing sgNC or sgAGPS. n = 3 biological replicates. Two-tailed Student's t-tests with Benjamini–Hochberg correction method were used to calculate the p-values. (C) Heatmap of hierarchical clustering for significant lipid species (P < 0.05) with sample group labels in the top. (D) Enrichment analysis of significant lipid species was performed using over representation analysis based on lipid class. Bars indicate –log10(P-value). (E) Enrichment network built from KEGG pathway analysis presents the significantly altered pathways (P < 0.05) associated with PC O– and PE O– related genes. Nodes are filled according to –log10(P-value) and their sizes represent the lipid-related gene number involved in the pathway. Line width indicates the value of gene similarity between the pathways. PC O-, ether-linked phosphatidylcholine; PE O–, ether-linked phosphatidylethanolamine.

Connecting lipid characteristics with biological function

Lipid Characteristics Analysis is one of the important features in LipidSig. It can be carried out using our transformation function to build a new characteristics expression table for further analysis (Figure 2). In ‘Differential Expression’, users are allowed to focus on one specific characteristic, or a combination of two characteristics, to access the cellular phenotypes or disease associations. We collected two examples to demonstrate this application in lipid unsaturation (double bond) and chain-length analysis. First case is to study mammalian membrane homeostasis in response to dietary lipid perturbations (31). The expression table for the dataset includes 390 lipid species in six mice fed with fish oil or corn oil. All lipid species are categorized according to the number of total double bond, fatty acid double bond, fatty acid chain length in the lipid characteristics table. Using LipidSig, we obtained an almost same lipid unsaturation profile in the paper. The fish oil (FO) diet, enriched with eicosapentaenoic acid (EPA; 20:5) and docosahexaenoic acid (DHA; 22:6), led to a striking increase of glycerophospholipids (GPLs) with six double bonds in mouse cardiac tissue compared to the corn oil (CO) diet (Figure 4A). This response can be compensated by the induction of saturated GPLs and the reduction of other polyunsaturated GPLs (2–5 double bonds). Using the Fatty Acid Analysis, we found an expected increase in fatty acids with 22 carbons and six double bonds, corresponding to DHA (22:6) (Supplementary Figure S2A and B). Due to LipidSig's flexible handling of user-defined lipid characteristics, a significant incorporation of DHA into the membrane can also be seen in the FO-fed group (Supplementary Figure S2C).
Figure 4.

Applications of lipid characteristics analysis. (A) Lipid unsaturation (double bond) profile in membrane glycerophospholipids (GPLs) isolated from mouse cardiac tissue treated with corn oil (CO) diet versus fish oil (FO) diet. n = 3 biological replicates. (B) Lipid chain trend plot in ceramide (Cer) from murine pancreatic tissue after feeding with fenofibrate. n = 4 biological replicates. (C) The concentration-weighted average chain length of Cer is significantly up-regulated upon fenofibrate treatment. (D) Receiver operating characteristic (ROC) curve and area under curve (AUC) for machine learning models with different number of features. (E) Feature importance of the top 10 contributing features under SHAP (SHapley Additive exPlanations) analysis in the best ten-feature model. (F) Pearson correlation network of the lipid predictors in the best ten-feature model. Nodes are filled according to SHAP feature importance and their colors indicate direction of impact. Line width indicates the correlation coefficients, with purple for negative correlation and orange for positive correlation. Two-tailed Student's t-tests were used to calculate the p-values in (A) to (C). *P < 0.05, **P < 0.01, ***P < 0.001. totaldb, total double bond.

Applications of lipid characteristics analysis. (A) Lipid unsaturation (double bond) profile in membrane glycerophospholipids (GPLs) isolated from mouse cardiac tissue treated with corn oil (CO) diet versus fish oil (FO) diet. n = 3 biological replicates. (B) Lipid chain trend plot in ceramide (Cer) from murine pancreatic tissue after feeding with fenofibrate. n = 4 biological replicates. (C) The concentration-weighted average chain length of Cer is significantly up-regulated upon fenofibrate treatment. (D) Receiver operating characteristic (ROC) curve and area under curve (AUC) for machine learning models with different number of features. (E) Feature importance of the top 10 contributing features under SHAP (SHapley Additive exPlanations) analysis in the best ten-feature model. (F) Pearson correlation network of the lipid predictors in the best ten-feature model. Nodes are filled according to SHAP feature importance and their colors indicate direction of impact. Line width indicates the correlation coefficients, with purple for negative correlation and orange for positive correlation. Two-tailed Student's t-tests were used to calculate the p-values in (A) to (C). *P < 0.05, **P < 0.01, ***P < 0.001. totaldb, total double bond. The second example examines the effects of fenofibrate, a potential drug for type-1 diabetes, on the mouse pancreatic lipidome (32). One hundred ninety-six lipid species with lipid class and chain length assignment are used to analyze the lipidomic alteration in the pancreas from four control and four fenofibrate-treated mice. Our Multi-Characteristics Analysis provides a summary table that clearly highlights significant chain length categories in the Cer class (Supplementary Figure S2D). On the trend plot, a chain length shift from C16 to C24 Cer is uncovered in the fenofibrate-treated group, as previously described (Figure 4B). C16 Cer is associated with apoptosis and insulin resistance (33,34), which might explain fenofibrate's anti-diabetic effect. Besides, we calculate the concentration-weighted average index to reflect an overall change in chain length (31). In brief, ‘Lipid expression data’ is transformed into a new expression table for chain length. Then, each chain length was multiplied by its proportion and summed up to get the index. In LipidSig, an increased concentration-weighted average chain-length index in Cer can also be detected after feeding with fenofibrate (Figure 4C). Overall, our Lipid Characteristics Analysis in ‘Differential Expression’ unravels the impact of one or two characteristics and links their changes to different biological functions.

Lipid-related biomarker exploration

Recently, distinct dysregulation of lipid metabolism has been discovered in many diseases (8–11). Lipidome profiling can provide new insights into specific signatures, making them compelling biomarkers or candidates for investigating causal effects. Here, we introduce two datasets, a lipidomic study for chronic obstructive pulmonary disease (COPD) (35) and an association analysis linking cancer metabolomic alterations to genetic features (36), to illustrate our ‘Correlation’ and ‘Machine Learning’ functions as biomarker explorers. In the COPD dataset, a multivariate linear regression model is applied to associate sphingolipids with clinical subphenotypes of COPD. The dataset detects 69 distinct plasma sphingolipid species in 129 current and former smokers. Three COPD subphenotypes including exacerbations, emphysema, and FEV1/FVC (ratio of the forced expiratory volume during the first second to the forced vital capacity) are recorded in the condition table. Additionally, confounding factors such as age, sex, smoking status, body mass index, and FEV1 are provided in the multivariate-adjusted table. We then build a heatmap to easily compare the lipidomic fingerprints identified in different groups (Supplementary Figure S3A). The most significant negative association is between emphysema and sphingomyelins, consistent with the paper's result. In the other cancer metabolomic research, cell lines with high polyunsaturated lipids have been found more resistant to the knockout of stearoyl-CoA Desaturase (SCD), a key enzyme in monounsaturated fatty acid formation. Therefore, we now aim at predicting cancer cell lines’ sensitivity to SCD knockout using the provided lipidomic data, and corroborating the key unsaturation feature by machine-learning modeling. We extract 89 lipid species from the original metabolite data and build the lipid characteristics table containing the information about lipid class, double bond number, and chain length. 228 cancer cell lines in the first and last quarter according to sensitivity to SCD knockout are selected for further analysis. Lipid species and multiple characteristics (including lipid class, chain length, and double bond) are introduced to separate two groups—sensitive (labelled as 0) or resistant (labelled as 1) cell lines. We apportion the data into training and testing sets, with a ratio of 2:1. A random forest model is adopted to perform feature selection, model training, and performance evaluation using 10 times Monte-Carlo cross-validation. The receiver operating characteristic (ROC) and precision-recall (PR) curves reveal that ten features reached a plateau in model performance, which is supported by the elbow point in the accuracy curve (Figure 4D, Supplemental Figure S3B and S3C). A further feature importance analysis by SHAP (SHapley Additive exPlanations) points out that unsaturation-related variables (lipids with 1, 2, 4, 5 and 6 double bonds, totaldb) account for half in the ten-feature model (Figure 4E). The directions of feature values and Shapley values in the SHAP summary plot show that lipids with 4, 5 and 6 double bonds contributed to SCD knockout resistance, while lipids with one and two double bonds increased the sensitivity. Furthermore, the correlation network of the predictors displays two positive correlation clusters containing polyunsaturated and less-unsaturated (double bond) lipid features, respectively (Figure 4F). In conclusion, ‘Machine Learning’ allows users to introduce multiple lipid characteristics as feature variables and we highlight the effect of lipid unsaturation on cell lines’ sensitivity to SCD knockout. To summarize, this analyses illustrate the feasibility of using the ‘Correlation’ and ‘Machine Learning’ functions to discover novel lipid biomarkers, made up of species, characteristics, or both.

DISCUSSION

LipidSig is the first web server dedicated to intensive lipid-focused analyses and giving detailed insights into the changes of lipid species and characteristics. It supports robust data processing, data mining, and visualization to functionally interpret complex lipidomic data. We provide a comprehensive analysis workflow to identify significant lipid features in different cellular or clinical phenotypes (Figure 3 and Supplemental Figure S1). Our web server also places great importance on Lipid Characteristics Analysis, designed to investigate the changes of diverse characteristics in different experimental groups. Most existing tools with similar functions require users to follow specific naming rules, for example ‘PC 36:2’ meaning a lipid species belonging to PC class with chain length of 36, and two double bonds. Then, they extract characteristics information from this format. To the contrary, LipidSig allows users to upload a characteristics table that defines characteristics for each lipid species by themselves. The customary format limitation is thus mitigated, allowing the handling of more lipid characteristics, class, shape (cone, cylinder), hydroxyl group, chain length, double bond, fatty acid composition, etc. These characteristics are unique for lipids and usually correspond to specific biological functions. Our transformation function can convert expression data from lipid species to lipid characteristics, and can be applied in many analysis sections. In the ferroptosis case study, Lipid Characteristics Analysis in ‘Profiling’ and ‘Differential Expression’ revealed a clear decrease of the PC O– and PE O– classes in the AGPS knockout group (Supplementary Figure S1A and S1D). The change in lipid class is considered a good indicator of altered lipid metabolism pathways. The enrichment result in ‘Network’ supported this point, indicating that ether lipid metabolism might be affected (Figure 3E). Moreover, Lipid Characteristics Analysis also examines the changes of double bond, chain length, and hydroxyl group that constitute fatty acid diversity. In Figure 4B and C, we demonstrated the shift of double bond and chain length between the control and the treatment group. Double bond change represents membrane unsaturation degree or fluidity, and has been connected to signaling transduction (16,17). Lipid chain length has been thought to be a measure of fatty acid elongation or oxidation (18,37), but recent findings have revealed that changes in chain length (C16 up and C24 down in ceramide) could lead to insulin resistance in obesity, unravelling a new role for specific chain length (Figures 4C) (32,38). Further, owing to the flexibility of characteristics definition in LipidSig, users are able to observe changes in a specific lipid group according to their own experimental requirements. In Supplemental Figures S2C, whether lipids contain DHA was recorded in the ‘Lipid characteristics table’ and calculated. In addition, the accumulation of lipid hydroperoxides on membrane is recognized as the hallmark of ferroptosis (29). A redox-lipidomics analysis can also be performed to evaluate alterations of hydroxyl and hydroperoxyl lipids as ferroptotic signals (Supplemental Figures S2E) (19). It's noted that Lipid Characteristics Analysis uses a direct add-up function for selected characteristics regardless of data source. Although this method has been widely used in many studies (16–18,31,32), users should consider whether their data are suitable for this process when analyzing certain lipid characteristics. Two unique functions, Fatty Acid Analysis and Multi-Characteristics Analysis, derived from Lipid Characteristics Analysis can extend more applications in LipidSig. The first of these provides an analysis focusing on fatty acid instead of lipid species in ‘Differential Expression’. This allows users to examine different types of polyunsaturated fatty acids (PUFAs) that are stored in lipids, and that exert special functions. For instance, compared to calculation for total double bonds in lipid species, an increase in fatty acids with 6 double bonds was more reasonable after DHA (22:6) supplementation (Figure 4A and Supplemental Figures S2A). PUFA search through Fatty Acid Analysis can be also used to explore substrate specificity of acyltransferases (39). One example is to study arachidonic acid (AA; 20:4) incorporation by LPCAT3 in intestine and liver cells, which promotes lipoprotein assembly (40). As for Multi-Characteristics Analysis, that can be achieved in ‘Differential Expression’ and ‘Machine Learning’. In the fenofibrate-treated case, we observed a major change in chain length in the ceramide class (Figure 4B) (32). Another study showed that membrane saturation remodeling by LPCAT1, a lysophosphatidylcholine acyltransferase, happened in the PC class (17). Newly, a similar approach has also been implemented to explore the unique pattern of chain length and desaturation in different lipid classes by probing the mitochondrial lipidome of adipose tissues (41). Thus, a two-characteristics analysis is suited to studies that concentrate on specific reactions or pathways. For working with more than two characteristics, users are encouraged to build different machine learning models to evaluate the importance of lipid-related variables. In Figure 4D–F, we demonstrated how the addition of various lipid characteristics promotes structural or functional resolution of the lipidome and links to specific phenotypes. This method is useful for processing multidimensional lipidomic data, and for discovering compelling biomarkers that can be used in clinical research. For example, there have been three large population cohorts combining machine learning and lipidome to obtain new insights into complex disease processes including coronary artery disease (8), obesity (9), and type 2 diabetes (10). Moreover, LipidSig users can also assess Human Metabolome Database (HMDB) database to connect important lipid predictors with related diseases (42). Additionally, we developed two practical functions, ‘Correlation’ and ‘Network’, to support further analysis. ‘Correlation’ offers a clear heatmap to compare significant lipid features for diverse clinical phenotypes (Supplemental Figures S3A). Lipid Characteristics Analysis can also be performed here to assess the associations of a specific group of lipids. Similar methods have been used to interrogate causal effects of metabolites in nonalcoholic fatty liver disease (NAFLD) (43) and acute-on-chronic liver failure (ACLF) (44). The ‘Network’ page is divided into two parts. The first of these is built from the Reactome database; its purpose is to find connections between different lipids and their regulatory genes. Recently, a web-based tool called BioPAN has also been proposed to explore lipidome metabolic pathways (45). Different from our Reactome network, BioPAN requires users to upload lipid expression data and introduces a statistical model to identify activated or suppressed lipid pathways based on a ratio of product over reactant. In LipidSig, we can provide an overview of the metabolic network and guide users to explore the critical reaction steps. On the other hand, BioPAN puts more emphasis on the concept of lipid flux and also lists predicted genes, which could be involved in the reactions. The other part, the pathway enrichment network, presents significantly altered pathways based on related genes of selected lipids. This multi-omics approach helped us find ether lipid metabolism in Figure 3E and was recently adopted to identify a metabolic signature in high-grade bladder cancer (46). In summary, we propose LipidSig as a highly flexible web server that can conduct comprehensive and deep lipidomic analysis. A strong emphasis has been placed on the functionality of Lipid Characteristics Analysis, exemplified by many published works using a wide range of datasets. Our main hope is that LipidSig will bridge fundamental aspects of lipid biology in order to contribute to the broader development of lipidomics. Currently, there are more and more interesting lipidomic datasets focusing on specific diseases (8–10) or subcellular organelles available (41,47,48). These datasets can help users grasp the essence of lipidomic analysis workflow in LipidSig. We will include them into the demo datasets for our next update.

DATA AVAILABILITY

LipidSig is freely and publicly available at http://chenglab.cmu.edu.tw/lipidsig. No login is required and detailed documentation is provided on the LipidSig website. Click here for additional data file.
  48 in total

1.  LipidHunter Identifies Phospholipids by High-Throughput Processing of LC-MS and Shotgun Lipidomics Datasets.

Authors:  Zhixu Ni; Georgia Angelidou; Mike Lange; Ralf Hoffmann; Maria Fedorova
Journal:  Anal Chem       Date:  2017-08-08       Impact factor: 6.986

2.  Aging and β3-adrenergic stimulation alter mitochondrial lipidome of adipose tissue.

Authors:  Sona Rajakumari; Simran Srivastava
Journal:  Biochim Biophys Acta Mol Cell Biol Lipids       Date:  2021-03-11       Impact factor: 4.698

3.  Multi-omics Integration Analysis Robustly Predicts High-Grade Patient Survival and Identifies CPT1B Effect on Fatty Acid Metabolism in Bladder Cancer.

Authors:  Venkatrao Vantaku; Jianrong Dong; Chandrashekar R Ambati; Dimuthu Perera; Sri Ramya Donepudi; Chandra Sekhar Amara; Vasanta Putluri; Shiva Shankar Ravi; Matthew J Robertson; Danthasinghe Waduge Badrajee Piyarathna; Mariana Villanueva; Friedrich-Carl von Rundstedt; Balasubramanyam Karanam; Leomar Y Ballester; Martha K Terris; Roni J Bollag; Seth P Lerner; Andrea B Apolo; Hugo Villanueva; MinJae Lee; Andrew G Sikora; Yair Lotan; Arun Sreekumar; Cristian Coarfa; Nagireddy Putluri
Journal:  Clin Cancer Res       Date:  2019-03-07       Impact factor: 12.531

4.  CerS2 haploinsufficiency inhibits β-oxidation and confers susceptibility to diet-induced steatohepatitis and insulin resistance.

Authors:  Suryaprakash Raichur; Siew Tein Wang; Puck Wee Chan; Ying Li; Jianhong Ching; Bhagirath Chaurasia; Bghagirath Chaurasia; Shaillay Dogra; Miina K Öhman; Kosuke Takeda; Shigeki Sugii; Yael Pewzner-Jung; Anthony H Futerman; Scott A Summers
Journal:  Cell Metab       Date:  2014-10-07       Impact factor: 27.287

5.  Obesity-induced CerS6-dependent C16:0 ceramide production promotes weight gain and glucose intolerance.

Authors:  Sarah M Turpin; Hayley T Nicholls; Diana M Willmes; Arnaud Mourier; Susanne Brodesser; Claudia M Wunderlich; Jan Mauer; Elaine Xu; Philipp Hammerschmidt; Hella S Brönneke; Aleksandra Trifunovic; Giuseppe LoSasso; F Thomas Wunderlich; Jan-Wilhelm Kornfeld; Matthias Blüher; Martin Krönke; Jens C Brüning
Journal:  Cell Metab       Date:  2014-10-07       Impact factor: 27.287

6.  Plasma Lipidome and Prediction of Type 2 Diabetes in the Population-Based Malmö Diet and Cancer Cohort.

Authors:  Céline Fernandez; Michal A Surma; Christian Klose; Mathias J Gerl; Filip Ottosson; Ulrika Ericson; Nikolay Oskolkov; Marju Ohro-Melander; Kai Simons; Olle Melander
Journal:  Diabetes Care       Date:  2019-12-08       Impact factor: 19.112

Review 7.  Novel advances in shotgun lipidomics for biology and medicine.

Authors:  Miao Wang; Chunyan Wang; Rowland H Han; Xianlin Han
Journal:  Prog Lipid Res       Date:  2015-12-15       Impact factor: 16.195

8.  Fatty acid elongase 7 catalyzes lipidome remodeling essential for human cytomegalovirus replication.

Authors:  John G Purdy; Thomas Shenk; Joshua D Rabinowitz
Journal:  Cell Rep       Date:  2015-02-26       Impact factor: 9.423

9.  HMDB 4.0: the human metabolome database for 2018.

Authors:  David S Wishart; Yannick Djoumbou Feunang; Ana Marcu; An Chi Guo; Kevin Liang; Rosa Vázquez-Fresno; Tanvir Sajed; Daniel Johnson; Carin Li; Naama Karu; Zinat Sayeeda; Elvis Lo; Nazanin Assempour; Mark Berjanskii; Sandeep Singhal; David Arndt; Yonjie Liang; Hasan Badran; Jason Grant; Arnau Serra-Cayuela; Yifeng Liu; Rupa Mandal; Vanessa Neveu; Allison Pon; Craig Knox; Michael Wilson; Claudine Manach; Augustin Scalbert
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

10.  LipidBlast in silico tandem mass spectrometry database for lipid identification.

Authors:  Tobias Kind; Kwang-Hyeon Liu; Do Yup Lee; Brian DeFelice; John K Meissen; Oliver Fiehn
Journal:  Nat Methods       Date:  2013-06-30       Impact factor: 28.547

View more
  5 in total

1.  Shiny GATOM: omics-based identification of regulated metabolic modules in atom transition networks.

Authors:  Mariia Emelianova; Anastasiia Gainullina; Nikolay Poperechnyi; Alexander Loboda; Maxim Artyomov; Alexey Sergushichev
Journal:  Nucleic Acids Res       Date:  2022-05-27       Impact factor: 19.160

Review 2.  Guide to Metabolomics Analysis: A Bioinformatics Workflow.

Authors:  Yang Chen; En-Min Li; Li-Yan Xu
Journal:  Metabolites       Date:  2022-04-15

3.  Comprehensive lipid and lipid-related gene investigations of host immune responses to characterize metabolism-centric biomarkers for pulmonary tuberculosis.

Authors:  Nguyen Phuoc Long; Nguyen Ky Anh; Nguyen Thi Hai Yen; Nguyen Ky Phat; Seongoh Park; Vo Thuy Anh Thu; Yong-Soon Cho; Jae-Gook Shin; Jee Youn Oh; Dong Hyun Kim
Journal:  Sci Rep       Date:  2022-08-04       Impact factor: 4.996

4.  Renal Metabolome in Obese Mice Treated with Empagliflozin Suggests a Reduction in Cellular Respiration.

Authors:  Surabhi Bangarbale; Blythe D Shepard; Shivani Bansal; Meth M Jayatilake; Ryan Kurtz; Moshe Levi; Carolyn M Ecelbarger
Journal:  Biomolecules       Date:  2022-08-25

Review 5.  Brain lipidomics: From functional landscape to clinical significance.

Authors:  Jong Hyuk Yoon; Youngsuk Seo; Yeon Suk Jo; Seulah Lee; Eunji Cho; Amaury Cazenave-Gassiot; Yong-Seung Shin; Myeong Hee Moon; Hyun Joo An; Markus R Wenk; Pann-Ghill Suh
Journal:  Sci Adv       Date:  2022-09-16       Impact factor: 14.957

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.