| Literature DB >> 36211379 |
Zhen Li1, Qijun Yu2,3, Qingyuan Zhu1, Xiaojing Yang1, Zhaobin Li1, Jie Fu1.
Abstract
Evaluation of tumor-host interaction and intratumoral heterogeneity in the tumor microenvironment (TME) is gaining increasing attention in modern cancer therapies because it can reveal unique information about the tumor status. As tumor-associated macrophages (TAMs) are the major immune cells infiltrating in TME, a better understanding of TAMs could help us further elucidate the cellular and molecular mechanisms responsible for cancer development. However, the high-dimensional and heterogeneous data in biology limit the extensive integrative analysis of cancer research. Machine learning algorithms are particularly suitable for oncology data analysis due to their flexibility and scalability to analyze diverse data types and strong computation power to learn underlying patterns from massive data sets. With the application of machine learning in analyzing TME, especially TAM's traceable status, we could better understand the role of TAMs in tumor biology. Furthermore, we envision that the promotion of machine learning in this field could revolutionize tumor diagnosis, treatment stratification, and survival predictions in cancer research. In this article, we described key terms and concepts of machine learning, reviewed the applications of common methods in TAMs, and highlighted the challenges and future direction for TAMs in machine learning.Entities:
Keywords: artificial intelligence; deep learning; machine learning; tumor microenvironment; tumor-associated macrophages (TAMs)
Mesh:
Year: 2022 PMID: 36211379 PMCID: PMC9538115 DOI: 10.3389/fimmu.2022.985863
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
M1 and M2 macrophages markers.
| Characteristics | M1 (classical) | Reference | M2(alternative) | Reference |
|---|---|---|---|---|
| Stimuli | LPS/IFNG/CSF2 | ( | IL4/IL13/CSF1 | ( |
| CDs and MHC | CD68, CD80, CD86, MHC-II | ( | CD68, CD204, CD163, CD206 | ( |
| Cytokines and Chemokines | IL1B, IL6, IL12, TNF, IFNG | ( | IL10, VEGFA/C, TGFB1 | ( |
| Non-coding RNAs | miR-125b-2 | ( | miR-375 | ( |
| miR-16 | ( | miR-34a | ( | |
| miR-9 | ( | miR-301a | ( | |
| lncRNA-PVT1 | ( | miR-934 | ( | |
| lncRNA-MEG8 | ( | miR-940 | ( | |
| lncRNA-GAS5 | ( | let-7b | ( | |
| miR-155 | ( | let-7c | ( | |
| miR-142-3p | ( | let-7d-5p | ( | |
| miR-146a | ( | miR-19b-3p | ( | |
| lncRNA-MM2P | ( | |||
| Others | NOS2, ROS, HMGB1 | ( | PD-1/PD-L1, MMP1/2/9, Arg1, | ( |
Figure 1Roles of TAMs in tumor progression. Overview of TAMs in tumor progression. TAMs can derive from BMDMs and TRMs. TAMs provide a niche for tumor initiation and development, participate in angiogenesis, promote tumor metastasis, and enhance resistance to chemotherapy, radiotherapy and immunotherapy. (Created with BioRender.com).
Common terminologies and explanations in ML.
| Artificial Intelligence | Artificial intelligence is the capability of a computer to perform tasks that are generally completed by humans because they require human intelligence and conception. |
| Features | Features are the observable quantities and characteristics across all samples in the data set, either raw or mathematically transformed. |
| Feature selection | Feature selection is the process of selecting the most relevant features in developing a predictive model and can reduce the computational cost of modeling as well as improve the performance of the model. |
| Data augmentation | Data augmentation refers to techniques that can increase the diversity of training sets by applying random (but realistic) transformations, such as image rotation, flipping, scaling, etc. |
| Overfitting | Overfitting refers to a model that performs pretty well on the training data and fails to generalize and perform well in the case of unseen data scenarios. |
| Underfitting | Underfitting refers to a model that does not work correctly in the training data and also has poor performance in the new data. |
| Dimensionality reduction | Dimensionality reduction refers to techniques that reduce the number of random variables to the principal component of a data set. |
Categories of supervised learning and unsupervised learning for common algorithms.
| Supervised Learning | Unsupervised Learning |
|---|---|
| Ordinary Least Square Regression | K-Means |
| Logistic Regression | Principal Component Analysis |
| Least Absolute Shrinkage Selection Operator Regression | Information Maximizing Component |
| Linear Discriminant Analysis | Self-organizing Maps |
| Ridge Regression | Topological Data Analysis |
| Elastic Net Regression | |
| Support Vector Machines | |
| Bayesian Networks | |
| Naïve Bayes Classifiers | |
| Random Forests |
Pros and cos of common machine learning algorithms.
| Pros | Cons | |
|---|---|---|
|
|
Good performance with high dimensional data Good performance when classes are separable |
Slow Cannot deal with overlapped classes Selecting appropriate hyperparameters is essential Selecting the appropriate kernel |
|
|
Reduce overfitting Improve visualization Improve model performance |
Independent variables become less interpretable Data standardization is necessary Lose information |
|
|
Fast prediction Insensitive to irrelevant features Can be used for multi-class prediction Perform well with high dimensional data Less dependent to data size |
Independence of features does not hold Relatively low prediction accuracy Zero Frequency |
|
|
Simple to implement and interpret Feature scaling is unnecessary Perform well for linearly separable dataset Tuning of hyperparameters is unnecessary Fast at classifying unknown records |
Assumption of linearity between the dependent variable and the independent variables Requires average or no multicollinearity between independent variables High reliance on proper presentation of data |
|
|
Reduced error with high accuracy (balance the bias-variance well with multiple trees) Good performance on imbalanced datasets Can handle linear and non-linear relationships well Little impact of outliers Not prone to overfitting Useful for feature selection |
Features need to have some predictive power Predictions of the trees need to be uncorrelated Not easily interpretable Computationally intensive for large datasets Black box nature |
|
|
Normalization or data scaling is unnecessary Can handle huge amount of data Easy to explain Easy visualization Automatic Feature selection Missing values does not affect building decision tree |
Prone to overfitting A small change in data can cause large change in structure of decision tree Long training time Inadequate for applying regression and predicting continuous values |
|
|
Simple to understand and implement No assumptions about data Constantly evolving model Can handle multi-class problem One hyper-parameter(k) |
Slow Poor performance on datasets with large number of features Scaling is necessary Imbalanced data causes problems Outlier sensitivity No capability of dealing with missing values |
|
|
High Efficiency High accuracy Multi-tasking Able to deal with incomplete information Having fault tolerance |
Hardware dependence Black Box Nature Complex algorithm compared to traditional machine learning algorithms Need large data set |
ML algorithms and their applications in TAMs.
| Authors and Years | Cancer Types | Sample Size | ML Algorithms | Research Purposes | ML Applications |
|---|---|---|---|---|---|
| Chang et al. ( | Ovarian cancer | 1566 | Cox, LASSO | To construct macrophage related prognostic model for ovarian cancer | identify multiple features related to survival (uni-and multi-variate Cox) and construct the macrophage-related prognostic model (LASSO) |
| Rostam et al. ( | / | Orange Data Mining Toolbox | To identify different macrophage functional phenotypes | auto-identification of phenotypes based on cell size and morphology (Orange) | |
| Zhu et al. ( | Rectal cancer | 46 | SVM | To investigate the role of tumor-infiltrating leukocyte cell composition in the prognosis of radiotherapy for rectal cancer | classify responsive and non-responsive patients (SVM) |
| Zhang et al ( | Glioma | 2405 | NN, SVM | To investigate the predictive value of monocytes in the immune microenvironment | validate clustering results (NN, SVM) and calculate the risk scores of patients (ER, PCA) |
| Zhang et al. ( | Glioma | 2365 | Pamr, NN, SVM | To build a prognostic model based on the molecular feature of TAMs for gliomas | validate the clustering results (Pamr, SVM, and NN), construct risk scores (ER, PCA) and further validate the clustering results (SVM, NN) |
| Zhang et al. ( | Prostate cancer | 487 | LASSO, PCA | To build a model to predict the risk of prostate cancer based on immune-related gene-based novel subtypes | determine the properties of the subtypes (PCA) and build the risk predictive model (LASSO) |
| Yin et al. ( | Cervical squamous cell carcinoma | 78 | Cox, LASSO, LR, GMM | To investigate the roles of TAMs in the development of cervical squamous cell carcinoma | select immune‐related genes (Univariate Cox and LASSO), construct the risk score model (multi-variate Cox), build a diagnosis signature (LR), and then select the best models (GMM) |
| Yan et al. ( | Ovarian cancer | 365 | Cox, LASSO, SVM, SVM-RFE | To explore prognostic genes associated | identify the most valuable genes related to immune infiltration (LASSO, Cox), distinguish two different standards of immune infiltration (SVM), and work out the most valuable variables of immune infiltration (SVM-RFE) |
| Wu et al. ( | Non-small cell lung cancer | 681 | RF | To develop a macrophages-based immune-related risk score model for relapse prediction in stage I–III non-small cell lung cancer | screen the robust prognostic markers and construct risk score to predict disease-free survival (RF) |
| Wei et al. ( | Gastric cancer | 407 | SVM, LASSO, SVM-RFE | To investigate the effect of various components in gastric cancer TME and identify mechanisms exhibiting potential therapeutic targets | minimize the redundancy of features (LASSO) and rank the features (SVM, SVM-RFE) |
| Wang et al. ( | Lung cancer | 507 | Mask R-CNN | To develop a prognostic model for the prediction of high- and low- risk lung adenocarcinoma | segment the nuclei of tumor, stroma, lymphocyte, macrophage, karyorrhexis and red blood cells (Mask R-CNN) |
| Vayrynen et al. ( | Colorectal cancer | 931 | inform | To investigate the prognostic role of macrophage polarization in the colorectal cancer microenvironment | identify macrophages in tumor intraepithelial and stromal regions (inForm) |
| Ugai et al. ( | Colorectal cancer | 3092 | inform | To investigate if the relationship between smoking and colorectal cancer incidence varies depending on macrophage infiltration | perform tissue category segmentation, cell segmentation, and cell type classification |
| Starosolski et al. ( | Transgenic mouse models of neuroblastoma | 16 | Non-parametric neighborhood component analysis | To investigate if nanoradiomics can differentiate tumors based on TAM burden | radiomic features selection (the non-parametric neighborhood component method) |
| Shen et al. ( | Brian tumor | 3810 | A self-developed deep learning algorithm based on contrastive learning | To stratify brain tumors for better clinical decision-making and prognosis prediction | distill expression signatures of transcriptome (DL) |
| Nakamura et al. ( | Ovarian carcinoma | 1656 | SVM, RF, NN, LDA | To identify relationships between the expression of immune and inflammatory mediators and patient outcomes | classify ovarian cancer and normal tissue (SVM, RF, and NN) and map high-dimensional input data into a two-dimensional space (LDA) |
| Liang et al. ( | Various cancers | 9881 | CART, LR, LDA, K-Neighbors Classifier, Gaussian Naive Bayes, SVM | To investigate the inflammasome signaling status to clarify its clinical and therapeutic significance | classify samples and validate gene set enrichment (all 6 ML methods) |
| Li et al. ( | Bone-related malignancies | 1675 | RF | To investigate if a distinct immune infiltrative microenvironment exists in malignant bone-associated tumors and build a model for tumor diagnosis and prognosis | develop a bone-related tumor differential diagnosis model (RF) |
| Li et al. ( | Gliomas | 652 | NN, LSTM, Cox, LASSO, | To predict survival and tumor-infiltrating | extract significant radiomic features to construct a prediction model (NN, LSTM, Cox, LASSO, RF) |
| Kuang et al. ( | Hodgkin lymphoma | 130 | LASSO, Cox, | To investigate potential markers for the diagnosis and prediction of classic Hodgkin lymphoma prognosis | identify prognostic genes and build a model for prognosis (LASSO, Cox, RF) |
| Hagos et al. ( | Follicular lymphoma | 32 | ConCORDe-Net | To identify cell phenotypes and spatial distribution of immune cell subsets in the inter‐follicular area of follicular lymphoma TME | detect different immune cells within and outside neoplastic follicles (ConCORDe-Net) |
| Guo et al. ( | Pulmonary | 97 | Cox, RF | To build an immune-based risk-stratification system for prognosis in pulmonary sarcomatoid carcinoma | construct a predictive model and rank the predictive ability of each variable (Cox, RF) |
| Lange et al. ( | Uveal melanoma | 64 | HCA, PCA | To study the immune environment and explore whether absolute T-cell quantification and expression profiles can dissect disparate immune components | reveal cell-specific expression patterns in gene selection (HCA, PCA) |
| Lin et al. ( | Adamantinomatous | 57 | RF, LASSO | To study the molecular immune | screen diagnostic markers (RF, LASSO) |
Cox, Cox Proportional-hazards Regression; LSSO, Least Absolute Shrinkage and Selection Operator; PCA, Principal Component Analysis; ER, Elastic Regression; SVM, Support Vector Machine; Pamr, Prediction Analysis for Microarrays; LR, Logistic Regression; GMM, Gaussian Mixture Model; LDA, Linear Discriminant Analysis; SVM-RFE, Support Vector Machine Recursive Feature Elimination; NN, Neural Network; RF, Random Forest; CART, Classification and Regression Trees; ConCORDe-Net, Cell Count Regularized Convolutional Neural Networks; HCA, Hierarchical Cluster Analysis; LSTM, Long short-term memory; MLP, Multi-layer perceptron; Weka, Waikato Environment for Knowledge Analysis; ROF, Rudin-Osher-Fatemi.
Figure 2Basic principles of standard ML algorithms. (A) PCA reduces the dimensionality of a data set consisting of plenty of interrelated variables. (A) illustrates a series of data points viewed from another angle with approximately the same value on that dimension. It shows that the distinction between the data points can be represented by a principal component. (B) Regression analysis determines the relationship between factors and disease outcomes or identifies relevant prognostic factors for diseases. (B) illustrates regression estimating a mathematical formula that relates input variables to the output variable. (C) SVM generates a hyperplane in higher-dimensional feature space and maximizes the margin of error to select the best hyperplane. The best hyperplane would serve as a decision boundary for classification. (D) RF model ensembles a large number of small decision trees. Each tree is capable of making an individual prediction. (E) Neural networks tend to resemble the connections of neurons and synapses in human brain. The input data is assigned initial weights and transferred to output layers for classification. Hidden layers would tune the initial wrights to minimize the neural network’s prediction error.