Literature DB >> 34036199

Autoregressive parametric modeling combined ANOVA approach for label-free-based cancerous and normal cells discrimination.

Abstract

Label free based methods received huge interest in the field of bio cell characterizations because they do not cause any cell damage nor contribute any change in its compositions. This work takes a close outlook of cancerous cells discrimination from normal cells utilizing parametric modeling approach. Autoregressive (AR) modeling technique is used to fit the measured optical transmittance profiles of both cancer and normal cells. The transmitted light intensity, when passes through the cells, gets affected by their intercellular compositions and membrane properties. In this study, four types of cells: lung-cancerous and normal, liver-cancerous and normal, were suspended in their corresponding medium and their transmission characteristics were collected and processed. The AR coefficients of each type of the cell were analyzed with the statistical technique called Analysis of variance (ANOVA), which provided the significant coefficients. The poles extracted from the significant coefficients resulted in an improved demarcation for normal and cancer cells. These outcomes can be further utilized for cell classification using statistical tools.

Entities: CellLine Chemical Disease Gene Species

Keywords: ANOVA; Autoregressive; Cancer; Cells; Detection; Discrimination; Poles

Year: 2021 PMID： 34036199 PMCID： PMC8134980 DOI： 10.1016/j.heliyon.2021.e07027

Source DB: PubMed Journal: Heliyon ISSN： 2405-8440

Introduction

In the past two decades, extensive research has been carried out to develop various means that aid in the efficient classification of normal and cancer cells [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. A systematic review work on cancer, its mortality statistics, the current therapeutic strategies and the surpassing need to find novel strategies can be found in literature [11, 12]. The classification of normal and cancer cells is very important because it aids in the early detection of cancer [13] and thus helps cancer patients to receive better treatment and improve their lifestyles [14]. Conventional cancer screening techniques based on clinical studies are mostly invasive [15]. Furthermore, these techniques require large amounts of samples, biomarkers, antigens, and antibodies [16]. Biomarkers are molecules that are present in blood, urine, stool, tissues, or other bodily fluids that indicate normal or abnormal processes in the body [17]. Cancer biomarkers include substances produced by the cancer cells or by other cells reacting to the cancer cells in the body [18]. These biomarkers are helpful in detecting and diagnosing cancer cells [19]. However, an important downside of these techniques is that repeated biopsies are required if the false positive rate is high [20, 21]. In addition, precise manual identification of cancer cells based on microscopic biopsy images is a challenging task for the pathologists and medical practitioners. As a result, the emphasis on the development of reliable, label-free methods is growing inside the research community. Kumar et al. had discussed in their work about the common features looked for by histopathologists for classification of cancer cells and developed a software tool for automated method for discrimination of cells. This work shows how medicine combined with engineering brings an enhancement in the classification of cells [22]. Label-free methods based on electrical [23, 24, 25], mechanical [26, 27], optical [28, 29, 30], and biochemical [31] cancer cell detection techniques have been reported in the research literature. The analysis of electrical, mechanical, and optical responses of cells combined with numerical methods is gaining popularity due to their improved efficiency in distinguishing between normal and cancer cells. Several signal processing algorithms such as Prony [32, 33], the matrix pencil method [34], and empirical mode decomposition [35], as well as modeling techniques such as autoregressive (AR) [36], autoregressive moving average (ARMA) [37], and autoregressive integrated moving average (ARIMA) [38] are used to classify normal and cancer cells. A noninvasive discrimination method of cancer cells from normal cells in a corresponding adherent culture has been introduced utilizing their cell morphology [39]. The analysis of the corresponding intracellular distribution phase-shift data has been used to develop cancer index as an indicator to be utilized to distinguish the cancer from normal cells. On the other hand, cancer and normal cells were discriminated based on their biomechanics characteristics [40]. The cellular properties were probed at the single cell level using atomic force microscopy (AFM). The corresponding elastic properties were measured with a proper designed indentation experiments. The extracted Young's modulus that represents the cellular deformability can be used quantitatively to distinguish the normal from cancer. Chromatography-mass spectrometry based on gas or biomarkers has been used for the metabolomics profiling of cells. With the help of multivariate analysis, a panel of metabolites revealed that was possible to discriminate cancerous from normal cells [41]. Electrically, a set of extracted parameters from the measured capacitance-voltage profiles has been used to recognize cancer from normal cells [42]. The normal cells show greater value of dielectric constants than its cancerous counterpart of same tissue. On another study the electrical impedance spectroscopy has been employed to classify between normal and cancerous mammalian cells. Set of features allowed the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives [43]. The interactions between cell compositions and light were used for classification and identification of many types of normal cells and cells with any abnormality. Experimental results have proven that cancer cells transmit more light than normal ones taken from same tissue [44]. Artificial intelligence and deep learning techniques and method have proven to show an outstanding performance and capabilities in resolving recognition and classification problems. A combined deep learning approach in conjunction with expert trained data system was very powerful tool to identify cancer from gene expression data [45]. Moreover, it can also contribute towards the understanding the complex nature of cancer based on large public data as well. Analysis of variance is an efficient statistical tool that helps to incorporate the effect of a subpopulation on the variability of total population. This work combines optical methods with an AR modeling technique for enhanced classification of normal and cancer cells. The poles extracted from the AR coefficients for different cells characterize the composition and intrinsic properties of the cells. Analysis of variance (ANOVA) – a statistical technique – is utilized that provides us the significant AR coefficients and hence to obtain order optimization. The distribution of poles in the complex z-plane serves as an efficient tool for enhancing the classification of normal and cancer cells.

Materials and methods

Four type of cells utilized for this work is shown in Table 1. The cell lines were processed according to the standards established by the American Tissue Culture Collection (ATCC). Normal cells and their cancerous counterpart from lung and liver cell lines utilized for this study were cultured with the corresponding culture medium. The details of the culture condition, experimental setup used to measure the optical transmittance of the cells are discussed below.

Table 1

List of type of cells.

Cell line	Tissue	Cell type
BEAS 2B	Lung	Normal
CC-827	Lung	Cancer
THLE2	Liver	Normal
HEPG2	Liver	Cancer

List of type of cells.

Cell preparation

Cell suspension for culturing the cells

The suspensions used for cell culturing can be homogenous or non-homogenous. The suspensions used in this work were homogenous as they contained cells of single cell lines. The homogeneity of the suspension was crosschecked with a confocal fluorescence microscope. The number of cells in a suspension is also counted using the microscope. With this hemocytometer, the cell population was adjusted to be 107 cells per mL for each type of cell with a mean error of 5%. The in vitro nourishment requirement of cells for their survival and proliferation differs for different types of cells. The following section elaborates the culture medium for the four category of cells. At a temperature of 98.6°F, a moistened air surrounding with 5% carbon dioxide (CO2) was retained for the cells.

Lung cell line

Normal cells (BEAS 2B): According to ATCC guidelines, a layer of pre-coating mixture is coated initially on the culture dish. The mixture contains 0.01 mg/ML each of fibronectin and bovine serum albumin and 0.03 mg/ML of bovine collagen, diluted in bronchial epithelial basal medium (BEBM). All chemicals were bought from Sigma-Aldrich. The BEGM bullet pack (Lonza ™ Clonetics ™) that has the vital additives (gentamycin/amphotericin was discarded) was used to culture at the first stage. Penicillin (100 units/mL) and streptomycin (100 mg/mL) act as medium's supplements. Trypsinization is done using a solution of Ethylene diamine tetra acetic acid (EDTA) (0.53 mM) plus 0.5% polyvinylpyrrolidone (PVP) solution. Cancerous cells (CC-827): For these cells, ATCC-recommended RPMI 1640 medium is used. The medium is a product of Hyclone™, US. It had 10% heat-inactivated FBS supplement as base and trypsinization was performed with a solution containing 0.25% trypsin (0.53 mM EDTA solution).

Liver cell line

Normal cells (THLE2): The pre-coating mix coated on the culturing plates for THLE2 cells had 2.9 mg/mL of collagen I, 1 mg/mL each of fibronectin and bovine serum albumin in BEBM (Sigma-Aldrich product). Removing the gentamycin/amphotericin and epinephrine, the Lonza ™ Clonetics ™ BEGM bullet kit with a base of epidermal growth factor (EGF) (5 ng/mL), phosphoethanolamine (70 ng/mL), with additives were used as growth medium for the cells. The exact amount of FBS as used for CC-827 along with penicillin-streptomycin (Gibco – 1%) were the media supplements. Trypsinization is carried out with 0.5% trypsin. Cancerous cells (HEPG2): Dulbecco's modified Eagle's medium (DMEM – Hyclone ™) is used for the HEPG2 liver cancerous cells that were grown in culture plate with 10% and 1% of FBS and penicillin-streptomycin, respectively as the medium supplements. Trypsinization for these cells remained same as its normal counterpart.

Optical measurements setup

The description of the setup used in the work to measure the light transmission intensity is shown in Figure 1. C11708 MA mini-spectrometer optical sensor (Hamamatsu/Japan, [46]) is placed under the host sample holder. High precision quartz superasil cell (1 mm light path) (Hellma analytics/Germany, [47]) has been used as host sample holder. Each sample suspension was loaded inside the cell individually. The sensor board (C11351-02) output terminals are connected to the evaluation board (C113451-01) to transfer data to a laptop. The collection of the light data and their corresponding processing are carried out through the Hamamatsu-Mini-spectrometer “HMS Evaluation” software installed on the laptop. A xenon light (Xenoncorp/USA, [48]) was used to direct the light towards the sample. The sensor converts the photo-electrical light that passes through the sample and convex lens to electrical signal.

Figure 1

Schematic diagram for the setup: (a)Light Source (Xe), (b) Convex lens holder, (c) Sample holder, (d) Image sensor (C11708 MA) + sensor board (C113451-02), (e) Evaluation board (C113451-01) and (f) PC with HMS Evaluation software.

Current approach

In this section, the principles of the AR model are discussed. A set of M discrete data samples can be expressed in the AR (p) model as given in (1) [49]:a (0) = 1, m = 1, 2, 3…, Mwhere x(m) is the present output, p represents the order of the model, and represents the AR coefficients. e(m) represents the random shock or random noise and is assumed to be white Gaussian noise: WN (0, σ2). The all-pole model can be represented in the z domain as follows [50]: A key feature of AR modeling is that it does not have any domain constraints [51]. Any discrete signal, whether in time or frequency domain, can be modeled using an AR model. In addition, the signal may or may not have transients. Metrics such as prediction accuracy, mean square error (MSE), and final prediction error (FPE) are helpful in evaluating the performance of the model. The discrete dataset was preprocessed to remove noise and trend in the data. This was carried out by data smoothing followed by data detrending. The Final Prediction Error (FPE) is usually utilized for the estimation of the model-fitting error and the use of the developed model to predict new outputs. The selected model should minimize the FPE, which indeed represents a balance between the number of parameters and the variations. The FPE is a measure of the fit of the model to estimated data, i.e. the model quality. The Final Prediction Error (FPE) is defined by the following equation [52]:where: SSE is the sum of square error, N is the number of values in the estimation data set and d is the number of estimated parameters. The mean-square error (MSE) measures how closely the predictors tracks the actual data. The MSE is frequently used in the analysis of variance, and is calculated as follows [53]:where: is the actual data point and is the average of the actual data set. The squaring is used usually to exaggerate the influence of outliers. The SSE is then can be expressed as [52]: ANOVA quantifies the contributions of each source to the total variation. In other words, ANOVA allows us to determine the contributions of different factors to the variability in the total dataset. In general, a quantitative response variable is connected to one or more explanatory variable. In such a scenario where it is necessary to quantify the response variable, ANOVA is the most fitting statistical technique. The F-statistic in ANOVA analysis that determines the significant of coefficients, mathematically can be expressed as follow [53]:where: MSA and MSE are the variance values, respectively representing, between and within treatments. and are expressed as per follows:where: total number of samples (treatments) being considered and is the total number of observations made for all treatments. SSA is sum of squared errors of all treatments means versus grand mean. SSE is sum of squared errors of all observations versus respective sample means.

Results and discussion

The normal and cancer cells of two cell lines (normal lung, cancerous lung, normal liver, and cancerous liver) were cultured separately. The norms of the ATCC were followed for the sub-culturing and trypsinization of the cells. The cell suspension preparation and culturing were done separately for each cell type. The suspension for each cell type contained 107 cells per mL. The cell count in the suspension of each type of cell was conducted using a hemocytometer with a 5% mean error. After loading the cells in the experimental setup, their transmittance profile was recorded. The fluorescence microscopy images of liver normal and liver cancer cells along with hematoxylin and eosin (H/E) staining are shown in Figure 2(a), (c), (b) and (d) respectively.

Figure 2

Fluorescence microscopy images of (a) THLE2 (liver normal) (b) THLE2 stained with H&E (c) HEPG2 (liver cancer) and (d) HEPG2 stained with H&E.

Fluorescence microscopy images of (a) THLE2 (liver normal) (b) THLE2 stained with H&E (c) HEPG2 (liver cancer) and (d) HEPG2 stained with H&E. The measured transmission profile of the normal and cancerous cells of liver and lung tissues are shown in Figure 3(a), (b), (c), and (d), respectively. Table 2 summarizes the metrics such as prediction accuracy, MSE, and FPE of the fitted AR model of each type of cell for an order 6. The model complexity increases in proportional to the model order. The AR model coefficients of order 6 obtained for the four types of cells are shown in Table 3.

Figure 3

Measured transmittance response of (a) cancerous liver, (b) normal liver, (c) cancerous lung, and (d) normal lung cells. The response was sampled at a uniform step size of Ws = 2.3 nm.

Table 2

Performance measure of a fitted AR model for the transmittance response of different types of cells for order 6.

Type of cell	Prediction accuracy	MSE	FPE
Normal lung	91.61%	5.72e-08	6.00e-08
Cancerous lung	90.75%	1.13e-07	1.18e-07
Normal liver	93.48%	6.53e-08	6.84e-08
Cancerous liver	99.76%	6.59e-08	6.91e-08

Table 3

Set of extracted AR coefficients for different types of cells.

AR coefficients	Normal lung	Cancerous lung	Normal liver	Cancerous liver
a₁	−0.16	+0.06	−0.13	−0.68
a₂	−0.90	−0.94	−0.96	−0.95
a₃	+0.03	−0.23	−0.08	+0.43
a₄	+0.42	+0.36	+0.46	+0.36
a₅	+0.05	+0.12	+0.13	−0.09
a₆	+0.01	+0.02	+0.06	−0.01

Measured transmittance response of (a) cancerous liver, (b) normal liver, (c) cancerous lung, and (d) normal lung cells. The response was sampled at a uniform step size of Ws = 2.3 nm. Performance measure of a fitted AR model for the transmittance response of different types of cells for order 6. Set of extracted AR coefficients for different types of cells. It can be concluded that the coefficients are of different values for different types of cells. This reflects the alteration in the composition and intrinsic properties of the different cell types. Moreover, the cancer cells from the same tissue as normal cells have different coefficient values, implying the variation in their composition, morphology, and intrinsic properties. Table 4 shows the poles extracted for the four types of cells used in this work. The poles can be real valued or complex conjugate pairs. For instance, the poles P1 and P2 of liver cancer cells are real and distinct, while their remaining poles (P3–P6) occur as complex conjugate pairs. All the extracted poles of lung (normal and cancer) cells occur as complex conjugate pairs.

Table 4

Set of poles extracted for different types of cells.

Cell Type	P₁, P₂	P₃, P₄	P₅, P₆
Normal lung	−0.687 ± 0.305i	−0.05 ± 0.16i	0.820 ± 0.32i
Cancerous lung	−0.688 ± 0.309i	−0.15 ± 0.21i	0.800 ± 0.24i
Normal liver	−0.71 ± 0.33i	−0.10 ± 0.32i	0.88 ± 0.32i
Cancerous liver	−0.1 + 0i, 0.35 + 0i	−0.65 ± 0.23i	0.87 ± 0.11i

Set of poles extracted for different types of cells. The scattering of the poles of the normal and cancer cells of the lung in the z-plane is shown in Figure 4(a) and of liver is shown 4(b). The poles of the normal cells are illustrated with blue dots whereas the red dots represent the poles of the cancer cells. As shown in Figure 4, the poles of the different cells have different distributions in the z-plane. In addition, the scattering of the poles of the normal and cancer cells of the same tissue are also different. Any deviation from the pole values of the normal cells shows the presence of abnormalities.

Figure 4

Z-plot showing the scattering of the poles of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells.

Z-plot showing the scattering of the poles of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells. The alterations in the transmittance profile of normal cells to that cancerous cells were attributed to the structural changes, alteration of its physiological as well as biochemical properties that affect the optical properties of the cells and enabling them to be distinguished from each other. This affects the poles’ location and the coefficient values in the z-plane. It is worth to mention that the measurements were conducted on cells suspended in their corresponding media. The applied optical measurements conditions did not cause any harm to the cells. The applied light did not affect intracellular temperature of the cells. The O2 is dissolved in the media which will help the cells to survive. The pH has been maintained; has not been affected by light. The pH has been measured before and after the passing of light. The temperature of the suspension has been measured before and after the optical measurements, and is found to be almost the same. The measurements have been conducted at room temperature. The cells are suspended in media that is rich with nutrient, to keep them alive. Cells have been subjected to light for less than 5 min. This is not a significant time to make them die; mainly the cells during measurements were suspended inside the media. Cell viability test, the most common test using the try-band loop staining has been used to check the suspension before and after the optical measurements; before the light the percentage of living cell was above 90%, after the optical measurements; the percentage of living cell was above 85%. To reduce redundancy and arrive at a concise AR model, statistical tools such as the N-way ANOVA technique were applied. The ANOVA revealed the significance of the AR coefficients. The ANOVA technique was applied to the AR coefficients rather than the poles since the poles were extracted from the coefficients. The coefficient that gives the highest value of the mean square is the significant AR coefficient. Three coefficients – a1, a3, and a5 – out of the six coefficients shown in Table 2 were found to be significant. Hence the order of the AR model is reduced by one degree. This will reduce the complexity of the system. The new set of reduced poles distribution in the Z-plan are plotted in Figure 5.

Figure 5

Z-plane showing the distribution of the reduced poles of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells.

Z-plane showing the distribution of the reduced poles of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells. The poles are complex and are defined as σ ± jω. Here, σ is the damping coefficient (real part of the pole) and ω is the resonant pulsation (imaginary part of the pole). The poles damping and resonant can be used to identify the quality factor of the pole [54], as follow: The pole quality (Q) can then be used to discriminate between cancer and normal cells. The pole quality can be considered as a figure of merit (FOM) which we are suggesting to be used for the time; to correlates the real part with the imaginary part to develop a discrimination procedure. The location of the poles are strongly affected by the variations in cell composition for normal and cancer cells. The changes in the optical response of normal and cancerous ones were primarily due to structural changes, alteration in physiological as well as biochemical properties. These changes are reflected in the optical properties of the cells, that enables for distinguishing the cells; as previously indicated. The poles quality factors will be affected consequently. Based on the experimental outcomes it is found that the normal cells have lower transmittance intensity than cancerous one from the same tissue type. The corresponding computed poles quality factors for the four kind of cells under study are shown in Figure 6. Figure 6(a) and (b) represents the distributions of Q-factor for lung (normal (blue), cancer (red)) and liver cells lung (normal (blue), cancer (red)), respectively. Figure 6 revealed that the magnitude of the pole quality factor for cancer cells the cancer cells is higher than normal.

Figure 6

Z-plane showing the distribution of the Q-factor of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells.

Z-plane showing the distribution of the Q-factor of the (a) normal (blue) and cancer (red) lung cells and the (b) normal (blue) and cancer (red) liver cells. Thus the proposed approach offers a very straight forward and clear discrimination strategy: by analyzing the optical profiles of various in vitro normal and cancer cell line models, the AR based data processing procedure demonstrated in the work to achieve a label-free discrimination between cancerous and normal cells of the same tissue type performs very well. Furthermore; the proposed approach can be coupled or integrated with existing techniques and methods to enhance the discrimination between the cells from same tissues. The current method utilized statistical methods that are less expensive than machine learning based methods.

Conclusions

This work utilized optical techniques combined with an AR modeling method and statistical techniques for the classification of normal and cancer cells. The approach used in the present work was applied to normal and cancer cells from lung and liver tissues. The AR coefficients of each type of cell were analyzed with the statistical technique ANOVA, which provided the significant coefficients. The poles extracted from the significant coefficients provided an improved demarcation for normal and cancer cells. These outcomes can be further utilized for cell classification using statistical tools.

Declarations

Author contribution statement

Aysha F. AbdulGani: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Mahmoud Al Ahmad: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Funding statement

This work was supported by the (31R129).

Data availability statement

Data included in article/supp. material/referenced in article.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

29 in total

1. Proteinchip(R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures.

Authors: G L Wright; L H Cazares; S-M Leung; S Nasim; B-L Adam; T-T Yip; P F Schellhammer; L Gong; A Vlahou
Journal: Prostate Cancer Prostatic Dis Date: 1999-12 Impact factor: 5.554

Autoregressive parametric modeling combined ANOVA approach for label-free-based cancerous and normal cells discrimination.

Introduction

Materials and methods

Cell preparation

Cell suspension for culturing the cells

Lung cell line

Liver cell line

Optical measurements setup

Current approach

Results and discussion

Conclusions

Declarations

Author contribution statement

Funding statement

Data availability statement

Declaration of interests statement

Additional information

1. Proteinchip(R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures.

Review 2. Gold nanoparticles: interesting optical properties and recent applications in cancer diagnostics and therapy.

3. Unique dielectric properties distinguish stem cells and their differentiated progeny.

4. Classification of cell types using a microfluidic device for mechanical and electrical measurement on single cells.

5. Assay based on electrical impedance spectroscopy to discriminate between normal and cancerous mammalian cells.

6. Optical properties of normal and cancerous human skin in the visible and near-infrared spectral range.

Review 7. Cancer survivor identity and quality of life.

8. Comparison of needle core biopsy and fine-needle aspiration for diagnostic accuracy in musculoskeletal lesions.

9. Cell stiffness is a biomarker of the metastatic potential of ovarian cancer cells.

10. Discrimination between the human prostate normal and cancer cell exometabolome by GC-MS.