Literature DB >> 32288331

A novel cognitive interpretation of breast cancer thermography with complementary learning fuzzy neural memory structure.

Abstract

Early detection of breast cancer is the key to improve survival rate. Thermogram is a promising front-line screening tool as it is able to warn women of breast cancer up to 10 years in advance. However, analysis and interpretation of thermogram are heavily dependent on the analysts, which may be inconsistent and error-prone. In order to boost the accuracy of preliminary screening using thermogram without incurring additional financial burden, Complementary Learning Fuzzy Neural Network (CLFNN), FALCON-AART is proposed as the Computer-Assisted Intervention (CAI) tool for thermogram analysis. CLFNN is a neuroscience-inspired technique that provides intuitive fuzzy rules, human-like reasoning, and good classification performance. Confluence of thermogram and CLFNN offers a promising tool for fighting breast cancer.

Entities: Chemical Disease Gene Species

Keywords: Breast cancer diagnosis; Complementary learning; FALCON-AART; Fuzzy adaptive learning control network fuzzy neural network; Thermogram

Year: 2006 PMID： 32288331 PMCID： PMC7126614 DOI： 10.1016/j.eswa.2006.06.012

Source DB: PubMed Journal: Expert Syst Appl ISSN： 0957-4174 Impact factor: 6.954

Introduction

Breast cancer is the second most deadly cancer among women. Each year, 211,240 women are diagnosed with breast cancer and 40,870 of them will die in 2005 (American Cancer Society, 2005). In United States alone, it is estimated that there are 1 million women with undetected breast cancer; to date, the figure of women affected has surged to 1.8 million and 45, 000 women die per year (Diakides & Diakides, 2003). This high death rate has stimulated extensive researches in breast cancer detection and treatment. Recent studies have determined that the key to breast cancer survival rests upon its earliest detection possible. If discovered in its earliest stage, 95% cure rates are possible (Gautherie, 1999, Pacific Chiropractic and Research Center,). On the other side, it is reported that 70 to 90% of the excisional biopsies performed are found to be benign (Lay, Crump, Frykberg, Goedde, & Copeland, 1990). Owing to this high false positive rate, many endeavors have been putted into ameliorate the breast cancer early detection. Breast imaging is a noninvasive and inexpensive cancer detection technology. Amongst, mammography is accepted as the most reliable and cost-effective imaging modality. However, its false-negative rates is high (up to 30%) (Elmore et al., 1994, Rajentheran et al., 2001). In addition, the danger of ionizing radiation and tissue density, which has been associated with increased cancer risk (Boyd, Byng, & Jong, 1995), is linked with patient who underwent mammography screening. It is also uncomfortable, because the breast has to be compressed between flat surfaces to improve image quality. Furthermore, obtaining adequate images from radiologically dense breasts (with little fat) or in women with breast implants are difficult (Foster, 1998), and it is difficult to detect breast cancer in young women (Gohagan, Rodes, Blackwell, & Darby, 2004). Despite of these limitations, mammogram remains the gold standard for screenings (Gohagan et al., 2004, Moore, 2001). Since early detection is important, new technologies such as Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Computed Tomography-Single Photon Emission Computed Tomography (CT-SPECT) (Del Guerra, Di Domenico, Fantini, & Gambaccini, 2003), and ultrasound have been applied as complement to mammogram (Ng & Fok, 2003). Fig. 1 and Table 1 show the available modalities for breast cancer detection at present, and the reported accuracy, respectively. Note that the reported accuracy is only an estimate because these modalities perform differently on different types of breast cancer, on different age group, apart from the fact that most of the tests are done on small populations.

Fig. 1

Noninvasive breast cancer detection modalities (adapted from Fok et al., 2002).

Table 1

Accuracy of breast cancer diagnosis modalities

Technique	Sensitivity (%)	Specificity (%)	References
Clinical examination	48.3–59.8	90.2–96.9	Barton, 2002, McDonald et al., 2004

Biopsy
Surgical/Open biopsy	≈100	≈100	Imaginis, Breast Cancer Diagnosis (2004)
Vacuum-assisted biopsy (Mammotome)	95	98	Simmon et al. (2000)
Large core biopsy	74–97	91–100	Delle Chiaie and Terinde, 2004, Meyer et al., 1999, Puglisi et al., 2003
FNA (biopsy)	85–88	55.6–90.5	Pisano et al. (2001)
Core needle biopsy	91–99	73–100	Brenner et al., 2001, Pisano et al., 2001
Breast cyst aspiration	79	94	Lucas and Cone (2003)

Imaging
FNA (cytology)	65–99	64–100	Fajardo et al., 1990, Reinikainen, 2003
Mammography	13–95	14–90	Fajardo et al., 1990, Fletcher et al., 1993, Singhal and Thomson, 2004
Full-Field Digital Mammography (FFDM)	64.3	88.2	Irwig et al., 2004, Lewin et al., 2002
Thermography	90	90	Amalu (2003)
Ultrasound/Sonography	13–98.4	67.8–94	Houssami et al., 2003, Singhal and Thomson, 2004, Stavros et al., 1995
MRI	86–100	21–97	Cecil et al., 2001, Orel, 2000, Singhal and Thomson, 2004, Yeung et al., 2001
Proton Magnetic Resonance Spectroscopy (MRS)	83–100	73–87	Cecil et al., 2001, Reinikainen, 2003, Yeung et al., 2001
Scintigraphy (CT)	55–95	62–94	Brem et al., 2003, Singhal and Thomson, 2004
PET	96	100	Singhal and Thomson (2004)
Positron Emission Mammography (PEM)	80–86	91–100	Levine et al., 2003, Murthy et al., 2000
Electrical Impedance Scanning (EIS)	62–93	52–69	Glickman et al., 2002, Malich et al., 2003

Gene screening
Serum protein expression profiling	90	93	Vlahou et al. (2003)
Gene Profiling	83–91	72.7–81.8	van’t Veer et al. (2002)
Gene Testing	63–85	Not mentioned	Berry et al. (2002)

Noninvasive breast cancer detection modalities (adapted from Fok et al., 2002). Accuracy of breast cancer diagnosis modalities As shown in Table 1, none of the methods possesses high sensitivity (correctly identify women with breast cancer), and high specificity (correctly weed out women without breast cancer), albeit a lot of endeavors have been put in Foster, 1998, Moore, 2001. Each has its limitations. For example, clinical examination is insensitive, examiner-dependent (McDonald, Saslow, & Alciati, 2004); biopsy is invasive, causes complications, leaves scars, and requires long recovery time (Imaginis, Breast Cancer Diagnosis, 2004, Simmon et al., 2000); MRI is inconsistent, costly and low-resolution (Cardillo, Starita, Caramella, & Cilotti, 2001); PET, CT-SPECT, is expensive and scarce; ultrasound images are of poor resolution (Kotre, 1993, Moore, 2001), and operator-dependent (Chen, Chang, & Huang, 2000); microwave imaging requires accurate modeling of the relation between various tissues’ frequency dependency, and its sensitivity is affected by many factors (Bond et al., 2003, Fear et al., 2002, Kosmas et al., 2004); PEM (Thompson, Murthy, Picard, Weinberg, & Mako, 1995) is expensive and insensitive (Moses, 2004); FNA is operator-dependent (Pisano, Fajardo, Caudry, & Sneige, 2001), and incurs complications (Lucas & Cone, 2003); Gene expression analysis on genes BRCA1 and BRCA2, whose mutations are associated with breast cancer, is difficult as the genes are highly complex. The costly blood storage worsens the matter (Spengler, 2003); MRS is technically demanding, and only of confimatory value to MRI (Cecil et al., 2001, He and Shkarin, 1999); EIS requires localization of lesion before hand (Glickman et al., 2002), insensitive, and observer-dependent (Malich et al., 2003). These methods are often too cumbersome, costly inaccessible or invasive to be used as first-line detection modalities alongside clinical examination and mammography (Keyserlingk et al., 2000, Qi and Diakides, 2003). Thus, thermogram appears as one of the most promising and suitable alternatives for preliminary screening (Amalu, 2003). Thermogram monitors the breast health based on the heat pattern variation that correlates with the patients’ medical condition (Gautherie, 1999, Head et al., 2000). It is cheap, noninvasive, simple, painless, low cost, and highly accurate if done right, safe (no side effect known), practical, and it requires no contact nor compression, no radiation or venous access (Aksenov et al., 2003, Bamberg, 2002, Gautherie, 1989, Head et al., 2000, Keyserlingk et al., 2000). Infrared breast thermography can increase sensitivity at the critical early detection phase by providing an early warning of an abnormality that is not evident by other approaches (Keyserlingk et al., 2000). It is able to warn women up to 10 years before a cancer is found (Amalu, 2003, Pacific Chiropractic and Research Center,). Furthermore, thermography is the only physical method that mediates significant information on breast physiology (Gautherie, 1989). In contrast to other techniques, its result is independent of nodal status, and unrelated to age, tumor location (right or left breast), and estrogen, progesterone receptor status (Head et al., 2000). Hence, thermogram plays a pivotal role in breast cancer, be it risk assessment (Amalu, 2003), detection, diagnosis, or prognosis (Gautherie, 1989, Head et al., 2000). Unfortunately, despite of the strengths reported, thermogram is associated with some of the limitations such as environment-dependent, operator-dependent (Fok et al., 2002, Ng and Fok, 2003), not descriptive (Aksenov et al., 2003, Bamberg, 2002), difficult to interpret (Amalu, 2003), nonspecific (Jones, 1998), inconsistent (Frize et al., 2002, Head et al., 2003), and no standard analysis procedure (Ohashi and Uchida, 2000, Kaczmarek and Nowakowski, 2003), as pointed out in Breast Cancer Detection Demonstration Projects (BCDDP). As a result, breast thermography is yet to be widely used and is not recommended by National Breast Cancer Centre (National Breast Cancer Centre Position Statement, 2004). Apparently, thermogram performs no better than other modalities. All in all, if the thermography is done right, it offers a very powerful tool for fighting breast cancer. Thus, by providing decision aids using intelligent system (Ng and Fok, 2003, Ng et al., 2002), good and consistent diagnosis performance can be maintained using breast thermography. At the same time, these intelligent tools can lighten the pressures upon the physicians, and ease the burden of examining large number of images (e.g., 1 million pairs of X-ray images per year is needed to be reviewed Kotre, 1993). A summary of the use of complementing breast cancer detection modalities with intelligent tools is given in Table 2 .

Table 2

Reported accuracy on computer-aided diagnosis

Methods & applications	Accuracy (%)	Training/Sample size
Prognostic factors & BP for breast cancer prognosis (Burke et al., 1994)	85	NM
Prognostic factors & SOM for nodal metastasis detection (Naguib et al., 1996)	55–84	50/81
Prognostic factors & (a) Logistic regression (b) MLP (c) Decision tree for breast cancer survivability prediction (Delen et al., 2005)	(a) 89.2 (b) 91.2 (c) 93.6	10-fold cross validation, 202,932
Patient physiological and history data & (a) ANN (b) Data Employment Analysis (c) LDA for breast cancer diagnosis (Pendharkar et al., 1999)	(a) 81.5 (b) 66.5 (c) 66.1	227/227
Clinical pathological data & MLP & single threshold system for breast cancer prognosis (Gómez-Ruiz et al., 2004)	96	828/1035
FNA & (a) Fuzzy k-NN (b) logic regression (c) MLP for breast cancer prognosis (Seker et al., 2001)	(a) 88 (b) 82 (c) 87	Leave-one-out, 100
FNA & (a) Rank NN (Bagui et al., 2003) (b) Evolving ANN (Land & Albertelli, 1998) (c) Memetic pareto ANN and (d) BP (Abass, 2002) (e) CLFNN (Tan, 2005) (f) DA and (g) MARS and (h) BP and (i) MARS and BP (Chou et al., 2004) (j) Evolving ANN and (k) ANN ensembles (four MLP) (Yao & Liu, 1999) (l) Hybrid fuzzy genetic (Andrés et al., 1999) (m) SVM (Liu et al., 2003) for breast cancer diagnosis	(a) 97 (b) 97.1 (c) 98.1 (d) 97.5 (e) 97.81 (f) 95.91 (g) 97.66 (h) 98.25 (i) 98.25 (j) 96–99.8 (k) 99.99 (l) 97.5 (m) 97.07	(a), (b), (k): 524/699 (d): 400/683 (e), (j), (k), (l): 349/699 (f), (g), (h), (i): 398/569 (m) 547/699
Wavelet features of mammogram & MLP for breast cancer diagnosis (Kocur et al., 1996)	88	NM
Breast cancer tissue image & fuzzy co-occurrence matrix & MLP for breast cancer diagnosis (Cheng et al., 1995)	100	60/90
Biopsy image & (a) RBF (Schnorrenberg et al., 1997) (b) receptive field function and (c) ANN (Schnorrenberg et al., 2000) (d) singular value decomposition & MLP with Levenberg–Marquardt algorithm (Tsapatsoulis et al., 1997) for breast cancer nuclei detection	(a) 83.7–84.6 (b) 76.4–78.1 (c) 79.3–80.7 (d) 76.8	Sensitivity
Mammogram & (a) Evolving ANN (Fogel et al., 1998) (b) DA (Leichter et al., 1996) (c) SOM & MLP (Santos-André & da Silva, 1999) (d) Bayesian belief network (Wang et al., 1999) for breast cancer diagnosis	(a) 84.64–91.96 (b) 93.8 (c) 60 (d) 89	(a) Leave-one-out, 216 (b) NM (c) 247/272 (d) 335/419
(a) Mammogram & LDA for parenchymal patterns identification. (b) Mammogram & one-step rule-based & ANN for breast cancer diagnosis (Huo et al., 1998)	(a) 91 (b) 94	NM
(a) Mammogram & patient history data & ANN for breast cancer diagnosis, (b) and for mammographic invasion prediction (Lo & Floyd, 1999)	(a) 82–86 (b) 77.96	Leave-one-out
Mammogram & patient history data & (a) evolutionary programming & Adaboosting (Land et al., 2000) (b) Constraint satisfaction ANN [Tourassi01] for breast cancer diagnosis	(a) 86.1–87.6 (b) 84	(a) 400/500 (b) 250/500
Mammogram & RBF for (a) abnormalities detection (b) breast cancer diagnosis (Christoyianni et al., 2002)	(a) 88.23 (b) 79.31	(a) 119/238 (b) 119/119
Ipsilateral mammogram & ANN (BP and Kalman filter) for breast cancer diagnosis (Sun et al., 2004)	About 65	60/100
MRI & BP for breast cancer diagnosis (Cardillo et al., 2001)	Improved accuracy	NM
(a) Spectrum of radio frequency echo signals in ultrasound (b) B-mode ultrasound & DA for axillary lymph node classification (Tateishi et al., 1998)	(a) 92.5 (b) 80	NM
Sonography & (a) SOM (Chen et al., 2000) (b) ANN (Lo & Floyd, 1999) for breast cancer diagnosis (Chen et al., 2000)	(a) 85.6 (b) 96	(a) 10-fold cross validation (b) NM
Thermogram & (a) Image histogram & Co-occurrence matrix (Jakubowska et al., 2003) (b) Microwave radiation & Karhunen–Loeve transformation (Varga & De Muynck, 1992) (c) CLFNN (Tan et al., 2004) (d) BP for breast cancer diagnosis	(a) Almost 100 (b) Compared well to physician (c) 74–94 (d) 53–64	(a), (b): NM (c) 39/78 (d) 65/78
Gene expression & k-means clustering & principal component analysis & Bayesian classification tree for (a) lymph-node metastasis and (b) relapse (Huang et al., 2003)	(a) 90 (b) 90	Leave-one-out, (a) 37, (b) 52

Abbreviations: BP: Backpropagation, ANN: Artificial Neural Network, SVM: Support Vector Machine, DA: Discriminant Analysis, LDA: Linear DA, MLP: Multilayer Perceptron, SOM: Self-Organizing Map, NN: Nearest Neighbor, MARS: Multivariate Adaptive Regression Splines, RBF: Radial Basis Function, NM: Not Mentioned.

Reported accuracy on computer-aided diagnosis Abbreviations: BP: Backpropagation, ANN: Artificial Neural Network, SVM: Support Vector Machine, DA: Discriminant Analysis, LDA: Linear DA, MLP: Multilayer Perceptron, SOM: Self-Organizing Map, NN: Nearest Neighbor, MARS: Multivariate Adaptive Regression Splines, RBF: Radial Basis Function, NM: Not Mentioned. As shown in Table 2, intelligent tools contribute significantly in improving the breast cancer detection and prognosis. This is consistent with a recent review that computer-aided diagnosis shows incremental improvement in sensitivity (Irwig, Houssami, & van Vliet, 2004). MLP or BP is the favorite algorithm to complement various modalities, in spite of its limitations such as slow learning, likely to be trapped in local minima, etc. SOM is another common adjunct for imaging modalities, albeit its poor classification performance, and high memory requirement. Statistical methods like LDA, Bayesian network, and logistic regression are often applied in assisting diagnosis and prognosis. However, statistical methods are difficult to develop, and oftentimes they work under the assumption that the underlying data is normally distributed. Whereas RBF has heavy computation and memory requirements, decision tree is limited in its representation power due to the use of crisp rule. On the other hand, evolving ANN, although it is able to achieve optimal performance, is time-consuming to develop since it may take a few hundreds to thousands runs before it can find the appropriate parameters. Furthermore, due to the stochastic nature of the algorithm, it may generate inconsistent knowledge base. Most of all, these methods (except decision tree) do not provide any explanations for their computations and reasoning. As a result, the physicians have no way to validate the system operation, and hence, they find it difficult to trust the system. Complementary Learning Fuzzy Neural Network (CLFNN) is therefore proposed to be Computer-Assisted Intervention (CAI) for breast thermography. CLFNN is a neuroscience-inspired, evolving, and autonomous fuzzy neural network that based on positive and negative learning. CLFNN not only provides good performance in classification, but also fast in learning. Most importantly, CLFNN offers human-like reasoning as well as intuitive fuzzy rules to explain its computations. Since human observer’s image interpretation is often lack of thoroughness and lack of consistency (Bick, 2000), the capacity of CLFNN in providing cognitive interpretation on given thermogram is of great importance for aiding image analysis. Psychophysical evidence demonstrates that even imperfect prompts can enhance human ability in pattern detection (Kotre, 1993). Therefore, CLFNN is believed to enhance the overall accuracy of breast thermography. On the other hand, most of the disease detection works in CAI adopted physic or physically inspired models (Ellis & Peters, 2004), statistical methods such as Bayesian theory and nearest neighbor (Sajda, Spence, & Parra, 2003), or Artificial Neural Network (ANN) (Frigyesi, 2003, Joo et al., 2004). These methods however possess some shortcomings: statistical methods and ANN do not justify, and provide no explanation for their computation. As a result, the output is difficult to trust because it comes without reason. As for model-based system, it is difficult to develop, and many a times requires assumption to be made. This applies to statistical methods as well, as many statistical methods assume that the data is normally distributed. Conversely, other than superior accuracy, CLFNN provides positive and negative fuzzy rules to reason its decisions, and this reasoning is closely akin to diagnostician’s decision-making process. These rules not only can be used to countercheck physician’s diagnosis, they could potentially guide junior physician. Besides, CLFNN can also be adopted to confirm or investigate hypothesis associated with breast cancer such as women having family history of breast cancer belong to high risk group (Cancer Research UK, 2002), temperature difference between left and right breast suggests possible case of cancer (Gautherie, 1989), and so on.

FALCON-AART

FALCON-AART is a CLFNN that forms its fuzzy partitions based on visual cortical plasticity, and adjusts its parameters based on psychological theory of learning (Tan, Quek, & Ng, 2004) (for details, see Tan et al., 2004). It generates fuzzy rules autonomously in the form described by Eq. (1).The fuzzy rule in Eq. (1) is an example of a system with two inputs and two outputs. It consists of five elements: Input linguistic variables (x 1, x 2). Input linguistic terms (A, B). This represents fuzzy entities such as tall, short, thin, fat, and so on. FALCON-AART represents input linguistic terms by using trapezoidal membership function. If–Then rule: links the antecedent part (i.e., input linguistic variables and terms) with the consequent part (i.e., output linguistic variables and terms). Output linguistic variables (y 1, y 2). Output linguistic terms (C, D). FALCON-ART has five layers and each layer is mapped onto the elements of the fuzzy rule (Fig. 2 ). Before training commences, FALCON-AART consists of input and output layers only. As training progresses, FALCON-AART evolves and automatically constructs its hidden layer by modified Fuzzy ART algorithm (Tan et al., 2004). This algorithm is based on complementary learning paradigm that comprises positive (learn from positive patterns) and negative learning (learn from negative patterns). The modified fuzzy ART algorithm (known as Another ART) improves Fuzzy ART (Baraldi & Bonda, 1999) by functionally models and incorporates the human visual cortical plasticity. With this, FALCON-AART structural learning becomes a function of time (age), which enables FALCON-AART to alleviate the stability–plasticity dilemma as well as to avoid the problem of generating bad clusters as suffered by most competitive learning algorithm. It dynamically partitions the input and output spaces into trapezoidal fuzzy clusters, and subsequently these clusters are finetuned using modified adaptive back-propagation algorithm (Tan et al., 2004). The tuning is done simultaneously to the slope and the location of fuzzy sets. When new training patterns are presented, the stored cluster will resonate if the new training patterns are sufficiently similar to them. The resonant cluster will then expand to incorporate these patterns using the Another ART algorithm. Training terminates when the mean square errors between two consecutive epochs are sufficiently equal.

Fig. 2

Architecture of FALCON-AART.

Architecture of FALCON-AART. The neural memory structure between Layers 2 and 3 is the construct of the complementary learning. Complementary learning refers to positive and negative learning, which is believed to be a mechanism underlies human recognition. When a positive pattern is presented, positive rules will be excited, and negative rules will be inhibited simultaneously, and vice versa. The complementary learning is often practiced in daily life: a child will learn how to recognize an apple more efficiently, if he/she were presented an apple (positive pattern) and other fruits (negative patterns). Likewise, a radiologist will have to have seen/learned, both abnormal medical image (positive learning) and normal medical images (negative learning), in order for him/her to recognize or analyse the images effectively. Evidences for this complementary learning can be drawn from various neuroscience studies. For instance, hippocampus possesses both positive and negative reinforcement signals; the existence of excitatory (positive) and inhibitory (negative) neurotransmitter systems inside human brain, etc. As shown in Fig. 3 , different objects are registered into different brain areas, lending further support to the complementary learning conjecture. Hence, whenever a car is presented (positive), only areas registered for car (positive rules) will be activated, while the areas registered for other objects (negative rules) will be inhibited simultaneously.

Fig. 3

Slices of fusiform gyrus of car and bird expert in face, car, and bird recognition. The rectangular boxes show the activated areas of brain for different recognition task (Adapted from Gauthier et al., 2000). Thus, FALCON-AART functionally models the biological complementary learning, and is formalized as Eqs. (2), (3). Given a positive sample, {x + = (x 1, x 2, … , x ), d = 1}, x ∈ U, d ∈ V, and , , then: Hence, whenever a positive sample is presented to the system, , which leads to a correct decision, i.e., d = 1.

Experiment and results

Dataset

The thermograms are obtained from voluntary patients at the Singapore General Hospital (SGH) (Ng et al., 2001, Ng et al., 2002). The thermograms are captured using the AVIO thermal camera TVS-2000 MkIIST system. The thermography process is shown in Fig. 4 .

Fig. 4

Thermography process.

Thermography process. The patient’s thermal image is captured using thermal camera. The imager component of thermal system converts infrared emitted by the object under observation into electrical signals. Subsequently, the processor components collects these signals, store them in frame memory, then displays them on a LCD display, either as real-time sixteen bit color or monochrome thermographic images. The thermogram is stored, and feature extraction is done to compute the temperatures of the left and right breasts using the AVIO software. Example of thermogram is given in Fig. 5 .

Fig. 5

Thermogram of (a) healthy patient-symmetrical temperature (b) unhealthy patient-unsymmetrical temperature.

Thermogram of (a) healthy patient-symmetrical temperature (b) unhealthy patient-unsymmetrical temperature. The volunteers are between the ages of 27 and 90. Screening was carried out from 9.00 am to 11.30 pm of the day as this is the most stable period (Gautherie, 1989). All volunteers were briefed the methodology and process of thermography in advance in order to relief them from any possible emotional stress as well as to obtain their consent. They were advised not to put on any powder, ointments, perfume, or any other wipes that will affect the conduction through the skin, around regions to be examined. Before the examination was carried out, volunteers were required to rest for 15–20 min for acclimatization to room temperature upon arrival at the examination room. This is important to keep patients in basal metabolic rate which will result minimal surface temperature changes for satisfactory thermograms. Since standardized ambient conditions are necessary to minimize variations in thermography, the ambient temperature was carefully observed for the examination. The examination environment was a controlled, air-conditioned room maintained at an ambient temperature of 20–22 °C (maximum variation is ±0.1 °C), with humidity between 55% and 65%. Direct draughts are avoided in the areas where the patient is positioned. Volunteers wore loose gowns that do not restrict airflow for equilibration and do not constrict the skin surface during this equilibration period. It was ensured that patients were within the period of the 5th–12th and 21st day after the onset of menstrual cycle as this is the most suitable period for imaging. This is because women body temperature is known to be stable in this period (Gautherie, 1989), and the vascularisation is at basal level with least engorgement of blood vessels (Ng et al., 2001). Three thermograms were taken for each patient: one front view and two lateral views. There are total of 78 patients with 28 healthy patients, 43 benign tumor patients, and 7 cancer patients. Mean, median, mode, standard deviation and skewness of each breast temperature are extracted from front-view thermograms using histograms of the temperature distribution, and calculated using the Statistical Package for the Social Sciences (SPSS). Population of patients is shown in Table 3 .

Table 3

Average mean and modal temperatures of healthy and unhealthy breasts

	Healthy patients	Benign patients	Carcinoma patients
Average mean temperature of normal breast (left and/or right) (°C)	32.66	32.81	33.43
Average mean temperature of abnormal breast (left or right) (°C)	Not available	33.00	33.51
Average modal temperature of normal breast (left and/or right) (°C)	32.67	33.05	33.40
Average modal temperature of abnormal breast (left or right) (°C)	Not available	33.00	33.51

Average mean and modal temperatures of healthy and unhealthy breasts Table 3 shows that carcinoma patients generally have higher breast temperature compared to healthy patients. This temperature difference arises because the cancerous breast has higher metabolism. The blood vessels in the vicinity of the tumor are engorged with blood and therefore, cancerous breast emits more heat (Ng et al., 2002).

Experiment

The experiment is to diagnose whether a patient belongs to normal, benign, or malignant based on breast temperatures extracted from thermogram. Five types of file are created, and three training/testing sets are created for each type of file for cross-validation purpose. Each of the stratified training sets contains randomly selected 50% samples from the dataset, and the remaining unseen samples made up the testing sets. The sets are presented below: File FH: contains patient age, family history, hormone replacement therapy, age of menarche, presence of palpable lump, previous surgery/biopsy, presence of nipple discharge, breast pain, menopause at age above 50 years, and first child at age above 30 years. File T: contains mean, median, modal, standard deviation and skewness of temperature for left and right breasts. File TH: combination of FH and T. File TD: contains temperature difference of mean, median, modal, standard deviation and skewness for left and right breasts. File TDH: combination of TD and FH. The averaged performance of FALCON-AART is benchmarked against Linear discriminant analysis (LDA) (Hanm & Kamber, 2001), k-Nearest neighbor (kNN) (Hanm & Kamber, 2001), Naı¨ve Bayesian (Hanm & Kamber, 2001), logistic regression (LR) (Hanm & Kamber, 2001), Self-Organizing Map (Chen et al., 2000), Radial Basis Function (RBF) (Hanm & Kamber, 2001), Support Vector Machine (SVM) (Hanm & Kamber, 2001), C4.5 (Hanm & Kamber, 2001), Multilayer Perceptrons (MLP) (Hanm & Kamber, 2001). Apart from that, comparison is made with FALCON-AART ancestors: FALCON-ART (Lin & Lin, 1997) and FALCON-MART (Tung & Quek, 2001). The result is listed in Table 4 . Recall refers to the classification accuracy on the training set, whereas predict refers to the classification accuracy on testing set.

Table 4

Breast cancer diagnosis result (desired values are in bold)

Method		FH	T	TH	TD	TDF
Linear discriminant analysis	Recall (%)	65.79	62.86	88.57	37.14	71.43
	Predict (%)	34.21	47.37	28.95	28.95	36.84
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

Multilayer perceptron	Recall (%)	82.86	77.14	97.14	65.71	88.57
	Predict (%)	42.11	55.26	57.89	47.37	42.11
	No. of epoch	100	100	100	100	100
	No. of rules	Not applicable

Naı¨ve Beyesian classifier	Recall (%)	57.14	57.14	55.88	57.14	54.29
	Predict (%)	54.29	54.29	57.14	54.29	22.86
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

k-Nearest neighbor	Recall (%)	25.71	88.57	97.06	74.29	91.43
	Predict (%)	40	45.71	48.57	42.86	45.71
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

Support vector machine	Recall (%)	62.86	62.86	77.14	54.29	68.57
	Predict (%)	42.11	52.63	57.89	52.63	42.11
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

C4.5	Recall (%)	74.29	74.29	91.43	68.57	88.57
	Predict (%)	42.11	57.89	55.26	44.74	50
	No. of epoch	1	1	1	1	1
	No. of rules	7	4	9	4	9

Logistic regression	Recall (%)	80	77.14	100.0	60.0	82.86
	Predict (%)	36.84	57.89	50	44.74	42.11
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

Self organizing map	Recall (%)	77.14	82.86	80	77.14	77.14
	Predict (%)	34.21	50.0	21.05	36.84	34.21
	No. of epoch	2000	2000	2000	2000	2000
	No. of rules	Not applicable

Radial basis function	Recall (%)	60.0	62.86	57.14	57.14	62.86
	Predict (%)	50.0	55.26	52.63	44.74	50.0
	No. of epoch	1	1	1	1	1
	No. of rules	Not applicable

FALCON-ART	Recall (%)	85.71	71.43	65.71	70.37	88.57
	Predict (%)	50.0	52.63	52.63	51.28	55.26
	No. of epoch	50	50	50	50	50
	No. of rules	137	189	148	134	180

FALCON-MART	Recall (%)	37.14	97.14	100.0	70.37	97.14
	Predict (%)	34.21	65.79	52.63	51.28	52.63
	No. of epoch	6	17	10	6	8
	No. of rules	34	23	78	54	31

FALCON-AART	Recall (%)	77.14	94.29	97.14	100.0	100.0
	Predict (%)	65.79	65.79	68.42	55.26	63.16
	No. of epoch	4	4	4	4	10
	No. of rules	4	22	30	38	31

Breast cancer diagnosis result (desired values are in bold) It is shown that FALCON-AART outperforms the common methods in medical image analysis and its ancestors in all the training/testing sets. While having good recall and relatively superior generalization capability, the average training time of FALCON-AART is significantly shorter than MLP, SOM, and LR. Though statistical algorithms require only one pass of training dataset, it does not necessarily means they are faster than FALCON-AART as this depends on the computational complexity of the algorithm. In this particular case, FALCON-AART is as fast as kNN, LDA, SVM, and Naı¨ve Bayesian classifier in learning (≈245 ms). In contrast to statistical methods, FALCON-AART did not make assumption on the data distribution, and this may give superior classification performance even for non-normally distributed data. Note that this result is not comparable to the one in Table 1, Table 2 as this is a different classification task. This classification task involves normal, benign, and malignant whereas the task in Table 1, Table 2 involves only benign and malignant. In other words, from the experimental result shown in Table 4, complementary learning displays superior capacity in multi-class classification than conventional methods. One significant advantage FALCON-AART offers is the ability to explain its computed output. In contrast to conventional methods, FALCON-AART constructs intuitive positive and negative fuzzy rules dynamically to depict its reasoning process; these rules can be scrutinized by the physicians and decide upon whether to adopt the system suggestion. In addition, accurate rules identified may be used as a guideline for inexperience physicians in diagnosis. As shown in Table 4, rule generation capability of FALCON-AART is better than its ancestor, in which lesser rules are generated but greater accuracy are attained. Some authors have proposed a few criteria for measuring system interpretability: compactness (lesser number of rule in rule base), coverage (every value in universe of discourse should belong to one of the rule), normality (every rule has at least one pattern exhibit full-matching), and so on (Casillas, Cordón, Herrera, & Magdalena, 2003). FALCON-AART learning is a data-centered learning and therefore, it fulfills the coverage and normality criteria. From this experiment, it can be seen that FALCON-AART generates a smaller rule base then its ancestors. Thus, from this aspect, FALCON-AART offers a more interpretable system than its ancestors. Examples of the rules generated are given in Table 5 .

Table 5

Fuzzy rules generated by FALCON-AART

Fuzzy rules (FALCON-AART)	Crisp rules (C4.5)	Diagnostic rule Varga and De Muynck (1992)
IFmean difference is small, ANDmedian difference is rather big, ANDmodal difference is medium, ANDstandard deviation is big, ANDskewness difference is very small, THEN Normal	IFmodal difference < 0.1 ANDmodal difference < 0.03 ANDmean difference < 0.25 ANDmedian difference < 0.24 ANDskewness difference < 0.05 THEN Normal	IF temperature is generally 0.3–1.5 °C higher than the surrounding normal tissues, THEN Tumor
IFmean difference is medium, ANDmedian difference is very small, ANDmodal difference is very big, ANDstandard deviation is medium, ANDskewness difference is small, THEN Benign	IFmodal difference ⩾ 0.1 ANDmedian difference < 0.07 THEN Benign	IF temperature drops linearly with time THEN normal tissue
IFmean difference is very small, ANDmedian difference is small, ANDmodal difference is very big, ANDstandard deviation is big, ANDskewness difference is medium, THEN Malignant	IFmodal difference < 0.1 ANDmodal difference < 0.03 ANDmean difference ⩾ 0.25 ANDmodal difference < 0.02 AND THEN Malignant	IF temperature remains high but sometimes drops a little THEN Tumor

Fuzzy rules generated by FALCON-AART As shown in Table 5, fuzzy rules generated by FALCON-AART are highly similar to the diagnostic rules practiced by diagnosticians. Aside from the capacity for uncertainty handling (allowing vagueness in linguistic terms), FALCON-AART rule is relatively more expressive compared to decision-tree rule. FALCON-AART rule encapsulates unnecessary details using linguistic term, and allows the use of linguistic hedges such as “very”, “rather”, etc. Moreover, rules generated by FALCON-AART do not have the confusing repeated antecedent term as in decision tree. Furthermore, because FALCON-AART adopts complementary learning, positive and negative rules are generated. This, aside from better classification performance, models the problem space closer than positive or negative learning systems (system with only positive or negative rule base) because no assumptions are made for the uncovered space by the rule base. Fig. 6 depicts the FALCON-AART reasoning process.

Fig. 6

Reasoning process of FALCON-AART.

Reasoning process of FALCON-AART. As shown in Fig. 6, the reasoning process of FALCON-AART is closely akin to how a diagnosis is made: A diagnostician will first observe (presents sample), generates a set of hypotheses (a set of rules), evaluates each hypotheses (compute matching degree of rules), and subsequently derives the conclusion. This human-like reasoning, together with the fuzzy rules generated, which provide insights and interpretations to the thermograms, are useful to aid diagnostician. Table 6 shows the similarity between FALCON-AART and thermogram analyst’s reasoning process. As shown, there is one-to-one mapping of the reasoning process, suggesting the closeness between the two reasoning processes. This is paramount as it facilitates the physicians in analyzing or validating a system, in that he/she can do so in his/her familiar terms, as well as in his/her familiar thought process.

Table 6

FALCON-AART and analyst reasoning

Steps	FALCON-AART	Analyst
1	Take in the extracted features from thermogram	Examine the thermogram. Looks for abnormal heat patterns, temperature variations, etc.
2	Compare the feature values with own positive and negative diagnostic rules (knowledge/ experiences). Computes their matching degree (firing strength/ similarity)	Compare the examined thermogram with previous benign (negative) and malignant (positive) thermograms. Judges and determines their similarity based on own diagnostic rules and experiences
3	Select the rule with maximum matching degree, and inhibits others	Select the knowledge that best describes the current situation. Eliminates those hypotheses that are not relevant
4	Determine the consequent linked by the winning rule	Determine the conclusion derived from the knowledge applied
5	Perform defuzzification and outputs the conclusion	Give the diagnostic conclusion and decision

FALCON-AART and analyst reasoning FALCON-AART can be used to assess/affirm certain medical hypothesis as well. For example, from Table 4, one can see that the classification accuracy of training/testing set using only breast temperatures alone is lower than that of using breast temperatures and family history. This confirms that family history is important risk factor for breast cancer, lending support to the hypothesis that women who have family history of breast cancer belong to the high-risk group. Another example: the performance of FALCON-AART trained on files TH (FATH) and TDF (FATDF) seemed to be inconsistent with the belief that temperature asymmetry between left and right breast suggests possible case of cancer. This happens because the classification task is to classify three classes, instead of classifying out the cancerous case. In fact, the temperature asymmetry between left and right breast may be more useful in determining the stages of cancer instead of cancer detection (Usuki, Maeta, Maeba, & Wakabayashi, 2000). Nevertheless, the result of detecting cancerous case is illustrated in Task 1 of Table 7 .

Table 7

Performance of FALCON-AART on breast thermography

Tasks		Recall (%)	Predict (%)	Sensitivity (%)	Specificity (%)	No. of epoch	No. of rules
1. Cancer detection	TH	100.0	94.74	100.0	60.0	4	14
1. Cancer detection	TDF	100.0	94.74	100.0	60.0	7	21

2. Breast tumor detection	TH	100.0	84.0	33.33	90.91	11	10
2. Breast tumor detection	TDF	100.0	71.05	76.0	61.54	11	17

3. Breast tumor classification	TH	100.0	88.0	33.33	95.45	4	8
3. Breast tumor classification	TDF	100.0	84.0	33.33	90.91	11	10

Performance of FALCON-AART on breast thermography Though both FATH and FATDF attain same accuracy, FATDF is better when assessed using Receiver Operating Curve (ROC) plot, which suggests that it is relative easier to classify using temperature asymmetry of left and right breasts. The 45° line signifies the random guessing. As shown in Fig. 7 , FALCON-AART trained on either files deviates far away from the 45° line, achieving good performance for breast cancer detection. The Area Under the Curve index (A ) is often used in ROC analysis. A = 0.5 symbolizes random guessing, and the closer A is to 1.0, the better the classifier is. A for FATH and FATDF are 0.867 and 0.93, respectively, hence, confirming that asymmetry temperature between the left and right breast is an alarm for breast cancer, and the fact that FALCON-AART is a competent classifier.

Fig. 7

ROC plot of FALCON-AART trained on files TF and TDF.

ROC plot of FALCON-AART trained on files TF and TDF. Thermogram is often employed to detect the presence of breast tumor. Hence, experiment to classify patient with breast tumor is conducted using FALCON-AART. The result is summarized in Task 2 of Table 7. The experimental result reveals that FALCON-AART can detect patient with breast tumor accurately. Therefore, FALCON-AART could assist the physicians in identifying suspected cases where follow-ups are needed. With overall performance close to 90%, good recall and generalization capability is exhibited by FALCON-AART. Sometimes, it is desired to classify benign and malignant breast cancer. Misdiagnose benign breast tumor as malignant causes unnecessary physical and emotional agony, because the only way to remove breast tumor is surgical biopsy. On the other hand, misdiagnose malignant breast tumor as benign brings fatal consequences. Thus, it is required to diagnose breast tumor as accurate as possible. The experimental result listed in Task 3 of Table 7 demonstrates that FALCON-AART is able to assist in this diagnostic task as well. Giving an overall accuracy about 93%, FALCON-AART demonstrates its competency in tumor classification task. This shows that complementary learning paradigm is a promising recognition approach. From the results presented in Table 1, Table 2, Table 7, complementary learning exhibits itself as a promising tool for aiding breast cancer diagnosis. Applying FALCON-AART with thermogram shows an improved performance in cancer detection as well as breast tumor classification. This confluence of thermography and CLFNN subsides the problem of high variability in accuracy of breast thermogram analysis. Besides, sensitivity and specificity are offered as high/higher than the reported accuracy on breast thermography alone, as well as other modalities. However, CLFNN is not to replace, rather, is to complement the breast thermography and to assist the physicians in breast cancer diagnosis. The contribution of CLFNN-breast thermography in enhancing the consistency of breast cancer diagnosis accuracy is believed to bring forth better patient outcome. Comparing the results of Table 2, Table 7, confluence of CLFNN and breast thermography shows a superior performance in breast cancer detection over different conventional methods in medical diagnosis and medical imaging analysis. Medley of CLFNN and breast thermography gives as accurate result, if not better, compared to other combinations of ANN and breast imaging modalities in tumor classification and detection. In general, CLFNN has relatively good generalization capability, in that it can classify well using only a small fraction of the data. Together, this supports the application of CLFNN and breast thermography. This also suggests that the confluence of breast thermography and CLFNN is a promising system for fighting breast cancer.

Discussion and conclusion

In this study, it is shown that CLFNN complements breast thermography in various ways. The combination of breast thermography and CLFNN gives better or more consistent result than using breast thermography alone. Whether it is cancer detection, tumor classification or breast-cancer diagnosis (multi-class problem), CLFNN outperforms conventional methods, showing the strength of complementary learning in recognition task. FALCON-AART assists the physicians in different diagnostic tasks by providing relative accurate decision support, and hence could potentially enhance patient outcome. FALCON-AART not only gives superior result than conventional methods, but it also offers intuitive positive and negative fuzzy rules to explain its reasoning process. FALCON-AART satisfies the criteria of an interpretable system: normality, compactness, coverage, and therefore is a more interpretable system. The rules generated are useful because it gives insight to the problem space, provides simple cognitive interpretation of medical image, and could potentially serve as guidelines or arguments for its decision to the physicians. Apart from assisting physician in diagnosis, FALCON-AART can also be used to investigate or to support hypothesis associated with the problem domain, i.e., concept validation (Qi & Diakides, 2003). In this study, only two hypotheses were analyzed. In future, more hypotheses can be assessed using CLFNN by proper experiment setup. Examples are thermal challenge test (Eccles, 2003), cold stress (Usuki et al., 2000) or cooling-rewarming tests (Gautherie, 1999) (outside cooling of the breast will increase the temperature contrast if the breast is cancerous), injection of vasoactive substances (Gautherie, 1999), microwave or ultrasonic irradiation (Gautherie, 1999) and so on. Likewise, FALCON-AART can be applied with advanced technologies, which provides more information in thermography: dynamic thermography (Ohashi & Uchida, 2000), 3-dimensional thermography (Aksenov et al., 2003), or thermal texture map (Hassan, Hattery, & Gandjbakhche, 2003), or Dynamic Area Telethermometry (DAT) (Anbar et al., 2000). Conversely, FALCON-AART can complement thermogram in other application areas such as injuries monitoring (Bamberg, 2002), neurology, vascular disorders (e.g., diabetes), rheumatic diseases, tissue viability, oncology (especially breast cancer), dermatological disorders, neonatal, ophthalmology, surgery (Jones, 1998), as well as Severe Acute Respiratory Syndrome (SARS) (Diakides & Diakides, 2003). Alternatively, CLFNN can be used to complement other medical imaging modalities such as MRI, MRS, PET, etc., as well as to serve as a concept validation tool for techniques such as nipple fluid bFGF (Liu, Wang, Chang, Barsky, & Nguyen, 2000), Electrical Impedance Tomography (EIT) (Cherepenin et al., 2001), etc. In current study, FALCON-AART does not perform feature analysis, which is an important area that may improve the system performance and deserved to be studied, as recognition requires one to make decision based on some “important features”. Moreover, performing feature analysis can reduce the number of antecedents of the rule, and hence improve the interpretability of the system. This will be investigated in future.

2 in total

Review 1. Application of infrared thermography in computer aided diagnosis.

Authors: Oliver Faust; U Rajendra Acharya; E Y K Ng; Tan Jen Hong; Wenwei Yu
Journal: Infrared Phys Technol Date: 2014-06-20 Impact factor: 2.638

2. Stock Portfolio Optimization Using a Combined Approach of Multi Objective Grey Wolf Optimizer and Machine Learning Preselection Methods.

Authors: Nasrin Bagheri Mazraeh; Amir Daneshvar; Mahdi Madanchi Zaj; Fereydon Rahnamay Roodposhti
Journal: Comput Intell Neurosci Date: 2022-08-29

2 in total