Literature DB >> 34876575

Minimalist module analysis for fault detection and localization.

Zhijiang Lou¹, Youqing Wang², Shan Lu³, Pei Sun¹.

Abstract

Traditional multivariate statistical-based process monitoring (MSPM) methods are effective data-driven approaches for monitoring large-scale industrial processes, but have a shortcoming in handling the redundant correlations between process variables. To address this shortcoming, this study proposes a new MSPM method called minimalist module analysis (MMA). MMA divides process data into several different minimalist modules and one more independent module. All variables in the minimalist module are strongly correlated, and no redundant variables exist; therefore, the extracted feature components in one minimalist module will not be disturbed by noise from the other modules. This study also proposes new monitoring indices and a fault localization strategy for MMA, and simulation tests demonstrate that MMA achieves superior performance in fault detection and localization.

Entities: Chemical

Year: 2021 PMID： 34876575 PMCID： PMC8651725 DOI： 10.1038/s41598-021-02676-3

Source DB: PubMed Journal: Sci Rep ISSN： 2045-2322 Impact factor: 4.379

Introduction

Multivariate statistical-based process monitoring (MSPM) methods[1-4], e.g., principal component analysis (PCA)[5, 6], partial least squares (PLS)[7, 8], and canonical correlation analysis (CCA)[9, 10], are effective data-driven approaches for monitoring large-scale industrial processes. The main idea of MSPM is analyzing the correlation between process variables and extracting the feature components for the construction of statistical indices. MSPM has been a research hotspot for many years, and a large number of relevant studies are published each year. In recent years, studies have focused on improving the existing methods to deal with process characteristics such as nonlinear, non-Gaussian, and dynamic features. For example, Ge et al.[11] combines the multivariate linear Gaussian state-space model with MSPM for handling the dynamic feature during a process; Du et al.[12] proposed the Gaussian distribution transformation (GDT)-based monitoring method for handling the non-Gaussian feature; and Lou et al.[13] combined artificial neural networks with PCA, and proposed a new neural component analysis for handling nonlinear features. Meanwhile, Zhou et al.[14] proposed a nonlinear key performance indicator (KPI) strategy for the PLS algorithm. Because MSPM can compress the high-dimensional data into two or three statistical indices, it is a convenient tool for detecting the abnormal condition in the whole process object. To address the fault localization problem, the contribution plot method[15, 16] was proposed for MSPM, which calculates the contribution of each variable of the original data set and picks the variables with high contributions as fault sources. Most studies on MSPM use the contribution plot as a basic algorithm tool[17, 18], and a few studies have proposed improved versions of the MSPM method that cannot use the traditional contribution plot directly (examples include the kernel PCA[19] and robust PCA[20]). However, according to actual simulation test results, MSPM is insensitive to specific faults, and the contribution plot method may mistakenly diagnose normal variables as a fault source. The reason for this phenomenon is that the traditional MSPM methods are based on the correlations between all process variables, and some correlations can be deduced by others, which means that these correlations are redundant. As such, the feature components extracted by traditional MSPM methods contain information from many process variables, and hence, are also disturbed by noises from these variables; therefore, traditional MSPM methods are insensitive to specific faults. In addition, the redundant correlations may mislead the contribution plot method, which results in incorrect localization of faults. For handling these problems, multiblock MSPM methods, such as consensus PCA (CPCA)[21], multiblock PLS (MBPLS)[18], and hierarchical PLS (HPLS)[22], are proposed for reducing the number of variables and improving the interpretability of multivariate models. The main idea of multiblock MSPM methods is dividing the process variables into several blocks and combining the monitoring result of each block. However, block division is still an open problem in academic and engineering fields. Though Slama had given a general guideline “blocks should correspond as closely as possible to distinct units of the process where all the variables within a block or process unit may be highly coupled, but where there is minimal coupling among variables in different blocks”[18], this rule is inappropriate for large-scale industrial processes, because (a) in large-scale industrial processes, variables in different process units are still highly coupled; (b) variables in the same unit may be unrelated. In addition, for multiblock MSPM methods, one variable only belongs to one block, as such, the rest blocks may lose key input variables, which causes large model error. For example, for model in Fig. 1, it’s hard to divide the process variables into two or more blocks: when is allocated to block 2, then blocks 1 loses information of . Besides, it’s difficult to divide the blocks with traditional data-driven method, and hence many multiblock MSPM methods demand the process prior knowledge for block division[23].

Figure 1

The traditional multivariate statistical-based process monitoring (MSPM) methods, multiblock MSPM, and minimalist module analysis.

The traditional multivariate statistical-based process monitoring (MSPM) methods, multiblock MSPM, and minimalist module analysis. To eliminate the influence of the redundant correlations among process data, this paper proposes a novel MSPM method called minimalist module analysis (MMA). All variables in the minimalist module are strongly correlated, and no redundant variables exist. As shown in Fig. 1, MMA just analyzes the correlations between variables in the same module, and hence the extracted feature components are not disturbed by the noise from the other modules. In addition, the modularization analysis results can provide more useful information for fault localization. The difference between MMA and the multiblock MSPM methods are as follows: first, for MMA, each variable may belong to more than one modules ( belongs to two modules in Fig. 1), so each module represents one complete correlation without information loss; second, for MMA, module division is based on statistics analysis rather than the process prior knowledge, which is consistent with the data-driven feature of MSPM; third, each module only contain one correlation in MMA, and each block in the multiblock MSPM methods may contain more than one correlations. The main innovations of this study are as follows. First, we propose a modularization method based on singular value decomposition (SVD)[24] and particle swarm optimization (PSO)[25], which can divide the process variables into different minimalist modules and an independent module. Then, we propose new monitoring indices for each module. In addition, we propose a new fault localization strategy for MMA. According to a survey paper[1], PCA is the most commonly used MSPM method. As such, this paper focuses on the comparison of MMA and PCA; our conclusion is also applicable to other algorithms, such as PLS and CCA. The simulation tests in a mathematical model and the Tennessee Eastman (TE) process[26] show that MMA can successfully obtain the minimalist modules; moreover, it achieves much better performance than the traditional MSPM methods in fault detection and fault localization. The remainder of this paper is organized as follows. In “Methods” section, we briefly review some concepts of classical PCA and the contribution plot method, and assess the defects of these methods. “Minimalist module analysis (MMA)” section then proposes MMA for process monitoring, and introduces some details. “Simulation study of MMA” section analyzes the characteristics of MMA, and compares this method with PCA by conducting tests on a mathematical model. “Fault detection in the Tennessee Eastman process” section compares MMA with other improved MSPM methods in the TE process. Lastly, “Conclusions” section summarizes the contributions of this paper, and discusses some directions for future studies.

Methods

Principal component analysis (PCA)

PCA decomposes the data matrix (where n is the number of samples, and s is the number of variables) into a transformed k subspace of reduced dimensions as follows:where refers to the score matrix, which is an orthogonal matrix; refers to the loading matrix, and it is orthonormal; and is the residual matrix. To obtain the loading matrix , one should firstly calculate the covariance matrix: Then, can be presented by singular value decomposition (SVD) as follows:where () is a diagonal matrix. Matrix is actually columns of P0 associated with the k largest eigenvalues, and k is determined by cumulative percent variance (CPV)[27] as follows:where is a parameter usually set to 85%. When CPV is larger than , we take k as the number of the principal components (PCs). Then, two statistics are constructed to monitor the new process data sample as follows:where and . The thresholds for the two indices, and , can be found in reference[28].

Contribution plot

The contributions to SPE are calculated as follows:where and are the jth columns of x and , respectively. The contributions to are calculated as follows:where P is the ith column of P, and P is the element in the jth column and ith row. The role of the contribution plots to fault isolation is to indicate which of the variables are related to the fault rather than to reveal the actual cause of it. In general, variables with a higher contribution have a closer relationship with the fault source. The thresholds of and can be obtained by kernel density estimation[29].

Drawback of PCA and contribution plot method

Theorem

The redundant variables introduce extra noise into the principal components (PCs).

Assume are the variables belonging to a minimalist module, which can be full-rank decomposed aswhere and Matrix are the redundant variables that can be presented as the linear combination of X1 as follows:where . is the linear transformation matrix, and is noise belonging to X2. In this paper, we assume that each measurement variable contains independent sensor noise, and hence, rank(W) = s′. Taking , one obtains Part can be full-rank singular value decomposed aswhere , rank(T1) = rank(T0), and . Hence, one obtains Taking , Because part is non-orthogonal in most situations, we introduce another orthonormal matrix , which makes It should be noted that when is orthogonal, then Q = I. PCA picks the k largest components of T2 as PCs, and we denote them as . Then,where is the corresponding k columns of . Taking , and because and are parts of orthonormal matrices and , one obtains () unless the exceptionally rare situation that all columns of Q belong to the column set of . As , one obtains . As such, T is influenced by W, and the redundant variables X2 introduce extra noise W into the principal components (PCs). This finishes the proof. Based on the Theorem, one finds that PCA is not good at handling process data with redundant variables. As for the contribution plot method, according to Eqs. (6) and (7), it is based on the difference between x and . As shown in Fig. 2, when a fault occurs in a specific variable , (a) according to equation , the relevant principal components are faulty; (b) according to equation , most reconstructed variables are faulty. As such, in a practical engineering application, it is hard to locate the source fault by the contribution plot method because too many variables’ contribution indices alarm the fault.

Figure 2

Fault propagation from original data to reconstructed data.

Section summary

In sum, to eliminate the noise disturbance in the redundant variables, and to improve the fault localization ability, we develop a new monitoring algorithm based on the minimalist module and propose a corresponding fault localization strategy in “Minimalist module analysis (MMA)” section.

Minimalist module analysis (MMA)

The content of this section is listed in Fig. 3 below.

Figure 3

Content of this “Minimalist module analysis (MMA)” section.

Minimalist module division

Traditional PCA approaches focus on the k largest eigenvalues in matrix , and the important information contained in the residual part is not used. When is very small (e.g., 0.05), one obtains . Taking P as the columns of P0 associated with the s-k smallest eigenvalues, one obtains We assume , and . Then, Through the transformation of Eq. (17), one obtains As such, one then obtains Unlike , some elements of are 0, and hence Eq. (19) can describe the relationship between x2 and x3 without considering x1. In Eq. (19), variable set is a minimalist module. The flow of minimalist module division is as follows: Find a transformation matrix that maximizes the number of 0 elements in . This paper addresses this optimization problem by using the particle swarm optimization (PSO)[30] algorithm as described below. Step 1 Set num = 1. Step 2 Take the column of P as and the remaining s − k − 1 columns as . Solve the following optimization function by PSO:where denotes the number of elements in interval ( is close to 0, such as 0.01). Step 3 If , go to step 4; else, num = num + 1 and go to step 2. Step 4 . Calculate , adjust each column of to unit variance, and set all elements in interval to 0. Take the variables corresponding to non-zero element parameters in the ith () column of as the ith minimalist module (MMi). The form of the minimalist module is not unique, e.g. through the transformation of Eq. (17), one also obtains.and hence variable set is also a minimalist module. As such, the result of PSO may be different each time.

Independent module

Each variable in the minimalist module is strongly correlated with other variables. As such, some variables, such as x8 and x9 in Fig. 3, are not included in the minimalist module group. Thus, these variables belong to the independent module.

Monitoring indices construction

Each minimalist module can be monitored by the PCA algorithm independently. We assume that are data belonging to MMi. Then, because each minimalist module represents one independent correlation, and hence the number of PCs for each minimalist module is fixed as . The monitoring indices of each module are calculated asandwhere and , and and are the and SPE indices and the corresponding thresholds for MMi, respectively. Different from the traditional SPE index, SPE divides to eliminate the impact of on SPE. The indices for the whole process areandwhere is a positive value (e.g., ). As such, when some minimalist module detects the fault, then these two indices are much larger than their normal values. The threshold for both indices is . As for the variables in the independent module, they can be monitored by the index, which is denoted as .

Fault localization

For MMA, the fault localization rules are different for , , and indices. For the index, when is normal, then all related variables are normal. For example, in the mathematical model in Fig. 3, when and are faulty, and and are normal, then one gets that: (a) variables related to MM1 and MM2, i.e., , , , , and , may be faulty; (b) all variables related to MM3 and MM4, i.e., , , , , and , are normal; (c) must be faulty because it is the only common variable shared by MM1 and MM2, and x2 may also be faulty because we have no more information for judging it. For the index, when is faulty, then the correlation between all variables in MMi maybe faulty. For example, in the mathematical model in Fig. 3, one obtains ; when the correlation between , , and x3 changes to or , then and alarms the fault. When a fault occurs in variables not belonging to the minimalist module, such as x8 and , then they can only be handled with the detection result of the independent module, i.e., the contribution .

Simulation study of MMA

This section aims to study the performance of MMA through simulation tests, and compare it with PCA and mutual information–multiblock PCA (MI-MBPCA)[31]. MI-MBPCA employs mutual information to divide the block automatically and hence it does not need the process prior knowledge for block division. The test model is shown below: Random variables N and follow the standard Gaussian distribution, and indicates the process noise. Approximately 10,000 normal observations are produced for offline modeling. After data normalization, the training data are adjusted to zero-mean and unit-variance. Then the normalized data are processed by MMA. The matrix is obtained as follows: . Thus, MMA successfully obtains four minimalist modules: , , , and . Then, the independent module is . And MI-MBPCA divides the process variables into the following 5 blocks: , , , , , which is not consistent with the process model because is correlated with both and but they do not belong to same block. To compare the monitoring performance between MMA, PCA and MI-MBPCA, five test data sets are generated. Each data set contains 960 samples, and the fault occurs at the 160th sample point. The occurred faults are of the following five types: Fault 1: a step change with amplitude of 5 in x1; Fault 2: term N2 in the expression of x2 changes to ; Fault 3: a step change with amplitude of 0.2 in x3; Fault 4: term in the expression of changes to ; Fault 5: a step change with amplitude of 5 in . The detection results are listed in Table 1. The false alarm rate is calculated as and the detection rate is calculated as . In this study, all control limits are based on a probability of 99% and the best result is marked in bold.

Table 1

False alarm rates (%) and detection rates (%) of the principal component analysis (PCA) method, the mutual information–multiblock PCA (MI-MBPCA), and the minimalist module analysis (MMA) method.

Method	PCA		MI-MBPCA	MMA
Index	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T^{2}$$\end{document}T2	SPE	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$DR$$\end{document}DR	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_{I}^{2}$$\end{document}TI2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_{M}^{2}$$\end{document}TM2	SPE_M
False alarm rate	1.9	3.1	0.6	1.9	1.9	0.0
Detection rate
Fault 1	95.8	5.3	89.3	0.8	99.0	0.4
Fault 2	29.8	1.3	12.5	0.8	38.4	0.4
Fault 3	1.3	0.8	0.1	0.8	1.9	93.8
Fault 4	1.1	4.7	5.25	1.4	1.5	90.3
Fault 5	33.9	94.6	95.8	97.8	1.0	0.4

False alarm rates (%) and detection rates (%) of the principal component analysis (PCA) method, the mutual information–multiblock PCA (MI-MBPCA), and the minimalist module analysis (MMA) method. As shown in Table 1, the performance of MMA is better than that of PCA and MI-MBPCA for all five faults. Because MMA divides the whole process data into several minimalist modules and an independent module, and the noise in each variable will not disturb the unrelated modules, MMA is more robust to process noise than PCA. For MI-MBPCA, because each variable only belongs to one block and the rest blocks may lose key information, the models of blocks maybe biased. One interesting finding in Table 1 is that MMA can successfully detect faults 3 and 4 while PCA fails. The reason for this phenomenon is that PCA monitors the complex correlations between all variables together while MMA monitors each strong correlation (one minimalist module) independently; therefore, MMA is very sensitive to changes in specific correlations. The fault localization results of the two algorithms for faults 3 and 5 are shown in Figs. 4 and 5, respectively. In Fig. 4, for PCA, , , and alarm the fault, and we cannot locate the fault source. For MI-MBPCA, because is influenced by , both variables alarm the fault and we cannot locate the fault source. For MMA, all and indices are normal, which means that all variables in the independent module are normal and all variables in the minimalist modules fluctuate within the normal range; because signals a fault alarm, one finds that the correlations between , , and are changed.

Figure 4

Fault localization for fault 3.

Figure 5

Fault localization for fault 5.

Fault localization for fault 3. Fault localization for fault 5. In Fig. 5, although a fault occurs in , most indices in PCA signal a fault alarm, and we cannot locate the fault source. For MI-MBPCA, it can successfully locate the fault source. However, because MI-MBPCA fails in detecting fault 5, and hence the fault localization step is skipped, as such, MI-MBPCA also fails in locating the fault source. For MMA, all and are normal, and hence one finds that the fault is not in the minimalist modules; only signals a fault alarm, and hence MMA successfully locates the faulty variable .

Fault detection in the Tennessee Eastman process

The Tennessee Eastman (TE) process[32] simulation is the most widely used simulation model to test the MSPM methods, which is outlined in Fig. 6. The TE process uses 12 manipulated variables, 22 continuous process measurements, and 19 composition measurements sampled less frequently to simulate a classical chemical process. Because the 19 composition measurements are difficult to measure in real time and one manipulated variable, i.e., the agitation speed, is not manipulated, this study only monitors the other 22 measurements and 11 manipulated variables, as listed in Table 2. Twenty-one programmed faults that are introduced in the TE process are listed in Table 3. In this study, 960 normal samples are adopted as training data to construct the monitoring models. Each testing data set contains 960 samples, and fault occurs at the 161st sample.

Figure 6

Schematic of the Tennessee Eastman process[33].

Table 2

Monitored variables in the Tennessee Eastman process[33].

Variable
1 A feed (stream 1)	18 Stripper temperature
2 D feed (stream 2)	19 Stripper steam flow
3 E feed (stream 3)	20 Compressor work
4 Total feed (stream 4)	21 Reactor cooling water outlet temperature
5 Recycle flow (stream 8)	22 Separator cooling water outlet temperature
6 Reactor feed rate (stream 6)	23 D feed flow valve (stream 2)
7 Reactor pressure	24 E feed flow valve (stream 3)
8 Reactor level	25 A feed flow valve (stream 1)
9 Reactor temperature	26 Total feed flow valve (stream 4)
10 Purge rate (stream 9)	27 Compressor recycle valve
11 Product separator temperature	28 Purge valve (stream 9)
12 Product separator level	29 Separator pot liquid flow valve (stream 10)
13 Product separator pressure	30 Stripper liquid product flow valve (stream 11)
14 Product separator under flow (stream 10)	31 Stripper steam valve
15 Stripper level	32 Reactor cooling water flow
16 Stripper pressure	33 Condenser cooling water flow
17 Stripper underflow (stream 11)

Table 3

Descriptions of faults in the Tennessee Eastman process[33].

No.	Description	Type
1	Feed ratio of A/C, composition constant of B (stream 4)	Step
2	Composition of B, ratio constant of A/C (stream 4)	Step
3	Feed temperature of D (stream 2)	Step
4	Inlet temperature of reactor cooling water	Step
5	Inlet temperature of condenser cooling water	Step
6	Feed loss of A (stream 1)	Step
7	Header pressure loss of C—reduced availability (stream 4)	Step
8	Feed composite of A, B, and C (stream 4)	Random variation
9	Feed temperature of D (stream 2)	Random variation
10	Feed temperature of C (stream 4)	Random variation
11	Inlet temperature of reactor cooling water	Random variation
12	Inlet temperature of condenser cooling water	Random variation
13	Reaction kinetics	Slow drift
14	Valve of reactor cooling water	Sticking
15	Valve of condenser cooling water	Sticking
16–20	Unknown	Unknown
21	The valve for stream 4 was fixed at the steady-state position	Constant position

Schematic of the Tennessee Eastman process[33]. Monitored variables in the Tennessee Eastman process[33]. Descriptions of faults in the Tennessee Eastman process[33]. In this section, we compare MMA with PCA, MI-MBPCA, Deep principal component analysis (DePCA)[34], and kernel dynamic PCA (KDPCA)[35]; the latter two methods are improved versions of PCA. The detection results of the four methods are listed in Table 4. The false alarm rate is calculated as the , and the detection rate is calculated as . In this study, all control limits are based on a probability of 99% and the best result is marked in bold.

Table 4

False alarm rates (%) and detection rates (%) of the four fault detection methods.

Method	PCA		DePCA		KDPCA		MI-MBPCA	MMA
Index	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T^{2}$$\end{document}T2	SPE	_ET2	_ESPE	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T^{2}$$\end{document}T2	SPE	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$DR$$\end{document}DR	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_{I}^{2}$$\end{document}TI2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_{M}^{2}$$\end{document}TM2	SPE_M
False alarm rate	0.5	1.4	6.1	11.5	11.21	4.05	1.25	0.8	1.3	0.2
Detection rate
Fault 1	99.1	99.9	99.1	100.0	99.0	99.6	99.9	44.6	100.0	0.0
Fault 2	98.4	95.8	98.5	98.0	98.3	96.6	98.0	74.4	98.6	0.0
Fault 3	0.9	2.6	17.6	17.4	0.9	3.1	0.8	1.9	6.6	1.8
Fault 4	20.9	100.0	78.3	100.0	20.2	99.9	100.0	0.3	100.0	0.9
Fault 5	24.3	20.9	38.8	45.0	24.0	24.8	23.5	14.0	33.0	100.0
Fault 6	99.1	100.0	99.4	100.0	98.9	99.9	100.0	93.9	100.0	100.0
Fault 7	100.0	100.0	100.0	100.0	99.9	99.9	100.0	100.0	45.6	3.6
Fault 8	96.9	83.6	97.5	98.3	96.8	93.0	97.8	60.8	98.4	3.0
Fault 9	1.8	1.8	16.9	14.0	1.5	3.1	2.5	1.5	6.0	1.3
Fault 10	29.9	25.8	57.1	58.1	29.5	27.6	41.8	6.8	88.5	0.1
Fault 11	40.6	74.9	86.3	85.0	40.5	74.9	82.5	0.9	89.4	1.1
Fault 12	98.4	89.5	99.6	99.3	98.3	93.4	99.0	66.1	99.6	52.0
Fault 13	93.6	95.3	94.4	95.1	93.5	95.0	95.4	66.5	95.6	22.9
Fault 14	99.3	100.0	100.0	100.0	99.1	99.9	99.9	0.3	100.0	0.0
Fault 15	1.4	3.0	17.8	19.6	1.3	3.4	2.5	1.6	11.6	2.0
Fault 16	13.5	27.4	43.5	57.4	13.7	27.8	27.1	3.6	91.9	74.1
Fault 17	76.4	95.4	91.6	94.4	76.5	94.8	93.5	1.0	97.1	0.1
Fault 18	89.3	90.1	92.1	92.0	89.3	90.3	89.6	88.1	91.0	83.9
Fault 19	11.0	12.5	68.8	68.9	8.7	21.0	13.8	1.6	90.4	48.3
Fault 20	31.8	49.8	63.5	61.8	31.2	50.8	57.4	2.4	83.9	81.0
Fault 21	39.3	47.3	54.6	61.8	35.3	50.1	47.4	39.8	66.3	0.6

False alarm rates (%) and detection rates (%) of the four fault detection methods. As shown in Table 4, we find that MMA, MI-MBPCA, and PCA achieve similar false alarm rates, and their values are much lower than those of the two improved PCA methods (over 10%). For fault detection rates, MMA achieves the best results in 17 of the 21 faults; as for the remaining 4 faults, MMA’s detection rates are not as high as those of DePCA only because DePCA sacrifices the false alarm rate. An eye-catching result is obtained in the case of fault 5: the detection rates of the compared methods are generally below 50%, whereas MMA achieves a 100.0% detection rate, which indicates the superiority of MMA. In addition, the performance of MMA in faults 10, 16, 19, and 20 is much better than that of the other four methods. As the papers that proposed DePCA and KDPCA did not give a description of the contribution plot construction, we only compare the fault localization ability between PCA, MI-MBPCA, and MMA. The matrix of MMA is shown in Table 5.

Table 5

Matrix for the Tennessee Eastman process.

Variable	MM
Variable	1	2	3	4	5	6	7	8	9	10	11	12	13	14
1	− 0.1	0.0	0.3	− 0.3	0.1	0.0	0.0	0.3	0.0	− 0.2	− 0.2	0.3	0.0	0.0
2	0.1	0.0	0.1	0.1	0.3	0.0	0.0	0.0	− 0.2	0.0	0.0	0.2	− 0.1	0.1
3	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
4	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
5	0.0	0.0	0.1	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.1
6	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
7	− 0.3	− 0.6	0.2	− 0.3	0.0	0.4	0.0	0.1	− 0.1	0.0	0.0	0.0	0.0	0.4
8	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
9	0.5	0.0	0.0	0.0	0.0	0.3	0.0	− 0.3	0.0	0.0	0.0	0.0	0.4	0.0
10	− 0.1	0.0	− 0.2	0.0	0.1	0.0	0.0	0.2	0.1	− 0.2	0.0	0.0	− 0.2	0.1
11	0.0	0.1	0.2	0.0	0.1	0.0	0.4	0.2	0.0	− 0.4	0.1	0.3	0.0	0.0
12	0.0	0.0	0.0	0.1	0.0	0.0	0.0	0.2	0.0	0.0	0.0	0.0	0.0	0.0
13	0.0	0.5	0.3	0.0	0.3	0.0	0.0	− 0.3	− 0.2	0.0	− 0.1	− 0.5	0.4	− 0.1
14	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
15	− 0.1	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.1	0.0	0.0	0.4
16	0.5	0.5	− 0.1	− 0.1	− 0.4	0.1	− 0.3	0.4	0.5	0.4	0.0	0.2	− 0.2	0.3
17	0.0	0.0	0.0	0.0	0.0	0.0	− 0.4	0.0	− 0.5	0.0	0.0	0.0	0.0	0.0
18	0.0	0.2	0.0	− 0.2	0.2	0.6	− 0.4	− 0.1	0.0	− 0.1	0.1	0.0	0.1	0.0
19	0.3	0.0	0.6	0.0	− 0.4	− 0.1	− 0.1	0.0	0.0	0.4	0.5	0.0	− 0.2	0.5
20	0.0	0.0	− 0.1	0.6	0.1	− 0.5	0.3	0.0	0.0	− 0.6	0.0	0.1	0.4	− 0.4
21	0.3	0.1	0.2	0.2	0.5	0.0	0.0	0.0	− 0.3	0.0	− 0.2	0.5	0.0	0.0
22	0.1	0.0	0.0	0.1	0.1	0.0	− 0.4	− 0.1	0.0	0.3	− 0.1	0.0	− 0.1	0.0
23	0.0	0.0	0.0	0.2	0.3	− 0.1	0.0	0.0	− 0.1	0.0	− 0.1	0.2	− 0.1	− 0.1
24	0.0	0.0	0.0	0.1	0.0	0.0	0.0	0.0	0.0	− 0.1	0.1	0.0	0.0	0.0
25	0.0	− 0.2	− 0.4	0.4	0.0	− 0.1	0.0	− 0.4	− 0.1	0.0	0.4	− 0.2	− 0.1	0.0
26	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
27	0.0	− 0.2	− 0.2	− 0.2	0.0	0.2	0.2	0.0	0.0	0.0	0.2	0.3	− 0.5	0.0
28	0.0	0.0	0.1	0.0	− 0.2	− 0.1	0.0	− 0.1	0.0	0.1	0.0	0.0	0.2	− 0.1
29	0.0	0.0	0.0	− 0.1	0.0	0.0	0.0	− 0.2	0.0	0.0	0.0	0.0	0.0	0.0
30	0.1	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	− 0.1	0.0	0.0	− 0.4
31	0.0	0.0	− 0.2	− 0.2	0.1	− 0.1	0.0	0.2	0.0	0.2	− 0.6	0.0	0.0	0.0
32	− 0.5	0.0	0.0	0.0	0.0	− 0.3	0.0	0.3	0.0	0.0	0.0	0.0	− 0.4	0.0
33	0.0	0.0	0.0	0.0	0.0	0.0	− 0.4	0.0	− 0.5	0.0	0.0	0.0	0.0	0.0

Significant values are in [bold].

Matrix for the Tennessee Eastman process. Significant values are in [bold]. Figure 7 shows the fault localization results of fault 4. According to Table 3, fault 4 is a step change in inlet temperature of reactor cooling water. As depicted in Fig. 6, the reactor temperature (variable 9 in Table 2) changes, and hence the reactor cooling water flow (variable 32 in Table 2) also changes to compensate for the temperature change. For PCA, , , and signal a fault alarm; for MI-MBPCA, about 14 variables alarm the fault and it fails in locating the fault source; for MMA, , , , and signal a fault alarm based on the fault localization rules presented in “Monitoring indices construction” section, and then one finds that variables 9 and 32 are faulty. Both PCA and MMA can locate this fault. Different from the contribution plot method of PCA, all of MMA are normal, which tells the engineers that the correlation between variables have not changed, and hence the fault source is the change in amplitude of some variables. Thus, it can be seen that, compared with PCA, MMA can provide more useful information for fault localization.

Figure 7

Fault localization for fault 4.

Conclusions

In this study, a new MSPM called MMA was proposed to overcome the shortcoming of the traditional MSPM method in handling the redundant correlations among process variables. The superiority of MMA was verified by several simulation tests. It achieved much better detection performance for five different types of faults on a mathematical model test, and two of which could not be detected by PCA and MI-MBPCA. MMA also had a better performance than other improved MSPM algorithms for 17 of the 21 faults in the Tennessee Eastman process. MMA is a completely new method, and hence much work can be done based on it. First, we can combine it with the traditional nonlinear, dynamic, robust strategy to improve its fault detection ability. We can also combine it with the traditional contribution plot method to improve its fault localization ability. Moreover, we can combine it with the key performance indicator[14] monitoring strategy. All of these investigations will be part of our future work.

2 in total

1. In-Depth Evaluation of Data Collected During a Continuous Pharmaceutical Manufacturing Process: A Multivariate Statistical Process Monitoring Approach.

Authors: Ana F Silva; Jurgen Vercruysse; Chris Vervaet; Jean P Remon; João A Lopes; Thomas De Beer; Mafalda C Sarraguça
Journal: J Pharm Sci Date: 2018-08-27 Impact factor: 3.534

2. Plant-wide process monitoring based on mutual information-multiblock principal component analysis.

Authors: Qingchao Jiang; Xuefeng Yan
Journal: ISA Trans Date: 2014-06-20 Impact factor: 5.468

2 in total