Literature DB >> 35371237

A Weighted Error Distance Metrics (WEDM) for Performance Evaluation on Multiple Change-Point (MCP) Detection in Synthetic Time Series.

Jin Peng Qi¹, Fang Pu², Ying Zhu³, Ping Zhang⁴.

Abstract

Change-point detection (CPD) is to find abrupt changes in time-series data. Various computational algorithms have been developed for CPD applications. To compare the different CPD models, many performance metrics have been introduced to evaluate the algorithms. Each of the previous evaluation methods measures the different aspects of the methods. Based on the existing weighted error distance (WED) method on single change-point (CP) detection, a novel WED metrics (WEDM) was proposed to evaluate the overall performance of a CPD model across not only repetitive tests on single CP detection, but also successive tests on multiple change-point (MCP) detection on synthetic time series under the random slide window (RSW) and fixed slide window (FSW) frameworks. In the proposed WEDM method, a concept of normalized error distance was introduced that allows comparisons of the distance between the estimated change-point (eCP) position and the target change point (tCP) in the synthetic time series. In the successive MCPs detection, the proposed WEDM method first divides the original time-series sample into a series of data segments in terms of the assigned tCPs set and then calculates a normalized error distance (NED) value for each segment. Next, our WEDM presents the frequency and WED distribution of the resultant eCPs from all data segments in the normalized positive-error distance (NPED) and the normalized negative-error distance (NNED) intervals in the same coordinates. Last, the mean WED (MWED) and MWTD (1-MWED) were obtained and then dealt with as important performance evaluation indexes. Based on the synthetic datasets in the Matlab platform, repetitive tests on single CP detection were executed by using different CPD models, including ternary search tree (TST), binary search tree (BST), Kolmogorov-Smirnov (KS) tests, t-tests (T), and singular spectrum analysis (SSA) algorithms. Meanwhile, successive tests on MCPs detection were implemented under the fixed slide window (FSW) and random slide window (RSW) frameworks. These CPD models mentioned above were evaluated in terms of our WED metrics, together with supplementary indexes for evaluating the convergence of different CPD models, including rates of hit, miss, error, and computing time, respectively. The experimental results showed the value of this WEDM method.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35371237 PMCID： PMC8970941 DOI： 10.1155/2022/6187110

Source DB: PubMed Journal: Comput Intell Neurosci

1. Introduction

Change-point (CP) detection is the application of core techniques to detect abrupt changes in properties of time-series data. It has been widely studied in many real-world problems, such as atmospheric and financial analyses [1], fault detection in engineering systems [2, 3], changes detection in a variance of oceanographic time series [4], genetic time-series analyses [5], and online detection of steady-state operation [6]. For example, the usage of this method to detect abnormal patterns in ECG and EEG signals may also be beneficial [4, 7–15]. This application would allow appropriate staff to be alerted of abrupt changes in a patient's medical situation and to provide on-time treatment [16, 17]. In addition, CPD models can be tightly combined with some nonlinear modeling approaches and their applications, such as classification of human hand movements [18], degradation signal for prognostic improvement [19], real-life hand prosthetic control [20], single-channel surface electromyography (sEMG)-based control [21]. CPD models utilize algorithms that cover the fields of data mining, statistics, and computer science, including parametric and nonparametric methods [8, 22–27]. Each CPD algorithm can be assessed from the aspect of detection accuracy, computational cost, or whether it can be a real-time detection. Many performance metrics have been introduced to evaluate CPD algorithms based on the type of decisions they make [28]. Aminikhanghahi and Cook [29] reviewed the performance evaluation methods commonly used for CPD models. The evaluation can be based on a yes/no decision whether the resultant change point was detected within a certain distance from the actual change point. In this case, the CPD model can be treated as a binary classification model and can be evaluated with the usual measures, such as accuracy, sensitivity, specificity, or ROC curve [30, 31]. For real applications, for example, clinical decision-making, cut-offs applied to the model outcomes can be adjusted to achieve different sensitivity and specificity [32]. However, when the difference in time between the resultant eCP and the actual tCP represents the measure of CPD performance, then the evaluation of these algorithms is not as straightforward as for the binary classification. There is no single label against which the performance of the algorithm can be measured. A few useful metrics consider the distance between the eCP and the tCP to measure CPD method performance. These metrics include mean absolute error (MAE), mean squared error (MSE), mean signed difference (MSD), root mean squared error (RMSE), and normalized root mean squared error (NRMSE). Of these, except NRMSE normalizes the unit size of the predicted value and facilitates a more direct comparison of error between different datasets, the other methods measure only the absolute distances between the eCP and the tCP. However, even NRMSE does not count the difference between the situations when the eCP is before and after the actual tCP. It also fails to consider the relative position of the tCP within the total length of the time-series sample. In our previous studies [33], a preliminary WED method was proposed for evaluating a CPD model for single change-point detection. In this existing method, a concept of weighted error distance (WED) is introduced for counting a normalized error distance between each pair of the resultant eCPs and the actual tCPs, and then the performance of different CPD models is ranked by the averaged WED accordingly [33]. In this study, a novel WEDM method is proposed to compare the overall performance of CPD models for MCPs detection on multiple data segments in a time series with different data features. Based on the previous WED measure, a concept of normalized error distance was introduced in this WEDM method, that allows comparisons of the distance between the estimated change-point (eCP) position and the target change point (tCP). During the successive MCPs detection, the proposed WEDM method first divides the original sample into a series of data segments in terms of assigned tCPs, and then counts a normalized error distance (NED) value for each segment. Then, our WEDM presents the frequency and WED distribution of the resultant eCPs from all data segments in the normalized positive-error distance (NPED) and the normalized negative-error distance (NNED) intervals in the same coordinates. Last, the mean WED (MWED) and MWTD (1-MWED) were calculated and dealt with as important performance indexes. Based on the synthetic datasets in the Matlab platform, both repetitive tests on single CP detection and successive test on MCPs detection were executed by using different CPD models, including ternary search tree (TST) [8, 34], binary search tree (BST) [15, 24], Kolmogorov–Smirnov (KS) tests [22, 25], t-tests (T) [23, 35], and singular spectrum analysis (SSA) algorithms [36] recorded in our previous studies [22, 37]. Meanwhile, these CPD models above were evaluated under the random slide window (RSW) [8, 38, 39] and fixed slide window (FSW) frameworks [40-44] in terms of our WEDM and supplementary indexes including the rates of hit, miss, error, and computing time, respectively. The experimental results showed the value of this WEDM method.

2. Methods

In this part, the proposed WEDM is theoretically illuminated in the following steps. First, the diagnosed sample is divided into a series of data segments according to the assigned target MCPs. Second, a normalized error distance (NED) is calculated by comparing the distance between the resultant eCP position and the actual tCP within each data segment. Third, the frequency and WED distribution of the resultant eCPs detected from all segments are presented across the normalized positive-error distance (NPED) and the normalized negative-error distance (NNED) intervals in the same coordinates. Last, the metrics of mean WED (MWED) and mean WTD (MWTD) are given to efficiently evaluate a CPD model for MCPs detection on a series of data fluctuations in an identical time series.

2.1. Data Segmentation

Suppose a time-series signal X={X1,…, X,…, X} can be observed as a trajectory of a multiple data distribution process, in which the segment X is defined by the following equation:where t∈{ t+1,..., t}, 0< i ≤ M, and f ∈ {f1,…, f} is a deterministic and piece-wise function of one-dimensional signal with change points (satisfying f ≠ f, and i = 1,…, M−1 for insuring that abrupt changes occur), and M∈{1, 2,…, n} is the number of data segment regimes and therefore M−1 is the number of abrupt changes, 0 = t0 < t1< ···< t <···< t = n. The number M−1 and locations η1,. . ., η of change points in the process are supposed to be unknown. The sequence (ε) ∈ is assumed to be random white noise and such that E(ε) is exactly or approximately zero. In the simplest case, (ε) ∈ is modeled as i.i.d., but can also follow more complex time-series distributions. Consider an observed time-series signal X={X1,…, X,…, X} with M−1 change points mentioned above, one-part time series X′={X,…, X,…, X} with a size of N′ is selected from X, 1 ≤ s < j < e ≤ N, and 1 < N′ ≤ N. Suppose a set of target MCPs tMCP set={tCP1,…, tCP} is contained within X′, and 1 ≤ n ≤ M − 1. In the proposed WEDM method, the diagnosed data sample X′ is first divided into a series of data segments according to different target CP positions in the tMCP set. The process of data segmentation is described below (Figure 1):where N is the total length of X′, and N_Seg refers to the size of Seg.

Figure 1

The scheme of WEDM evaluation on the target MCPs detection in the diagnosed X′.

For each tCP to be diagnosed in the tMCP set, the data segment Seg can be denoted as follows: where 1 < i < n and 1 < n < N′, and two endpoints mCP and mCP in Seg are formulated as follows: Especially, the first Seg1 and the last Seg can be presented according to the tCP1 and tCP as follows: where X and X are the two endpoints in X′, respectively. Then, the time series X′={X,…, X,…, X} can be divided into a set of data segments SEG set={Seg1,…, Seg }. That is, X′={Seg1,…, Seg }, and the following equation holds

2.2. NED Evaluation on Single CP Detection

In the scheme of error distance (ED) measurement on single CP detection (Figure 2), each segment Seg={X … X … X} in time series X′={Seg1,…, Seg } is divided into the former (left) part {X,…, X} and the latter (right) part {X,…, X} by the actual tCP located at the data point X and 1 ≤ i ≤ n.

Figure 2

The scheme of error distance (ED) measurement on single CP detection in the data segment Seg. In the positive area, X represents the start point of Seg, and X is the position of resultant eCP within the positive area before the actual tCP. On the other hand, X represents the endpoint of Seg, and X stands for the eCP located within the negative area after the tCP.

From a statistical point of view, we refer to the former (left) part as a positive area and the latter (right) part as a negative one. When applying a CPD to detect the actual tCP in the data segment Seg, a resultant eCP might be estimated from either the positive area or the negative one. A few concepts are introduced here to measure CPD model performance: true-positive distance (tPD), positive-error distance (pED), true-negative distance (tND), and negative-error distance (nED). If the resultant eCP is detected on the left side of the tCP (positive area), then pED and tPD can be calculated. That is, the distance from the eCP to the tCP and the start point, respectively. Meanwhile, nED and tND are not applicable. Conversely, when the eCP is estimated from the right side of the tCP (negative area), nED equals the distance from eCP to tCP, and tND is the distance from the eCP to the end of the data segment Seg. At the same time, pED and tPD do not exist (Figure 2). These definitions can be represented in formulas (6)–(9)as follows: In which, X and X represent the start and endpoints of the time-series segment Seg, respectively, X is the position of actual tCP in the Seg, X and X refer to the positions of resultant eCP on the left or right side of the tCP respectively. Basically, for a current data segment Seg in the scheme of NED evaluation on single CP detection (Figure 3), the distance between the start point and the tCP and the distance from the tCP to the end of each segment are both normalized to 1, and the normalized tCP position for each segment will match to the same point. In formulas (10)–(13), tPDR, pEDR, tNDR, and nEDR can be interpreted as the normalized true-positive distance (NtPD), normalized positive-error distance (NpED), normalized true-negative distance (NtND), and normalized negative-error distance (NnED), respectively.

Figure 3

The scheme of NED evaluation on single CP detection in the data segment Seg. In which, “−1” and “1” represent the start and endpoints of Seg, and “0” refers to the position of actual tCP in the x-axis, respectively.

Thereafter, a normalized error distance NED in formula (14) is presented by a piecewise function of NpED and NnED, according to the resultant eCP located at the positive or negative area.

2.2.1. WED Evaluation on MCPs Detection

Given a series of data segments SEG set ={Seg1,…, Seg } in a diagnosed time series X′ above, we can assemble all the resultant eCPs into an identical coordinate and present their NED values ranging from the positive area [−1, 0] to the negative area [0, 1] in the x-axis (Figure 4). Then, the frequencies of NED can be defined in the all resultant eCPs as follows:

Figure 4

The scheme of NED evaluation for MCPs detection on tMCPset = {tCP1,…, tCP} in a time series X′. For each resultant eCP, the value of NED equals to NpED or NnED depending on that the eCP is located at the positive or negative area ranging from −1 to 1 in the x-axis.

In which, Num(NED ) is the number of the resultant eCPs that their NED values equal to NED, and Nt is the number of resultant eCPs in total, 1 ≤ i ≤ Nt. Then, the weighted error distance WED is introduced according to the NED and Freq(NED) in the resultant eCPs (Figure 5). For each eCP in the scattered distribution of resultant eCPs, its corresponding WED is equal to WpED or WnED depending on whether the NED is located at the positive-NpED or negative-NnED area ranging from −1 to 1 in the x-axis. The definitions of WpED, WnED, and WED are formulated as follows:

Figure 5

The scheme of WED metrics for MCPs detection on a set of target MCPs tCPset = {{tCP1,…, tCP} in a time series X′. For each eCP in the scattered distribution of resultant eCPs, the value of WED refers to WPED or WNED according to whether the NED is located at the positive-NpED or negative-NnED area ranging from −1 to 1 in the x-axis.

Thereafter, a mean weighted error distance (MWED) is defined as follows:where l and r refer to the numbers of the eCPs located before and after the actual tCPs (positive-NpED area and negative-NnED area), respectively. In most of the CPD models, when the search algorithm reaches the start or end of the time series, if no change point is found, then the resultant eCP can be set as either the start or the end. Therefore, the sum of l and r will be equal to N (the total number of actual tCPs to be diagnosed in a time series X′). Formula (17) can be simplified as follows: Furthermore, following MWED, 1-MWED can be referred to as mean weighted true distance (MWTD) and used as a measure of the overall performance of a CPD model for MCPs detection on time series with a series of data fluctuations.

3. Results and Discussion

To accurately evaluate different CPD models, other related indexes were introduced besides our WEDM. In the synthetic experiments, time-series datasets were generated and assembled by using the Gaussian distribution function in the Matlab platform, and then repetitive tests on single CP detection were executed by using different TST, BST, KS, and SSA models. Meanwhile, the performance of CPD models was evaluated by using successive tests on MCPs detection that were implemented under different RSW and FSW frameworks, respectively.

3.1. Related Evaluation Indexes

In the synthetic tests, some other indexes are used for evaluating the convergence of different CPD models, including the hit, miss, and error rates, and computing time. Given a data segment Seg in the time series X' mentioned above, the related definitions are introduced in terms of the error distance between the resultant eCPs and the actual tCP as follows (Figure 6):where ST refers to the computing time cost in the Seg, and N is the total data segments. Then, the normalized time is defined as follows:

Figure 6

The scheme of single CP detection on the data segment Seg within a sliding window Wi. The definitions of hit, error, miss, and redundant are introduced according to the distance between tCP and eCP, respectively.

Error distance: Given an actual tCP assigned in the current data segment Seg, the error distance ED between each pair of the estimated eCP and the tCP is defined by ED=|eCP − tCP|. Hit area: For the actual tCP, the hit area named HA is formulated by HA=[tCP − hd, tCP +hd], where hd is the threshold value of error distance between tCP and eCP. Hit: Given an error distance ED mentioned above, if 0 ≤ ED ≤ hd holds, then the tCP is hit by eCP and recorded by Hit(tCP)=1. Therefore, the value of WED defined in formula (18) equals 0. Error: On the other hand, if ED > hd holds, then eCP is dealt as an error result labeled by Error(eCP)=1. In this circumstance, the value of WED is within the rage (0, 1). Miss: In addition, if no change point is detected from the Seg, then the target tCP is missed, and identified by Miss(tCP)=1. Accordingly, the value of WED is set to be 1 because of the missing tCP. Thereafter, the hit rate, miss rate, and error rate are formulated as follows: In which, Nhit=∑Hit(tCP) is the number of actual tCPs hit by the resultant eCPs, NMiss=∑Miss(tCP) is the part of actual tCPs that are missed, and NError=∑Error(eCP) stand for the number of the resultant MCPs in which D > hd holds. N is the number of resultant MCPs in total, and it is usually larger than N, that is, the number of the actual tCPs within the time series X′ . Generally, it holds true that hit rate + miss rate + error rate =1 for all the resultant eCPs. Computing time: In addition, for a certain CPD model k, the computing time is mainly used for tCPs detecting from the multiple data segments in X′, and it can be denoted as follows: In which, ST stands for the computing time of the model k, and n is the total model to be compared. The NST represents the time ratio of model k to all methods, and then it can reflect the searching efficiency against others. Generally, both TST and BST models in our previous studies have a time complexity of nearly O(log N) [8, 10, 13]; therefore, they should be faster and more efficient than some traditional algorithms with time complexity about O(N2), such as KS, CUSUM, t-test, or SSA methods.

3.2. Repetitive Tests on Single CP Detection

In the first experiment, repetitive tests on single CP detection were executed on the synthetic dataset, that is, Dataset1 ={X1,…X,…, X} that was generated by the Gaussian function in the Matlab R2016 platform. For each time series X={x1, …, x, … x} with single target CP, it is composed of both the positive area X={x1, …, x} and the negative area X={x, …, x} before and after the assigned target tCP=x. The former X and latter X were generated by the normal distribution N (μ = 0, σ = 1) of size m (m time points included in the positive area), and N (μ = V, σ = 1) of size N-m (N-m time points in the negative area), respectively, where V is a constant mean value, and N is the total length of X. Here, we first present the results from Dataset1 that was composed of multiple 20 data groups with different length N, variance V, and tCP, and each group contains 100 time-series samples. Therefore, Dataset1 included 2000 time series in total, and this experiment named Exp1 is performed by using TST, BST, KS, T, and SSA models, respectively. In our simulations, the time-series samples in each group were generated by selecting the random values of sample length N from 2^10 to 2^15, variance V from 1.0 to 3.7, and the position of actual tCP from 1 to N. In the 20 groups of Exp1, the repetitive tests are executed by using different CPD models including the TST, BST, KS, T, and SSA, respectively (Figure 7). With the total 2000 time-series samples in Dataset1, the frequency and WED distribution of resultant MCPs are illustrated from the positive-NpED range of [−1, 0] to the negative-NnED range of [0, 1] in the x-axis. From these results, we can see that if the resultant eCP is much closer to the central axis of x = 0, then the WED value generally gets smaller and tends to be 0, and vice versa. In all five models, TST and KS obtain the eCPs that are mostly located near the central field of x = 0, and then have narrower WED distributions and smaller WED values than other models, except that TST has a few eCPs fallen into the positive-NpED field. As for other BST, T, and SSA models, the eCPs are mainly scattered with a wide range from the NpED to the NnED areas, therefore their WED distributions are wider and bigger, especially for T and SSA.

Figure 7

The frequency and WED distribution of resultant MCPs from the 20 groups in Dataset1. For the different models of (a) TST, (b) BST, (c) KS, (d) T, and (e) SSA, the frequency and WED distribution of the resultant MCPs are demonstrated from the NpED range of [−1, 0] to the NnED range of [0, 1] in the x-axis, respectively.

Meanwhile, these simulation results also illustrate that both TST and KS have better convergency than others, especially, the TST has the highest hit level and takes the shortest convergent time in all five models. For the rest models, BST seems much better than others, and T has the worst convergency, because of the lowest hit, the biggest error, and convergent time in all five models. Furthermore, the mean analyses (Table 1) indicate that the TST takes the shortest computing time, has the highest hit rate, the smallest MWED, and the biggest MWTD out of the other four models. For T and SSA models, a lot of eCPs are scattered the whole field from NPED to NNED, especially, T has the biggest values of error rate and MWED and needs the longest time in all five models.

Table 1

The mean analyses of single CP detection in Exp1 by using TST, BST, KS, T, and SSA models.

Items	Methods
Items	TST	BST	KS	T	SSA
MWTD	0.9972	0.9633	0.9947	0.7030	0.8349
Hit rate	0.4040	0.1540	0.0430	0.0340	0.0585
Miss rate	0.0005	0.0035	0.0000	0.0005	0.0005
Error rate	0.0012	0.0038	0.0002	0.1202	0.0601
MWED	0.0028	0.0367	0.0053	0.2970	0.1651
Time	0.0032	0.0039	0.3126	0.5239	0.1566

In addition, the efficiencies of five models are evaluated using random parameter values in a total of 20 tests. The dynamic tracks including hit rate, miss rate, error rate, and MWED are illustrated versus the test number from 1 to 20 (Figure 8). Also, the mean analyses on hit rate, miss rate, error rate, and MWED are presented in the histograms, in which, “1,” “2,” “3,” “4,” and “5” in x-axis refer to the TST, BST, KS, T, and SSA models, respectively. In the whole process of simulation tests, the TST model has a relatively higher hit rate with some fluctuations and keeps more stable and lower levels of miss rate, error rate, and MWED than others. Although KS has a smaller hit rate than TST and BST, it keeps lower tracks of miss and error rates than BST, T, and SSA. To some extent, BST has a bigger hit rate, and lower values of error rate and MWED than T and SSA, it seems unstable due to the drastic oscillations in the tracks of hit and miss rates. For T and SSA, both models have smaller hit rates and keep dramatic fluctuations in the tracks of error rate and MWED value, despite a lower miss rate than BST.

Figure 8

The results of multiple 20 tests on single CP detection by using 2000 synthetic time series in Dataset1 of Exp1, with random parameters of sample size (N) from 2^10 to 2^15, actual tCP from start to end of sample length (N), and variance (V) from 1.0 to 3.7. For TST, BST, KS, T, and SSA models, the dynamic tracks of (a) hit rate, (b) miss rate, (c) error rate, and (d) MWED versus simulation tests range from 1 to 20. In addition, the mean analyses on (e) hit rate, (f) miss rate, (g) error rate, and (h) MWED, in which, “1” “2”, “3”, “4”, and “5” in x-axis refer to the TST, BST, KS, T, and SSA models, respectively.

Furthermore, taking one representative test as an example, the simulations of single CP detection are repetitively executed by using 100 time-series samples with random values of parameters N = 2^14, tCP = 12267, and V = 1.9. For different TST, BST, KS, T, and SSA models, the resultant eCPs are illustrated using the locations, distributions, frequency, and WED, in line with the test number, time-series positions, NPED, and NNED in the x-axis, respectively (Figure 9). For both TST and KS models, it is easy to see that most of the eCPs are located within the small range near the actual tCP = 12267, and similar results can be found in the distribution, frequency, and WED analyses on the resultant eCPs. On the contrary, similar results for the rest of BST, T, and SSA models are that lots of the eCPs are randomly scattered across the fields from NPED to NNED, and small parts of the eCPs are gathered near the actual tCP.

Figure 9

The repetitive simulations of single CP detection on 100 time series from one of 10 tests in Exp1, with random parameter values of sample size N = 2^14, actual tCP = 12267, and variance (V) = 1.9. By using different TST, BST, KS, T, and SSA models, the simulation results including (a) locations, (b) distributions, (c) frequency, and (d) WED of the resultant eCPs are represented in line with the test number, time-series positions, NPED, and NNED in the x-axis, respectively.

Then, the mean analyses for this representative test are summarized in terms of WMTD, hit rate, miss rate, error rate, MWED, and time (Table 2). The results show that the TST model has much smaller values of MWED, miss and error rates, and computing time, as well as the biggest values of hit rate and MWTD than others. Despite a long time and smaller hit rate than TST, KS kept similar levels of MWTD, hit, miss, and error rates with it. As for the rest BST, T, and SSA, although the three models had similar performance, BST had the biggest miss rate, T had the smallest MWTD and hit rate, and the biggest values of time, error rate, and MWED.

Table 2

The analyses of one representative test on repetitive single CP detection by different CPD models.

items	Methods
items	TST	BST	KS	T	SSA
MWTD	0.9998	.7315	0.9980	0.5708	0.6567
Hit rate	0.3400	0.0600	0.0400	0.0010	0.0020
Miss rate	0.0000	0.0001	0.0000	0.0000	0.0000
Error rate	0.0001	0.0259	0.0001	0.2496	0.1536
MWED	0.0002	0.2685	0.0020	0.4292	0.3433
Time	0.0012	0.0012	0.3276	0.5861	0.0840

3.2.1. Successive MCPs Detection under the RSW Framework

In the second experiment, successive tests on MCPs detection were implemented by using other synthetic datasets such as Dataset2 ={X1,…X,…, X } that was composed of W time-series samples, and each sample X={Seg1,…, Seg,…Seg } was assembled by n data segments with different features and distributions. For a given tMCP set={tCP1,…, tCP}, each tCP is assigned between two adjacent segments Seg and Seg, 1 ≤ i≤n − 1. Then, the sample X can be denoted as X = {x1,…x, tCP1,…, x1,…x, tCP, x1,…x, tCP}, where Nsj is the size of segment Seg in X. In the successive tests on MCPs detection, two experiments named Exp2 and Exp3 were implemented based on Dataset2 under the RSW and FSW frameworks, respectively. For each experiment, a series of tests for MCPs detection was executed by using TST, BST, KS, T, and SSA models, respectively. In Exp2, the number of segments n within each sample X was stochastically chosen from 15 to 30, and each data segment Seg = {x1,…, X} was randomly generated by the Gaussian distribution N(U, V) of length Nsj from 2^12 to 2^15, with mean U from 1.0 to 0.1 × N, and variance V from 1 to 2.0 × N, respectively. Here, we present the results of successive tests on MCPs detection under the RSW framework. First, the frequency and WED distribution of resultant MCPs (Figure 10) are displayed within the whole range from the negative-NPED field to the positive-NNED field in the x-axis. Generally, for a certain CPD model, the resultant MCPs are closer to the central axis x = 0, their values of MWED are much smaller. In contrast, the bigger MWTD has, the better efficiency is, and vice versa. In all five models, the results (Figure 10) and the mean analyses (Table 3) show that most of the resultant MCPs detected by TST are located near the central axis x = 0, and TST has the biggest hit rate, the smallest values of miss and error rates, therefore it has the highest MWTD out of others. For the BST model, although a lot of the resultant MCPs are scattered away from the central axis x = 0, it has a smaller error rate and MWED, as well as a bigger hit rate and MWTD than the rest models. For KS, T, and SSA, the common feature is that most of the resultant MCPs are spread through the whole field ranging from −1 to 1 in the x-axis. KS has a bigger MWTD than the other two, T has the smallest MWTD, and SSA has the biggest values of error rate and computing time in all five models.

Figure 10

The analyses on the frequency and WED distribution of resultant MCPs in the total 10 tests of Exp2. For the different MCP models of (a) TST, (b) BST, (c) KS, (d) T, and (e) SSA under the RSW framework, the frequency and WED distribution of resultant MCPs are illustrated within the NPED ranging from −1 to 0, and the NNED ranging from 0 to 1 in the x-axis.

Table 3

The performance analyses on MCPs detection by five CPD models in Exp2 under the RSW framework.

items	Methods
items	TST	BST	KS	T	SSA
MWTD	0.9264	0.8856	0.7850	0.5141	0.5299
Hit rate	0.8629	0.6255	0.2256	0.2029	0.0820
Miss rate	0.0398	0.0585	0.0430	0.0471	0.0018
Error rate	0.1006	0.3421	0.7682	0.7661	0.8851
MWED	0.0736	0.1144	0.2150	0.4859	0.4701
Time	0.0004	0.0003	0.1483	0.1134	0.7376

Meanwhile, these simulations illustrate that the TST has the best convergency because it has the highest hit level, the lowest error, and takes the shortest convergent time in all five models. For the others, the BST model has much better convergency due to the higher hit, lower error, and shorter time than others. SSA seems the worst one in all five models, because of the lowest hit, the biggest error, and convergent time. Second, the performance of five CPD models is demonstrated by a series of 10 tests in total, in which the respective parameters of the sample size N, the number of MCPs N, the mean μ, and variance δ are randomly taken from 2^12–2^15, 15∼30, 1∼0.1 × N, and 1∼2 × N, respectively. The results of dynamic tracks and mean analyses (Figure 11) indicate that the TST model still keeps a better grade with a higher and more stable level of hit rate, as well as the lower levels of error rate and MWED than the other four models. Although BST looks more efficient than KS, T, and SSA, the dynamic tracks in all four items present stronger fluctuations, especially for the miss rate. This probably means that BST has unstable performance during the process of MCPs detection. As for the rest models, they all have similar tracks of lower hit rate and bigger error rates. KS presents instability due to the fluctuant tracks of miss rate and MWED, and so does the T model because of the fluctuant miss rate in the total of random 10 tests. Also, the model's performance can be intuitively evaluated and distinguished from each other in terms of the mean analyses in the histograms (Figure 11(e)–11(h)).

Figure 11

The simulations of MCPs detection on the total of 10 tests in Exp2 under the RSW framework, with random parameters of sample size Nsj from 2^12 to 2^15, the number of tCPs N from 15 to 30, mean U from 1 to 0.1 × N, and variance V from 1 to 2 × N, respectively. For the different TST, BST, KS, T, and SSA models, the performance analyses are denoted in (a) hit rate, (b) miss rate, (c) error rate, and (d) MWED, respectively. Furthermore, the mean analyses are illustrated in histograms of (e) hit rate, (f) miss rate, (g) error rate, and (h) MWED, in which, “1,” “2,” “3,” “4,” and “5” in x-axis refer to TST, BST, KS, T, and SSA, respectively.

Last, one representative test is selected from Exp2 above, and the simulations of MCPs detection are demonstrated by using a time series with nMCPs = 25 (Figure 12). For the diagnosed data sample (Figure 12(f)), the distributions of resultant MCPs are illustrated by using different CPD models of TST, BST, KS, T, and SSA models, respectively (Figure 12(a)–12(e)). The results of frequency and WED distribution of resultant MCPs (Figure 13) and mean analyses (Table 4) reveal that the TST is a superior one in all five models because most of the resultant MCPs hit the target MCP positions, and few of them are dealt with as miss or error states. The BST model takes second place due to a smaller hit rate and bigger error rate than TST. For the rest models, KS, T, and SSA get worse one by one because more numbers of resultant MCPs are in the error state. As a result, the hit rate gets lower, and MWED takes bigger as well.

Figure 12

The simulations of MCPs detection on the representative test with N = 25. For one selected sample in (f), the resultant MCPs are illustrated by using different models of (a) TST, (b) BST, (c) KS, (d) T, and (e) SSA, respectively.

Figure 13

The results of WED evaluation on the 100 samples with N = 25 in Exp2. For the different MCP models of (a) TST, (b) BST, (c) KS, (d) T, and (e) SSA, the frequency and WED distribution of resultant MCPs are illustrated within the NPED ranging from −1 to 0, and the NNED ranging from 0 to 1 in the x-axis, respectively.

Table 4

The mean analyses on five CPD models in one representative MCPs detection test with = 25.

items	Methods
items	TST	BST	KS	T	SSA
MWTD	0.9655	0.9319	0.8836	0.6042	0.5779
Hit rate	0.9310	0.5667	0.3030	0.2222	0.0727
Miss rate	0.0345	0.0333	0.0303	0.0635	0.0000
Error rate	0.0345	0.4333	0.6970	0.7619	0.8909
MWED	0.0345	0.0681	0.1164	0.3958	0.4221
Time	0.0004	0.0003	0.1889	0.1209	0.6896

Successive MCPs detection under the FSW framework. In the Exp3 under the FSW framework, the total of 30 data segments was arranged within each sample X, and each data segment Seg = { x1,…, X} was randomly generated by the Gaussian distribution N(U, V) of length N from 2^12 to 2^15, with mean U from 1.0 to 0.1 × 30 and variance V from 1 to 2.0 × 30, as well as with the size of fixed slide window N ranging from 2^6 to 2^15, respectively. In our simulations, we execute a total of 10 successive tests on MCPs detection under the FSW framework. First, the frequency and WED distribution of resultant MCPs (Figure 14) are displayed from the negative-NPED field to the positive-NNED field in the x-axis. Generally, for a certain CPD model, the resultant MCPs are much closer to the central axis x = 0, and their WED values are much smaller. The results (Figure 14 and Table 5) indicate that for the TST model, most of the resultant MCPs detected are located near the central axis x = 0, and it has the biggest hit rate, the smallest values of error rate, MWED, and computing time; therefore, it has the highest MWTD in all five CPD models. As for BST, KS, T, and SSA models, the common feature is that most of the resultant MCPs are randomly scattered through the whole field ranging from −1 to 1 in the x-axis. For KS, it has a smaller miss rate and MWED and a bigger MWTD than the others. Although BST has a bigger hit rate and shorter time, it has a bigger MWED and smaller MWTD than TST and KS. T and SSA have much bigger values of MWED, error rate, and smaller MWTD, especially SSA has the smallest MWTD and the biggest values of error rate and time in all five models.

Figure 14

The analyses on the frequency and WED distribution of resultant MCPs in the total 10 tests of Exp3. For the different MCP models of (a) TST, (b) BST, (c) KS, (d) T, and (e) SSA under the FSW framework, the frequency and WED distribution of resultant MCPs are illustrated within the NPED field ranging from −1 to 0 and the NNED field ranging from 0 to 1 in the x-axis.

Table 5

The mean analyses on MCPs detection in Exp3 by five CPD models under the FSW framework.

Items	Methods
Items	TST	BST	KS	T	SSA
MWTD	0.9875	0.7758	0.8268	0.5063	0.5009
Hit rate	0.7867	0.5167	0.3106	0.1862	0.0525
Miss rate	0.1900	0.0930	0.0894	0.0977	0.0633
Error rate	0.0200	0.4186	0.6194	0.7271	0.9004
MWED	0.0125	0.2242	0.1732	0.4937	0.4991
Time	0.0006	0.0008	0.1419	0.0766	0.7802

Meanwhile, these simulations illustrate that the TST has the best convergency, in terms of the highest hit, the lowest error, and the shortest time in all five models. For the other four models, the BST model is much better than the rest ones, because it has a relatively higher hit level, lower error rate, and much shorter time than others. Unfortunately, SSA has the worst convergency in all five models, due to the lowest hit level, the biggest error rate, and the longest convergent time out of the other four models. Second, the performance evaluation on five CPD models is demonstrated respectively by a series of successive MCPs detection tests in Exp3. Generally, the dynamic tracks and histogram analyses (Figure 15) show that all five CPD models present respective instability in response to the size of the fixed slide window, N ranging from 2^6 to 2^15, especially for the TST, BST, and KS models. Despite the TST model having the biggest miss rate with drastic fluctuations, it still keeps a better efficiency due to the highest hit rate and the lowest levels of error rate and MWED out of the other four models. As for the rest ones, BST seems better than KS, T, and SSA, because of the higher hit rate and the slightly decreasing level of error rate. Although KS reversely keeps decreasing hit rate and increasing error rate with big fluctuation, it seems better than T and SSA, on account of lower levels of miss rate and MWED. Both T and SSA present inefficiency and insensitivity in response to the increasing N, especially for the SSA model, with the lowest hit rate and the highest levels of error rate and MWED out of other ones.

Figure 15

The simulations of MCPs detection on the total of 10 tests in Exp3 under the FSW framework, with random parameters of sample size N from 2^12 to 2^15, the fixed number of tCPs N = 30, mean U from 1 to 0.1 × 30, and variance V from 1 to 2 × 30, respectively. For the different CPD models of TST, BST, KS, T, and SSA, the performance analyses are denoted in (a) hit rate, (b) miss rate, (c) error rate, and (d) mwed, respectively. Furthermore, the mean analyses are illustrated in (e) hit rate, (f) miss rate, (g) error rate, and (h) MWED, in which, “1,” “2,” “3,” “4,” and “5” in x-axis refer to TST, BST, KS, T, and SSA, respectively.

Last, taking the TST model as an example, five representative simulations are selected from the total 10 tests in the FSW framework of Exp3 (Figure 16(a)–16(e)), and then the performance evaluation is listed under the values of N = 2^6, 2^8, 2^12, 2^14, and 2^15, respectively (Table 6). Given one data sample with N = 30 (Figure 16(f)), the results of MCPs detection show that the TST model presents the best performance as N = 2^12, in terms of the biggest values of hit rate and MWTD, and the smallest values of miss and error rates and MWED in all five tests. However, the efficiency of TST tends to be worse as the value of N takes too bigger or too smaller. Therefore, the size of the fixed slide window is a key factor for the FSW framework during the MCPs detection.

Figure 16

The simulations of MCPs detection by TST model under different sizes of fixed slide window N in the FSW framework. Given the diagnosed data sample with N = 30 in (f), the resultant MCPs detection is illustrated under different Nfsw values of (a) 2^6, (b) 2^8, (c) 2^12, (d) 2^14, and (e) 2^15, respectively.

Table 6

The performance evaluations on the TST model with different N under the FSW framework in Exp3.

Items N_fsw =	Hit rate	Miss rate	Error rate	MWED	MWTD
2^6	0.4667	0.5333	0.0000	0.5333	0.4667
2^8	0.7000	0.3000	0.0000	0.3000	0.7000
2^12	0.9667	0.0303	0.0333	0.0586	0.9414
2^14	0.7667	0.2000	0.0333	0.2002	0.7998
2^15	0.6000	0.3333	0.0000	0.3333	0.6667

In all, these results in the two experiments above suggest that the proposed WED method can visually present the distribution of resultant eCPs in the error state and the normalized distance from the target position of zero in the x-axis. The simulation results suggest that the mean analyses of MWED can generally count the mean value of error ratio against total tests and then measure the efficiency of a certain model in the successive MCPs detection. The performances of different CPD models can be evaluated, and the better ones can be discerned from the others.

4. Conclusions and Discussion

In this study, a novel WEDM method is proposed for evaluating the overall performance of a CPD model across not only repetitive tests on single CP detection, but also successive tests on multiple change-point (MCP) detection on synthetic time series under different RSW and FSW frameworks. In this WEDM method, a concept of normalized error distance was introduced that allows comparisons of the distance between the estimated change-point (eCP) position and the target change-point (tCP) in the synthetic time series. Especially, both positive- and negative-error distances between resultant eCPs and actual tCPs are weighted or normalized for creating WED metrics. As opposed to previous methods, our WEDM allows comparison when CPD is used across multiple time-series samples with different lengths and variances, especially cross multiple data segments in an identical time series, with different patterns, such as data distributions, segment sizes, and number and positions of targets tCPs. In the successive MCPs detection, our WEDM method first divides the original sample into a series of data segments in terms of assigned target change points and then calculates a normalized error distance (NED) value for each segment. Next, WEDM presents the frequency and WED distribution of the resultant eCPs from all data segments in the normalized positive-error distance (NPED) and the normalized negative-error distance (NNED) intervals in the same coordinates. Last, the mean WED (MWED) and MWTD (1-MWED) were obtained and dealt with as important performance indexes. In our simulations, a series of MCPs detection tests were executed by using synthetic time-series datasets in the Matlab platform, and the proposed method was applied to the evaluation of the CPD utilizing TST, BST, KS, T, and SSA models under repetitive single CP detection in Exp1, successive MCPs detection under the RSW in Exp2, and FSW framework in Exp3, respectively. The results of the study showed its ability to compare the results from the CPD models working with a series of synthetic tests on multiple time-series samples. The WED metrics offer a new way of evaluating CPD performance. It allows better visualization of the distribution of the resultant eCPs when the CPD models work on multiple time series with different data features, as well as multiple data segments of a time-series sample with different data patterns. Meanwhile, the convergence of different CPD models was analyzed in terms of the dynamic tracks and mean analyses on the value of WED, as well as other measurements, including the rates of hit, error, and miss, and the computational cost. Our WEDM method can not only offer a visualizable and overall measure but also give better advice for users as to what CPD models to use based on the application.

15 in total

1. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Authors: A L Goldberger; L A Amaral; L Glass; J M Hausdorff; P C Ivanov; R G Mark; J E Mietus; G B Moody; C K Peng; H E Stanley
Journal: Circulation Date: 2000-06-13 Impact factor: 29.690

2. A meta-analysis of electroencephalographic sleep in depression: evidence for genetic biomarkers.

Authors: Vivek Pillai; David A Kalmbach; Jeffrey A Ciesla
Journal: Biol Psychiatry Date: 2011-09-19 Impact factor: 13.382

3. Atlas, rules, and recording techniques for the scoring of cyclic alternating pattern (CAP) in human sleep.

Authors: M G Terzano; L Parrino; A Sherieri; R Chervin; S Chokroverty; C Guilleminault; M Hirshkowitz; M Mahowald; H Moldofsky; A Rosa; R Thomas; A Walters
Journal: Sleep Med Date: 2001-11 Impact factor: 3.492

Review 4. Receiver operating characteristic (ROC) methodology: the state of the art.

Authors: J A Hanley
Journal: Crit Rev Diagn Imaging Date: 1989

5. Statistical methodology: III. Receiver operating characteristic (ROC) curves.

Authors: M Grzybowski; J G Younger
Journal: Acad Emerg Med Date: 1997-08 Impact factor: 3.451

Review 6. Prinzmetal angina: ECG changes and clinical considerations: a consensus paper.

Authors: Antonio Bayés de Luna; Iwona Cygankiewicz; Adrian Baranchuk; Miquel Fiol; Yochai Birnbaum; Kjell Nikus; Diego Goldwasser; Javier Garcia-Niebla; Samuel Sclarovsky; Hein Wellens; Günter Breithardt
Journal: Ann Noninvasive Electrocardiol Date: 2014-09 Impact factor: 1.468