Literature DB >> 34054146

Detecting multiple generalized change-points by isolating single ones.

Andreas Anastasiou1, Piotr Fryzlewicz2.   

Abstract

We introduce a new approach, called Isolate-Detect (ID), for the consistent estimation of the number and location of multiple generalized change-points in noisy data sequences. Examples of signal changes that ID can deal with are changes in the mean of a piecewise-constant signal and changes, continuous or not, in the linear trend. The number of change-points can increase with the sample size. Our method is based on an isolation technique, which prevents the consideration of intervals that contain more than one change-point. This isolation enhances ID's accuracy as it allows for detection in the presence of frequent changes of possibly small magnitudes. In ID, model selection is carried out via thresholding, or an information criterion, or SDLL, or a hybrid involving the former two. The hybrid model selection leads to a general method with very good practical performance and minimal parameter choice. In the scenarios tested, ID is at least as accurate as the state-of-the-art methods; most of the times it outperforms them. ID is implemented in the R packages IDetect and breakfast, available from CRAN. SUPPLEMENTARY INFORMATION: The online version supplementary material available at 10.1007/s00184-021-00821-6.
© The Author(s) 2021, corrected publication 2021.

Entities:  

Keywords:  SDLL; Schwarz information criterion; Segmentation; Symmetric interval expansion; Threshold criterion

Year:  2021        PMID: 34054146      PMCID: PMC8142888          DOI: 10.1007/s00184-021-00821-6

Source DB:  PubMed          Journal:  Metrika        ISSN: 0026-1335            Impact factor:   1.057


Introduction

Change-point detection is an active area of statistical research that has attracted a lot of interest in recent years. Our work’s focus is on a posteriori change-point detection, where the aim is to estimate the number and locations of certain changes in the behaviour of the data. We work in the modelwhere are the observed data and is a one-dimensional, deterministic signal with structural changes at certain points. Two examples are: change-points in the level when is seen as piecewise-constant, and change-points in the first derivative when is piecewise-linear. We highlight, however, that our methodology and analysis apply to more general scenarios, for instance the detection of knots in a piecewise polynomial signal of order k, where k is not necessarily equal to zero (piecewise-constant mean) or one (piecewise-linear mean). The number N of change-points as well as their locations are unknown and our aim is to estimate them. In addition, N can grow with T. The random variables in (1) have mean zero and variance one; further assumptions will be given in Sect. 3.2. When is assumed to be piecewise-constant, the existing change-point detection techniques are mainly split into two categories based on whether the change-points are detected all at once or one at a time. The former category mainly includes optimization-based methods, in which the estimated signal is chosen based on its least squares or log-likelihood fit to the data, penalized by a complexity rule in order to avoid overfitting. The most common example of a penalty function is the Schwarz Information Criterion (SIC); see Yao (1988) for details. To solve the implied penalization problem, dynamic programming approaches, such as the Segment Neighborhood (SN) and Optimal Partitioning (OP) methods of Auger and Lawrence (1989) and Jackson et al. (2005), have been developed. In an attempt to improve on OP’s computational cost, Killick et al. (2012) introduce the PELT method, based on a pruning step applied to OP’s dynamic programming approach. A non-parametric adaptation of PELT is given in Haynes et al. (2017). Rigaill (2015) introduces an improvement over classical SN algorithms, through a pruning approach called PDPa, while Maidstone et al. (2017) give two algorithms by combining ideas from PELT and PDPa. Frick et al. (2014) propose the simultaneous multiscale change-point estimator (SMUCE) for the change-point problem in the case of exponential family regression; solving an optimization problem is also required. The FDRSeg method of Li et al. (2016) is a combination of False Discovery Rate (FDR) control and global segmentation methods in a multiscale way; the change-points are again detected all at once. In the latter category, in which change-points are detected one at a time, a popular method is binary segmentation, which performs an iterative binary splitting of the data on intervals determined by the previously obtained splits. Vostrikova (1981) introduces and proves the validity of binary segmentation in the setting of change-point detection for piecewise-constant signals. The main advantages of binary segmentation are its conceptual simplicity and low computational cost. However, at each step of the algorithm, binary segmentation looks for a single change-point, which leads to its suboptimality in terms of accuracy, especially for signals with frequent change-points. Some variants of binary segmentation that work towards solving this issue are the Circular Binary Segmentation (CBS) of Olshen et al. (2004), the Wild Binary Segmentation (WBS) of Fryzlewicz (2014) as well as its second version (WBS2) of Fryzlewicz (2020), the Narrowest-Over-Threshold (NOT) method of Baranowski et al. (2019), and the Seeded Binary Segmentation (SeedBS) of Kovács et al. (2020). CBS searches for at most two change-points at each step. Instead of initially calculating the contrast value for the whole data sequence, WBS and NOT are based on a random draw of subintervals of the domain of the data, on which an appropriate statistic is tested against a threshold. The draw of all the subintervals takes place at the beginning of the algorithm. In contrast, WBS2 draws first only a small number, , of data subsamples. It then uses the first change-point candidate to split the data into two parts, and again recursively draws the same number of subsamples to the left and to the right of this change-point candidate, and so on. A major difference between WBS and WBS2 is that the latter adaptively decides where to recursively draw the next subsamples, based on the change-point candidates detected so far; this adds to the detection power of the method. SeedBS is an approach, similar to WBS and NOT, that relies instead on a deterministic construction of background intervals in which single change points are searched. Apart from binary-segmentation-related approaches, the category in which the change-points are detected one at a time also includes methods that control the False Discovery Rate. For instance, the “pseudo-sequential” (PS) procedure of Venkatraman (1992), as well as the CPM method of Ross (2015) are based on an adaptation of online detection algorithms to a posteriori situations and work by bounding the Type I error rate of falsely detecting change-points. Some methods do not fall in either category. For example, the tail-greedy algorithm in Fryzlewicz (2018) achieves a multiscale decomposition of the data using Unbalanced Haar wavelets in an agglomerative way. In addition, Eichinger and Kirch (2018) use moving sum (MOSUM) statistics in order to detect multiple change-points. For a more thorough review of the literature on the detection of multiple change-points in the mean of univariate data sequences, see Cho and Kirch (2020) and Yu (2020). Truong et al. (2020) also present a survey of various a posteriori change-point detection algorithms; the focus is, however, on multivariate time series. Beyond the piecewise-constant signal model, existing methods mainly minimize the residual sum of squares taking into account a penalty, with the most common being the SIC. This is used in Bai and Perron (1998), in the trend filtering (TF) approach (Kim et al. 2009; Tibshirani 2014), and in the dynamic programming algorithm CPOP (Maidstone et al. 2019). Friedman (1991) introduces the Multivariate Adaptive Regression Splines (MARS) method for regression analysis based on splines with the number and the location of the knots being determined by the data. Spiriti et al. (2013) propose two methods for optimizing knot locations in spline smoothing, where either the number of knots is fixed or an upper bound for it needs to be given. The NOT approach (Baranowski et al. 2019) detects change-points one at a time in various scenarios including piecewise-linear mean signals. In general, change-point detection becomes easier in situations where there is at most one change-point to be detected in a given interval; in such cases the detection power of the contrast function (more details are in Sect. 3.2) is maximised. Therefore, it makes sense to decouple the multiple change-point detection problem into many single change-point detections. To achieve this, we propose a generic technique, Isolate-Detect (ID), for generalized change-point detection in various different structures, such as piecewise-constant or piecewise-linear signals. The concept behind ID is simple and is split into two stages; firstly, the isolation of each of the true change-points within subintervals of the domain , and secondly their detection. From now on, the terms subinterval and interval will be used interchangeably. Although a detailed explanation of our methodology is provided in Sect. 3.1, the basic idea is that for an observed data sequence of length T and with a positive constant, ID first creates two ordered sets of right- and left-expanding intervals as follows. The jth right-expanding interval is , while the jth left-expanding interval is . We collect these intervals in the ordered set . For a suitably chosen contrast function (more details are in Sect. 3.2), ID identifies the point with the maximum contrast value in . If its value exceeds a threshold, denoted by , then it is taken as a change-point. If not, then the next interval in is tested. Upon detection, ID makes a new start from the end-point (or start-point) of the right- (or left-) expanding interval where the detection occurred. Upon correct choice of , ID ensures that we work on intervals with at most one change-point, which was our aim. We would like to highlight the importance of the change-point isolation aspect present in our method as explained in the previous paragraph. There are various advantages. First, it enables detection in higher-order polynomial signals. Second, it is carried out in a fixed and systematic way, which eliminates any randomness in the selection of the intervals and, by extension, in the final results. Third, the way the isolation is carried out in ID makes it quicker than other localisation-focused algorithms, such as NOT, due to the fact that it needs to work on fewer intervals; more details on this advantage of our proposed methodology are in Sect. 4.1. We note here that, even though the default methodology described in Fryzlewicz (2014) and Baranowski et al. (2019) is based on the construction of random intervals, the same approaches can be applied to a fixed grid of intervals. However, as noted in Kovács et al. (2020), the latter implementation can be quite slow. Fourth, the pseudo-sequential nature of the attempted isolation, makes our proposed methodology suitable for online change-point detection. This is one of the various different ways that ID is different from existing techniques in the literature which also attempt change-point isolation; a more thorough comparison with seemingly similar, but still different, methods is given in the next section. The paper is organized as follows. Section 2 is a motivating illustration of our proposed method through examples. Section 3 gives a formal explanation of the ID methodology along with two different scenarios of use and the associated theory. In Sect. 4, we first discuss the computational aspects of ID and the choice of parameter values. ID variants which lead to improved practical performance are also explained. In Sect. 5, we provide a thorough simulation study to compare ID with state-of-the-art methods. Real-life data examples are provided in Sect. 6. The paper is concluded with reflections on the proposed method. The theoretical, as well as practical, merits and weaknesses of ID when compared against state-of-the-art methods are discussed throughout the paper. However, for the sake of clarity these are also brought together in Sect. 7. ID is implemented in the R packages IDetect and breakfast, available from CRAN.

Motivating illustration of Isolate-Detect

The fact that each change-point is sequentially detected using an interval that contains no other change-points leads to high detection power, especially in difficult structures, such as limited spacings between consecutive change-points and/or higher-order piecewise-polynomial signals. Two examples follow in order to make clear the importance of the isolation step and to illustrate the power of ID compared to other change-point detection methods (some of those also attempt localisation) in capturing even small movements in the data that are close to each other. Table 1 provides results on 100 replications of the continuous piecewise-linear signal (S1) and the piecewise-constant signal (S2), where
Table 1

Distribution of over 100 simulated data sequences from (S1)

SignalMethodMSE
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le - 15$$\end{document}-15\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(-15, -5] $$\end{document}(-15,-5]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[-4,4]$$\end{document}[-4,4][5, 15)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 15$$\end{document}15
(S1)ID0010000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$13 {\times } 10^{-5}$$\end{document}13×10-5
NOT586900\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$141 {\times } 10^{-5}$$\end{document}141×10-5
MARS1000000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$284 \times 10^{-5}$$\end{document}284×10-5
(S2)ID019720\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$94 \times 10^{-5}$$\end{document}94×10-5
NOT1000000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$485 \times 10^{-5}$$\end{document}485×10-5
PELT7822000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$437 \times 10^{-5}$$\end{document}437×10-5
WBS2771200\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$413 \times 10^{-5}$$\end{document}413×10-5

The average MSE is also given

, with 21 change-points in the slope at locations . The standard deviation is ; , with 21 change-points in the mean at locations . The standard deviation is . Distribution of over 100 simulated data sequences from (S1) The average MSE is also given Results (up to ) on estimated signals obtained by different change-point detection methods. Top row: the true signal (S1) and the data sequence, and the estimated signal using ID. Bottom row: The estimated signals from NOT, and MARS Results (up to ) on estimated signals obtained by different change-point detection methods. Top row: the true signal (S2), the data sequence, and the estimated signal using ID. Bottom row: The estimated signals from WBS, NOT, and PELT As a measure of the accuracy of the estimated number we give , while as a measure of the accuracy of the detected locations, we give Monte-Carlo estimates of the mean square error, . The methods compared are ID, NOT, and MARS for (S1) and ID, WBS, NOT, and PELT for (S2). For the ID related results in Table 1, we used the hybrid version of ID explained in Sect. 4.4. The choice of the parameters is described in Sect. 4.2. As already mentioned, WBS and NOT also work on subintervals of the data, chosen though in a completely different manner than in ID. More comparative simulation and real-life studies will be given in Sects. 5 and 6, respectively. We notice from Table 1 that ID offers an important increase in the change-point detection power, especially under limited spacings between consecutive change-points. Figures 1 and 2 give a graphical representation of the results for the first out of the 100 repetitions for signals (S1) and (S2), respectively. For better presentation of the results, in (S1) the signals are presented up to , since after there is no change-point and in all methods the estimated signal continues linearly beyond that point. For the same reason, in Fig. 2 which is related to (S2), the results are presented up to . The NOT and WBS methods also operate on sub-intervals of the data. However, the nature of the fixed, certain (we can expand one data point at each time), localization in ID means that it is of an order of magnitude faster than the aforementioned methods, which have high computational cost that increases linearly with the number of the randomly drawn intervals. This is an issue of fundamental importance, especially in signals with a large number of change-points, in which NOT and WBS need to increase the number M of intervals drawn. However, doing this also increases the computational cost. More specifically, one could try and draw all possible combinations of start- and end-points of the intervals; however, the computational complexity turns out to be cubic in T. In contrast, due to the explained interval expansion approach, in ID no choice of M is required, which leads to better practical performance with more predictable execution times, while at the same time ID examines all possible change-point locations. We recall that unlike ID and NOT, the principle of WBS does not extend to models other than piecewise-constant. To be more precise, this generality of Isolate-Detect with respect to its applicability in many different signal structures is a main distinction between our method and recently published competing methods which, with the exception of NOT, have been developed to cover only the detection of level-changes.
Fig. 1

Results (up to ) on estimated signals obtained by different change-point detection methods. Top row: the true signal (S1) and the data sequence, and the estimated signal using ID. Bottom row: The estimated signals from NOT, and MARS

Fig. 2

Results (up to ) on estimated signals obtained by different change-point detection methods. Top row: the true signal (S2), the data sequence, and the estimated signal using ID. Bottom row: The estimated signals from WBS, NOT, and PELT

Methodology and theory

Methodology

The model is given in (1) and the unknown number, N, of change-points can possibly grow with T. Let and and let . For clarity of exposition, we start with a simple example before providing a more thorough explanation of how ID works. Figure 3 covers a specific case of two change-points, and . We will be referring to Phases 1 and 2 involving six and four intervals, respectively. These are clearly indicated in the figure and they are only related to this specific example, as for cases with more change-points we would have more such phases. At the beginning, , , and we take (how to choose will be described in Sect. 4.2). Suppose the threshold has been chosen well enough (more details in Sect. 4.2) so that gets detected in , where . After the detection, e is updated as the start-point of the interval where the detection occurred; therefore, . In Phase 2 indicated in the figure, ID is applied in . Intervals 1, 3 and 5 of Phase 1 will not be re-examined in Phase 2 and gets, upon a good choice of , detected in , where . After the detection, s is updated as the end-point of the interval where the detection occurred; therefore, . Our method is then applied in ; supposing there is no interval on which the contrast function value exceeds , the process will terminate.
Fig. 3

An example with two change-points; and . The dashed line is the interval in which the detection took place in each phase

An example with two change-points; and . The dashed line is the interval in which the detection took place in each phase We now describe ID more generically. For each change-point, , ID works in two stages: Firstly, we isolate in an interval that contains no other change-point. To ensure this, the expansion parameter can be taken to be as small as equal to 1. If , then isolation is guaranteed with high probability. Theoretically for large T, the chosen value for (this typically will be small; see Sect. 4.2 for more details) is guaranteed to be smaller than the minimum distance (which has to grow with T) between two consecutive change-points and isolation will be guaranteed. For an explanation on the rate of with respect to the sample size T, see the discussion that follows Theorem 1. (Of course when asymptotics is put aside, in finite samples anything can happen, and in some configurations no method can be guaranteed to detect change-points if they are arbitrarily close.) The second stage is to detect through the use of an appropriate contrast function. This function is, from now on, denoted by , and it is defined for any integer triple (s, e, b), with . Heuristically, the value of is small if b is not a change-point and large otherwise. In piecewise-constant signals, the contrast function reduces to the absolute value of the CUSUM statistic defined in (4), while for continuous, piecewise-linear signals, the contrast function is given in Sect. 3.2. For the better understanding of the method, we provide its step-by-step simple outline through pseudocode, followed by a succinct narrative of the purpose of each step. The threshold to be used, in order to decide if a change has occurred at a specific data point, is denoted by . Practical choices for and are given in Sect. 4.2. For , let and for , while and . For a generic interval [s, e], define the sequenceswhere and . Denoting by |A|, the cardinality of any sequence A, and by A(j) its jth element, the pseudocode of the main function is as below: function ISOLATEDETECT() end function A brief explanation of the pseudocode follows. With K already defined above, the intervals are those used for the isolation step. Notice that in the odd intervals the start-point is fixed, unchanged, and equal to s, meaning that . In the even intervals , it is the end-point that is kept fixed and equal to e, meaning that . The process will follow until there are intervals to check. The term “expanding intervals” that is used throughout the paper is due to this one-sided expansion (of magnitude ) of the intervals. The pseudocode makes it also clear that ID is looking for change-points interchangeably in right- and left-expanding intervals which, with high probability, contain at most one change-point. The Isolate-Detect procedure is launched by the call ISOLATEDETECT(). The idea of a-posteriori change-point detection, in which change-points are detected sequentially, has appeared previously in the literature. The PS method of Venkatraman (1992) studies the multiple change-point detection problem for the case of piecewise-constant mean signals, as well as for changes in the rate of an exponential process. The CPM method of Ross (2015) treats change-point detection in the mean or variance of a sequence of random variables when their distribution is known. In addition, CPM can be used for distributional changes. Fang and Siegmund (2020), in a work completed after the first version of the current paper appeared on arXiv, search for significant change-points in settings such as piecewise-linear, and one of their algorithms, labelled Seq, bears some resemblance to ID; we note, however, that in addition to some algorithmic differences our aim is different as we focus on consistent estimation while Fang and Siegmund (2020) on testing. ID is conceptually and in practice different from these methods in a number of ways related to the threshold choice, the construction of the estimated change-point locations as well as the way PS, CPM, and Seq restart upon detection. Furthermore, ID’s isolation technique does not appear in CPM. By contrast, we use this isolation property of ID as a device enabling its use in piecewise-(higher-order-) polynomial models. Indeed, as shown in Baranowski et al. (2019), fast segmentation of signals of the latter type is difficult to achieve unless any change-point present can be isolated away from neighbouring change-points before detection is performed, which is exactly what ID sets out to do. In particular, this paper demonstrates the use of ID in continuous piecewise-linear models. A comparison between the performance of ID and that of state-of-the-art methods is given in Sect. 5.

Theoretical behavior of ID

The assumption of the random sequence being independent and identically distributed (i.i.d.) from the Gaussian distribution is widely used in the literature. In this paper, the Gaussianity assumption is only made for technical convenience with respect to the proofs of Theorems 1 and 2. Relaxing both the Gaussianity and the independence assumptions in order to have time-dependent errors is a more complicated issue in terms of theory development. Recently, Dette et al. (2018) have attempted to treat this issue, specifically for the SMUCE approach of Frick et al. (2014), using a reliable estimate for the long run variance, , of the error distribution, which is not necessarily Gaussian. Apart from the well-studied i.i.d. Gaussian noise structure, Isolate-Detect is explored under a variety of settings including i.i.d. non-Gaussian (see Sect. 4.5), and auto-correlated noise structures; see Fearnhead and Rigaill (2020) who conclude that “IDetect has very strong performance for many scenarios when either we have auto-correlated or heavy-tailed noise”. If the standard deviation, , of is unknown, then we need to estimate it and in the cases of independent errors with the signal being piecewise-constant or piecewise-linear, can be estimated via the Median Absolute Deviation (MAD) method proposed in Hampel (1974). For , the proposed estimator, denoted by , has been shown to be, for , a consistent estimator of the population standard deviation in the case of Gaussian data (Rousseeuw 1993). It is very robust as evidenced by its bounded influence function and its 50% breakdown point. For simplicity, let , and (1) becomesWith and , and for , we examine the theoretical behaviour of ID in the following two illustration cases: Piecewise-constant signals: for , and . Continuous, piecewise-linear signals: , for with the additional constraint of for . The change-points, , satisfy . The above scenarios are only examples of settings in which the ID methodology can be applied. The isolation aspect of the method allows its application to various different cases, such as the estimation of the number and the position of knots in piecewise polynomial signals (with or without the continuity constraint). Piecewise-constant signals. Under piecewise-constancy, the contrast function used is the absolute value of the CUSUM statistic, the latter beingwhere and . Under the i.i.d. Gaussian framework used for the theoretical results presented in this paper, it can be shown that , where is the generalized log-likelihood ratio statistic for all potential single change-points within [s, e]. For the main result of Theorem 1, we also make the following assumption. The number of change-points, N, is assumed to be neither known nor fixed. It can grow with T and the only indirect assumption on N is due to the minimum distance, , between two change-points in the sense that . Below, we give the theoretical result for the consistency of the number and location of the estimated change-points. The proof is in Section 8 of the supplementary material. The minimum distance, , between two change-points and the minimum magnitude of jumps, , are connected by , for a large enough constant .

Theorem 1

Let follow model (3), with being a piecewise-constant signal and assume that the random sequence is independent and identically distributed (i.i.d.) from the normal distribution with mean zero and variance one and also that (A1) holds. Let N and be the number and locations of the change-points, while and are their estimates sorted in increasing order. In addition, , . Then, there exist positive constants , which do not depend on T, such that for and for a sufficiently large T, we obtain The isolation aspect of Isolate-Detect helps us to prove consistency under the conditions used in Theorem 1 (and later in Theorem 2). From (5), we notice that in order to be able to match the estimated change-point locations with the true ones, should be larger than , meaning that must be at least . For this order of , Chan and Walther (2013) argue that the smallest possible that allows change-point detection is . In our case, assumption (A1) ensures that the rate for is attained, which is nearly optimal (up to the double logarithmic term). This provides evidence that ID allows for detection in complex scenarios, such as limited spacings between change-points. We mention that if is of higher order than , then Assumption (A1) implies that could decrease with T. The quantity on the right-hand side of (5) is ; the same order as in WBS and NOT. However, ID gives a provably lower constant for the bound. To understand this consistency advantage of our method over, for example, NOT see our proof in Section 8 of the supplement and compare (17) with the result in Equation (19), p.28 in the online supplementary material of Baranowski et al. (2019). The rate of the lower bound for the threshold is and this is what will be used in practice as the default rate: we useand the choice of the constant C will be explained in Sect. 4. Furthermore, (5) indicates that does not affect the rate of convergence of the estimated change-point locations; these only depend on . Continuous, piecewise-linear signals. Under Gaussianity and with being the generalized log-likelihood ratio for all possible single change-points within [s, e), the idea is to find a contrast function , which is maximized at the same point as . The contrast function is constructed by taking inner products of the data with a contrast vector. In the case of continuous piecewise-linear signals, Baranowski et al. (2019) show that the contrast vector to be used is , wherewhere , and . The contrast function is . To explain the reasoning behind the choice of the triangular function , we define, for the interval [s, e], the linear vector , (and 0 otherwise) as well as the constant vector , (and 0 otherwise). On the vector , (and 0 otherwise), which is linear with a kink at , we apply the Gram-Schmidt orthogonalization with respect to and . Normalizing the obtained vector such that returns the contrast vector defined in (7). The best approximation, in terms of the Euclidean distance, of in [s, e] is a linear combination of , , and , which are mutually orthonormal (Baranowski et al. 2019). This orthonormality leads to . For the consistency of ID in continuous piecewise-linear signals, we make the following assumption. The term characterizes the difficulty level of the detection problem and is analogous to in the scenario of piecewise-constant signals. Theorem 2 gives the consistency result for the case of continuous piecewise-linear signals. The proof is in Section 8 of the supplement. The minimum distance, , between two change-points and the minimum magnitude of jumps, , are connected by the requirement , for a large enough constant .

Theorem 2

Let follow model (3) with being a continuous, piecewise-linear signal and assume that the random sequence is independent and identically distributed (i.i.d.) from the normal distribution with mean zero and variance one and that (A2) holds. We denote by N and the number and locations of the change-points, while and are their estimates sorted in increasing order. Also, we denote . Then, there exist positive constants , which do not depend on T, such that for and for sufficiently large T, The quantity on the right-hand side of (8) is . In addition, in the case of , ID’s change-point detection accuracy is , as can be seen from (8). This differs from the rate derived in Raimondo (1998) only by the logarithmic factor. The lower bound of the threshold is . Therefore,where is a constant and we will comment on its choice in Sect. 4.2. ID is flexible because it does not depend on the structure of the signal; what changes is the choice of an appropriate contrast function. Adopting a similar approach as the one for the case of continuous piecewise-linear signals, one can construct contrast functions for the detection of other types of features.

Information criterion approach

Misspecification of the threshold in the ID algorithm can lead to the misestimation of the number of change-points. To remedy this, we develop an approach which starts by possibly overestimating the number of change-points and then creates a solution path, with the estimates ordered according to a certain predefined criterion. The best fit is then chosen, based on the optimization of a model selection criterion. The solution path algorithm: The estimated number of change-points depends on and this allows us to denote . For given data, we employ ID using first and then , where . Let and be the -associated constants in (6) and (9), respectively. With , we estimate , which are sorted in increasing order in . Our aim is to prune the estimates through an iterative procedure, where at each iteration the estimation most likely to be spurious is removed. The algorithm is split into four parts, with their descriptions being fairly technical. We note however that the different parts are very similar and are based on the idea of removing change-points according to their contrast function values as well as their distance to neighbouring estimates. Even though the full explanation of each part is in Section 1 of the supplement, we now provide a brief summary for the framework of the solution path algorithm. With and , we first collect triplets , and we calculate with being the relevant contrast function. For we check whether , for ; in the proofs of Theorems 3 and 4, but smaller values could be sufficient; see for example Corollary 1. If , we remove from , reduce J by 1, relabel the remaining estimates (in increasing order) in , and repeat this estimate removal process, which is carried out in a way such that once the set contains N estimates, then for , each is within a distance of from the true change-point . We keep removing estimates until . At the end of this change-point removal approach, we collect the estimates inwhere is the estimate that was removed first, is the one that was removed second, and so on. From now on, the vector is called the solution path and is used to give a range of different fits. We define the collection where and . For , let be the sorted elements of . Among the collection of models , we propose to select the one that minimizes the strengthened Schwarz Information Criterion (Liu et al. 1997; Fryzlewicz 2014), defined aswhere and for each collection , and are the maximum likelihood estimators of the segment parameters for the model (3) with change-point locations . The quantity is the total number of estimated parameters related to . For example, if we do not consider the change-point locations as free parameters, then in the scenario of piecewise-constant mean (the constant values for each of the segments), while in the scenario of continuous and piecewise-linear signals (the starting intercept and slope and the j changes in the slope). We mention that if the continuity constraint is to be removed, then would be equal to (the constant and slope values for the segments). If now we consider the change-point locations to be free parameters, then we just need to add j in the above values for in the different scenarios. In the algorithm we have referred to three parameters: , and . Although we do not give a recipe for the choice of and , Sect. 3 describes how to circumvent their choice. With respect to , taking its value to be equal to 1 in (11) gives the standard SIC penalty, but our theory requires . In practice we use in order to remain close to SIC. Theorems 3 and 4 below give the consistency results for the piecewise-constant and continuous piecewise-linear models, based on the sSIC approach. The proof of Theorem 3 is in the supplementary material and the same approach can be followed to prove Theorem 4.

Theorem 3

Let follow model (3) under piecewise-constancy and let the assumptions of Theorem 1 hold. Let N and be the number and locations of the change-points. Let , where J can also grow with T. In addition, let be such that is satisfied, where and are defined in (A1). With being the set of candidate models obtained by the solution path algorithm, we define . Then, there exist positive constants , which do not depend on T, such that for ,

Theorem 4

Let follow model (3) under continuous piecewise-linearity and let the assumptions of Theorem 2 hold. Let N and be the number and locations of the change-points. Let , where J can also grow with T. In addition, let be such that is satisfied, where and are defined in (A2). With being the set of candidate models obtained by the solution path algorithm, we define . Then, there exist positive constants , which do not depend on T, such that for , We note that our solution path algorithm, explained in detail in Section 1 of the supplementary material, allows J, the number of the detections from the already explained overestimation process, to grow with T. The quantities on the right hand sides of (12) and (13) are ; the same order as those in (5) and (8). The lowest admissible and in Theorems 3 and 4, respectively, are slightly larger than the same quantities in the thresholding approach. Our empirical expertise suggests that SIC-based approaches tend to exhibit better practical behaviour for signals that have a moderate number of change-points and/or large spacings between them. A hybrid that combines the advantages of the thresholding and the SIC-based approach is introduced in Sect. 4.4.

Computational complexity and practicalities

Computational cost

With being the minimum distance between two change-points, and the interval-expansion parameter, we use . We note that while is unknown, choosing small enough guarantees with high probability that this requirement holds; see Sect. 4.2 for how to choose in order to obtain good accuracy performance and at the same time low computational cost. Now, since and the total number, , of intervals required to scan the data is no more than 2K (K intervals from each expanding direction), in the worst case scenario we have . As a comparison, in WBS and NOT one needs to draw at least M intervals where . The lower bound for M in WBS and NOT is up to a logarithmic factor, whereas the lower bound for is . This results in great speed gains of ID over WBS and NOT. The reason behind this significant difference in the computational complexity of the methods is that in WBS and NOT both the start- and end-points of the randomly drawn intervals have to be chosen, whereas in ID, depending on the expanding direction, we keep the start- or the end-point fixed.

Parameter choice

Choice of the threshold constant. We start with an upper bound on the constant C, as defined in (6), for the case of piecewise-constant signals when the error terms are i.i.d. from the Gaussian distribution. We note that this result is of independent interest. Our model is as in (1) for stationary . For any vector , we definewhere and . It can be shown that if , are serially independent and their distribution is symmetric about zero (for example i.i.d. standard Gaussian random variables), then the sequence satisfiesThe following corollary indicates that as , we have that , meaning that the threshold can be taken to be at most . This value of is smaller than the constant used in the solution path algorithm of Sect. 3.3 (), which can however be used to give explicit upper bounds on the consistency results as explained in Theorems 1 and 3; in contrast, Corollary 1 does not give an explicit upper bound for the probability related to the consistency result as expressed in (16). We highlight that the aforementioned bound on the constant and its proof are simpler than the results presented in Fang et al. (2020) which involve the manipulation of complex distributions. The proof is in the supplementary material.

Corollary 1

Let be i.i.d. . For any , For the practical choice of the values of C and , in (6) and (9), respectively, we ran a large-scale simulation study involving a wide range of signals. The number of change-points, N, was generated from the Poisson distribution with rate parameter . For , we uniformly distributed the change-points in . Then, for piecewise-constant (or continuous piecewise-linear) signals, at each change-point location we introduced a jump (or a slope change) which followed the normal distribution with mean zero and variance . Standard Gaussian noise was then added onto the simulated signal. For each value of , and T we generated 1000 replicates and estimated the number of change-points using ID with threshold as in (6) and (9) for a variety of constant values C and . The best behaviour occurred when, approximately, and . These values will be referred to as the default constants and they hold true for all signals that satisfy the assumption of the error terms being i.i.d. Gaussian. We note that the value of does not violate Corollary 1 because the result expressed in the latter is only for piecewise-constant signals, while the constant applies to the scenario of continuous, piecewise-linear signals. Due to the fact that the contrast function used is based on local averaging, the CLT can be used to show that for sufficiently large sample size T, ID is robust when the normality assumption is not satisfied; this has also been explored in Fearnhead and Rigaill (2020). Also, pre-averaging is a practical approach that we employed in Sect. 4.5 for such cases with error departures from Gaussianity. In the SIC-based approach of Sect. 3.3, we started by detecting change-points using threshold . In practice, we take the constants related to , namely and as defined in Sect. 3.3, to be 0.9 and 1.25, respectively. Choice of the expansion parameter . We start by highlighting that our numerical experience suggests that ID is robust to small changes in the value of ; for a small-scale simulation study when the value of changes significantly (), see Section 6 of the supplementary material. Theoretically, for a given signal, the change-point detection results obtained from ID are the same for any value of used which is less than the minimum spacing between two successive change-points. The computational cost of running ID is inversely proportional to the size of the expansion parameter; the smaller the , the more intervals we need to work on. However, the low computational complexity of our algorithm allows us to take to be as small as the value of three leading to very good accuracy even for signals with frequent change-points. We now give example execution times for two models, (T1) and (T2) defined below, on a 3.60GHz CPU with 16 GB of RAM. We employed the ID-variant for long signals explained in Sect. 3. Length , with change-points at and values between them . The standard deviation is . Execution times: 0.31s (), 2.25s (), 26.41s (). Length , with no change-points. We use . Execution times: 0.64s (), 3.01s (), 30.35s ().

Variants

Here, we describe three different ways to further improve ID’s practical performance. Long signals: If T is large, we split the given data sequence uniformly into smaller parts (windows), to which ID is then applied. In practical implementations, the length of the window is 3000 and we apply this structure only when , because for smaller values of T there are no significant differences in the execution times of ID and its window-based variant. The computational improvement that this structure offers is explained in Section 3 of the supplement. Restarting after detection: In practice, instead of starting from the end-point (or start-point ) of the right-expanding (or left-expanding) interval where a detection occurred, we could start from the estimated change-point, . This alternative, labelled , leads to accuracy improvement without affecting the speed of the method. Faster solution path algorithm: In practice, we use only Part 4 of the solution path algorithm described in Section 1 of the supplement because it is quicker and conceptually simpler; it requires only the choice of and tends not to affect ID’s accuracy.

Alternative model selection criteria

A hybrid between thresholding and SIC stopping rules: For signals with a large number of regularly occurring change-points, the threshold-based ID tends to behave better than the SIC-based procedure. As explained after Theorems 3 and 4, this is unsurprising because SIC-based approaches typically perform better on signals with a moderate number of change-points separated by larger spacings. This difference in ID’s behaviour between the threshold- and SIC-based versions is what motivates us to introduce a hybrid of these two stopping rules with minimal parameter choice, which works as follows. Firstly, we estimate the change-points using the threshold approach with . If the estimated change-points are more than a constant , then the result is accepted and we stop. Otherwise, the hybrid method proceeds to detect the change-points using the SIC-based approach with , since the already-applied thresholding rule has not suggested a signal with many change-points. In the simulations, we use , . Steepest Drop to Low Levels (SDLL): We also combine ID with the SDLL model selection method introduced in Fryzlewicz (2020).

Extension to different noise structures

This section describes how to use ID when the noise is not Gaussian. We pre-process the data in order to obtain a noise structure that is closer to Gaussianity. For a given scale number s and data , let and , for , while . We apply ID on to obtain the estimated change-points, namely , in increasing order. To estimate the original locations of the change-points we define . The larger the value of s, the closer the distribution of the noise to normal, but the more the amount of pre-processing. In simulations presented in Sect. 5, we use for the case of Student- distributed noise, while if the tails are heavier (Student-), we set . The hybrid version of ID will be employed on and in order to be consistent with the choice of the expansion parameter, we take . In practice, for unknown noise, our recommendation is to set .

Simulations

This section compares the performance of ID with competitors. The main change-point detection functions in the competing packages were called using their default input arguments, which does not always allow direct like-for-like comparisons of the methods. Whenever needed (difficult signal structures), and in order to help the competitors capture their best possible performance, the input values were adjusted accordingly. The code used for the simulation study is available from Github at https://github.com/Anastasiou-Andreas/IDetect/blob/master/R/Simulations_used.R. Table 2 shows the competitors used. CPOP is employed based on code found in http://www.research.lancs.ac.uk/portal/en/datasets/cpop(56c07868-3fe9-4016-ad99-54439ec03b6c).html and TF in https://stanford.edu/~boyd/l1_tf. For WBS, we give results based on both the information criterion and the thresholding (for ) stopping rules. The notation is WBSIC and WBSC1, respectively. With respect to WBS2, its performance is investigated based on the SDLL model selection criterion introduced in Fryzlewicz (2020). In the cpm package, the threshold is decided through the average run length (ARL) until a false positive occurs. In our simulations, we give results for (the default value) and if the signal length, , is greater than 500, results are also given for . The notation is CPM.l.A, with A the value of ARL. For FKS, when the number of knots is unknown (the scenario we work in), we need to specify the maximum allowed number of knots. We take this to be 2N, with N the true number of change-points. Also, the estimated change-points by FKS are positive real numbers; we take as estimation the closest integer. The proposed ID version is the hybrid described in Sect. 4.4. However, we also present the results for two more variants: SDLL and thresholding with constant (see (6)), which is the upper bound proven in Corollary 1. The notation for these variants is ID.SDLL and , respectively.
Table 2

The competing methods used in the simulation study

Type of signalMethod notationReferenceR package
Piecewise-constantPELT Killick et al. (2012)changepoint
NP.PELT Haynes et al. (2017)changepoint.np
S3IB Rigaill (2015)Segmentor3IsBack
CumSeg Muggeo and Adelfio (2011)cumSeg
CPM Ross (2015)cpm
WBS Fryzlewicz (2014)wbs
WBS2 Fryzlewicz (2020)breakfast
NOT Baranowski et al. (2019)not
FDR Li et al. (2016)FDRSeg
TGUH Fryzlewicz (2018)breakfast
Continuous piecewise-linearNOT Baranowski et al. (2019)not
TF Kim et al. (2009)
CPOP Maidstone et al. (2019)
MARS Friedman (1991)earth
FKS Spiriti et al. (2013)freeknotsplines
The competing methods used in the simulation study Example of a signal of length 1000 with change-points at 490 and 510 offsetting each other A seemingly difficult structure for ID: Signals that present the most difficulty to ID are ones in which change-points are concentrated in the middle part of the data and offset each other, as in Fig. 4. The reason is that due to the left- and right- expanding feature of ID, where one of the two end-points of the interval is kept fixed, the change-points need to be detectable based on relatively “unbalanced” (explanation follows directly below) tests, which typically tend to offer poor power. For example, referring again to Fig. 4, the change-point at 490 will need to be isolated and detected by comparing the means of the data over the long interval [1, 490] and a short interval of the form , where is the end-point of a right-expanding interval . To be more precise, if the expansion parameter , then and therefore our procedure will have seven opportunities to detect the change-point 490 while it is still isolated in intervals that do not contain any other change-points. Even though ID would be expected to struggle in detecting the change-points in such unbalanced intervals, our numerical experience suggests that its performance on such challenging signals is in fact very good and matches or surpasses that of the best competitors; see for example the results in Table 3 for the model (M4), which follows this structure. All the signals are fully specified in Section 2 of the supplementary material. Figure 5 shows examples of the data generated by models (M1) blocks, (M2) teeth, (M4) middle-points, and (W1) wave 1. Tables 3, 4, 5, 6 and 7 summarize the results in the case of i.i.d. Gaussian noise. Table 8 presents the behaviour of ID under the setting of i.i.d. scaled Student- noise, where . More examples are in the supplement.
Fig. 4

Example of a signal of length 1000 with change-points at 490 and 510 offsetting each other

Table 3

Distribution of over 100 simulated data sequences of the piecewise-constant signals (M1)–(M4)

MethodModelMSE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document}dHTime (ms)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le -3$$\end{document}-3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-2$$\end{document}-2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1$$\end{document}-1012\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 3$$\end{document}3
PELT63250120003.230.143
NP.PELT02274915522.820.10211.8
S3IB0738541002.490.08343.2
CumSeg39213820006.370.2062.3
CPM.l.500000334904.450.442.3
CPM.l.300000841261963.030.193.3
WBSC1(M1)0011322719112.790.2599.3
WBSIC0337537002.590.0899.3
WBS20354318222.640.09623.3
NOT0351433002.610.1080.7
FDR00335412102.510.09
TGUH0537497113.300.08127.4
ID0330625002.660.0823.9
ID.SDLL1259285322.800.1020
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {ID}_{\sqrt{3/2}}$$\end{document}ID3/20962281002.750.0922.3
PELT85609000181 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-36.621.1
NP.PELT841231000165 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-34.263.1
S3IB4115143000117 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-33.7315.2
CumSeg100000000251 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-33.9
CPM.l.500784153000145 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-32.960.4
WBSC1(M2)12772126053 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.3338.2
WBSIC78168133064 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-31.0038.2
WBS233471104558 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.36330.5
NOT9747361065 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.9743.4
FDR1411115572071 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.80
TGUH41836870064 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.4722.8
ID77174110060 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.878.8
ID.SDLL5566384962 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.433.7
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {ID}_{\sqrt{3/2}}$$\end{document}ID3/2281394730084 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.905.3
PELT0279010023 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.151.1
NP.PELT100000000781 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-31.784.2
S3IB98110000213 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.9120.2
CumSeg03167290065 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.325.2
CPM.l.5001687600051 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.850.2
WBSC1(M3)00066267124 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.1937.3
WBSIC00064279024 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.1837.3
WBS20018782225 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.1734.7
NOT0009370021 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.13118.3
FDR00277155123 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.17
TGUH0019162025 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.1525.2
ID0009181022 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.139.8
ID.SDLL0019710124 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.146.8
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{ID }_{\sqrt{\mathbf{3/2 }}}$$\end{document}ID3/20029431023 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.154.4
PELT5304700014 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.546.7
NP.PELT00213344214 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.47395.2
S3IB120871007 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.12292.1
CumSeg1000000023 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-384.6
CPM.l.500000069431 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.7614
CPM.l.2000003511223213 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.3920.4
WBSC1(M4)002320174013 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.50120.8
WBSIC40961005 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.04119.2
WBS2018310425 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.09666.4
NOT80920006 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.0861.8
FDR0197010109 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.07
TGUH0514072023 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.28169.2
ID70930006 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.0742.3
ID.SDLL008141057 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.1028.7
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{ID }_{\sqrt{\mathbf{3/2 }}}$$\end{document}ID3/210981005 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-3}$$\end{document}×10-30.0566.4

The average MSE, and computational time are also given

Fig. 5

Examples of data series, used in simulations. The true signal, , is in red

Table 4

Distribution of over 100 simulated data sequences from the piecewise-constant signal (M5)

Method MSE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document}dHTime (s)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le -500$$\end{document}-500\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(-500,-50]$$\end{document}(-500,-50]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(-50,-10)$$\end{document}(-50,-10)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[-10,10]$$\end{document}[-10,10]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 10$$\end{document}>10
PELT10000001.97114.920.033
NP.PELT10000002.25551.898.976
S3IB9910002.231979.95332.841
CumSeg10000002.2519990.551
CPM.l.50004554100.199.000.002
CPM.l.2000010000002.2319991.245
WBSC110000001.5135.2612.272
WBSIC10000002.25199912.272
WBS200010000.140.545.796
NOT10000002.2519990.484
FDR0005950.140.51
TGUH00010000.160.840.794
ID00010000.140.990.785
ID.SDLL00010000.140.71120.601
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {ID}_{\sqrt{3/2}}$$\end{document}ID3/208218000.222.481.363

The average MSE, and computational time are also given

Table 5

Distribution of over 100 simulated data sequences from (NC)

MethodMSETime (s)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
012\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 3$$\end{document}3
PELT10000039 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.004
NP.PELT812368999 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-51.077
S3IB10000039 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.715
CumSeg10000039 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.115
CPM.l.5000001002957 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.011
CPM.l.30002863927628 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.031
WBSC115182047653 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.149
WBSIC9910044 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.149
WBS28954282 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.958
NOT9910044 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.089
FDR9640047 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-5
TGUH10000039 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.217
ID10000039 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.172
ID.SDLL90406182 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.069
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{ID }_{\sqrt{3/2}}$$\end{document}ID3/29901041 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times 10^{-5}$$\end{document}×10-50.259

Also the average MSE and computational times for each method are given

Table 6

Distribution of over 100 simulated data sequences from the continuous piecewise-linear signals (W1), (W3), and (W4)

MethodModelMSE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document}dHTime (s)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le -3$$\end{document}-3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-2$$\end{document}-2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1$$\end{document}-1012\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 3$$\end{document}3
NOT000991000.0160.0630.343
TF0000001000.0290.4511.125
CPOP000991000.0130.05523.190
MARS(W1)0029423980.0340.2000.011
FKS0007222600.0150.109270.385
ID000919000.0300.1040.036
ID.SDLL000980110.0330.0980.030
NOT00270618490.0350.5710.163
TF000000100606.5230.4320.117
CPOP000906220.0100.0970.078
MARS(W3)910720003.9912.2580.008
FKS000909100.0100.09767.582
ID000991000.0130.1010.017
ID.SDLL000934120.0220.1300.010
NOT0114201620290.1090.9980.958
TF000000100660.3990.4651.349
CPOP(W4)000928000.0150.0841.627
MARS10000000022.0581.6090.019
ID000928000.0380.1230.045
ID.SDLL000924130.0620.1200.025

The average MSE, and computational time for each method are also given

Table 7

Distribution of over 100 simulated data sequences of the continuous piecewise-linear signal (W2)

MethodMSE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document}dHTime (s)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le -90$$\end{document}-90\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(-90,-1)$$\end{document}(-90,-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1$$\end{document}-101(1, 60]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 60$$\end{document}>60
NOT1000000004.731990.869
TF000000100212.5470.3870.863
CPOP000973000.1620.1891.161
MARS1000000004.70398.5230.009
ID000982000.2010.2420.589
ID.SDLL000982000.2560.2870.097

The average MSE, and computational time for each method are also given

Table 8

ID results for the distribution of for the models (M2)–(M4) and (W1), over 100 simulations where the distribution of the noise is Student-, for

dModelMSE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document}dHTime (ms)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{N}} - N$$\end{document}N^-N
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le -3$$\end{document}-3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-2$$\end{document}-2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1$$\end{document}-1012\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge 3$$\end{document}3
5(M2)62274952\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$60\times 10^{-3}$$\end{document}60×10-30.869.7
(M3)000751654\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$21 \times 10^{-3}$$\end{document}21×10-30.169.2
(W1)000861220\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$31 \times 10^{-3}$$\end{document}31×10-30.2332.8
3(M2)712522189\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$71\times 10^{-3}$$\end{document}71×10-31.188.7
(M3)0105920137\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$26 \times 10^{-3}$$\end{document}26×10-30.229.8
(W1)000622846\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$32 \times 10^{-3}$$\end{document}32×10-30.2522.6

The average MSE, and computational time are also given

Examples of data series, used in simulations. The true signal, , is in red We highlight that the NOT, WBSIC, and S3IB methods require the specification of the maximum number, , of change-points allowed to be detected. If the default values in these methods are lower than the true number of change-points in the simulated examples, then we take , where is the minimum distance between two change-points. We ran 100 replications for each signal and the frequency distribution of for each method is presented. The methods with the highest empirical frequency of (or in a neighbourhood of zero, depending on the example) and those within off the highest are given in bold. As a measure of the accuracy of the detected locations, we provide Monte-Carlo estimates of the mean squared error, , where is the ordinary least square approximation of between two successive change-points. In continuous piecewise-linear signals, is the splines fit obtained using the splines package in R. The scaled Hausdorff distance, where is the length of the largest segment, is also given in all examples apart from the signal (NC) in Table 5, which is a constant-mean signal with no change-points. The average computational time for all methods, apart from FDR, is also provided. FDR is excluded due to its non-uniform procedure in terms of the execution speed for each signal (if a newly obtained signal has length greater than previously treated signals, then FDR estimates the threshold by 5000 Monte-Carlo simulations, which makes it slow). In some cases the average computational time for FKS is not given. We have already explained that we need to pre-specify the maximum allowed number of knots in order for FKS to work. The method is somewhat slow and we exclude the results for FKS when the true change-points are more than 10, as in such cases it would take a significant amount of time to finish all the 100 simulations. Distribution of over 100 simulated data sequences of the piecewise-constant signals (M1)–(M4) The average MSE, and computational time are also given Distribution of over 100 simulated data sequences from the piecewise-constant signal (M5) The average MSE, and computational time are also given Distribution of over 100 simulated data sequences from (NC) Also the average MSE and computational times for each method are given Distribution of over 100 simulated data sequences from the continuous piecewise-linear signals (W1), (W3), and (W4) The average MSE, and computational time for each method are also given Distribution of over 100 simulated data sequences of the continuous piecewise-linear signal (W2) The average MSE, and computational time for each method are also given ID results for the distribution of for the models (M2)–(M4) and (W1), over 100 simulations where the distribution of the noise is Student-, for The average MSE, and computational time are also given With regards to piecewise-constancy, ID is always in the top of the best methods when considering accuracy in any aspect (estimation of N, MSE, ); in most cases it is the best method overall. ID.SDLL is also, in most cases, in the top 10% of the best performing methods; this provides evidence that the Isolate-Detect algorithm can be combined with various model selection criteria (thresholding, SIC, SDLL) and maintain a good practical behaviour. When the threshold constant, C, is equal to , the behaviour of ID remains good for signals that have a moderate number of change-points that are not near each other. As we can see from Table 4, seems to struggle in scenarios with a large number of frequently occurring change-points. In continuous piecewise-linear signals, CPOP, ID, and ID.SDLL are in all cases in the top of the best methods in terms of the accurate estimation of N. In terms of the MSE and , CPOP is by a narrow margin the overall best method, with ID and ID.SDLL coming second and third, respectively. We can deduce that our method exhibits uniformity in detecting with high accuracy the change-points for various different signal structures, a characteristic which is at least partly absent from the majority of its competitors. Furthermore, ID’s behaviour is particularly impressive in extremely long signals with a large number of frequently occurring change-points; see Tables 4 and 7. Compared to other well-behaved methods, such as NOT for piecewise-constancy and CPOP for continuous piecewise-linear signals, our methodology has by far the lowest computational cost. To conclude, the simulation study provides evidence that Isolate-Detect is an accurate, reliable, and quick method for generalized change-point detection. The results of Table 8 are very good for and not too different from those under Gaussian noise. For , there is a slight overestimation of the number of change-points. When the tails of the distribution of the noise are significantly heavier than those of the normal distribution, one can obtain better results by increasing the threshold constant. For example, the results in Table 8 for were improved when the threshold constant was slightly increased. We highlight that more thorough simulations can be done using our R packages IDetect and breakfast and code available from https://github.com/Anastasiou-Andreas/IDetect/blob/master/R/Simulations_used.R.

Real data examples

UK House Price Index

We investigate the performance of ID on monthly percentage changes in the UK House price index from January 1995 to December 2020 in two London Boroughs: Tower Hamlets and Hackney. The data are available from http://landregistry.data.gov.uk/app/ukhpi and they were accessed in March 2021. Figure 6 shows the fits of ID, ID.SDLL, NOT, and TGUH. In both data sets, ID behaves similarly to NOT whereas ID.SDLL’s performance is closer to that of TGUH where we detect more change-points. This difference between the examined methods is, in our opinion, due to the fact that ID in this example and NOT detect change-points based on the Schwarz Information Criterion, so fewer estimated change-points can be expected. The detection of two change-points near March 2008 and September 2009 for both boroughs may be related to the financial crisis during that time, which led to a decrease in house prices. As explained in Sect. 3.3, our methodology returns the solution path defined in (10), which can be used to obtain different fits; see Section 7 in the supplement for more details and for a real-data example where this is useful.
Fig. 6

Top row: The time series and the fitted piecewise-constant mean signals obtained by ID and ID.SDLL for both Tower Hamlets and Hackney. Bottom row: NOT (solid) and TGUH (dashed) estimates for Tower Hamlets and Hackney

Top row: The time series and the fitted piecewise-constant mean signals obtained by ID and ID.SDLL for both Tower Hamlets and Hackney. Bottom row: NOT (solid) and TGUH (dashed) estimates for Tower Hamlets and Hackney Residual diagnostics have indicated that the behaviour of the raw residuals, , in relation to normality and independence is good for all methods.

The COVID-19 outbreak in the UK

The performance of ID is investigated on data from the recent COVID-19 pandemic; we employ a continuous piecewise-linear model on the daily number of lab-confirmed cases in England, as well as on the daily additional COVID-19 associated UK deaths. The data concern the period from the beginning of March 2020 until the end of February 2021 and they are available from https://coronavirus.data.gov.uk. The data were accessed on the 8th of March 2021. Before applying the various methods to the data, we bring the distribution closer to Gaussian with constant variance. To achieve this we perform the Anscombe transform, , with as described in Anscombe (1948). We denote the transformed number of COVID-19 cases by and the transformed number of COVID-19 associated deaths by . Figure 7 presents the results of ID, ID.SDLL, CPOP, and NOT for the transformed data. We observe that ID, ID.SDLL, and NOT have a similar behaviour, while CPOP gives a higher estimated number of change-points. In an attempt to date the detected change-points by ID, we provide a possible explanation of their location with respect to the outbreak of the pandemic in the UK; this discussion is given in Section 4 of the supplementary material.
Fig. 7

Top row: The transformed data sequence and the fitted continuous and piecewise-linear mean signals obtained by ID and ID.SDLL for both the daily number of cases and the daily number of deaths. Bottom row: NOT (solid) and CPOP (dashed) estimates for the daily number of cases and the daily number of deaths

Top row: The transformed data sequence and the fitted continuous and piecewise-linear mean signals obtained by ID and ID.SDLL for both the daily number of cases and the daily number of deaths. Bottom row: NOT (solid) and CPOP (dashed) estimates for the daily number of cases and the daily number of deaths For another example related to the continuous, piecewise-linear case, see Section 7 of the supplement where we explore the behaviour of Isolate-Detect and two competitors, CPOP and NOT, on the daily closing stock prices of Samsung Electronics Co. from July 2012 until June 2020.

Concluding reflections on ID

In this paper, we have proposed Isolate-Detect which is a new, generic technique for multiple generalized change-point detection in noisy data sequences. The method is based on a change-point isolation approach which seems to provide an advantage in detection power, especially in complex structures where most state-of-the-art competitors seem to suffer (see the simulations in Sect. 5) such as limited spacings between change-points. In addition, the aforementioned isolation aspect allows the extension of our method to the detection of knots in higher-order polynomial signals. As already mentioned in Sect. 1, NOT, WBS, and WBS2 also work on sub-intervals of the data, but the way the isolation is carried out in ID, where one of the end-points of the subintervals is kept fixed, provides predictable execution times for the analysis of a given data sequence, which are faster than the aforementioned competitors; see Sects. 4.1 and 5. Another advantage of our method over NOT, WBS and WBS2 is that, due to its pseudo-sequential interval expansion character, it can easily be applied for online change-point detection. In Sect. 4.4, a variant of ID was introduced that combines the threshold- and SIC-based versions of our proposed method with the aim to enhance its accuracy (both in terms of the estimated number and the estimated change-point locations) for signals of different structures with respect to the true number of change-points and the distance between them. In addition, due to the way that the relevant hybrid approach has been developed in Sect. 4.4, we manage to offer, for ease of execution, minimal parameter choice. Apart from thresholding and SIC, we have also combined ID with the SDLL model selection criterion. In the practical applications of Sects. 5 and 6, compared to the state-of-the-art competitors, ID lies in the top 10% (in terms of the accurate estimation of the number and the location of the change-points) of the best methods. Furthermore, it exhibits a notable advantage over other techniques in long signals with many change-points that occur frequently. In addition, ID’s pseudo-sequential character assists in attaining a low computational time; our method can accurately analyse signals of tens of thousands with thousands of change-points in less than a second; see for example Table 4. In cases where the normality assumption for the error terms is violated, Sect. 4.5 provides a practical solution where pre-processing allows us to use ID without altering the proposed parameter values. The results of simulations from a Student-t distribution with two options for the degrees of freedom are in Table 8. Since no method has a uniformly best behaviour, it is natural to also highlight the weaknesses of our method in terms of its practical behaviour. To start with, ID can be slow in long and constant signals in which change-points do not occur. This is because of the expanding intervals attribute, which in the case of no change-points will push the method to keep testing for change-points in growing, overlapping intervals. This is inevitably going to lead to high computational costs. We tried to eliminate this weakness by introducing a window-based variant, as explained in Section 3. Another drawback of the method is that, due to its left- and right-expanding feature, the change-points need to be detectable based on relatively unbalanced intervals. This could lead to accuracy issues in signals where the change-points are in the middle of the data sequence and offset each other. In practice, we have not encountered this type of behaviour in ID; in particular it accurately detects the change-points for the model (M4) in Table 3, which is an example of the aforementioned structure with two nearby change-points in the middle of the data sequence. Below is the link to the electronic supplementary material. Supplementary material 1 (pdf 569 KB)
  6 in total

1.  Circular binary segmentation for the analysis of array-based DNA copy number data.

Authors:  Adam B Olshen; E S Venkatraman; Robert Lucito; Michael Wigler
Journal:  Biostatistics       Date:  2004-10       Impact factor: 5.899

2.  Efficient change point detection for genomic sequences of continuous measurements.

Authors:  Vito M R Muggeo; Giada Adelfio
Journal:  Bioinformatics       Date:  2010-11-18       Impact factor: 6.937

3.  Algorithms for the optimal identification of segment neighborhoods.

Authors:  I E Auger; C E Lawrence
Journal:  Bull Math Biol       Date:  1989       Impact factor: 1.758

4.  Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection-rejoinder.

Authors:  Piotr Fryzlewicz
Journal:  J Korean Stat Soc       Date:  2020-09-16       Impact factor: 0.805

5.  On optimal multiple changepoint algorithms for large data.

Authors:  Robert Maidstone; Toby Hocking; Guillem Rigaill; Paul Fearnhead
Journal:  Stat Comput       Date:  2016-02-15       Impact factor: 2.559

6.  A computationally efficient nonparametric approach for changepoint detection.

Authors:  Kaylea Haynes; Paul Fearnhead; Idris A Eckley
Journal:  Stat Comput       Date:  2016-07-28       Impact factor: 2.559

  6 in total
  1 in total

1.  Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection-rejoinder.

Authors:  Piotr Fryzlewicz
Journal:  J Korean Stat Soc       Date:  2020-09-16       Impact factor: 0.805

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.