Literature DB >> 35275971

Detecting pattern transitions in psychological time series - A validation study on the Pattern Transition Detection Algorithm (PTDA).

Kathrin Viol¹, Helmut Schöller¹, Andreas Kaiser², Clemens Fartacek¹, Wolfgang Aichhorn¹, Günter Schiepek^1,3.

Abstract

With the increasing use of real-time monitoring procedures in clinical practice, psychological time series become available to researchers and practitioners. An important interest concerns the identification of pattern transitions which are characteristic features of psychotherapeutic change. Change Point Analysis (CPA) is an established method to identify the point where the mean and/or variance of a time series change, but changes of other and more complex features cannot be detected by this method. In this study, an extension of the CPA, the Pattern Transition Detection Algorithm (PTDA), is optimized and validated for psychological time series with complex pattern transitions. The algorithm uses the convergent information of the CPA and other methods like Recurrence Plots, Time Frequency Distributions, and Dynamic Complexity. These second level approaches capture different aspects of the primary time series. The data set for testing the PTDA (300 time series) is created by an instantaneous control parameter shift of a simulation model of psychotherapeutic change during the simulation runs. By comparing the dispersion of random change points with the real change points, the PTDA determines if the transition point is significant. The PTDA reduces the rate of false negative and false positive results of the CPA below 5% and generalizes its application to different types of pattern transitions. RQA quantifiers also can be used for the identification of nonstationary transitions in time series which was illustrated by using Determinism and Entropy. The PTDA can be easily used with Matlab and is freely available at Matlab File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/80380-pattern-transition-detection-algorithm-ptda).

Entities: Chemical

Mesh：

Year: 2022 PMID： 35275971 PMCID： PMC8916631 DOI： 10.1371/journal.pone.0265335

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

With the increasing use of real-time monitoring devices like the Synergetic Navigation System [1], high frequency sampled longitudinal data of psychological processes become available. Fields of research include, for example, psychotherapy processes [2, 3], developmental psychology [4, 5], unconscious processes [6], treatment outcome [7, 8], psychotraumatology [9], sudden gains and losses in psychotherapy [10-12], treatment of multi-problem families [13], psychological aspects of immune diseases [14] or epilepsy [15, 16], and suicidal crises [17, 18]. Real-time monitoring produces time series with daily or even within-day measurements of psychological variables relevant to psychotherapeutic changes like emotions, motivation, insights, and symptom severity. The availability of such time series open the possibility to go beyond the assessment of pre-post changes of psychotherapy and to shed light on the mechanisms of change that are still largely unknown [19, 20]. The temporal sequences of psychological variables in high frequency sampled time series allows to derive causality [21] by investigating what has happened in the hours or days before a change occurs. One crucial prerequisite to derive mechanisms of change from empirical time series is to identify the point of change in a patient’s individual process in a reliable and valid way. This is non-trivial due to the noisy, discontinuous, and complex patterns of the time series [22, 23]. Besides investigating mechanisms of change, data from real-time monitoring can be used to identify early warning signals that can, once validated, inform the therapist about the current state of a patient. This information can be used for example to set specific interventions [24, 25], or, in the most extreme case, prevent suicide attempts [18]. In order to identify early warning signals, one has to determine the point of change in the time series first. Those retrospective assessments might lead to more reliable predictors of change than are currently available. It should be noted that long-term predictions are impossible in complex nonlinear systems, but with regard to short-term predictions, several indicators of qualitative changes of a system have been identified. Therefore, researchers and therapists will not be successful in long-term forecasting of specific trajectories of change, but should be vigilant to the possibility of transition points of the system [26]. Such transitions from one pattern to another are called phase transitions [24]. In general, a phase transition has occurred when the system has shifted to a qualitatively different behavior, i.e., when a change from one phase to another has occurred, where a phase refers to a specific stable pattern. The existence of such tipping points (“criticality”), where transitions to a different state occur, characterize nonlinear dynamic systems irrespective of the system under investigation. Theories of self-organization like Synergetics [24, 27] describe how such phase transitions are governed by underlying control parameters, i.e., variables that change on a slower time scale and influence the functional relation between the observed microscopic or macroscopic variables. A variety of measures to determine the point at which a system changes its qualitative behavior exist in various disciplines like climate research, physics, ecology, neuroscience, and physiology (e.g., [28-30], see [26] for an overview). A commonly used method in psychology but also in other research areas is the Change Point Analysis [31], which is available in softwares like R and Matlab. The Change Point Analysis (CPA) is able to identify one or more points in a time series where pre-defined statistical properties as the mean or the variance change. One of the aims of this study is to test its applicability to short time series of psychotherapy processes with changing complex patterns. As stated above, however, a phase transition is defined as a qualitative change of the system. This is not restricted to a change of the mean of a time series, but can concern any pattern. Nonlinear time series are often analyzed by two classical methods, the identification of the (local) Largest Lyapunov Exponent [32-34] or the (pointwise) Correlation Dimension [32, 35, 36]. Unfortunately, these measures are not suitable for the relatively short time series which usually are created by real-time monitoring or ecological momentary assessment of psychotherapeutic processes because they require considerably longer time series (> 1000 sampling points) than are usually available in daily ratings in psychotherapy (~ 100 sampling points). Several alternative indicators of change have been reported in [28] and [30]. Among those are again a multitude of measures that require long time series, but also methods that can be applied to psychological data, such as changes of the frequency, variance, distribution, and autocorrelation. The application of these entities to detect changes in a time series requires to extend the methods in a way that captures the dynamics. This can be done by calculating these measures for small segments of a time series (sliding window approach). For the frequency, this exists in terms of Time Frequency Distributions [37], and in the combined dynamics of fluctuation and distribution as it is captured by the Dynamic Complexity measure [24, 38]. Temporal change of autocorrelations can be identified in a more sophisticated way that extends to repeating patterns by Recurrence Plots. An important spectrum of advanced and sophisticated methods is available by Recurrence Quantification Analysis (RQA) [39, 40], which allows for the identification of dynamic features (e.g., determinism or entropy) in time series [41], of patterns in multidimensional time series data [42-44], phase synchronization in networks [45], chimera states [46] or coupling in bivariate and multivariate dynamics [47]. The quantifiers can be used to detect pattern transitions in dynamic systems (e.g., [41]). We will apply CPA to Recurrence Plots as a contributor to PTDA and also use known quantifiers (determinism and entropy) for reasons of cross-validation (see Results). The results of these measures are a new time series (Dynamic Complexity), a matrix with time on the horizontal axis and the frequencies of the vertical axis (Time Frequency Distributions), and again a matrix with time on both axes (Recurrence Plots). We will call these “secondary measures” in the following, in contrast to the original “primary” time series. These secondary time series or matrices capture different aspects of the primary time series that have shown to be relevant for detecting phase transitions. Change Point Analysis can then be applied not only to detect changes of the primary time series, but also to detect changes of the distribution, frequency, and recurrent patterns. Using information from various secondary measures increases the validity of the location of change [26]. The assessments described above have been implemented in an easy-to-use algorithm called Pattern Transition Detection Algorithm (PTDA), where the user simply enters a time series. The program will then either inform the user that no change point was found or on the time-related location of the pattern transition. The aim of this study is (1) to assess the performance of the CPA on time series with changes of complex patterns, (2) to extend the CPA to identify changes of the second order features like distribution, fluctuation, frequency, and autocorrelation, and (3) to validate the new PTDA in terms of precision, rate of false negative results, and rate of false positive results. Furthermore, we will investigate the performance with increasing levels of noise, with the transition points placed at different positions along the time series, and with prolonged (i.e., not sudden) periods of change. Finally, the PTDA will be applied to empirical data. Compared with the first description of the algorithm [26], we illustrate in the present paper not only the feasibility of applying the PTDA to some simulated and empirical data, but conduct a systematic validation of the algorithm by using 300 simulated time series. Furthermore, we optimized the algorithm. Some methods proved to be not sensitive enough for the identification of transitions and, in consequence, were eliminated, as Instantaneous Frequency and Permutation Entropy. Synchronization Pattern Analysis refers to the identification of changing inter-item correlations during the process and is a promising indicator of critical periods before transitions occur. However, the hypothesis of increased synchronization as a precursor of transitions needs further investigation in empirical system dynamics and for this, it was not included in this step of PTDA development. Here, we included fewer converging methods and applied the Change Point Analysis (CPA) also to the Time Frequency Distributions included in the PTDA. For this, the aim of this study is far beyond the aim of the Schiepek et al. paper.

Methods

The idea of using different methods to gain a convergent validation for a pattern transition was first presented and explained in detail in [26]. It is based on the CPA introduced by Killick et al. [31]. We will present how these methods are combined in a single algorithm, the PTDA. In detail, we will explain the steps of the algorithm in order to improve the rate of precision, and to reduce false positive and false negative results. The paper will also present the results of the evaluation of the PTDA using simulated time series and its applicability to empirical data. In the following, we will refer to a change point (CP) as a point of the time series that was applied to a single method and is the result of the CPA, and to a transition point (TP) as a point that was identified by the Pattern Transition Detection Algorithm (PTDA), which incorporates the results of all change points found for the different methods.

Data for validation

For the validation of the PTDA, 300 time series were simulated with a mathematical model of psychotherapeutic processes. In the sense of an illustrative pilot application, the PTDA was also tested on four empirical time series.

The model

The algorithm was developed and validated on simulated time series produced by a theoretical computational model of psychotherapeutic change. The details can be found in [48-51]. In short, the model includes five variables (order parameters), which are connected by nonlinear functions, and four control parameters, which modulate the shape of the functions, i.e., they determine the strength and the shape of the impact of the variables onto each other. The five variables represent common client-related factors of the psychotherapeutic process: problem severity (P), emotions (E), insight (I), therapeutic success (S), and motivation for change (M), which change on a relatively fast time scale (hours to days). The four parameters, on the other hand, represent relatively stable trait characteristics of a client: alliance with the therapist/ability to bond or attach (a), cognitive competencies (c), motivation to cope with problems/hopefulness (m), and resources/behavioral skills (r). The parameters can be seen as dispositions or traits that change on a slower time scale (weeks to months) and are intended to be changed during psychotherapy (personality development). In other words, psychotherapy works in this model by increasing the trait parameters, which influence the dynamics of the state variables [50]. The computational model includes 9 coupled nonlinear difference equations which are shown in the Supplement 2 in S1 File. The numerical values of the resulting time series are created by iterative application of the map onto the respective last discrete value (xt-1, with t = discrete time steps).

Simulating pattern transitions

In terms of Nonlinear Dynamic Systems Theory and Synergetics, the state variables represent the order parameters of the system, and the trait parameters the control parameters [49]. In nonlinear dynamic systems, the control parameters are the relevant entities that need to be changed if a phase transition is to occur. The phase transition can be observed on the level of the order parameters by a change of their dynamic patterns. In the computational model, pattern transitions were simulated by increasing or decreasing the control parameters of the system. Examples are depicted in Fig 1. All time series can be found in the Supplement 1 in S1 File.

Fig 1

Exemplary time series with pattern transitions that were produced by a model of psychotherapeutic change.

The simulated time series were used to optimize and validate the PTDA. Some transitions are characterized by a change of the mean, some by a change of the variance, and others by a change of different features in the domain of frequency or complexity. X-axis: 100 iterations of the simulation runs, corresponding to the segment between iteration 100 and 200 as shown in Fig 2A, lower part. Y-axis: Intensities of the simulated dynamics. The range of the variables produced by the theoretical computational model is [-1, +1] for E, P, S and [0, +1] for I and M.

Exemplary time series with pattern transitions that were produced by a model of psychotherapeutic change.

Fig 2

Generation of the time series.

Generation of the time series

Simulation runs of T = 400 iterations were generated with random initial values between 0 and 1 for the variables and the parameters. At T = 250, a pattern transition was induced by instantaneously increasing all control parameters by a random value between 0 and 1 with upper and lower limits of [0, 1] for the final parameters and a minimum change of +/- 0.05. Each simulation produces 5 time series (one for each of the five variables of the model). The first 100 data points of each time series were discarded to account for saturation effects (transient period); the TP was thus located at T = 150 (Fig 2). 100 out of the resulting 300 data points were used depending on where on the time series the TP was evaluated. For example, the interval [100, 200] was used to simulate a transition in the middle of the time series (T = 50) (Fig 2A), the interval [70, 170] to simulate a transition at the end of the time series (T = 80). To assess the rate of false positive results, the interval [100, 200] was used, i.e., an interval without a transition.

Generation of the time series.

A) Simulations with 400 iterations (data points) were generated. The first 100 data points were discarded to allow for saturation effects (transient period). A transition point (dashed line) was launched at T = 250 in the original time series, corresponding to T = 150 in the adjusted time series. Depending on where the transition point should be, the corresponding interval of 100 data points (brackets) was chosen and assessed with the algorithm. The upper bracket shows the interval where the transition point is located at the middle of the time series, the bracket in the middle the interval where the transition is at the end, and the lower bracket the interval without a transition point (to assess the rate of false positive results). B) The effects of prolonged periods of changes were investigated by linearly increasing the control parameters over several time points (brackets). As is common in nonlinear dynamic systems like the computational model of psychotherapy, the dynamics produced by the system can be a fixed point (i.e., the value of the variable stays the same all the time), periodic behavior of different orders (e.g., in a 4-cycle, the same value reoccurs every 4th data point; a pendular would correspond to a 2-cycle), and chaotic behavior, where the same value never occurs again. To mimic empirical time series most appropriately, we only included simulation runs that had sufficiently complex character, i.e., where the resulting time series were either of high periodicity (minimum: 16-cycle) or chaotic. Each simulation run of the theoretical model created 5 time series, which implies that the 60 simulation runs produced a sample of 300 time series.

Empirical data

In order compare the application of the PTDA to simulated time series with its application to empirical processes, 4 empirical time series of different psychological variables were assessed with the PTDA to illustrate the applicability on empirical data. One time series (see below, Fig 6A) represents the depression subscale of the SCL 90 [52] from the publicly available dataset of a case study [53]. For these time series data, a change point had already been identified [54], which we try to replicate with the PTDA. Three other time series of psychotherapeutic processes were generated by the Synergetic Navigation System (SNS) [1], an online-based real-time monitoring system, which is routinely used by patients during inpatient and outpatient therapy. In routine practice, daily self-assessments are realized by answering a validated and standardized process questionnaire like the Therapy Process Questionnaire (TPQ, [55]) or a personalized questionnaire which usually is developed together with the patient. One of the empirical time series (Fig 6B) was taken from a case report on a patient with dissociative identity disorder and shows one item of an individualized questionnaire which represents stress-related experiences corresponding to the “child-state” of the patient [3]. Another time series (Fig 6C) shows the daily ratings of the item “experienced therapeutic progress” of a patient (diagnosis: Major Depressive Disorder), assessed by the TPQ. The fourth time series (Fig 6D) represents the dynamics of the factor “motivation for change” of a patient with the diagnosis of Major Depressive Disorder (also assesses by the TPQ). The empirical time series can be found in Supplement 3 in S1 File.

Fig 6

PTDA applied to four different psychological time series.

A) Time Series of the depression subscore of the SCL-90 from a open access dataset (see Methods). B) Time series of the “child-state” of a patient with dissociative identity disorder during the psychotherapy process (Schiepek et al., 2016). C and D) Time series of the psychotherapeutic process. The change point is marked by the vertical black line. The bar below gives an impression of the convergence of the CPs over all methods. X-axis: Measurement points (daily assessments): Y-axis: Intensities of the values (A, B, C); z-transformed values of the factor “motivation for change” (D).

The algorithm

The original idea of using the combined information of several methods to identify transition points was first presented in [26], which also describes additional methods that might be used. The process of optimizing the algorithm led to the exclusion of two methods: permutation entropy [56] showed unreliable results and depended considerably on the parameters used (word length and window width), and the instantaneous frequency [57] was not sufficiently able to reduce the matrices of the Time Frequency Distribution to a vector. While permutation entropy was excluded without replacement, the Time Frequency Distributions are now assessed in the same way as the matrices of the Recurrence Plots, i.e., by assessing each line separately with the CPA (see below). An example of two time series is shown in Fig 3.

Fig 3

Methods of the PTDA.

From top to bottom: original time series, Dynamic Complexity, Recurrence Plot, Time Frequency Distribution. The left column shows an example of a time series where all methods find a transition at a converging point (T = 52). The blue circles show points where a change of the mean was detected, the green circles where a change of the variance was detected, and the red circles where a change of the linear trend was detected by the CPA (in this example, the points are identical; note that the vertical placement of the points is irrelevant and adjusted for visual recognition only). The right column shows an example where no change point was found by the CPA (i.e., applied to the original time series), but where the other methods convergently found a transition point at T = 53. X-axis: Time series length (see Fig 2A for explanation). Y-axis: From top to down: Value range of the simulated time series; values of the Dynamic Complexity measure; rows of the Recurrence Plot (time x time); frequency spectrum of the Time Frequency Distribution.

Methods of the PTDA.

Description of the methods used in the PTDA algorithm

Change Point Analysis (CPA) on original time series and on secondary analyses methods. The CPA [31] as implemented in the function findchangepts in Matlab (R2018b) was applied to the raw time series, and in addition to the results of the secondary analyses methods. The method is sensitive to changes of specific statistical properties of a time series. A time series x contains a change point if it can be split into two segments x1 and x2 such that C(x1) +C(x2) Consider the time series {2,2,2,4,4,4,4,4,4,4}, where the mean changes from 2 to 4 between t = 3 and t = 4. The change point analysis algorithm first splits the time series into two segments, x1 from t = 1 to t = 2, and x2 from t = 3 to t = 10. For both segments, the cost function C is calculated: the first part includes N = 2 time points, the second part N = 8 time points with var(x1) = 0 and var(x2) = 0.5, hence C(x1) = 2∙0 = 0 and C(x2) = 8∙0.5 = 4. The sum of C(x1) and C(x2), 4, is then compared to the cost function of the whole time series, C(x) = 10∙0.933 = 9.33. Since C(x1) + C(x2) are not less than C(x), the algorithm concludes there is no change point when segmenting the time series after t = 2. It then proceeds by splitting the time series between t = 3 and t = 4 and repeats the tests for these segments. Now, both the variance of x1 and x2 are zero, hence C(x1) +C(x2) The function is able to detect changes of the mean, of the variance, and of the linear trend of a time series. In the PTDA, these three possibilities are used by default for the original time series. For the application on the secondary methods, only changes with respect to the mean are assessed. In principle, the CPA can detect multiple change points in a time series. Also, the PTDA was designed to be applied to numerous transition points but was restricted to a maximum of 1 in the validation study because the simulated time series were designed to have exactly one transition point.

Dynamic Complexity (DC)

The Dynamic Complexity measure [24, 38] was developed to identify critical instabilities in short and coarse-grained real-world time series, without further mathematical or parametric assumptions. DC mirrors the increased complexity and sensitivity to noise and perturbations of system dynamics before phase transitions, but also the fact that regimes or attractors of human dynamics realize different degrees of complexity. DC is the multiplicative product of a fluctuation measure and a distribution measure applied to discrete time series data with given data ranges and constant discrete time intervals between the data points. The fluctuation measure (F) is sensitive to the amplitudes and frequencies of a time signal, and the distribution measure (D) scans the scattering of values or system states occurring within the range of possible values or system states. In order to identify non-stationarity, DC is calculated within a data window moving over the time series (sliding window). Because the empirical time series were collected by daily ratings, we apply a window width of 7 measurement points (corresponds to one week). The window size can be adjusted by the user in the Matlab version of the PTDA, but the effect of different window sizes on the performance of the algorithm was negligible. For the PTDA, the Matlab Toolbox to calculate the DC by Viol [58] was used. The DC is also implemented in the SNS. A detailed example how to calculate Dynamic Complexity is given in the Supplement 4 in S1 File.

Recurrence Plots (RP)

Recurrence Plots provide a visualization and quantification of recurrent, i.e. dynamically similar states within a time series [40, 59]. Dynamic similarity is measured in terms of some metric distances defined in the underlying state space. One or more time series are projected into a multidimensional state space by embedding procedures with a specific embedding dimension m and a time-delay parameter τ. The method of time-delayed embedding allows the reconstruction of phase-space profiles from a single one-dimensional time series, following the logic of Taken’s theorem [60] but also from multi-dimensional time series [42–44, 61]. Dynamic similarity is measured in terms of metric distances defined in the system’s reconstructed state space. Usually Recurrence Plots are produced by binary matrices where an entry is 1 if di,j, ≤ ε, otherwise it is 0 (where xi and xj are elements of the time series and ε some threshold). This procedure depends on three parameters: the embedding dimension m, the time delay τ for taking elements of the time series xi and the threshold ε which defines the occurrence of recurrent (close) or distant (non-recurrent) state vectors in the m-dimensional embedding space. In the PTDA we selected a small embedding dimension of m = 3 and a small time delay (τ = 1) which preserves the biggest number of state vectors out of a given time series. In psychotherapy we are frequently challenged by the availability of only short time series, e.g. 60 or less session by session measures or daily measures in a hospital stay. Instead of defining a specific threshold distance ε we used the Euclidian distance between all vector points in the time delay embedding space. In consequence the recurrence matrix R is equivalent to the distance matrix d = (di,j). In the time x time recurrence matrix, the color of each entry reflects the dynamic similarity between all vector points, with dark blue representing identity or very short Euclidian distances between the vector points and red representing the longest Euclidian distance between any two vector points (rainbow color spectrum). In the PTDA, the recurrence plots are calculated by a Matlab Toolbox [62, 63] with the commonly used parameters m = 3 and τ = 1. Note that during the optimization of the algorithm, the results were largely unaffected by the choice of these parameters. Of course, they can still be changed by the user in the Matlab version of the PTDA.

Time frequency distribution (TFD)

Time frequency distribution (TFD) is a method to calculate and visualize the frequency of a signal (time series) as it changes with time [64, 65]. In order to identify frequency changes, a moving window approach is implemented. Mathematically, both time t and frequency ω are variables of a distribution P(t,ω) which describes the amplitude (energy) of the signal at each given t and ω. Here, we use the so-called Stockwell transform (S-transform) which is a combination of two common TFD-methods, the Short Time Fourier Transform and the Continuous Wavelet Transform [37]. It preserves the phase information available from the former method but uses the variable (i.e., not fixed) window length of the continuous wavelet method. For visualization, time and frequency are plotted on a plane (x: time, y: frequency) and color coding is used for the representation of the amplitude (energy) of the frequencies. In the PTDA, a Matlab Toolbox by Sundar [66] is used.

Steps of the algorithm

Outlier deletion. For each time series, the following steps are performed by the algorithm (Fig 4): First, outliers are deleted with the Matlab function isoutlier with option ‘gesd’. The function applies an iterative algorithm to determine outliers based on the Generalized Extreme Studentized Deviate test (MathWorks, 2020).

Fig 4

Flow chart of the Pattern Transition Detection Algorithm (PTDA).

CP: change point; CPA: change point analysis.

Flow chart of the Pattern Transition Detection Algorithm (PTDA).

CP: change point; CPA: change point analysis.

Application of secondary methods

Second, the secondary methods (Dynamic Complexity, Recurrence Plot, Time Frequency Distribution) that were described in detail above are calculated. This results in a total of two time series (the original one plus the one of the Dynamic Complexity) and two matrices (Recurrence Plot and Time Frequency Distribution).

CPA

For the original time series and the DC, the CPA can be directly applied. This results in a maximum of four change points: one each for changes of the mean, the variance, and the linear trend of the original time series, and one for a change of the mean of the Dynamic Complexity. Note that it is possible that fewer change points are found if one or more methods do not find a change point. For the matrices (Recurrence Plots and Time Frequency Distributions), the CPA with respect to the changes of the mean is applied to every line. Then, a histogram of the change points of all lines is build, and the peak of the histogram determines the point of change for the whole matrix. In detail, the bin width of the histogram is fixed to (1/5)th of the length of the time series and the “peak” of the histogram is the (rounded) middle of the bin containing the most values.

Deletion of extreme CPs

An observation from inspecting false-positive results from the CPA of numerous time series was that these were nearly almost located at the very extremes of the time series. One important step in reducing the rate of false-positive results was therefore the exclusion of TPs that were found within the first or last 10% of the length of the time series. For applications in psychotherapy research, this also makes sense from both a psychological as well as a dynamical systems theory point of view. It is well known [24, 67] that the beginning and the end of psychotherapy are specific periods with increased fluctuations due to changes of the setting. These changes are, however, not what we want to detect in psychotherapy research. This is backed up by dynamic systems theory, which describes how changes in the boundary conditions (the setting) lead to a transient period until settling at a stable attractor [24].

Derivation of the overall TP

For each matrix as well as for the different methods, several change points will be found. In the optimal case, all change points will be at the same position on the time series and result in a clear overall transition point (TP). This, however, is a very rare case and hardly ever happens in time series with complex patterns. The main part of the PTDA will therefore assess the different change points in order to derive the most valid and reliable point of transition. The steps to achieve this are described in the following.

Significance testing

As described in [26], a valid TP should be characterized by a clustering of CPs around a certain point (the real TP) of the time series. This dispersion along the time series should be different to the dispersion of points that are placed randomly on the time series. Inspired by bootstrapping, random values were drawn from a discrete uniform distribution of the length of the respective time series with the unidrnd function implemented in Matlab. The number of random values drawn is equivalent to the number of change points found by the different methods for the respective time series. This is repeated 100 times. The spreading of the points of non-normal distributions can be assessed by the interquartile range (IQR). The IQR describes the number of points within the second and third quartile of the data, i.e., the inner 50% around the median, and is a measure of the dispersion of the data comparable to the variance of normally distributed data. After calculating the IQR of the 100 sets of random CPs, the mean and the 95% confidence interval of these IQRs is calculated. If the IQR of the real data lies below the lower bound of the confidence interval of the equally distributed random data, the TP is considered significant. Importantly, the algorithm does not test if the mean (or any higher order moment) is significantly different before and after the change point. If required by the research question, this can easily be calculated afterwards once the transition point is known.

Construction of the probability distribution

The last step of the algorithm consists of generating probability distributions in order to gain a visualization of the overall result. For each change point found by the different methods, a normal distribution is constructed where the mean is the position of the change point and the variance is fixed at 5, what is an arbitrary but very restrictive threshold. The height of the distribution is normalized to one. Then, the sum of all normal distributions is calculated and visualized as a color band where high values are shown in yellow and low values in blue. This kind of visualization provides an intuitive impression about how prominent the transition point is in terms of convergence among change points: points with a high convergence of change points will show as narrow yellow areas in the probability band, while less “clear” transition points will result in broader greenish areas in the band. In addition, this visualization provides important information about possible other transition points, i.e., if there might be more than one transition in the time series.

Investigation of the whole system

The PTDA allows as input either single time series, or the time series of different variables of the system together (multiple time series). If more than one time series is entered, the PTDA also provides an overall TP and calculates the significance test and the overall probability function by using the CPs of all methods and time series. Here, the 5 time series generated by each simulation run were assessed as single time series as well as in combination.

Assessment of the performance

Five measures are calculated to quantify the different aspects of performance of the PTDA and the CPA: mean, standard deviation (SD), precision, rate of false negatives, and rate of false positives.

Mean and SD

The mean and SD of the transition points (TP) of all time series. The mean should, ideally, be the point of the real transition (e.g., at 50 for a transition in the middle of the time series with 100 time points). The standard deviation should be as small as possible.

Precision

To assess the precision of the algorithm, we calculated the rate of transition points (TPs) that were within +/- 5 time points around the real TP. Ideally, all TPs should lie within this interval.

False negatives

If the algorithm does not find a TP although there is one, this is counted as a false negative result. The rate of false negative results should be below 5%.

False positives

To assess the rate of false positive results, the first 100 time points of each simulated time series, where no transition occurred, were used (see Fig 2A). This guarantees that this measure is not influenced by a specific pattern, since these time series had the same pattern as the ones used to assess the transitions. The rate of false positive results should be below 5%.

Modeling real-world conditions

Furthermore, the performance was tested under conditions comparable to empirical time series: noise, shifted, and prolonged transition points. Shifted TPs were assessed by assessing different parts of the time series (Fig 2A). TPs at T = [20, 30, 40, 50, 60, 70, 80] were investigated. Prolonged periods of transition were simulated by linearly increasing the control parameter (Fig 2B) for intervals of 10, 20, and 30 time points. Examples of resulting time series are shown in Fig 5.

Fig 5

A) Different positions of the simulated transition point (TP) at T = 20 (left), T = 50 (middle) and T = 80 (right). B) Different levels of noise were added to the original time series in order to investigate the effect on the PTDA. Left: Original time series without noise (deterministic). Middle: original time series with noise with a variance of +50% of the variance of the original time series. Right: original time series with noise with a variance of +90% of the variance of the original time series. C) Prolonged transitions. The upper rows show the time series resulting from the parameter increases (lower row). Left: original time series with a sudden pattern transition induced by an instantaneous increase of the control parameter from one to the next iteration. Middle and right: Time series and linear parameter increases in intervals of 10, 20, and 30 iterations (time points). X-axis: time series length. Y-axis: Rage of the time series values. C lower row: drift of parameter values.

Results

Performance of CPA and PTDA

The aim of our study was to assess the performance of the PTDA, an expansion of the CPA, in short time series with complex pattern changes as they are found in psychotherapy processes. Five criteria of performance were investigated: first, the mean TP found over all 300 time series used for the validation study; second, the standard deviation of the mean TP; third, the precision defined as the percentage of time series where the TP found by the algorithm within ± 5 time points around the real TP; fourth, the rate of false negative results, calculated as the percentage of time series where no TP was found although there was one; and fifth, the rate of false-positive results, calculated as the percentage of time series where a TP was found although there was none. As can be seen from Table 1, both algorithms are, on average, able to find the position of the change point correctly: the real change was induced by T = 50 in the simulated time series, and the change point was found at 51.4 for the PTDA when all 5 time series generated by the system are taken into account simultaneously, and at 51.7 if the time series generated by the system were investigated independently of each other. Note that it takes some time (some iterations) before a shift of the parameters “arrives” in the system. The % of time series that were within a window of +/- 5 time points around the real transition point was slightly higher for the PTDA compared to the common CPA, and 100% if the time series of the system are assessed together.

Table 1

Comparison of the PTDA when (A) the 5 five time series per simulation are assessed simultaneously, i.e., the whole system is taken into consideration, when (B) the time series are assessed independently, and (C) when the common CPA algorithm (with respect to the change of the mean) is applied to the single time series.

	Algorithm	Mean	SD	Precision +/- 5	False negatives	False positives
A)	PTDA (whole system)	51.4	1.4	100%	0%	4%
B)	PTDA	51.7	3.3	92%	1%	1%
C)	CPA	52.1	2.1	83%	16%	19%

Comparison of the PTDA when (A) the 5 five time series per simulation are assessed simultaneously, i.e., the whole system is taken into consideration, when (B) the time series are assessed independently, and (C) when the common CPA algorithm (with respect to the change of the mean) is applied to the single time series.

Based on the 300 simulated time series from 60 simulation runs with an induced change at T = 50, the table shows the mean change point of the algorithms, the standard deviation (SD), the precision, the false negative rate, and the false positive rate. Both algorithms are able to correctly detect the TP in most cases, but the percentages of false negative and false positive findings are much lower in the PTDA. The main improvement of the PTDA are the rates of false positive and false negative results. The common CPA indicated no change when in fact there is one in 16% of the cases (false negatives), and indicated a change when in fact there was none in 19% of the cases (false positives). In contrast, the rates of false positive and false negative results were below 5% for the PTDA.

Performance under real-world conditions

Shifted transition point

In empirical time series, the transition will often not be in the middle of the time series but shifted towards the beginning or the end. We therefore investigated how a change point at a different position affects the results. The results shown in Table 2 indicate that the algorithm is well able to identify transition points also at other points, and the rate of false negative results is hardly affected.

Table 2

Results of the performance of the PTDA under three real-world conditions.

Noise: different levels of noise were added to the original time series. The numbers in the second column denote the strength of noise, e.g., 50 refers to added noise with a variance of 50% of the original time series. Position: The transition point was shifted along the time series. The numbers in the second column denote where the change was induced, e.g., 50 refers to a transition point at 50% (the middle) of the original time series. Increase: The transition was not induced abruptly but over a longer period of time. The numbers in the second column denote the range of the time series where the change occurred, e.g., 20 denotes that the change stretched over 20% of the time series (see also Fig 5).

		all time series of the system				single time series of the system
		M	SD	Precision (%)	False neg. (%)	M	SD	Precision (%)	False neg. (%)
Noise	0	51	1.4	100	0	52	3.3	92	1
	10	51	1.4	100	0	52	3.2	92	1
	20	51	1.4	100	0	52	3.4	90	1
	30	51	3.0	98	0	52	4.0	90	1
	40	51	2.9	98	0	51	4.5	85	1
	50	51	5.0	95	0	51	6.0	82	2
	60	51	4.9	95	0	51	7.4	77	2
	70	51	5.1	95	0	50	8.5	70	4
	80	51	5.5	93	0	49	9.8	65	6
	90	50	6.4	85	0	48	10.0	63	7
Position	20	22	2.1	98	0	22	5.9	90	2
	30	31	1.2	100	0	32	2.8	91	3
	40	41	1.3	100	0	42	3.5	91	1
	50	51	1.4	100	0	52	3.3	92	1
	60	61	1.2	100	0	62	3.1	94	1
	70	71	1.2	100	0	71	2.5	93	2
	80	81	1.1	100	0	81	1.8	96	3
Increase	0	51	1.4	100	0	52	3.3	92	1
	10	51	1.7	98	0	51	4.3	91	8
	20	51	2.4	95	0	51	4.7	90	10
	30	51	4.5	87	0	51	5.0	88	12

False neg: Number of false negative results in %; Precision: the detected transition point was within +/- 5% of the real change point.

Results of the performance of the PTDA under three real-world conditions.

Noise

Although we already used complex time series for validation, a systematic evaluation how noise affects the performance of the PTDA algorithm can shed further light on its applicability. As expected, decreased signal-to-noise ratios reduced the precision (Table 2). When all time series of the system are taken into consideration, the precision remained above 80% even for high levels of noise. Even for very high levels of noise, the rate of false negatives did not rise above 7%, indicating a good performance even in the presence of noise. The precision decreased slightly but dropped below 80% only at noise levels with a variance of >50% of the original time series.

Prolonged transition

In practice, the traits of a client (control parameters of the model system, [49]) will usually not change abruptly. Skills and resources, cognitive competencies, hopefulness/trait motivation, and alliance might build up gradually during psychotherapy. In the simulation, this is reflected by a continuous shift of the parameters over an extended period of time. The performance of the algorithm was tested for linear increases over 10, 20, and 30 time points. Since the linear increase of the control parameter did not (in contrast to classical physical synergetic systems) lead to an abrupt change in the pattern, but rather a smooth transition period. In consequence, the performance of the PTDA dropped with increasing length of the interval.

Minimum length of time series

For practical applications, it is important to know how long a time series must be in order to apply a method like the PTDA. The algorithm was tested on shorter sections of the time series with the TP placed in the middle. The results (Table 3) of the performance indicators are given for time series between 20 and 100 time points. Interestingly, the correct identification is not affected by the length of the time series, as indicated by a mean very close to the real TP, a constantly small standard deviation, and a constantly high rate of precision. What is affected, however, is the rate of false positive results. When applying the criterion of allowing maximally 5% of such false results, the minimum length of the time series has to be 60 points. In cases where 6% are acceptable, the length could be reduced to 50 points. Below that, the application of the PTDA is not recommended.

Table 3

Results of the PTDA for time series (TS) of 20 to 100 time points.

The mean (M) and the standard deviation (SD) of the transition points (TP) of the 300 simulated time series are given along with further indicators of the performance: the precision (Prec.), which gives the rate of identifying a TP within +/- 5 time points around the real TP, and the rates of false negative (FN) and false positive (FP) results.

Length of TS	real TP	M	SD	Prec.	FN	FP
100	50	52	3.3	92%	1%	0%
90	45	48	3.5	91%	1%	0%
80	40	43	4.0	91%	1%	0%
70	35	38	3.2	90%	3%	0%
60	30	33	3.2	89%	3%	0%
50	25	27	2.6	90%	3%	6%
40	20	22	2.5	87%	5%	13%
30	15	17	2.4	89%	6%	19%
20	10	11	1.9	88%	11%	44%

Results of the PTDA for time series (TS) of 20 to 100 time points.

Empirical examples

Finally, the PTDA was tested on several empirical time series (see Methods). The output of the PTDA for these time series is shown in Fig 6. The examples reveal that the TPs are found at positions where one would expect them by visual inspection of the time series. For Fig 6A, the transition is less clear than in the other examples, as can be seen by the broader yellow area in the probability band below.

PTDA applied to four different psychological time series.

For comparison

The common CPA with the criteria mean (M), standard deviation (SD), and linear trend (LT) was applied to the 4 empirical time series as shown in Fig 6. In (A): (M) 18, (SD) no result, (LT) 18, (PTDA) 18. In (B): (M) 49, (SD) no result, (LT) 44, (PTDA) 46. In (C): (M) 39, (SD) no result, (LT) 39, (PTDA) 39. In (D): (M) no result, (SD) no result, (LT) 12, (PTDA) 20. Whereas the results from common CPA and PTDA are identical in (A) and (C) or similar in (B), in one of the empirical time series (D) the methods identified different TPs (based on CPA(LT) only). In no case the CPA for standard deviations could identify a TP.

Comparing PTDA with qRQA

A rapidly growing state-of-the-art branch of the Recurrence Plot method, the Recurrence Quantification Analysis (RQA), represents a powerful toolbox for processing nonstationary and noisy time series. Recurrence-based techniques were developed around their power of capturing the dynamics of complex and non-stationary time series data and of time series exhibiting qualitatively different patterns along with their temporal evolution [39, 40]. RQA introduces various metrics to quantify data complexity from different angles and at different time scales. RQA quantifiers allow efficient detecting of transitions, e.g., in chaotic model systems [61] and electrophysiological data [41]. For illustrating the method, we applied two different RQA quantifiers on a simulated time series where the point of parameter change was placed at 50 and transition point as identified by the PTDA was at 53 (comp. Figs 4 and 5A, middle). Following [41], the RQA quantifiers we used is the Determinism (DET) which is defined as the ratio of recurrence points that form diagonal lines compared to all considered recurrence points and the Recurrence Time Entropy (RTE) which is an entropy measure based on recurrent vertical lines tw in the sliding window. with lmin = 2, w is the window width of the sliding window and P(l) is the histogram of lines in parallel to the diagonal in the sliding window. with t is the recurrence points in the vertical lines of the sliding window, T equals to the number of vertical lines in the window, and p(t) is the estimated probability of recurrence points in the vertical lines, corresponding to the recurrence rate. The parameters we used for construction the Recurrence Plots are m = 3 embedding dimensions, delay τ = 1, and threshold ε = 0.1. The calculation was done with the crqa package in R [39]. Fig 7 shows the results.

Fig 7

Evolution of DET (Determinism) and RTE (Recurrence Time Entropy) applied to a simulated time series from our validation set.

Evolution of DET (Determinism) and RTE (Recurrence Time Entropy) applied to a simulated time series from our validation set.

(A) The time series. (B) Evolution of DET depending on the width of the sliding window (left: 25, middle: 45, right: 65). (C) Evolution of the RET depending on the width of the sliding window (left: 25, middle: 45, right: 65). Depending on the window width, the DET quantifier increases when the signal gets more regular and rhythmic whereas the RTE decreases. This corresponds to the anti-correlated behavior of both measures as shown in [41] for movement-related EEG data. During the pattern transition (shortly before or behind), DET decreases and RTE increases which may mirror the increased complexity of the time series during the transition. Evidently, the exact position of the changes in DET and RTE depends on the window width of the submatrix of the Recurrence Plot which was used to calculate the quantifiers. The broader the window, less the ratio of the complete time series which can be exploited for calculation and the fuzzier will be the result. More than this the evolution of the RTE seems to be inverted (increasing), as the broadest sliding window only captures the increasing segment of the dynamic measure (see Fig 7C, left and middle). Based on this illustrative case, it became evident that RQA quantifiers are able to identify nonstationary transitions in complex systems. The preciseness will depend on the time series length, with impressively good results for longer time series, as in movement-related EEG signals [41] or in simulated time series moving into or out of chaotic regimes [61]. In some fields of application, like psychotherapy, the PTDA could be a useful amendment and complementary method to RQA. A more systematic cross validation of PTDA and RQA quantifier is necessary and will be prepared by the authors.

Discussion

An expansion of the CPA algorithm, the Pattern Transition Detection Algorithm (PTDA), was evaluated with 300 simulated time series with known transition points. The PTDA combines the information from three nonlinear methods: Recurrence Plots show changes of repeating patterns and in this sense are linked to autocorrelation, Time Frequency Distributions depict changing frequencies over time, and Dynamic Complexity captures the dynamics of fluctuation, amplitude, and distribution of a time series. Assessing all these aspects in one algorithm, the PTDA provides an easy-to-use tool for research and practice to define the transition points in any psychological time series that exceeds 50 time points. In contrast to applying the commonly used CPA only, the PTDA is able to detect not only changes of the mean, but also of other important qualities of time series and pattern transitions in general. This allows to reduce the high rate of false negative and false positive results of the CPA to below 5%, while keeping the high precision of the CPA. The small standard deviation further confirms the high precision of the PTDA. The numerous applications of the algorithm for research are obvious. Whenever one is interested in discontinuous changes within a time series, the PTDA will objectively specify the point of change. Several examples of studies in various fields of psychology have been mentioned in the Introduction.

Recurrence Quantification of Transitions (RQT)

Another important contribution of this study is the validation of a method to determine the point of change in Recurrence Plots (Recurrence Quantification of Transitions, RQT), which was introduced in [26]. In our study, the calculation of change points of each row in the RPs, which we call Recurrence Quantification of Transitions (RQT), provided reliable and valid results. Sophisticated methods exist to characterize Recurrence Plots as a whole (RQA). In the fast growing field of RQA, new quantifiers were developed in order to identify nonstationary developments and pattern transitions in complex systems. In consequence, PTDA and RQA algorithms could be combined and used for cross-validation, like in complexity science different algorithms for complexity or entropy quantification are applied in combination.

Diversity of pattern transitions

In practice it is important to see that pattern transitions not only concern changes of the mean level of psychological measures. Usually, the General Linear Model of psychological statistics focuses the attention of psychotherapists on increasing or decreasing levels of symptom intensity, emotions, or experienced progress. In psychotherapy research this is mirrored by the investigation of sudden gains or sudden losses (e.g., [11]). However, there are many examples of changes which concern the frequency patterns and the dynamic complexity of symptoms and emotions, e.g. low vs. rapid cycling of emotions in bipolar disorders, intensity of emotional fluctuations in Borderline Personality Disorder, switching between ego-states which are enslaving different emotions and cognitions in dissociative identity disorders vs. dissolving pathological over-synchronization [3], to be stuck in reduced ways of mental functioning (e.g. craving, numbness) vs. flexibility and adaptivity [68]. If mental disorders can be seen as “dynamic diseases” [69, 70] then we should open our judgment on therapeutic effects on changing dynamics. This will also concern the discussion on the appropriateness of outcome measures in psychotherapy.

Limitations and future developments

One limitation of our study is that the simulated time series were all produced by one system model. Although the model has been shown to reproduce the dynamics found in empirical time series of the psychotherapy process [49, 50], we cannot exclude that not all features of empirical time series are incorporated. However, the algorithm was shown to work also for the well-known Hénon map [26], one of the best investigated models in dynamic systems research, and worked well with different empirical time series. In principle, the PTDA can be applied to time series of any kind of any discipline, but the performance was not tested here. Note that for very long time series: while the result is presented at 1.3 seconds for a time series of N = 100 data points, it takes 82.2 seconds for N = 500 data points and over 7 minutes for N = 1000 data points. The running time is mainly affected by calling the function findchangepts more than 2*N times in the evaluation of the RP and TFD matrices. The evaluation of speed was done on an a notebook with intel core i7 processor with 16 GB RAM. Last but not least, the PTDA has been tested and validated for simulated data with one change point. Since the algorithm offers the possibility to find several TPs, the performance on such time series should be evaluated in the future. The main challenge here will be to avoid having to pre-define the maximum number of changes, which is a non-trivial problem in cluster analysis algorithms.

Supplement.

(PDF) Click here for additional data file.

Empirical data.

(CSV) Click here for additional data file.

Simulation data.

(CSV) Click here for additional data file.

36 in total

1. Multiscale recurrence quantification analysis of spatial cardiac vectorcardiogram signals.

Authors: Hui Yang
Journal: IEEE Trans Biomed Eng Date: 2010-08-05 Impact factor: 4.538

Review 2. Mediators and mechanisms of change in psychotherapy research.

Authors: Alan E Kazdin
Journal: Annu Rev Clin Psychol Date: 2007 Impact factor: 18.561

3. Analyzing developmental processes on an individual level using nonstationary time series modeling.

Authors: Peter C M Molenaar; Katerina O Sinclair; Michael J Rovine; Nilam Ram; Sherry E Corneal
Journal: Dev Psychol Date: 2009-01

4. A Nonlinear Dynamic Systems Model of Psychotherapy: First Steps Toward Validation and the Role of External Input.

Authors: Helmut Scholler; Kathrin Viol; Hannes Goditsch; Wolfgang Aichhorn; Marc-Thorsten Hutt; Gunter Schiepek
Journal: Nonlinear Dynamics Psychol Life Sci Date: 2019-01

5. Destabilization in self-ratings of the psychotherapeutic process is associated with better treatment outcome in patients with mood disorders.

Authors: Merlijn Olthof; Fred Hasselman; Guido Strunk; Benjamin Aas; Günter Schiepek; Anna Lichtwarck-Aschoff
Journal: Psychother Res Date: 2019-07-01

6. Convergent Validation of Methods for the Identification of Psychotherapeutic Phase Transitions in Time Series of Empirical and Model Systems.

Authors: Günter Schiepek; Helmut Schöller; Giulio de Felice; Sune Vork Steffensen; Marie Skaalum Bloch; Clemens Fartacek; Wolfgang Aichhorn; Kathrin Viol
Journal: Front Psychol Date: 2020-08-26

7. Psychotherapy Is Chaotic-(Not Only) in a Computational World.

Authors: Günter K Schiepek; Kathrin Viol; Wolfgang Aichhorn; Marc-Thorsten Hütt; Katharina Sungler; David Pincus; Helmut J Schöller
Journal: Front Psychol Date: 2017-04-24

8. What Differentiates Poor- and Good-Outcome Psychotherapy? A Statistical-Mechanics-Inspired Approach to Psychotherapy Research, Part Two: Network Analyses.

Authors: Giulio de Felice; Alessandro Giuliani; Omar C G Gelo; Erhard Mergenthaler; Melissa M De Smet; Reitske Meganck; Giulia Paoloni; Silvia Andreassi; Guenter K Schiepek; Andrea Scozzari; Franco F Orsucci
Journal: Front Psychol Date: 2020-05-20

9. The Effect of Childhood Adversities and Protective Factors on the Development of Child-Psychiatric Disorders and Their Treatment.

Authors: Egon Bachler; Alexander Frühmann; Herbert Bachler; Benjamin Aas; Marius Nickel; Guenter Karl Schiepek
Journal: Front Psychol Date: 2018-11-15

10. Real-Time Monitoring of Psychotherapeutic Processes: Concept and Compliance.

Authors: Günter Schiepek; Wolfgang Aichhorn; Martin Gruber; Guido Strunk; Egon Bachler; Benjamin Aas
Journal: Front Psychol Date: 2016-05-03

Length of TS	real TP	M	SD	Prec.	FN	FP
100	50	52	3.3	92%	1%	0%
90	45	48	3.5	91%	1%	0%
80	40	43	4.0	91%	1%	0%
70	35	38	3.2	90%	3%	0%
60	30	33	3.2	89%	3%	0%
50	25	27	2.6	90%	3%	6%
40	20	22	2.5	87%	5%	13%
30	15	17	2.4	89%	6%	19%
20	10	11	1.9	88%	11%	44%

Length of TS	real TP	M	SD	Prec.	FN	FP
100	50	52	3.3	92%	1%	0%
90	45	48	3.5	91%	1%	0%
80	40	43	4.0	91%	1%	0%
70	35	38	3.2	90%	3%	0%
60	30	33	3.2	89%	3%	0%
50	25	27	2.6	90%	3%	6%
40	20	22	2.5	87%	5%	13%
30	15	17	2.4	89%	6%	19%
20	10	11	1.9	88%	11%	44%

Length of TS	real TP	M	SD	Prec.	FN	FP
100	50	52	3.3	92%	1%	0%
90	45	48	3.5	91%	1%	0%
80	40	43	4.0	91%	1%	0%
70	35	38	3.2	90%	3%	0%
60	30	33	3.2	89%	3%	0%
50	25	27	2.6	90%	3%	6%
40	20	22	2.5	87%	5%	13%
30	15	17	2.4	89%	6%	19%
20	10	11	1.9	88%	11%	44%