Literature DB >> 34520584

Comparison and optimization of various moving patient-based real-time quality control procedures for serum sodium.

Yuanyuan Li1,2,3, Qian Yu1,2,3, Xiaoyan Zhang1,2,3, Xiaoling Chen1,2,3.   

Abstract

BACKGROUND: Patient-based real-time quality control (PBRTQC) is a valuable tool for monitoring the performance of testing processes. We aimed to compare and optimize various PBRTQC procedures for serum sodium.
METHODS: In a computer simulation, artificial errors were added to 680,000 real patients' results. The characteristics of error detection of various algorithms-moving average, moving median, moving SD and moving proportion of normal results including different control limits (CLs)-were assessed on their ability to detect critical errors early.
RESULTS: The moving average and moving median were sensitive to system error, and the moving SD tended to detect random error. P3SD (moving proportion of normal results, CLs based on mean and SD of proportion of normal results) demonstrated excellent performance for both system error and random error. The increase of block sizes (N) leads to the delay of error detection and the decrease of false rejection, except for QC procedures with minimum and maximum as CLs. CLs calculation with "0.1% false alarm rate" had more effective performance than that set false alarm to zero (minimum and maximum as CLs). The impact of truncation on QC performance depended on truncation limits, algorithms and the types of error. The significant improvement in QC performance due to truncation was only found in moving SD.
CONCLUSION: "P3SD ,N = 50, without truncation" and "moving SD, N = 25, set 0.1% false alarm as CLs and set 1% outliers exclusion as truncation limits" were recommended as the optimized procedures for serum sodium to monitor system error and random error, respectively.
© 2021 The Authors. Journal of Clinical Laboratory Analysis published by Wiley Periodicals LLC.

Entities:  

Keywords:  moving average; moving median; moving proportion of normal results; moving standard deviation; patient-based real-time quality control

Mesh:

Substances:

Year:  2021        PMID: 34520584      PMCID: PMC8529142          DOI: 10.1002/jcla.23985

Source DB:  PubMed          Journal:  J Clin Lab Anal        ISSN: 0887-8013            Impact factor:   2.352


INTRODUCTION

Patient‐based real‐time quality control (PBRTQC) is a useful tool for monitoring analytic performance in clinical laboratories. It is also an important application of “big data” in laboratory quality management. Compared with traditional methods of internal quality control (QC), PBRTQC has several advantages: e.g., low cost, no matrix effect, continuous monitoring and pre‐analytical monitoring. , , The concept of moving average QC was first published by Hoffmann and Waid in 1965. Since then, benefiting from the development of laboratory information systems, improved statistical methodology and increased awareness of the limitations of current QC, , , PBRTQC has attracted substantial attention and developed quickly. , , Recently, novel algorithms have been described, such as moving median, moving standard deviation (SD), moving average of delta and moving sum of outliers. , Various parameters and charts ,  have been developed to quantify their ability to detect error. However, it was still a challenge for the majority of routine clinical laboratories to implement PBRTQC because of the complexity of obtaining optimal PBRTQC settings. , , It was well known that the probability for “error” detection of traditional individualized QC should not be less than 90%. In fact, the “error” involved here refers to “critical error”. The critical error represents the minimum error that should be detected by a QC procedure, or else it will affect clinical practice. Similarly, the optimized PBRTQC procedures should also have the best ability to detect critical error. Nevertheless, rare articles have examined the relationship between critical error and the characteristics of error detection. Furthermore, novel algorithms, such as a moving median and moving standard deviation, have been shown to be superior for error detection under certain conditions. However, most of these algorithms have been studied independently by different research groups, scarcely any articles have compared them using the same database. In the actual application, it seemed to be more feasible that PBRTQC start with several typical tests and then be extended to all tests. Serum sodium is probably the most suitable chemistry test for PBRTQC because of its small biological variation and high requirement for analytical performance. , Therefore, we aimed to investigate and compare the characteristics of error detection of various algorithms, including their different definitions of control limits for serum sodium. Both system and random error were examined, and the relationships between critical error and characteristics of error detection were described in detail to optimize PBRTQC settings.

MATERIALS AND METHODS

Patients’ data collection and errors simulation

A total of 680,000 results of serum sodium were anonymized and exported from the laboratory information system of the First Affiliated Hospital, College of Medicine, Zhejiang University, including inpatient, outpatient and physical examination population. All results were sorted by detection time and divided into 400 virtual days with 1,700 measurements each. The last 200 days served as training dataset and the first 200 days as testing dataset. All optimization of the procedures was conducted on the training dataset, and all verifications of procedure performance were conducted on the testing dataset. The robust normalized spread (RNS) was calculated on all unaltered patient measurements: RNS = interquartile range/median. RNS represents the dispersion degree of original data distribution. Westgard JO et al. assessed the average of normal (AON) patient data algorithms to maximize run lengths for automatic process control. We used the similar method to simulate errors. The CVa (analytical CV), which represents analytical inherent precision, was defined as 1/3 of the allowable total error (TEa); 1/3TEa is the minimum requirement for analytical imprecision. The TEa of sodium, which was obtained from the specification in the Analytical Quality Specification for Routine Analytics in Clinical Chemistry (WS/T 403–2012), was defined as 4%. The system error (SE) was simulated as multiples of CVa (CVa = 1/3TEa) by changing the mean of patients’ data. The SE ranged from 0 to 4.0 CVa (0 ~ 4/3TEa), and both positive and negative errors were added. The random error (RE) was simulated as multiples of CVa by changing the SD of the patients’ data from 1.0 to 5.0 CVa. The artificial error for each day was introduced from the 201st result onwards and sustained for the remaining results.

Parameters of QC procedures

A whole QC procedure consists of four parts: algorithms, quality control limits (CLs), truncation limits (TLs) and block size (N). Table 1 lists the QC procedures investigated in this article.
TABLE 1

Quality control (QC) procedures investigated in this article

QC proceduresAlgorithms and QC limits (CLs)Block size (N)Truncation (Tn)
Amm,N = i, Tn Moving average ( A )CLs were the minimum and maximum values observed after running a calculation algorithm on the dataset without extra error. ( mm )Every N consecutive patients’ results were included to calculate a QC data. N=25, 50, 75, 100, 125 and 150.T0, T1% and T5% were set to exclude the outer 0, 1 and 5% of all results, respectively.
Mmm,N = i, Tn Moving median ( M )
Smm,N = i, Tn Moving SD ( S )
Pmm,N = i, Tn Moving proportion of normal results ( P )
A0.1%,N = i, Tn Moving average ( A )The 99.95th percentile of QC data without extra error was defined as the upper CLs and 0.05th as the lower CLs. ( 0.1% )
M0.1%,N = i, Tn Moving median ( M )
S0.1%,N = i, Tn Moving SD ( S )
P0.1%,N = i, Tn Moving proportion of normal results ( P )
ARCV,N = i, Tn

Moving average ( A )

CLs = mean ± RCV

RCV=2×Z×CVa2+CVi2 Z = 1.96. The CVa represents analytical precision. CVi denotes the biological variation within subjects. ( RCV )

A3.09,N = i, Tn

Moving average ( A )

CLs=mean±3.09SDp/N

SDp is the SD of the patients’ data. ( 3.09 )

SC4,N = i, Tn

Moving SD ( S )

CLs=SD¯±3SD¯C4(1C42)

SD¯ is the mean of the moving SD for an in‐control period. C4 is an unbiased constant related to block size. ( C4 )

P3SD,N = i, Tn

Moving proportion of normal results ( P )

CLs = meanproportion ± 3 × SDproportion.

The meanproportion is the average proportion of normal results. SDproportion is the square root of the variation for the proportion of normal results. ( 3SD )

A whole quality control procedure (first column) consists of four parts: algorithms, quality control limits (CLs), block size and truncation limits. The italics and boldface in parentheses in the middle columns represent abbreviations.

Quality control (QC) procedures investigated in this article Moving average ( ) CLs = mean ± RCV Z = 1.96. The CVa represents analytical precision. CVi denotes the biological variation within subjects. ( ) Moving average ( ) SDp is the SD of the patients’ data. ( ) Moving SD ( ) is the mean of the moving SD for an in‐control period. C4 is an unbiased constant related to block size. ( ) Moving proportion of normal results ( ) CLs = meanproportion ± 3 × SDproportion. The meanproportion is the average proportion of normal results. SDproportion is the square root of the variation for the proportion of normal results. ( ) A whole quality control procedure (first column) consists of four parts: algorithms, quality control limits (CLs), block size and truncation limits. The italics and boldface in parentheses in the middle columns represent abbreviations. For algorithms, the moving average (A), moving median (M), moving SD (S) and the moving proportion of normal results (P) , were calculated as the QC data for different error conditions and block sizes. Several defining methods of CLs were also investigated. There were two universal methods. The one was the minimum and maximum values (mm) observed after running a calculation algorithm on the dataset without extra SE or RE. So false alarms were set to zero for these procedures. The QC procedures for this method were expressed as Amm, Mmm, Smm and Pmm for moving average, moving median, moving SD and the moving proportion of normal results, respectively. For the other method, the 99.95th percentile of QC data without extra error was defined as the upper CLs and 0.05th as the lower CLs. So false alarms were set to 0.1%. They were expressed as A0.1%, M0.1%, S0.1% and P0.1%. Another two defining methods of CLs were also investigated for the moving average method (A). One was determined by calculating the reference change value (RCV) using the formula and CLs=mean±RCV, where Z=1.96 for the 2SD change in a 2‐tailed distribution ; CVa (analytical CV) represents analytical inherent precision; CVi (intraindividual CV) denoted the biological variation within subjects. , This was expressed as ARCV. The other one, which was related to block size (N), was . SDp was the SD of patients’ data. This was expressed as A3.09. The CLs for moving SD (S) were also defined with the following formula: . is the mean of the moving SD for an in‐control period. C4 is an unbiased constant related to block size, which can be obtained from the “GAMMA” function in Microsoft Excel. This was expressed as SC4. The CLs for the moving proportion of normal results (P) was also defined with the following formula: CLs = meanproportion ± 3 × SDproportion. Here, the meanproportion is the average proportion of normal results. The SDproportion is the square root of the variation for the proportion of normal results. It was expressed as P3SD. To minimize the influence of outlying values in the data, truncation was usually implemented for PBRTQC protocols. According to the conclusion by Bietenbeck et al., we selected “Winsorization” method, which replaces outlying values with the corresponding lower or upper truncation limits that was exceeded. For example, if the truncation limits were 134 ~ 148 mmol/L, the results greater than 148 mmol/L were replaced with 148 mmol/L instead of being eliminated directly. Three types of truncation limits (TLs), T0, T1% and T5%, were investigated. T0 meant all the data were included to QC procedures, and no TLs for serum sodium. The TLs of T1% and T5% were based on the mean and SD of patients’ data (SDp). TLs of T1% was TLs = mean ± 3 × SDp, and that of T5% was TLs = mean ± 2 × SDp. T0, T1% and T5% were set to exclude the outer 0, 1 and 5% of all measurements, respectively. We investigated QC procedures using batch sizes of 25, 50, 75, 100, 125 and 150 consecutive test results as the calculation method.

Performance of QC procedures

The number of patient samples necessary for error detection was counted after introducing extra error (NPed). Then median number of patient results affected before error detection (MNPed) for an increased analytical imprecision or bias was calculated. The MNPed reflected the median number of patient samples processed from the inception of an out‐of‐control error condition until it was detected. In addition, median number of patient samples between QC rejections when the process was in control (MNPfr) was calculated too. The MNPfr was the median number of patient samples between two false rejections. An ideal QC procedure was expected to detect error quickly and lead to rare false rejections. Thus, MNPed should be as small as possible, while MNPfr should be as large as possible. The infinite MNPeds (when the error was not detected) and MNPfrs (when false alarms were set to zero) were imputed with 1,650 (110% of the maximum value).

Optimization of QC procedures

Power function graphs were generated to compare the QC procedures by plotting errors (SE and RE in the form of multiples of CVa) on the x‐axis and MNPed on the y‐axis. To optimize the QC procedure, those procedures with MNPfr <1,500, which would lead to a high false rejection rate and increase QC cost, were excluded. As a false rejection is considered as a defective incident, the defective incidents per million was about 667, corresponding to a 4.75 Sigma for MNPfr = 1,500. The ability of a QC procedure to detect critical SE and RE got particular attention. The critical system error (SEc) was calculated as follows: SEc = (TEa‐bias)/CVa‐1.65. In this formula, 1.65 is a z‐value that sets the maximum defect rate at 5% (i.e., when the mean of patient test results has shifted by an amount that causes 5% of individual patient test results to have errors exceeding the total error requirement, the run will be considered unstable). The critical random error (REc) was calculated as follows: REc = (TE a–bias)/1.65CVa. As all the results were exported from a stable system, bias was deemed to be zero. An accumulative MNPed (∑MNPed) which was the sum of MNPeds for error greater than or equal to critical error was calculated to evaluate the overall performance of QC procedure. The QC procedure (one with a minimum ∑MNPed) was selected as the optimal strategy. All the data simulations and statistics were performed with Microsoft®Office Excel 2019 and its extended functions. The general flowchart for data simulation and QC procedure optimization is shown in Figure 1.
FIGURE 1

Flowchart for data simulation and QC procedure optimization. Serum sodium results of patients were exported from the laboratory information system and sorted by time. All the results were divided into 400 virtual days with 1,700 measurements each. The last 200 days served as training dataset to optimize QC procedures and the first 200 days as testing dataset to verify conclusions. Systematic error was simulated by changing mean of patients’ data, and random error was simulated by changing SD. Various QC procedures which consist of algorithms, truncation limits, control limits and block size were assessed with two basic parameters (MNPfr and MNPed) and one advanced parameter (∑MNPed). The ability of a QC procedure to detect critical errors got particular attention. The QC procedure with minimum ∑MNPed and MNPfr≥1,500 was the optimized QC procedure

Flowchart for data simulation and QC procedure optimization. Serum sodium results of patients were exported from the laboratory information system and sorted by time. All the results were divided into 400 virtual days with 1,700 measurements each. The last 200 days served as training dataset to optimize QC procedures and the first 200 days as testing dataset to verify conclusions. Systematic error was simulated by changing mean of patients’ data, and random error was simulated by changing SD. Various QC procedures which consist of algorithms, truncation limits, control limits and block size were assessed with two basic parameters (MNPfr and MNPed) and one advanced parameter (∑MNPed). The ability of a QC procedure to detect critical errors got particular attention. The QC procedure with minimum ∑MNPed and MNPfr≥1,500 was the optimized QC procedure This study was approved by the Ethics Committee of the First Affiliated Hospital, College of Medicine, Zhejiang University.

RESULTS

Data distribution

As listed in Table 2, serum sodium was nearly normally distributed with low skewness (−0.84 for training set and −0.65 for testing set) and had low RNSs (0.021 for training and 0.014 for testing set). The change of mean from training to testing set, which was the difference of means between training and testing set in relation to the mean of the training set, was −0.26%.
TABLE 2

Statistical properties for serum sodium in the training and testing datasets

DatasetsMeanSDMedianMaxMinSkewKurtosisIQRRNS
Training141.292.27141178110−0.846.3630.021
Testing140.922.25141185108−0.656.3620.014
All141.102.27141185108−0.746.2430.021

Unit: mmol/L.

Abbreviations: IQR, interquartile range; RNS, robust normalized spread; SD, standard deviation.

Statistical properties for serum sodium in the training and testing datasets Unit: mmol/L. Abbreviations: IQR, interquartile range; RNS, robust normalized spread; SD, standard deviation. A3.09 and SC4 were excluded because their false rejection rate was too high (MNPfr <1,500). Besides them, P3SD,N = 25,T0 and 1% (both MNPfr = 1137) were also ruled out for the same reason. Then the remaining rules were further assessed. The ability to detect SE and RE, which was quantified with MNPed for QC procedures under various block sizes and truncations, is shown in Table S1 and S2, respectively. The moving average and moving median were sensitive to SE, and the moving SD tended to detect RE. Unexpectedly, P3SD demonstrated excellent performance for both SE and RE. Figure 2 shows the ability to detect SE, which was quantified with MNPed for QC procedures under the same block size and optimized TLs. It was demonstrated that the procedures were more capable of detecting negative SE than positive SE. This was due to a low negative skewness in original sodium distribution. P3SD was clearly superior to the others in SE detection. As a whole, the A0.1%, M0.1% and P0.1% were more competent to detect SE than Amm, Mmm and Pmm, but the difference in QC performance among these three procedures (A0.1%, M0.1% and P0.1%) was unobvious and variable. Figure 3 shows the ability to detect RE with the same block size and optimized TLs. On the whole, the ability to detect the RE of P3SD, S0.1%, Smm, P0.1% and Pmm decreased in the sequence.
FIGURE 2

Median number of patients affected until error was detected (MNPed) as a function of induced systematic error magnitude. The MNPed of quality control procedures for systematic error (SE) are shown in Figure 2A–F. The first capital letter is the quality control algorithm and the subscripts denote quality control limits (CLs). is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving average with CLs based on reference change values. is the moving median with 0.1% false rejection rate as CLs. is the moving median with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with 0.1% false rejection rate as CLs. is the moving proportion of normal results with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. , and were the truncation limits which were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures had the same performance for different truncations were marked with dotted lines. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. system error. The critical SE was 1.35 CVa. The MNPed for critical SE was marked with red arrows (↑). Figure 2 shows that the procedures were more capable of detecting negative SE than positive SE. P3SD was clearly superior to the others in SE detection. As a whole, the A0.1%, M0.1% and P0.1% were more competent to detect SE than Amm, Mmm and Pmm, but the difference among themselves was unobvious and variable

FIGURE 3

Median number of patients affected until error was detected (MNPed) as a function of induced random error magnitude. The MNPed of quality control procedures for random error (RE) are shown in Figure 3A–F. The first capital letter is the quality control algorithm and the subscripts denote quality control limits (CLs). is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. is the moving proportion of normal results with 0.1% false rejection rate as CLs. is the moving proportion of normal results with CLs based on minimum and maximum control data without extra error. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. , and were the truncation limits which were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures had the same performance for different truncations were marked with dotted lines. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. random error. The critical RE was 1.82 CVa. The MNPed for critical RE was marked with red arrows (↑). Figure 3 shows that the ability to detect the RE of P3SD, S0.1%, Smm, P0.1% and Pmm decreased in the sequence

Median number of patients affected until error was detected (MNPed) as a function of induced systematic error magnitude. The MNPed of quality control procedures for systematic error (SE) are shown in Figure 2A–F. The first capital letter is the quality control algorithm and the subscripts denote quality control limits (CLs). is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving average with CLs based on reference change values. is the moving median with 0.1% false rejection rate as CLs. is the moving median with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with 0.1% false rejection rate as CLs. is the moving proportion of normal results with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. , and were the truncation limits which were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures had the same performance for different truncations were marked with dotted lines. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. system error. The critical SE was 1.35 CVa. The MNPed for critical SE was marked with red arrows (↑). Figure 2 shows that the procedures were more capable of detecting negative SE than positive SE. P3SD was clearly superior to the others in SE detection. As a whole, the A0.1%, M0.1% and P0.1% were more competent to detect SE than Amm, Mmm and Pmm, but the difference among themselves was unobvious and variable Median number of patients affected until error was detected (MNPed) as a function of induced random error magnitude. The MNPed of quality control procedures for random error (RE) are shown in Figure 3A–F. The first capital letter is the quality control algorithm and the subscripts denote quality control limits (CLs). is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. is the moving proportion of normal results with 0.1% false rejection rate as CLs. is the moving proportion of normal results with CLs based on minimum and maximum control data without extra error. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. , and were the truncation limits which were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures had the same performance for different truncations were marked with dotted lines. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. random error. The critical RE was 1.82 CVa. The MNPed for critical RE was marked with red arrows (↑). Figure 3 shows that the ability to detect the RE of P3SD, S0.1%, Smm, P0.1% and Pmm decreased in the sequence The influence of block size on QC performance was somewhat complex (Figure 4). The main trend was that both MNPed and MNPfr decreased with smaller block sizes. A reduction in block size led to quicker error detection, but it also led to a higher rate of false rejection. The typical cases were A0.1%, ARCV, P0.1% and P3SD for SE, and S0.1% and P3SD for RE. As a result, the performance curves of A0.1% for SE and S0.1% for RE descended with the decrease of block size in Figure 4. However, there were exceptions for procedures with minimum and maximum as CLs. Some QC procedures with a small block size, such as Amm(N = 25), Pmm(N = 25) were not sensitive to error<2.0 CVa. Nevertheless, their performance improved rapidly as the error increased. Compared with QC procedures with block sizes (N ≥ 50), procedures with N = 25 detected a small error (<2.0 CVa) more slowly, but detected a large error (≥2.0 CVa) more quickly. That is why the performance curves of Amm, N = 25 for SE and Smm, N = 25 for RE intersected with that of N = 75 and 125 in Figure 4.
FIGURE 4

Influence of block sizes on QC performance. is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. A0.1% and S0.1% were marked with dotted lines. Amm and Smm were marked with solid lines. (A) shows the performance of A0.1% and Amm (without truncation) for system error (SE). (B) shows the performance of S0.1% and Smm (set 5% outliers’ exclusion as truncation limits) for random error (RE). The main trend was that both MNPed and MNPfr decreased with smaller block sizes. As a result, the performance curves of A0.1% and S0.1% descended with the decrease of block size in Figure 4. However, there were exceptions for procedures with minimum and maximum as CLs. Compared with N = 75 and 125, procedures with N = 25 detected a small error (<2.0CVa) more slowly, but detected a large error (≥2.0CVa) more quickly. That is why the performance curves of Amm, N = 25 and Smm, N = 25 intersected with that of N = 75 and 125 in Figure 4

Influence of block sizes on QC performance. is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. A0.1% and S0.1% were marked with dotted lines. Amm and Smm were marked with solid lines. (A) shows the performance of A0.1% and Amm (without truncation) for system error (SE). (B) shows the performance of S0.1% and Smm (set 5% outliers’ exclusion as truncation limits) for random error (RE). The main trend was that both MNPed and MNPfr decreased with smaller block sizes. As a result, the performance curves of A0.1% and S0.1% descended with the decrease of block size in Figure 4. However, there were exceptions for procedures with minimum and maximum as CLs. Compared with N = 75 and 125, procedures with N = 25 detected a small error (<2.0CVa) more slowly, but detected a large error (≥2.0CVa) more quickly. That is why the performance curves of Amm, N = 25 and Smm, N = 25 intersected with that of N = 75 and 125 in Figure 4 The impact of truncation on QC performance depended on TLs, QC algorithms and the types of error (Figure 5). Truncation didn't improve the QC performance of moving average and moving median which were sensitive to SE, but resulted in a slight increase of MNPed. Figure 2 which lists the optimized truncation limits for various procedures shows that T0 was the optimal TLs for most procedures of these two algorithms. In contrast, the proper TLs can significantly improve the QC performance of moving SD. Smm and S0.1% with T1% and T5% had much better performance than that without truncation (Figure 5). T1% was slight superior to T5%. Figure 3 which lists the optimized TLs for each procedure also shows that T1% was the optimal TLs for most moving SD procedures. The effect of truncation on the moving proportion of normal results depended on the TLs. If the TLs were wider than reference range, it had no impact on QC performance. If the TLs were within the reference range, its impact was fatal. For example, the TLs of T1% was 134.49 ~ 148.07 mmol/L, and the reference range was 137 ~ 147 mmol/L. There was no difference in MNPeds between moving proportion of normal results with T0 and T1% (Figures 2 and 3). Conversely, the upper truncation limit of T5% was 145.72 mmol/L which was lower than upper limit of reference range (147 mmol/L). Compared with no truncation, the ability of Pmm, P0.1% and P3SD with T5% to detect RE and positive SE decreased sharply or even lost, such as P3SD, N = 25,T5% in Figure 2A.
FIGURE 5

Influence of truncation limits on QC performance. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. T, T and T were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures without truncation (T0) were marked with solid lines, and those with truncations (T1% and T5%) were marked with dotted lines. The proper truncation limits can significantly improve the QC performance of moving SD. Smm and S0.1% with T1% and T5% had much better performance than that without truncation (See Figure 5). T1% was slight superior to T5%

Influence of truncation limits on QC performance. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. (the analytical CV) represents analytical inherent precision. T, T and T were set to exclude the outer 0, 1 and 5% of all results, respectively. The procedures without truncation (T0) were marked with solid lines, and those with truncations (T1% and T5%) were marked with dotted lines. The proper truncation limits can significantly improve the QC performance of moving SD. Smm and S0.1% with T1% and T5% had much better performance than that without truncation (See Figure 5). T1% was slight superior to T5%

Optimized QC procedures

The critical system error (SEc) was 1.35 CVa for serum sodium. P3SD,N=50,T0&1% detected SEc the fastest (MNPed = 258 tests for positive SEc and 33.5 tests for negative SEc). As T0 and T1% of P3SD,N = 50 had the same performance, T0 which was more convenient was selected. The selection of T0, T1% and T5% for the other procedures followed the same way. The best 10 QC procedures for SE based on ∑MNPed were as follows: P3SD,N = 50,T0,∑MNPed = 716; P3SD,N = 75,T0,∑MNPed = 924; P3SD,N = 100,T0, ∑MNPed = 1,315.5; P3SD,N = 125,T0,∑MNPed = 1,519; P3SD,N = 150,T0,∑MNPed = 1,586; A0.1%,N=25,T0,∑MNPed = 2,412; A0.1%,N = 25,T1%,∑MNPed = 2,460.5; M0.1%,N=50,T0, ∑MNPed = 2,707.5; A0.1%,N = 25,T5%,∑MNPed = 2,767; A0.1%,N = 50,T0,∑MNPed = 3,861. Similarly, the critical random error (REc) was 1.82CVa. S0.1%,N = 25,T1% detected REc the fastest (MNPed = 24 tests). The best 10 QC procedures for RE were as follows: S0.1%,N = 25,T1%,∑MNPed = 107.5; P3SD,N = 50,T0,∑MNPed = 114.5; S0.1%,N = 25,T5%, ∑MNPed = 116.5; P3SD,N = 25,T5%,∑MNPed = 132; P3SD,N = 75,T0,∑MNPed = 152.5; Smm,N = 25,T1%,∑MNPed = 169; S0.1%,N = 50,T1%,∑MNPed = 171; S0.1%,N=50,T5%, ∑MNPed = 191; P3SD,N = 100,T0,∑MNPed = 191; P3SD,N = 50,T5%,∑MNPed = 194.5; P3SD, N = 125,T0,∑MNPed = 229.5. In all, P3SD,N = 50,T0 and S0.1%,N = 25,T1% were the optimized QC procedures for serum sodium, and their detailed parameters and performance are listed in Table 3.
TABLE 3

Parameters of the optimized procedures for serum sodium

ProceduresP3SD,N = 50,T0 S0.1%,N = 25,T1%
Main functionMonitor system errorMonitor random error
Truncation limitsNone134.49 ~ 148.07 mmol/L
AlgorithmsMoving proportion of normal resultsMoving standard deviation
Block size50 tests25 tests
Control limits88.92% ~ 100%0.9601 ~ 3.4546 mmol/L
MNPfr1,650 tests1,650 tests
MNPed for critical error

33.5 tests for negative Sec

258 tests for positive SEc

24 tests for REc
∑MNPed716 tests107.5 tests

the median number of patient samples between two false rejections. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. the sum of MNPeds for error greater than or equal to critical error. critical system error. critical random error.

Parameters of the optimized procedures for serum sodium 33.5 tests for negative Sec 258 tests for positive SEc the median number of patient samples between two false rejections. : the median number of patient samples processed from the start of an out‐of‐control error condition until it was detected. the sum of MNPeds for error greater than or equal to critical error. critical system error. critical random error.

The stability of the QC performance

To evaluate the stability of the PBRTQC performance over time, MNPeds from the training and the test datasets with the same method and error were compared. Table 4 shows the SE detection performance of QC procedures in the training and test dataset. MNPeds were basically close, except for MNPeds for SE = 1.35CVa. Table 5 shows that these candidate QC procedures had highly consistent RE detection performance in the training and test dataset.
TABLE 4

Difference of system error detection between the training and testing datasets

Procedures−4−3.5−3−2.5−2−1.351.3522.533.54
P3SD,N = 506679153425846201176
566813.52759864221176
P3SD,N = 7577912204233464241498
778111734791773015108
P3SD,N = 10099111524544968030171110
8910132043975963718119
P3SD,N = 1251011131829645669637211412
101112162551118911442221411
P3SD,N = 15012131521347257111242251613
111214183059165013649251713
A0.1%,N = 5026293441507315275441353026
252832394836916508444363127
M0.1%,N = 5024252732434367497136292726
2424263041149165012139312826
Amm,N = 50313541484361650165027348403430
303439462341650165064050423531
Mmm,N = 50262732444361650165076274373028
26273141151165016501650121393128

is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving median with 0.1% false rejection rate as CLs. is the moving median with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. All the procedures in the table had no truncation and had the same MNPfrs (1,650 tests). The MNPeds for various system errors (from 1.35CVa to 4.0CVa) are listed in the table. The rows marked gray were results from training dataset, and the others were from testing set. MNPeds were basically close, except for MNPeds for SE = 1.35CVa.

TABLE 5

Difference of random error detection between the training and testing datasets

ProceduresMNPfr1.8222.533.544.55
S0.1%,N = 25,T5% 16502219161412121111
16502219171413121111
S0.1%,N = 25,T1% 165024201512101098
16502521161211998
P3SD,N = 50,T1% 165027231612101098
16502722161310998
P3SD,N = 50,T0 165027231612101098
16502722161310998
P3SD,N = 75,T0 16503631211714131110
16503629211613121110
S0.1%,N = 50,T1% 16503833241916151413
16503933241916151413
P3SD,N = 100,T0 16504639262117161413
16504637252016151412
P3SD,N = 125,T0 16505647322520181616
16505245302319181615
Smm,N = 25,T1% 16505032211713131211
16505333211714131211
Smm,N = 50,T1% 16505544322522191817
16505543332522201817
P3SD,N = 150,T0 16506455362824211918
16506153362723211918
Smm,N = 25,T5% 165014076292320191817
165013369302321191817

is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. and were set to exclude the outer 0, 1 and 5% of all results, respectively. All the procedures in the table had the same MNPfrs (1650 tests). The MNPeds for various random errors (from 1.82 CVa to 5.0 CVa) are listed in the table. The rows marked gray were results from training dataset, and the others were from testing set. These candidate QC procedures had highly consistent random error detection performance in the training and test dataset.

Difference of system error detection between the training and testing datasets is the moving average with 0.1% false rejection rate as CLs. is the moving average with CLs based on minimum and maximum control data without extra error. is the moving median with 0.1% false rejection rate as CLs. is the moving median with CLs based on minimum and maximum control data without extra error. is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. All the procedures in the table had no truncation and had the same MNPfrs (1,650 tests). The MNPeds for various system errors (from 1.35CVa to 4.0CVa) are listed in the table. The rows marked gray were results from training dataset, and the others were from testing set. MNPeds were basically close, except for MNPeds for SE = 1.35CVa. Difference of random error detection between the training and testing datasets is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion. is the moving SD with 0.1% false rejection rate as CLs. is the moving SD with CLs based on minimum and maximum control data without extra error. and were set to exclude the outer 0, 1 and 5% of all results, respectively. All the procedures in the table had the same MNPfrs (1650 tests). The MNPeds for various random errors (from 1.82 CVa to 5.0 CVa) are listed in the table. The rows marked gray were results from training dataset, and the others were from testing set. These candidate QC procedures had highly consistent random error detection performance in the training and test dataset.

DISCUSSION

The characteristics of the error detection of various algorithms were analyzed and compared. The moving average and moving median were sensitive to SE, and the moving SD tended to detect RE. P3SD demonstrated excellent performance for both SE and RE. Overall, the A0.1%, M0.1% and P0.1% were more competent to detect SE than Amm, Mmm and Pmm, but the difference among themselves was unobvious and variable. In general, CLs calculation with “0.1% false alarm rate” had more effective performance than that set false alarm to zero (minimum and maximum as CLs). The ability to detect the RE of P3SD, S0.1%, Smm, P0.1% and Pmm decreased in the sequence. For serum sodium, P3SD,N = 50,T0 and S0.1%,N = 25,T1% were the optimized QC procedures for SE and RE, respectively. A3.09 and SC4 were excluded for high false rejection in our research. In the previous reports, , they showed acceptable, even satisfactory performance. That is because they assessed QC performance with simulated patients’ data, instead of real patients’ data. The simulated patients’ data usually have a perfect Gaussian distribution, while the real measurements probably do not follow it. Both MNPed and MNPfr increased with the enlargement of block sizes, except for procedures with minimum and maximum as CLs. It was understandable. For example, there was a sample with 110 mmol/L sodium, and it was incorporated to calculate QC data. Owing to this outlier, the average of 25 tests would obviously decrease while the average of 150 tests would just change slightly. Only after incorporating more such outliers did the QC data for N = 150 begin to apparently decrease. Thus, the main trend was that both MNPed and MNPfr increased with larger block sizes. Additionally, the smaller the block size, the larger the fluctuations in the QC data were. For procedures with small block sizes (such as N = 25), the CLs derived from minimum and maximum values were wide and not sensitive to small errors. The impact of truncation on QC performance depended on TLs, QC algorithms and the types of error. The significant improvement in QC performance due to truncation was only found in moving SD. So only moving SD was recommended to set proper truncation limits. There are well‐known significant differences between traditional QC and patient‐based QC, and the limitations of traditional commercial QC have been increasingly recognized. The 4th edition of the Clinical and Laboratory Standards Institute (CLSI) C24 document recommends laboratories introduce additional QC performance metrics that are more directly related to patient risk. In other words, the traditional performance metrics (probabilities for error detection and false rejection) are not suitable for risk management. Even more important, the C24 document has proposed that the frequency of QC events and their relationship to patient risk should be the focus of QC practices. PBRTQC may be an effective way to solve these problems, due to its ability for real‐time monitoring and focus directly on patients’ results. Additionally, the TEa based on desirable biological variation is usually demanding for serum calcium, chloride, sodium and albumin. , The biological variation of these tests is smaller, so their TEa is stricter. As a result, the sigma metrics are so low that even multiple rules cannot achieve satisfactory performance. In contrast, the smaller the biological variation is, the more powerful the error detection ability of PBRTQC is. So, the tests with smaller biological variation are more suitable for PBRTQC, and they just need PBRTQC to make up for the inability of traditional QC to detect error. That is also why we selected sodium for our research. Nevertheless, PBRTQC is certainly complex and unpredictable compared with traditional QC. In all, these different characteristics between traditional and PBRTQC offer an opportunity to strengthen QC plans by combining them, rather than using one method in place of another. To be specific, Figure 6 shows the proposed flowchart for the serum sodium of PBRTQC in routine clinical chemistry. Laboratory should set parameters of QC procedures firstly. The traditional individualized QC was designed based on sigma metrics of analytical performance. When setting parameters of PBRTQC, enough patient outcomes should be collected from a stable analytical system, one month at least. Then exclude outliers according to TLs, if it was needed. After that, define suitable CLs according to the optimized procedures. After proper parameters setting, the whole protocols will be performed in routine work. Traditional QC, which was usually performed at the initial phase of analysis, was applied as a confirmatory tool. If the traditional QC was in control, the analytical system started to measure patients’ samples. As measurement results were produced, PBRTQC was initiated. PBRTQC was considered as an alarm tool for monitoring performance in real time. In part, PBRTQC also decided when to perform traditional QC again. The combination of P3SD,N = 50,T0 and S0.1%,N = 25,T1% were recommended as the optimized PBRTQC for serum sodium. The detailed parameters of them are listed in Table 3. In practice, a new patient's result corresponds to a new QC data. The large block size wouldn't delay the startup of PBRTQC. Take N = 150 as an example, the first result of today can be combined with 149 last results of yesterday to calculate a new QC data. If PBRTQC was out of control, further measures were needed to confirm the analytical status, such as additional commercial QC or retesting retained samples.
FIGURE 6

Proposed flowchart for the serum sodium of patient‐based real‐time quality control (PBRTQC) charts.   is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion, without truncation. is the moving SD with 0.1% false rejection rate as CLs and 1% outliers exclusion as truncation limits. system error. random error. : control limits. The traditional individualized QC was designed based on sigma metrics of analytical performance. Enough patient outcomes were collected from a stable analytical system to set PBRTQC parameters. Then outliers were excluded according to truncation limits, if it was needed. After that, proper CLs were defined according to the optimized procedures. At last, the combination of traditional QC and PBRTQC would be performed in the laboratory. Traditional QC, which was performed at the initial phase of analysis, was applied as a confirmatory tool. If the traditional QC was in control, the analytical system started to measure patients’ samples. As measurement results were produced, PBRTQC was initiated. PBRTQC was considered as an alarm tool for monitoring performance in real time. If PBRTQC was out of control, further measures were needed to confirm the analytical status, such as additional commercial QC or retesting retained samples

Proposed flowchart for the serum sodium of patient‐based real‐time quality control (PBRTQC) charts.   is the moving proportion of normal results with CLs = meanproportion ± 3 × SDproportion, without truncation. is the moving SD with 0.1% false rejection rate as CLs and 1% outliers exclusion as truncation limits. system error. random error. : control limits. The traditional individualized QC was designed based on sigma metrics of analytical performance. Enough patient outcomes were collected from a stable analytical system to set PBRTQC parameters. Then outliers were excluded according to truncation limits, if it was needed. After that, proper CLs were defined according to the optimized procedures. At last, the combination of traditional QC and PBRTQC would be performed in the laboratory. Traditional QC, which was performed at the initial phase of analysis, was applied as a confirmatory tool. If the traditional QC was in control, the analytical system started to measure patients’ samples. As measurement results were produced, PBRTQC was initiated. PBRTQC was considered as an alarm tool for monitoring performance in real time. If PBRTQC was out of control, further measures were needed to confirm the analytical status, such as additional commercial QC or retesting retained samples We investigated and compared the characteristics of the error detection of various algorithms (i.e., moving average, moving median, moving SD and moving proportion of normal results), including a variety of definition methods of CLs, simultaneously. For routine laboratories, the QC procedures investigated in this paper is common and easy to implement. In addition, both SE and RE were investigated in this study. Second, the optimized QC procedure was based on the critical error instead of percentage of TEa. The critical error, which was decided by the TEa and analytical performance, was closely related to Sigma metrics of analytical system (SEc = Sigma metrics −1.65). The critical error was initially applied in the designation of traditional QC procedures. Similarly, it should be of concern in PBRTQC too. Thus, it is scientifically reasonable to optimize QC procedures according to their capacity to detect the critical error. Third, ∑MNPed was the advanced parameter to evaluate overall QC performance and decided the optimized QC procedures in this article. MNPed replaced ANPed as the basic parameter of QC performance. ANPed uses the average number of patient results affected before error detection, whereas MNPed uses the median number. As the numbers of results necessary for error detection is not normally distributed, MNPed is more suitable. The ability of a QC procedure to detect critical error should be received with concern. But ideally, any error greater than the critical error should be detected too. As a result, MNPeds for errors greater than the critical error should be valued too. So ∑MNPed was more powerful than MNPed. This study has several specific limitations. First, only serum sodium was investigated in our study, and more analytic tests with significantly different characteristics should be investigated in the future. Nevertheless, both our research and previous research ,  have demonstrated that serum sodium is probably the most suitable chemistry test for PBRTQC because of its small biological variation. PBRTQC is not suitable for every test, particularly, tests with low production numbers (e.g., iron), tests with an extreme variation in results (e.g., C‐reactive protein and urea) or a combination of both (i.e., lipase and amylase). In practice, it seemed to be more feasible that PBRTQC start with several typical tests, and then be extended to most tests. Second, the series of errors were introduced using a step‐shift strategy, not a gradual degradation in data simulation. In fact, a gradual degradation error may be closer to reality and be more difficult to detect. In conclusion, the combination of P3SD,N = 50,T0 and S0.1%,N = 25,T1%, which were the quickest to detect any type of critical error, are recommended as the optimized QC procedure for serum sodium.

CONFLICT OF INTEREST

The authors declare that there is no conflict of interest or financial disclosure related to this publication.

AUTHOR CONTRIBUTIONS

Yuanyuan Li took part in conceptualization, formal analysis, funding acquisition, writing the original draft and writing, reviewing and editing. Qian Yu carried out investigation and data curation. Xiaoyan Zhang was involved in methodology and data curation. Xiaoling Chen had contributed to writing, reviewing and editing. Table S1 Click here for additional data file. Table S2 Click here for additional data file.
  24 in total

1.  THE "AVERAGE OF NORMALS" METHOD OF QUALITY CONTROL.

Authors:  R G HOFFMANN; M E WAID
Journal:  Am J Clin Pathol       Date:  1965-02       Impact factor: 2.493

2.  "Big Data" in Laboratory Medicine.

Authors:  Nicole V Tolan; M Laura Parnas; Linnea M Baudhuin; Mark A Cervinski; Albert S Chan; Daniel T Holmes; Gary Horowitz; Eric W Klee; Rajiv B Kumar; Stephen R Master
Journal:  Clin Chem       Date:  2015-10-20       Impact factor: 8.327

3.  Design and assessment of average of normals (AON) patient data algorithms to maximize run lengths for automatic process control.

Authors:  J O Westgard; F A Smith; P J Mountain; S Boss
Journal:  Clin Chem       Date:  1996-10       Impact factor: 8.327

4.  What's New in Laboratory Statistical Quality Control Guidance? The 4th Edition of CLSI C24, Statistical Quality Control for Quantitative Measurement Procedures: Principles and Definitions.

Authors:  Curtis A Parvin
Journal:  J Appl Lab Med       Date:  2017-03-01

5.  Implementation and application of moving average as continuous analytical quality control instrument demonstrated for 24 routine chemistry assays.

Authors:  Huub H van Rossum; Hans Kemperman
Journal:  Clin Chem Lab Med       Date:  2017-07-26       Impact factor: 3.694

6.  Understanding Patient-Based Real-Time Quality Control Using Simulation Modeling.

Authors:  Andreas Bietenbeck; Mark A Cervinski; Alex Katayev; Tze Ping Loh; Huub H van Rossum; Tony Badrick
Journal:  Clin Chem       Date:  2020-08-01       Impact factor: 8.327

7.  Planning Risk-Based SQC Schedules for Bracketed Operation of Continuous Production Analyzers.

Authors:  James O Westgard; Hassan Bayat; Sten A Westgard
Journal:  Clin Chem       Date:  2017-11-02       Impact factor: 8.327

8.  Moving sum of number of positive patient result as a quality control tool.

Authors:  Jiakai Liu; Chin Hon Tan; Tony Badrick; Tze Ping Loh
Journal:  Clin Chem Lab Med       Date:  2017-10-26       Impact factor: 3.694

Review 9.  Recommendation for performance verification of patient-based real-time quality control.

Authors:  Tze Ping Loh; Andreas Bietenbeck; Mark A Cervinski; Huub H van Rossum; Alex Katayev; Tony Badrick
Journal:  Clin Chem Lab Med       Date:  2020-07-28       Impact factor: 3.694

10.  Moving standard deviation and moving sum of outliers as quality tools for monitoring analytical precision.

Authors:  Jiakai Liu; Chin Hon Tan; Tony Badrick; Tze Ping Loh
Journal:  Clin Biochem       Date:  2018-02       Impact factor: 3.281

View more
  1 in total

1.  Comparison and optimization of various moving patient-based real-time quality control procedures for serum sodium.

Authors:  Yuanyuan Li; Qian Yu; Xiaoyan Zhang; Xiaoling Chen
Journal:  J Clin Lab Anal       Date:  2021-09-14       Impact factor: 2.352

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.