| Literature DB >> 35965972 |
Yufang Liang1, Zhe Wang2, Dawei Huang1,3, Wei Wang4, Xiang Feng2, Zewen Han2, Biao Song2, Qingtao Wang1,5, Rui Zhou1,5.
Abstract
Background: In the big data era, patient-based real-time quality control (PBRTQC), as an emerging quality control (QC) method, is expanding within the clinical laboratory industry. However, the main issue of current PBRTQC methodology is data stability. Our study is aimed to explore a novel protocol for data stability by combining delta data with machine learning (ML) technique to improve the capacity of QC event detection.Entities:
Keywords: Data processing; Delta data; Machine learning; Patient-based real-time quality control; Random forest
Year: 2022 PMID: 35965972 PMCID: PMC9363967 DOI: 10.1016/j.heliyon.2022.e09935
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Figure 1Integrated experimental process diagram.
Data analysis of the 3 data types for the seven test items.
| Test item | Algorithm | Mean | SD | Min | 25th | 50th | 75th | Max |
|---|---|---|---|---|---|---|---|---|
| LYMPH# | Single | 2.0722 | 0.9855 | 0.2400 | 1.4400 | 1.8600 | 2.4500 | 9.4700 |
| Delta | -0.0138 | 0.7819 | -3.6900 | -0.3600 | -0.0200 | 0.2700 | 11.0600 | |
| IF | -0.0076 | 0.3249 | -0.6800 | -0.2600 | 0.0000 | 0.2500 | 0.6000 | |
| LYMPH% | Single | 31.3645 | 11.7726 | 2.9000 | 23.3000 | 30.2000 | 37.3000 | 77.8000 |
| Delta | -1.8032 | 8.8492 | -40.9000 | -6.5000 | -1.5000 | 3.3000 | 34.2000 | |
| IF | -0.2161 | 5.2249 | -10.6600 | -4.4000 | -0.1600 | 4.0000 | 9.5600 | |
| HGB | Single | 129.6815 | 19.5626 | 56.0000 | 118.0000 | 131.0000 | 143.0000 | 184.0000 |
| Delta | 2.0162 | 11.2371 | -64.0000 | -4.0000 | 2.0000 | 7.0000 | 68.0000 | |
| IF | 0.8813 | 5.3897 | -10.4000 | -4.0000 | 1.0000 | 5.0000 | 11.6000 | |
| MCH | Single | 29.5675 | 2.7095 | 17.9000 | 28.3000 | 29.8000 | 31.2000 | 39.8000 |
| Delta | 0.0941 | 0.9510 | -7.3000 | -0.3000 | 0.1000 | 0.5000 | 7.3000 | |
| IF | -0.0140 | 0.4154 | -0.8600 | -0.3000 | 0.0000 | 0.3000 | 0.8000 | |
| MCHC | Single | 325.8178 | 13.1967 | 268.0000 | 318.0000 | 326.0000 | 335.0000 | 364.0000 |
| Delta | 0.8596 | 12.1156 | -48.0000 | -7.0000 | 1.0000 | 7.0000 | 75.0000 | |
| IF | 1.0480 | 4.3566 | -9.5000 | -3.0000 | 1.0000 | 5.0000 | 9.2000 | |
| RCV | Single | 13.5693 | 2.1297 | 10.9200 | 12.3000 | 13.0400 | 14.0700 | 30.4000 |
| Delta | -0.0211 | 1.4696 | -10.9000 | -0.4900 | 0.0000 | 0.4000 | 18.1600 | |
| IF | 0.0243 | 0.2560 | -0.4000 | -0.2000 | 0.0000 | 0.2000 | 0.5300 | |
| PLT | Single | 235.4359 | 82.7727 | 30.0000 | 182.0000 | 224.0000 | 275.0000 | 742.0000 |
| Delta | 2.8489 | 57.2888 | -243.0000 | -21.0000 | 4.0000 | 27.0000 | 456.0000 | |
| IF | 1.4151 | 23.3346 | -42.0000 | -17.0000 | 1.0000 | 20.0000 | 48.0000 |
Single - single-type data; Delta - delta-type data pre-processed by different truncation limits based on statistical method; IF - delta-type data pre-processed by IF based on ML method; Mean - average value; SD - standard deviation; Min - minimum value; 25 - 25 quartile; 50 - 50 quartile; 75 - 75 quartile; Max - maximum value.
Figure 2Data separability between critical biased and unbiased data for three-type data by PCA. The 3 rows from top to bottom represented LYMPH #, HGB and PLT of Cell Blood Count, the 3 columns from left to right represented single-type data, delta-type delta, and delta-type data processed by IF. Every point in each diagram represented a ML sample with the same block size consisting of 10 patient raw data.
AUC for different block sizes and the results of RF tuning for R–CV at the critical bias.
| Block size | AUC | N-trees | Max Depth | Training accuracy | Testing accuracy |
|---|---|---|---|---|---|
| 5 | 0.9122 | 100 | 150 | 0.90 | 0.89 |
| 6 | 0.9352 | 200 | 100 | 0.91 | 0.92 |
| 7 | 0.9449 | 200 | 300 | 0.92 | 0.93 |
| 8 | 0.9556 | 300 | 100 | 0.96 | 0.94 |
| 9 | 0.9768 | 300 | 300 | 0.93 | 0.93 |
| 10 | 0.9862 | 400 | 100 | 0.94 | 0.93 |
| 11 | 0.9841 | 400 | 300 | 0.94 | 0.93 |
| 12 | 0.9832 | 500 | 100 | 0.94 | 0.93 |
| 13 | 0.9865 | 500 | 300 | 0.95 | 0.92 |
N-trees - number of trees; Max Depth - max depth of trees.
Figure 3The visualization of data distribution feature for the training and the test sets and the performance parameters of five experiments at critical bias for LYMPH #, HGB and PLT. A-C take examples of LYMPH #, HGB and PLT ordered from left to right, represented principal component analysis (PCA) plots of the training set and internal validation set. D represented the TPR, TNR, FPR, FNR and ACC of the five algorithms (TPR - true positive rate; TNR - true negative rate; FPR - false positive rate; FNR - false negative rate; ACC - accuracy). E represented ANPed, MNPed, 95NPed of them (ANped - average of Nped; MNped - median of Nped; 95Nped - 95 quantile of Nped).
Test results of 5 algorithms at the critical level in leucocyte lineage.
| Test item | Algorithm | TL (%) | Transfor-mation | BS | CL | CL_l | CL_U | TPR | TNR | FPR | FNR | ACC | ANPed | MNPed | 95NPed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LYMPH# | Single-MA | 1 | BC | 130 | daily extremes | 1.6912 | 1.9715 | 0.0899 | 1.0000 | 0.0000 | 1.0000 | 0.5335 | 736.6693 | 1100 | 1100 | |
| Single-MovSD | 5 | BC | 50 | daily extremes | 0.3741 | 0.6351 | 0.3876 | 0.9955 | 0.0045 | 0.9955 | 0.6741 | 599.3228 | 573 | 1100 | ||
| Delta-MA | 10 | - | 110 | daily extremes | -0.4356 | 0.3840 | 0.7572 | 0.9945 | 0.0055 | 0.9945 | 0.8704 | 145.3333 | 98 | 451 | ||
| Delta-MovSD | 15 | - | 90 | 2.5CV | 0.8371 | 1.7768 | 0.9750 | 0.9610 | 0.0390 | 0.9610 | 0.9685 | 77.5542 | 72 | 166 | ||
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9950 | 1.0000 | 0.0000 | 1.0000 | 0.9957 | 5.0000 | 5 | 6 | |||
| LYMPH% | Single-MA | 10 | neat | 130 | daily extremes | 28.4640 | 32.6476 | 0.5614 | 0.9989 | 0.0011 | 0.9989 | 0.7676 | 591.4567 | 494 | 1100 | |
| Single-SD | 10 | neat | 110 | daily extremes | 4.7516 | 9.4391 | 0.3686 | 0.9989 | 0.0011 | 0.9989 | 0.6691 | 635.5118 | 657 | 1100 | ||
| Delta-MA | 5 | - | 90 | 3CV | -6.3065 | 5.2503 | 0.8771 | 1.0000 | 0.0000 | 1.0000 | 0.9343 | 330.3810 | 237 | 951 | ||
| Delta-SD | 1 | - | 130 | daily extremes | 9.7270 | 22.4594 | 0.9590 | 0.9966 | 0.0034 | 0.9966 | 0.9765 | 63.9565 | 57 | 140 | ||
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9678 | 0.9847 | 0.0153 | 0.9847 | 0.9700 | 4.0000 | 4 | 5 | |||
TL - truncation limit; BC - Box–Cox transformation; BS-block size; CL_l - Control limit_lower; CL_U - Control limit _upper; TPR - true positive rate; TNR - true positive rate; FPR - false positive rate; FNR - false negative rate; ACC - accuracy; ANped - average of Nped; MNped - median of Nped; 95Nped - 95 quantile of Nped.
Test results of 5 algorithms at the critical level in erythrocyte lineage.
| Test item | Algorithm | TL (%) | Transfor-mation | BS | CL | CL_l | CL_U | TPR | TNR | FPR | FNR | ACC | ANPed | MNPed | 95NPed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HGB | Single-MA | 10 | BC | 130 | daily extremes | 125.9305 | 130.8100 | 0.1459 | 1.0000 | 0.0000 | 1.0000 | 0.5622 | 842.0866 | 1100 | 1100 |
| Single-MovSD | 5 | neat | 50 | daily extremes | 2.4035 | 14.5398 | 0.0709 | 1.0000 | 0.0000 | 1.0000 | 0.5286 | 602.7619 | 551 | 1100 | |
| Delta-MA | 15 | - | 110 | daily extremes | -3.8918 | 5.1418 | 0.6943 | 0.9596 | 0.0404 | 0.9596 | 0.8193 | 570.3228 | 469 | 1100 | |
| Delta-MovSD | 5 | - | 30 | daily extremes | 11.4405 | 26.4159 | 0.9191 | 1.0000 | 0.0000 | 1.0000 | 0.9568 | 103.1467 | 97 | 236 | |
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9912 | 0.9997 | 0.0003 | 0.9997 | 0.9923 | 8.8500 | 9 | 11 | ||
| MCH | Single-MA | 15 | BC | 90 | 3CV | 29.2380 | 30.2543 | 0.4386 | 0.9722 | 0.0278 | 0.9722 | 0.7015 | 643.4252 | 643 | 1100 |
| Single-MovSD | 15 | Neat | 30 | daily extremes | 1.2504 | 2.1272 | 0.4785 | 0.9558 | 0.0442 | 0.9558 | 0.7136 | 547.8189 | 454 | 1100 | |
| Delta-MA | 5 | - | 30 | 2.5CV | -1.1558 | 1.2430 | 0.1658 | 0.9939 | 0.0061 | 0.9939 | 0.5759 | 568.5952 | 426 | 1100 | |
| Delta-MovSD | 1 | - | 20 | 3CV | 1.4054 | 4.1035 | 0.4086 | 0.9989 | 0.0011 | 0.9989 | 0.6900 | 196.8989 | 136 | 494 | |
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9909 | 0.9923 | 0.0077 | 0.9923 | 0.9911 | 8.3500 | 8 | 12 | ||
| MCHC | Single-MA | 20 | BC | 130 | daily extremes | 329.3763 | 334.2094 | 0.5485 | 0.9990 | 0.0010 | 0.9990 | 0.7704 | 630.4646 | 559 | 1100 |
| Single-MovSD | 0 | Neat | 30 | 3CV | 6.7953 | 11.7294 | 0.5884 | 0.9990 | 0.0010 | 0.9990 | 0.7907 | 567.5827 | 429 | 1100 | |
| Delta-MA | 1 | - | 30 | 3CV | -9.1536 | 9.6639 | 0.4066 | 0.9989 | 0.0011 | 0.9989 | 0.6890 | 268.1905 | 200 | 669 | |
| Delta-MovSD | 5 | - | 90 | daily extremes | 5.6724 | 10.6232 | 0.9570 | 0.9977 | 0.0023 | 0.9977 | 0.9760 | 83.3125 | 83 | 120 | |
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9913 | 0.9930 | 0.0070 | 0.9930 | 0.9915 | 8.7500 | 9 | 11 | ||
| R–CV | Single-MA | 10 | BC | 90 | daily extremes | 12.5823 | 13.1440 | 0.3866 | 0.9674 | 0.0326 | 0.9674 | 0.6697 | 476.0472 | 293 | 1100 |
| Single-MovSD | 5 | BC | 50 | 3CV | 0.1928 | 1.1664 | 0.4026 | 0.9568 | 0.0432 | 0.9568 | 0.6756 | 187.3571 | 94 | 921 | |
| Delta-MA | 20 | - | 30 | daily extremes | -0.4530 | 0.4332 | 0.2248 | 0.9630 | 0.0370 | 0.9630 | 0.5884 | 473.9370 | 283 | 1100 | |
| Delta-MovSD | 1 | - | 30 | 2.5CV | 0.8907 | 3.1528 | 0.7922 | 0.9978 | 0.0022 | 0.9978 | 0.8902 | 83.4404 | 76 | 185 | |
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9897 | 0.9843 | 0.0157 | 0.9843 | 0.9890 | 10.3000 | 10 | 12 |
TL - truncation limit; BC - Box–Cox transformation; BS-block size; CL_l - Control limit_lower; CL_U - Control limit_ upper; TPR - true positive rate; TNR - true positive rate; FPR - false positive rate; FNR - false negative rate; ACC - accuracy; ANped - average of Nped; MNped - median of Nped; 95Nped - 95 quantile of Nped.
Test results of 5 algorithms at the critical level in platelet lineage.
| Test item | Algorithm | TL (%) | Transfor-mation | BS | CL | CL_l | CL_U | TPR | TNR | FPR | FNR | ACC | ANPed | MNPed | 95NPed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PLT | Single-MA | 15 | neat | 110 | 3CV | 212.2105 | 242.0452 | 0.0759 | 1.0000 | 0.0000 | 1.0000 | 0.5312 | 525.9528 | 440 | 1100 |
| Single-MovSD | 15 | BC | 30 | daily extremes | 4.7925 | 19.7785 | 0.0779 | 1.0000 | 0.0000 | 1.0000 | 0.5322 | 498.0866 | 372 | 1100 | |
| Delta-MA | 0 | - | 30 | 2.5CV | -48.8773 | 46.9443 | 0.0669 | 1.0000 | 0.0000 | 1.0000 | 0.5266 | 482.6190 | 392 | 1100 | |
| Delta-MovSD | 1 | - | 30 | 3CV | 64.3340 | 178.1932 | 0.8202 | 0.9966 | 0.0034 | 0.9966 | 0.9033 | 126.3592 | 103 | 303 | |
| Delta-ML | Processing | 10 | RF-model | - | - | 0.9894 | 0.9937 | 0.0063 | 0.9937 | 0.9900 | 7.4500 | 7 | 11 |
TL - truncation limit; BC - Box–Cox transformation; BS-block size; CL_l - Control limit_lower; CL_U - Control limit upper; TPR - true positive rate; TNR - true positive rate; FPR - false positive rate; FNR - false negative rate; ACC - accuracy; ANped - average of Nped; MNped - median of Nped; 95Nped - 95 quantile of Nped.
Figure 4The curves for the comparison performance of 5 experiments. A,B,C corresponded to LYMPH #, HGB, and PLT, respectively. Colored lines represented MNPed for each bias, colored area represented the associated 95NPed. Parameters were displayed in the top corner (BS: block size; T: truncation limit, BC: with Box–Cox transformation).