| Literature DB >> 34520474 |
Bruno Levecke1, Luc E Coffeng2, Christopher Hanna3, Rachel L Pullan4, Katherine M Gass5.
Abstract
Recently, the World Health Organization established the Diagnostic Technical Advisory Group to identify and prioritize diagnostic needs for neglected tropical diseases, and to ultimately describe the minimal and ideal characteristics for new diagnostic tests (the so-called target product profiles (TPPs)). We developed two generic frameworks: one to explore and determine the required sensitivity (probability to correctly detect diseased persons) and specificity (probability to correctly detect persons free of disease), and another one to determine the corresponding samples sizes and the decision rules based on a multi-category lot quality assurance sampling (MC-LQAS) approach that accounts for imperfect tests. We applied both frameworks for monitoring and evaluation of soil-transmitted helminthiasis control programs. Our study indicates that specificity rather than sensitivity will become more important when the program approaches the endgame of elimination and that the requirements for both parameters are inversely correlated, resulting in multiple combinations of sensitivity and specificity that allow for reliable decision making. The MC-LQAS framework highlighted that improving diagnostic performance results in a smaller sample size for the same level of program decision making. In other words, the additional costs per diagnostic tests with improved diagnostic performance may be compensated by lower operational costs in the field. Based on our results we proposed the required minimal and ideal diagnostic sensitivity and specificity for diagnostic tests applied in monitoring and evaluating of soil-transmitted helminthiasis control programs.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34520474 PMCID: PMC8480900 DOI: 10.1371/journal.pntd.0009740
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Fig 2The process to determine the decision cut-off c in a LQAs framework.
The different panels in this figure illustrate the process to determine the decision cut-off c when 500 subjects (N) are randomly recruited for both a perfect test (sensitivity (Se) = specificity (Sp) = 100%; Panels A–C) and an imperfect test (Se = Sp = 80%); Panels D–F). Panels A and D represent the cumulative error of prematurely reducing the preventive chemotherapy (PC) (ε) when the true underlying prevalence was arbitrarily set at 55% (Prev). The horizontal dashed line represents a ε of 5%, the red dashed line represents the allowed possible decision cut-off c resulting in a ε≤5%. The red area under the curve highlight all possible values for c resulting in a ε≤5%. Panels B and E represent the cumulative error of selecting a PC frequency that is higher than needed (ε) when the true underlying prevalence was arbitrarily set at 45% (Prev). The horizontal dashed line represents a ε of 25%, the blue dashed line represents the lowest possible decision cut-off c resulting in a ε of ≤ 25%. The blue area under the curve highlights all possible values for c resulting in a ε of ≤ 25%. Panels C and F represent the probability (in %) of the number of positive test results (N+) in a random sample of N subjects being at least c over a wide range of true underlying prevalence (Prev) based on the two extreme decision cut-offs (red line: lowest possible value; blue line: highest possible value). The vertical straight line represents the program decision threshold T of 50%. The horizontal black dashed lines represent a ε equal to 25% and a ε equal to 5% (= 100% - 95%). The grey zone indicates the range of Prev for which decision making is inadequate (ε>25% (blue dashed line) and ε>5% (red dashed line). In this example, the grey zone ranges from 45% to 55% by design.
Fig 3The build-up of multi-category LQAS for STH control program decision making using an imperfect test.
The different panels illustrate the build-up of a multi-category LQAS around 4 program decision thresholds T (2%, 10%, 20% and 50%) when applying an imperfect test (sensitivity (Se) = 76% and specificity (Sp) = 99%) on 500 randomly selected subjects (N). Panel A provides the provides the probability (in %) of the number of positive test results (N+) in a random sample of N subjects (= 500) being at least c separately for each of the 4 thresholds, their corresponding decision cut-offs (c2% = 13, c10% = 41, c20% = 84, c50% = 182) and true underlying prevalence Prev (Prev: 0.0%, Prev: 4.0%; Prev: 7.5%, Prev: 12.5%; Prev: 15.0%, Prev: 25.0%; Prev: 45.0%, Prev: 55.0%). Note that these Prev-values define the borders of the grey zone around the program thresholds and for these Prev-values ε≤25% and ε≤5%. The vertical straight line represents the program decision threshold T (orange: 2%, red: 10%, green: 20% and blue: 50%). The horizontal black dashed lines represent a ε equal to 25% and a ε equal to 5% (= 100% - 95%). The grey zone indicates the range of Prev for which decision making is inadequate (ε>25% and ε>5%). Panel B provides the same information as Panel A, but highlights the error of falsely scaling up the PC frequency (solid surfaces). Panels C and D represent the probability of correct program decision making across a wide range of Prev, where Panel C provides an overview of the relative contribution of ε (colored areas) in the program decision making.
Fig 5The impact of program decision errors and diagnostic performance on the grey zone.
The red line in each panel represents the probability (in %) of the number of positive test results (N+) in a random sample of N subjects (= 500) being at least T′ (see ) (Panels A and B: T = 50%, Panels C and D: T = 2%) using 2 theoretic distinct imperfect diagnostic tests D1 and D2 (Se = 100% and Sp = 60% (Panels A, B and C); Se = 60% and Sp = 100% (Panel D)). The grey area represents the range of true underlying prevalence for which program decision is inadequate (ε>25% and ε>5% (Panels A, D and C) or not ideal (ε>10% and ε>5% (Panel B).
The 61 diagnostic tests that allow for ideal decision making.
The table represents the width of the grey zone around the six program decision thresholds T (1%, 2%, 5%, 10%, 20% and 50%) that allowed for a sufficient decision making (ε≤10% and ε≤5%) for each of the 61 pairs of sensitivity (Se) and specificity (Sp). For simplicity, we have classified the width of the grey zone into three levels (1–3) for each threshold separately. This classification into 3 levels was based on the 25th and 75th percentile of the width of the grey zones (level 1: width of grey zone < 25th percentile; level 2: 75th percentile > width of grey zone ≥ 25th percentile; level 3: width of grey zone ≥ 75th percentile (see ) across all potential diagnostic methods that allowed for adequate program decision making. In other words, each of these diagnostic methods allowed for adequate decision making (ε is set at 25%) at a true underlying prevalence of zero). Diagnostic tests were considered ‘optimal’ (blue) when they resulted in level 1 grey zone around at least 3 out of the 6 thresholds and did not result in a level 3 grey zone in any of the 6 program thresholds. In all other cases, the diagnostic test was considered ‘minimal’ (white).
|
|
| Program thresholds (in %) | Type of test | |||||
|---|---|---|---|---|---|---|---|---|
| 50 | 20 | 10 | 5 | 2 | 1 | |||
| 100 | 96–100 | 1 | 1 | 1 | 1 | 1 | 1 | Optimal |
| 81–95 | 2 | 1 | 1 | 1 | 1 | 1 | Optimal | |
| 70–80 | 2 | 2 | 1 | 1 | 1 | 1 | Optimal | |
| 61–69 | 2 | 2 | 1 | 1 | 1 | 2 | Optimal | |
| 60 | 3 | 2 | 1 | 1 | 1 | 2 | Minimal | |
| 99 | 97–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 85–96 | 2 | 1 | 1 | 1 | 1 | 2 | Optimal | |
| 81–84 | 2 | 2 | 1 | 1 | 1 | 2 | Optimal | |
The 207 diagnostic tests that allow for an adequate decision making.
The table represents the width of the grey zone around the six program decision thresholds T (1%, 2%, 5%, 10%, 20% and 50%) that allowed for a sufficient decision making (ε≤25% and ε≤5%) for each of the 207 pairs of sensitivity (Se) and specificity (Sp). For simplicity, we have classified the width of the grey zone into three levels (1–3) for each threshold and ε separately. This classification into 3 levels was based on the 25th and 75th percentile of the width of the grey zones (level 1: width of grey zone <25th percentile; level 2: 75th percentile > width of grey zone ≥ 25th percentile; level 3: width of grey zone ≥ 75th percentile (see ) across all potential diagnostic methods that allowed for adequate program decision making. In other words, each of these diagnostic methods allowed for adequate decision making (ε is set at 25%) at a true underlying prevalence of zero. Diagnostic tests were considered ‘optimal’ (blue) when they resulted in level 1 grey zone in at least 3 out of the 6 thresholds and did not result in a level 3 grey zone in any of the 6 program thresholds. In all other cases, the diagnostic test was considered ‘minimal’ (white).
|
|
| Program thresholds (in %) | Type of test | |||||
|---|---|---|---|---|---|---|---|---|
| 50 | 20 | 10 | 5 | 2 | 1 | |||
| 100 | 74–100 | 1 | 1 | 1 | 1 | 1 | 1 | Optimal |
| 63–73 | 2 | 1 | 1 | 1 | 1 | 1 | Optimal | |
| 60–62 | 2 | 1 | 1 | 1 | 1 | 2 | Optimal | |
| 99 | 75–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 60–74 | 2 | 1 | 1 | 1 | 1 | 2 | Optimal | |
| 98 | 76–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 69–75 | 2 | 1 | 1 | 1 | 1 | 2 | Optimal | |
| 67–68 | 2 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 66 | 2 | 1 | 1 | 1 | 2 | 3 | Minimal | |
| 64–65 | 2 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 62–63 | 2 | 2 | 1 | 1 | 1 | 3 | Minimal | |
| 97 | 77–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 72–76 | 2 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 68–71 | 2 | 1 | 1 | 1 | 2 | 3 | Minimal | |
| 63–67 | 2 | 2 | 1 | 1 | 2 | 3 | Minimal | |
| 96 | 92–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 84–91 | 1 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 95 | 98–100 | 1 | 1 | 1 | 1 | 1 | 2 | Optimal |
| 93–97 | 1 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 87–92 | 1 | 1 | 1 | 1 | 2 | 3 | Minimal | |
| 85–86 | 1 | 1 | 1 | 1 | 1 | 3 | Minimal | |
| 94 | 96–100 | 1 | 1 | 1 | 1 | 1 | 3 | Minimal |
| 86–95 | 1 | 1 | 1 | 1 | 2 | 3 | Minimal | |
The diagnostic performance of minimal and optimal diagnostic tests for adequate and ideal decision making.
Diagnostic tests were considered ‘optimal’ when they resulted in level 1 grey zone in at least 3 out of the 6 thresholds and did not result in a level 3 grey zone in any of the 6 program thresholds. In all other cases, the diagnostic test was considered ‘minimal’. For simplicity, we have classified the width of the grey zone into three levels (1–3) for each threshold and ε separately. The classification into these 3 levels was based on the 25th and 75th percentile of the width of the grey zones (level 1: width of grey zone < 25th percentile; level 2: 75th percentile > width of grey zone ≥ 25th percentile; level 3: width of grey zone ≥ 75th percentile (see )). For an adequate decision making the ε≤25%, whereas for ideal decision making this ε≤10%. For both levels of decision making ε≤5%.
| Program decision making | |||||
|---|---|---|---|---|---|
| Adequate | Ideal | ||||
| Specificity | Sensitivity | Specificity | Sensitivity | ||
|
| Minimal | 98 | 62–68 | 100 | 60 |
| 97 | 63–76 | ||||
| 96 | 84–91 | ||||
| 95 | 85–97 | ||||
| 94 | 86–100 | ||||
| Optimal | 100 | ≥ 60 | 100 | ≥ 61 | |
| 99 | ≥ 60 | 99 | ≥ 81 | ||
| 98 | ≥ 69 | ||||
| 97 | ≥ 77 | ||||
| 96 | ≥ 92 | ||||
| 95 | ≥ 98 | ||||