| Literature DB >> 35383149 |
David Hong1, Rounak Dey2, Xihong Lin3,4,5, Brian Cleary6, Edgar Dobriban7.
Abstract
Large scale screening is a critical tool in the life sciences, but is often limited by reagents, samples, or cost. An important recent example is the challenge of achieving widespread COVID-19 testing in the face of substantial resource constraints. To tackle this challenge, screening methods must efficiently use testing resources. However, given the global nature of the pandemic, they must also be simple (to aid implementation) and flexible (to be tailored for each setting). Here we propose HYPER, a group testing method based on hypergraph factorization. We provide theoretical characterizations under a general statistical model, and carefully evaluate HYPER with alternatives proposed for COVID-19 under realistic simulations of epidemic spread and viral kinetics. We find that HYPER matches or outperforms the alternatives across a broad range of testing-constrained environments, while also being simpler and more flexible. We provide an online tool to aid lab implementation: http://hyper.covid19-analysis.org .Entities:
Mesh:
Year: 2022 PMID: 35383149 PMCID: PMC8983763 DOI: 10.1038/s41467-022-29389-z
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Illustration of HYPER.
Stage 1 tests pools that are formed by cycling through a sequence of pool assignments generated via hypergraph factorization. Putative positives are individuals that are not in any negative pools (decoding). Stage 2 tests the putative positives individually. In this example, n = 12 individuals (2 of whom are actual positives) are each split into q = 2 of m = 6 pools; three are decoded as putative positives and both positives are successfully identified in stage 2.
Comparison of various features of HYPER with existing methods.
| Plate-based arrays[ | ||||||
|---|---|---|---|---|---|---|
| HYPER | 8 × 12 | 16 × 24 | P-BEST[ | Random assignment[ | Double-pooling[ | |
| # individuals per batch ( | Any | 96 | 384 | 384 | Any | Multiple of |
| # pools ( | Variable | 20 | 40 | 48 | Any | Even |
| # splits ( | ≤ 3 | 2 | 6 | Any | 2 | |
| # stages | Two | Two | One | Two | Two | |
| Max. balanced pools | × | × w.h.p.a | ||||
| Max. balanced combinations | × w.h.p.a | × w.h.p.a | ||||
| Simple to implement by hand | × | |||||
| Flexible/easily adapted | × | × | ||||
| Simple to decode by hand | × | |||||
| Corrects false positive | ||||||
| Corrects false negatives | Optional | × | Optional | × | ||
aWith high probability, i.e., probability of failure ≫ 0.
In contrast to the existing methods, HYPER is simple to implement, flexible to adapt, and maximally balanced.
Fig. 2Efficiency and sensitivity of pooled testing during a simulated epidemic.
Average values of efficiency (relative to individual testing) and sensitivity of a variety of pooling designs are shown for each day, with results averaged across 200,000 random trials. For sensitivity, raw averages are shown as dots with degree-8 polynomial fits overlaid as curves; the curves for efficiency depict raw averages. During the days 40–90 (highlighted), the prevalence grows exponentially from 0.03% to 2.46%. a, b Comparison of HYPER with alternative methods that use n = 96 individuals per batch (a) or n = 384 individuals per batch (b). HYPER designs with q = 2 splits were chosen to have the same maximum pool sizes (nq/m = 12 for H96,16,2; nq/m = 24 for H384,32,2) as the array designs. Dorfman designs (i.e., HYPER designs with q = 1) with matching pool sizes are also included. Sensitivity (bottom panels) depends heavily on pool sizes, due to dilution of viral loads. c, d HYPER evaluated with varying numbers of pools (c, m = 32, 16, 12) and numbers of splits (d, q = 1, 2, 3). The designs are affected by the increasing prevalence over time to varying degrees. As prevalence increases, efficiency decreases (as more stage 2 tests become necessary), while sensitivity increases (as larger viral loads begin to rescue small viral loads that would have been missed). More efficient designs tend to be less sensitive, creating a trade-off.
List of HYPER designs H considered.
| 4 | 1 | 384 | 1 | 12 | 6 | * | 96 | 64 | * | 384 | 42 | 12 | 6 | 96 | 48 | * | 400 | 48 | * | ||
| 8 | 1 | 384 | 2 | 12 | 12 | 144 | 30 | 384 | 44 | 12 | 12 | * | 96 | 60 | 400 | 60 | |||||
| 12 | 6 | 384 | 3 | 20 | 12 | 192 | 6 | * | 384 | 46 | * | 12 | 18 | 96 | 90 | 400 | 90 | ||||
| 20 | 10 | 384 | 4 | 20 | 20 | 192 | 12 | 384 | 48 | 20 | 12 | * | 144 | 30 | 400 | 192 | * | ||||
| 40 | 10 | 384 | 6 | 24 | 6 | * | 192 | 20 | 384 | 64 | * | 20 | 18 | 180 | 90 | 400 | 282 | ||||
| 40 | 20 | 384 | 8 | 24 | 12 | 192 | 24 | 384 | 96 | 20 | 30 | 192 | 6 | 400 | 384 | * | |||||
| 44 | 11 | 384 | 12 | 40 | 12 | 192 | 40 | 384 | 192 | 24 | 6 | 192 | 12 | * | 400 | 390 | |||||
| 44 | 22 | 384 | 16 | 40 | 20 | 192 | 44 | 384 | 288 | 24 | 12 | * | 192 | 18 | 720 | 90 | |||||
| 48 | 12 | 384 | 24 | 40 | 40 | 192 | 48 | 400 | 12 | 40 | 12 | * | 192 | 24 | * | 768 | 12 | * | |||
| 48 | 16 | 384 | 32 | 44 | 12 | 192 | 64 | * | 400 | 20 | 40 | 18 | 192 | 30 | 768 | 24 | * | ||||
| 48 | 24 | 384 | 48 | 44 | 20 | 192 | 192 | 400 | 40 | 40 | 30 | 192 | 42 | 768 | 48 | * | |||||
| 64 | 8 | 384 | 64 | 44 | 40 | 192 | 96 | 400 | 44 | 40 | 42 | 192 | 48 | * | 1440 | 90 | |||||
| 64 | 16 | 384 | 96 | 44 | 44 | 288 | 12 | 400 | 48 | 40 | 48 | * | 192 | 60 | 1536 | 24 | * | ||||
| 64 | 32 | 384 | 128 | 48 | 6 | * | 288 | 20 | 400 | 64 | * | 40 | 60 | 192 | 90 | 1536 | 48 | * | |||
| 80 | 20 | 384 | 192 | 48 | 12 | 288 | 30 | 400 | 96 | 44 | 12 | * | 192 | 192 | * | 2880 | 90 | ||||
| 96 | 1 | 400 | 10 | 48 | 20 | 288 | 40 | 400 | 192 | 44 | 18 | 288 | 12 | * | 3072 | 48 | * | ||||
| 96 | 2 | 400 | 20 | 48 | 24 | 288 | 44 | 400 | 288 | 44 | 30 | 288 | 18 | 5760 | 90 | ||||||
| 96 | 3 | 400 | 40 | 48 | 40 | 288 | 48 | 400 | 384 | 44 | 42 | 288 | 30 | ||||||||
| 96 | 4 | 400 | 50 | 48 | 44 | 288 | 64 | * | 768 | 12 | 44 | 48 | * | 288 | 42 | ||||||
| 96 | 6 | 400 | 80 | 48 | 48 | 288 | 96 | 768 | 24 | 44 | 60 | 288 | 48 | * | |||||||
| 96 | 8 | 400 | 100 | 64 | 12 | 288 | 192 | 768 | 48 | 48 | 6 | 288 | 60 | ||||||||
| 96 | 12 | 400 | 200 | 64 | 20 | 384 | 4 | * | 768 | 96 | 48 | 12 | * | 288 | 90 | ||||||
| 96 | 16 | 64 | 40 | 384 | 6 | * | 1536 | 24 | 48 | 18 | 288 | 192 | * | ||||||||
| 96 | 24 | 64 | 44 | 384 | 8 | * | 1536 | 48 | 48 | 24 | * | 288 | 282 | ||||||||
| 96 | 32 | 64 | 48 | 384 | 10 | * | 1536 | 96 | 48 | 30 | 360 | 90 | |||||||||
| 96 | 48 | 64 | 64 | * | 384 | 12 | 3072 | 48 | 48 | 42 | 384 | 6 | |||||||||
| 100 | 20 | 96 | 4 | * | 384 | 14 | * | 3072 | 96 | 48 | 48 | * | 384 | 12 | * | ||||||
| 192 | 12 | 96 | 6 | * | 384 | 16 | * | 6144 | 96 | 48 | 60 | 384 | 18 | ||||||||
| 192 | 16 | 96 | 8 | * | 384 | 18 | * | 64 | 12 | * | 384 | 24 | * | ||||||||
| 192 | 32 | 96 | 10 | * | 384 | 20 | 64 | 18 | 384 | 30 | |||||||||||
| 192 | 48 | 96 | 12 | 384 | 22 | * | 64 | 30 | 384 | 42 | |||||||||||
| 192 | 64 | 96 | 14 | * | 384 | 24 | 64 | 42 | 384 | 48 | * | ||||||||||
| 192 | 96 | 96 | 16 | * | 384 | 26 | * | 64 | 48 | * | 384 | 60 | |||||||||
| 288 | 12 | 96 | 18 | * | 384 | 28 | 64 | 60 | 384 | 90 | |||||||||||
| 288 | 18 | 96 | 20 | 384 | 30 | 96 | 6 | 384 | 192 | * | |||||||||||
| 288 | 36 | 96 | 22 | * | 384 | 32 | * | 96 | 12 | * | 384 | 282 | |||||||||
| 288 | 48 | 96 | 24 | 384 | 34 | * | 96 | 18 | 400 | 12 | * | ||||||||||
| 288 | 96 | 96 | 40 | 384 | 36 | 96 | 24 | * | 400 | 18 | |||||||||||
| 288 | 144 | 96 | 44 | 384 | 38 | * | 96 | 30 | 400 | 30 | |||||||||||
| 360 | 40 | 96 | 48 | 384 | 40 | 96 | 42 | 400 | 42 | ||||||||||||
HYPER was optimized across the following set of design parameters. Asterisks (*) denote the restricted set of parameters that are available for Reed-Solomon Kautz-Singleton (RS-KS) code-based designs (see the Supplementary Material).
Fig. 3Comparison of pooling methods under resource constraints.
HYPER designs (Table 2) were evaluated together with individual testing, plate-based arrays[9], and P-BEST[8], across a range of sample collection and testing budgets. The basis for comparison was the effective screening capacity across days 40–90 of the simulation (Fig. 2), during which the prevalence increases exponentially from 0.03% to 2.46%. Bar plots on the left depict the effective screening capacities (bar height) in a testing-scarce setting (a), followed by increasingly testing-rich settings (b–d) and settings well-suited for the plate-based arrays and P-BEST (e, f). When multiple designs for a given method were available within the constraints (i.e., various choices of HYPER designs, or a choice between the 8 × 12 and 16 × 24 arrays), we use the most effective configuration and indicate it in white text within the appropriate bar. The average number of batches run per day is noted at the bottom of each bar. g Expanded comparison to a grid of sampling and testing budgets. Each cell is colored by the best method (where we separately identify HYPER designs with q = 1, 2, 3 splits in shades of orange/red), and shows the corresponding effective screening capacities (in black text). The best design configuration is written in white text. For HYPER, we write the number of individuals per batch n and the number of pools m for the best configuration; cell color already indicates the number of splits q. Note that n and m often do not match the daily sampling and testing budgets, respectively, since multiple batches can be run per day. The cases from (a–f) are outlined in black. See Supplementary Fig. 10 for additional details.