| Literature DB >> 31700665 |
Peter Bacchetti1, Ronald J Bosch2, Eileen P Scully3, Xutao Deng4, Michael P Busch4,5, Steven G Deeks6, Sharon R Lewin7,8.
Abstract
Observational and interventional studies for HIV cure research often use single-copy assays to quantify rare entities in blood or tissue samples. Statistical analysis of such measurements presents challenges due to tissue sampling variability and frequent findings of 0 copies in the sample analysed. We examined four approaches to analysing such studies, reflecting different ways of handling observations of 0 copies: (A) replace observations of 0 copies with 1 copy; (B) add 1 to all observed numbers of copies; (C) treat observations of 0 copies as left-censored at 1 copy; and (D) leave the data unaltered and apply a method for count data, negative binomial regression. Because research seeks to estimate general patterns rather than individuals' values, we argue that unaltered use of 0 copies is suitable for research purposes and that altering those observations can introduce bias. When applied to a simulated study comparing preintervention to postintervention measurements within 12 participants, methods A-C showed more attenuation than method D in the estimated intervention effect, less chance of finding P < 0.05 for the intervention effect and a lower chance of including the true intervention effect within the 95% confidence interval. Application of the methods to actual data from a study comparing multiply-spliced HIV RNA among men and women estimated smaller differences by methods A-C than by method D. We recommend that negative binomial regression, which is readily available in many statistical software packages, be considered for analysis of studies of rare entities that are measured by single-copy assays.Entities:
Keywords: HIV; latent reservoir; rare entities; statistical bias
Year: 2019 PMID: 31700665 PMCID: PMC6816121
Source DB: PubMed Journal: J Virus Erad ISSN: 2055-6640
Summary of analysis results for 1000 simulated studies of before–after differences in copies per million cells, using four different analysis methods (see Box 2 and text)
| Method | Median estimated fold reduction | Median absolute error (log10 scale) | 95% CI coverage (%) | Power (%) |
|---|---|---|---|---|
| Input fixed at 1,000,000 cells. Copies counted exactly. | ||||
| A. 0 copies reset to 1, paired | 2.03 | 0.29 | 27.1 | 81.9 |
| B. Add 1 to copies, paired | 2.08 | 0.28 | 22.3 | 86.4 |
| C. 0 copies treated as left-censored | 2.66 | 0.18 | 77.1 | 87.5 |
| D. Negative binomial regression | 3.20 | 0.15 | 88.1 | 89.5 |
| Input fixed at 1,000,000 cells. Copies measured with error. | ||||
| A. 0 copies reset to 1, paired | 2.03 | 0.29 | 29.2 | 77.0 |
| B. Add 1 to copies, paired | 2.11 | 0.28 | 25.6 | 83.4 |
| C. 0 copies treated as left-censored | 2.78 | 0.17 | 80.6 | 83.6 |
| D. Negative binomial regression | 3.30 | 0.15 | 87.4 | 85.8 |
| Input varies from 500,000 to 1,500,000 cells. Copies counted exactly. | ||||
| A. 0 copies reset to 1, paired | 2.01 | 0.30 | 24.5 | 75.7 |
| B. Add 1 to copies, paired | 2.08 | 0.28 | 23.2 | 82.9 |
| C. 0 copies treated as left-censored | 2.71 | 0.17 | 76.6 | 85.7 |
| D. Negative binomial regression | 3.23 | 0.14 | 87.5 | 87.3 |
| Input varies from 500,000 to 1,500,000 cells. Copies measured with error. | ||||
| A. 0 copies reset to 1, paired | 2.00 | 0.30 | 26.0 | 72.3 |
| B. Add 1 to copies, paired | 2.09 | 0.28 | 28.2 | 79.5 |
| C. 0 copies treated as left-censored | 2.77 | 0.16 | 81.6 | 81.7 |
| D. Negative binomial regression | 3.23 | 0.15 | 87.4 | 85.0 |
CI: confidence interval.
Each study includes N = 12 persons, and the true mean reduction in log(copies per million cells) corresponds to a fourfold reduction in copies per million cells.
*For CIs, the ideal value for coverage is 95%.
Summary of analysis results for 1000 simulated studies similar to those summarised in Table 1, but with intervention having no effect
| Method | Per cent with |
|---|---|
| Input fixed at 1,000,000 cells. Copies counted exactly. | |
| A. 0 copies reset to 1, paired | 5.2 |
| B. Add 1 to copies, paired | 5.5 |
| C. 0 copies treated as left-censored | 5.9 |
| D. Negative binomial regression | 3.9 |
| Input fixed at 1,000,000 cells. Copies measured with error. | |
| A. 0 copies reset to 1, paired | 5.4 |
| B. Add 1 to copies, paired | 4.9 |
| C. 0 copies treated as left-censored | 5.8 |
| D. Negative binomial regression | 8.0 |
| Input varies from 500,000 to 1,500,000 cells. Copies counted exactly. | |
| A. 0 copies reset to 1, paired | 5.5 |
| B. Add 1 to copies, paired | 5.0 |
| C. 0 copies treated as left-censored | 6.4 |
| D. Negative binomial regression | 4.8 |
| Input varies from 500,000 to 1,500,000 cells. Copies measured with error. | |
| A. 0 copies reset to 1, paired | 5.0 |
| B. Add 1 to copies, paired | 5.9 |
| C. 0 copies treated as left-censored | 6.4 |
| D. Negative binomial regression | 7.3 |
*The ideal value for this is 5%.
Estimated differences in multiply-spliced HIV RNA per million resting CD4 cells, from a study comparing 26 women with 26 men, all with effectively treated HIV [31]
| Method | Estimated male : female ratio | 95% Confidence interval | |
|---|---|---|---|
| A. 0 copies reset to 1, unpaired | 2.38 | 1.07–5.27 | 0.034 |
| B. Add 1 to copies, unpaired | 2.11 | 0.98–4.53 | 0.055 |
| C. 0 copies treated as left-censored | 2.73 | 0.87–8.56 | 0.084 |
| D. Negative binomial regression | 6.17 | 1.95–19.6 | 0.002 |