| Literature DB >> 32327815 |
Zhenke Wu1, Maria Deloria-Knoll1, Laura L Hammitt1, Scott L Zeger1.
Abstract
In population studies on the aetiology of disease, one goal is the estimation of the fraction of cases that are attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal or other pathogens. The study of pneumonia aetiology is challenging because directly sampling from the lung to identify the aetiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. The paper introduces the statistical methodology designed for estimating the population aetiology distribution and the individual aetiology probabilities in the Pneumonia Etiology Research for Child Health study of 9500 children for seven sites around the world. We formulate the scientific problem in statistical terms as estimating the mixing weights and latent class indicators under a partially latent class model (PLCM) that combines heterogeneous measurements with different error rates obtained from a case-control study. We introduce the PLCM as an extension of the latent class model. We also introduce graphical displays of the population data and inferred latent class frequencies. The methods are tested with simulated data, and then applied to Pneumonia Etiology Research for Child Health data. The paper closes with a brief description of extensions of the PLCM to the regression setting and to the case where conditional independence between the measures is relaxed.Entities:
Keywords: Aetiology; Bayesian method; Case–control; Latent class; Measurement error; Pneumonia
Year: 2015 PMID: 32327815 PMCID: PMC7169268 DOI: 10.1111/rssc.12101
Source DB: PubMed Journal: J R Stat Soc Ser C Appl Stat ISSN: 0035-9254 Impact factor: 1.864
Figure 1Directed acyclic graph illustrating relationships between lung infection state , imperfect laboratory measurements on the presence or absence of each of a list of pathogens at each site , and , disease outcome Y and covariates X
Figure 2(a), (b) Population and (c), (d) individual aetiology estimates for a single sample with 500 cases and 500 controls with true and either (a), (c) 1% (N=5) or (b), (d) 10% GS data on cases (in (c) and (d), 8 (= 2) BS measurement patterns and predictions for individual children are shown with measurement patterns attached; the numbers at the vertices show empirical frequencies of GS measurements): ⊕, true population aetiology distribution ; , , , 95% credible regions for analysis using BS data only, BS plus GS data and GS data only respectively; , , , corresponding posterior mean of ; , 95% highest posterior density region of the uniform prior distribution
Figure 3(a) Observed BS rates (with 95% confidence intervals) for cases and controls, (b) SS data and (c) (see the text for further explanation)
Pathogen names and their abbreviations
|
| |
| HINF |
|
| PNEU |
|
| SASP |
|
| SAUR |
|
|
| |
| ADENO | Adenovirus |
| COR_43 | Coronavirus OC43 |
| FLU_C | Influenza virus type C |
| HMPV_A_B | Human metapneumovirus type A or B |
| PARA1 | Parainfluenza type 1 virus |
| RHINO | Rhinovirus |
| RSV_A_B | Respiratory syncytial virus type A or B |
Figure 4Summary of posterior distribution of pneumonia aetiology estimates (, 95% credible regions within the bacterial or viral groups): (a) posterior distribution of viral aetiology; (b) posterior aetiology distribution for the top two causes given a bacterial infection; (c) posterior aetiology distribution for the top two causes given a viral infection