Carly E Milliren1, Clare R Evans2, Tracy K Richmond3, Erin C Dunn4. 1. Center for Applied Pediatric Quality Analytics, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA; Division of Adolescent/Young Adult Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA. Electronic address: carly.milliren@childrens.harvard.edu. 2. Department of Sociology, University of Oregon, 736 PLC 1291, Eugene, OR 97403, USA. Electronic address: cevans@uoregon.edu. 3. Division of Adolescent/Young Adult Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA; Department of Pediatrics, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA. Electronic address: tracy.richmond@childrens.harvard.edu. 4. Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA; Department of Psychiatry, Harvard Medical School, 401 Park Drive, Boston, MA 02215, USA; Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, 75 Ames Street, Cambridge, MA 02142, USA. Electronic address: edunn2@mgh.harvard.edu.
Abstract
BACKGROUND: Recent advances in multilevel modeling allow for modeling non-hierarchical levels (e.g., youth in non-nested schools and neighborhoods) using cross-classified multilevel models (CCMM). Current practice is to cluster samples from one context (e.g., schools) and utilize the observations however they are distributed from the second context (e.g., neighborhoods). However, it is unknown whether an uneven distribution of sample size across these contexts leads to incorrect estimates of random effects in CCMMs. METHODS: Using the school and neighborhood data structure in Add Health, we examined the effect of neighborhood sample size imbalance on the estimation of variance parameters in models predicting BMI. We differentially assigned students from a given school to neighborhoods within that school's catchment area using three scenarios of (im)balance. 1000 random datasets were simulated for each of five combinations of school- and neighborhood-level variance and imbalance scenarios, for a total of 15,000 simulated data sets. For each simulation, we calculated 95% CIs for the variance parameters to determine whether the true simulated variance fell within the interval. RESULTS: Across all simulations, the "true" school and neighborhood variance parameters were estimated 93-96% of the time. Only 5% of models failed to capture neighborhood variance; 6% failed to capture school variance. CONCLUSIONS: These results suggest that there is no systematic bias in the ability of CCMM to capture the true variance parameters regardless of the distribution of students across neighborhoods. Ongoing efforts to use CCMM are warranted and can proceed without concern for the sample imbalance across contexts.
BACKGROUND: Recent advances in multilevel modeling allow for modeling non-hierarchical levels (e.g., youth in non-nested schools and neighborhoods) using cross-classified multilevel models (CCMM). Current practice is to cluster samples from one context (e.g., schools) and utilize the observations however they are distributed from the second context (e.g., neighborhoods). However, it is unknown whether an uneven distribution of sample size across these contexts leads to incorrect estimates of random effects in CCMMs. METHODS: Using the school and neighborhood data structure in Add Health, we examined the effect of neighborhood sample size imbalance on the estimation of variance parameters in models predicting BMI. We differentially assigned students from a given school to neighborhoods within that school's catchment area using three scenarios of (im)balance. 1000 random datasets were simulated for each of five combinations of school- and neighborhood-level variance and imbalance scenarios, for a total of 15,000 simulated data sets. For each simulation, we calculated 95% CIs for the variance parameters to determine whether the true simulated variance fell within the interval. RESULTS: Across all simulations, the "true" school and neighborhood variance parameters were estimated 93-96% of the time. Only 5% of models failed to capture neighborhood variance; 6% failed to capture school variance. CONCLUSIONS: These results suggest that there is no systematic bias in the ability of CCMM to capture the true variance parameters regardless of the distribution of students across neighborhoods. Ongoing efforts to use CCMM are warranted and can proceed without concern for the sample imbalance across contexts.
Authors: Travis Gallo; Mason Fidino; Brian Gerber; Adam A Ahlers; Julia L Angstmann; Max Amaya; Amy L Concilio; David Drake; Danielle Gay; Elizabeth W Lehrer; Maureen H Murray; Travis J Ryan; Colleen Cassady St Clair; Carmen M Salsbury; Heather A Sander; Theodore Stankowich; Jaque Williamson; J Amy Belaire; Kelly Simon; Seth B Magle Journal: Elife Date: 2022-03-31 Impact factor: 8.713
Authors: Hoda S Abdel Magid; Carly E Milliren; Kathryn Rice; Nina Molanphy; Kennedy Ruiz; Holly C Gooding; Tracy K Richmond; Michelle C Odden; Jason M Nagata Journal: PLoS One Date: 2022-04-28 Impact factor: 3.752