Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Two-stage method to remove population- and individual-level outliers from longitudinal data in a primary care database.

Literature DB >> 22052713

Two-stage method to remove population- and individual-level outliers from longitudinal data in a primary care database.

C Welch¹, I Petersen, K Walters, R W Morris, I Nazareth, E Kalaitzaki, I R White, L Marston, J Carpenter.

Abstract

PURPOSE: In the UK, primary care databases include repeated measurements of health indicators at the individual level. As these databases encompass a large population, some individuals have extreme values, but some values may also be recorded incorrectly. The challenge for researchers is to distinguish between records that are due to incorrect recording and those which represent true but extreme values. This study evaluated different methods to identify outliers.
METHODS: Ten percent of practices were selected at random to evaluate the recording of 513,367 height measurements. Population-level outliers were identified using boundaries defined using Health Survey for England data. Individual-level outliers were identified by fitting a random-effects model with subject-specific slopes for height measurements adjusted for age and sex. Any height measurements with a patient-level standardised residual more extreme than ±10 were identified as an outlier and excluded. The model was subsequently refitted twice after removing outliers at each stage. This method was compared with existing methods of removing outliers.
RESULTS: Most outliers were identified at the population level using the boundaries defined using Health Survey for England (1550 of 1643). Once these were removed from the database, fitting the random-effects model to the remaining data successfully identified only 75 further outliers. This method was more efficient at identifying true outliers compared with existing methods.
CONCLUSIONS: We propose a new, two-stage approach in identifying outliers in longitudinal data and show that it can successfully identify outliers at both population and individual level.

Entities: Species

Year: 2011 PMID： 22052713 DOI： 10.1002/pds.2270

Source DB: PubMed Journal: Pharmacoepidemiol Drug Saf ISSN： 1053-8569 Impact factor: 2.890

Keyword Cloud
Cited

10 in total

1.

Authors: Eric I Benchimol; Liam Smeeth; Astrid Guttmann; Katie Harron; David Moher; Irene Petersen; Henrik T Sørensen; Jean-Marie Januel; Erik von Elm; Sinéad M Langan
Journal: CMAJ Date: 2019-02-25 Impact factor: 8.262

2. Comparison of cohort characteristics in Central Africa International Epidemiology Databases to Evaluate AIDS and Demographic Health Surveys: Rwanda and Burundi.

Authors: Anna Mageras; Ellen Brazier; Théodore Niyongabo; Gad Murenzi; Jean D'Amour Sinayobye; Adebola A Adedimeji; Christella Twizere; Elizabeth A Kelvin; Kathryn Anastos; Denis Nash; Heidi E Jones
Journal: Int J STD AIDS Date: 2021-02-03 Impact factor: 1.359

3. Detection of Outliers Due to Participants' Non-Adherence to Protocol in a Longitudinal Study of Cognitive Decline.

Authors: Aline Dugravot; Severine Sabia; Martin J Shipley; Catherine Welch; Mika Kivimaki; Archana Singh-Manoux
Journal: PLoS One Date: 2015-07-10 Impact factor: 3.240

4. Longitudinal multiple imputation approaches for body mass index or other variables with very low individual-level variability: the mibmi command in Stata.

Authors: Evangelos Kontopantelis; Rosa Parisi; David A Springate; David Reeves
Journal: BMC Res Notes Date: 2017-01-13

5. [The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement].

Authors: Eric I Benchimol; Liam Smeeth; Astrid Guttmann; Katie Harron; Lars G Hemkens; David Moher; Irene Petersen; Henrik T Sørensen; Erik von Elm; Sinéad M Langan
Journal: Z Evid Fortbild Qual Gesundhwes Date: 2016-09-28

6. The value of aspartate aminotransferase and alanine aminotransferase in cardiovascular disease risk assessment.

Authors: Stephen F Weng; Joe Kai; Indra Neil Guha; Nadeem Qureshi
Journal: Open Heart Date: 2015-08-21

7. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement.

Authors: Eric I Benchimol; Liam Smeeth; Astrid Guttmann; Katie Harron; David Moher; Irene Petersen; Henrik T Sørensen; Erik von Elm; Sinéad M Langan
Journal: PLoS Med Date: 2015-10-06 Impact factor: 11.069

8. Artificial Intelligence-Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study.

Authors: Weijia Chen; Zhijun Lu; Lijue You; Lingling Zhou; Jie Xu; Ken Chen
Journal: JMIR Med Inform Date: 2020-06-15

9. Is it time to stop sweeping data cleaning under the carpet? A novel algorithm for outlier management in growth data.

Authors: Charlotte S C Woolley; Ian G Handel; B Mark Bronsvoort; Jeffrey J Schoenebeck; Dylan N Clements
Journal: PLoS One Date: 2020-01-24 Impact factor: 3.240

10. Methods to estimate baseline creatinine and define acute kidney injury in lean Ugandan children with severe malaria: a prospective cohort study.

Authors: Anthony Batte; Michelle C Starr; Andrew L Schwaderer; Robert O Opoka; Ruth Namazzi; Erika S Phelps Nishiguchi; John M Ssenkusu; Chandy C John; Andrea L Conroy
Journal: BMC Nephrol Date: 2020-09-29 Impact factor: 2.388

10 in total