Literature DB >> 18215187

Outlying observations and missing values: how should they be handled?

John Ludbrook1.   

Abstract

1. The problems of, and best solutions for, outlying observations and missing values are very dependent on the sizes of the experimental groups. For original articles published in Clinical and Experimental Pharmacology and Physiology during 2006-2007, the range of group sizes ranged from three to 44 ('small groups'). In surveys, epidemiological studies and clinical trials, the group sizes range from 100s to 1000s ('large groups'). 2. How can one detect outlying (extreme) observations? The best methods are graphical, for instance: (i) a scatterplot, often with mean+/-2 s; and (ii) a box-and-whisker plot. Even with these, it is a matter of judgement whether observations are truly outlying. 3. It is permissable to delete or replace outlying observations if an independent explanation for them can be found. This may be, for instance, failure of a piece of measuring equipment or human error in operating it. If the observation is deleted, it can then be treated as a missing value. Rarely, the appropriate portion of the study can be repeated. 4. It is decidedly not permissable to delete unexplained extreme values. Some of the acceptable strategies for handling them are: (i) transform the data and proceed with conventional statistical analyses; (ii) use the mean for location, but use permutation (randomization) tests for comparing means; and (iii) use robust methods for describing location (e.g. median, geometric mean, trimmed mean), for indicating dispersion (range, percentiles), for comparing locations and for regression analysis. 5. What can be done about missing values? Some strategies are: (i) ignore them; (ii) replace them by hand if the data set is small; and (iii) use computerized imputation techniques to replace them if the data set is large (e.g. regression or EM (conditional Expectation, Maximum likelihood estimation) methods). 6. If the missing values are ignored, or even if they are replaced, it is essential to test whether the individuals with missing values are otherwise indistinguishable from the remainder of the group. If the missing values have not occurred at random, but are associated with some property of the individuals being studied, the subsequent analysis may be biased.

Entities:  

Mesh:

Year:  2008        PMID: 18215187     DOI: 10.1111/j.1440-1681.2007.04860.x

Source DB:  PubMed          Journal:  Clin Exp Pharmacol Physiol        ISSN: 0305-1870            Impact factor:   2.557


  7 in total

1.  Effects of repeated morphine on ultrasonic vocalizations in adult rats: increased 50-kHz call rate and altered subtype profile.

Authors:  Laura M Best; Leah L Zhao; Tina Scardochio; Paul B S Clarke
Journal:  Psychopharmacology (Berl)       Date:  2016-10-11       Impact factor: 4.530

Review 2.  Predicting outcomes in radiation oncology--multifactorial decision support systems.

Authors:  Philippe Lambin; Ruud G P M van Stiphout; Maud H W Starmans; Emmanuel Rios-Velazquez; Georgi Nalbantov; Hugo J W L Aerts; Erik Roelofs; Wouter van Elmpt; Paul C Boutros; Pierluigi Granone; Vincenzo Valentini; Adrian C Begg; Dirk De Ruysscher; Andre Dekker
Journal:  Nat Rev Clin Oncol       Date:  2012-11-20       Impact factor: 66.675

3.  Assessment of a 1H high-resolution magic angle spinning NMR spectroscopy procedure for free sugars quantification in intact plant tissue.

Authors:  Teresa Delgado-Goñi; Sonia Campo; Juana Martín-Sitjar; Miquel E Cabañas; Blanca San Segundo; Carles Arús
Journal:  Planta       Date:  2013-07-04       Impact factor: 4.116

4.  Maternal blood manganese levels and infant birth weight.

Authors:  Ami R Zota; Adrienne S Ettinger; Maryse Bouchard; Chitra J Amarasiriwardena; Joel Schwartz; Howard Hu; Robert O Wright
Journal:  Epidemiology       Date:  2009-05       Impact factor: 4.822

5.  An instrument assessing patient satisfaction with day care in hospitals.

Authors:  Sm Kleefstra; Rb Kool; Lc Zandbelt; Jcjm de Haes
Journal:  BMC Health Serv Res       Date:  2012-05-24       Impact factor: 2.655

6.  The development of self-regulated learning during the pre-clinical stage of medical school: a comparison between a lecture-based and a problem-based curriculum.

Authors:  Susanna M Lucieer; Jos N van der Geest; Silvana M Elói-Santos; Rosa M Delbone de Faria; Laura Jonker; Chris Visscher; Remy M J P Rikers; Axel P N Themmen
Journal:  Adv Health Sci Educ Theory Pract       Date:  2015-05-29       Impact factor: 3.853

Review 7.  Carotid intima-media thickness studies: study design and data analysis.

Authors:  Sanne A E Peters; Michiel L Bots
Journal:  J Stroke       Date:  2013-01-31       Impact factor: 6.967

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.