Literature DB >> 24335428

Missing data estimation in morphometrics: how much is too much?

Julien Clavel1, Gildas Merceron, Gilles Escarguel.   

Abstract

Fossil-based estimates of diversity and evolutionary dynamics mainly rely on the study of morphological variation. Unfortunately, organism remains are often altered by post-mortem taphonomic processes such as weathering or distortion. Such a loss of information often prevents quantitative multivariate description and statistically-controlled comparisons of extinct species based on morphometric data. A common way to deal with missing data involves imputation methods that directly fill the missing cases with model estimates. Over the last years, several empirically-determined thresholds for the maximum acceptable proportion of missing values have been proposed in the literature, whereas other studies showed that this limit actually depends on various properties of the study data set and of the selected imputation method, and is by no way generalizable. We evaluate the relative performances of seven multiple imputation (MI) techniques through a simulation-based analysis under three distinct patterns of missing data distribution. Overall, Fully Conditional Specification and Expectation-Maximization algorithms provide the best compromises between imputation accuracy and coverage probability. MI techniques appear remarkably robust to the violation of basic assumptions such as the occurrence of taxonomically or anatomically biased patterns of missing data distribution, making differences in simulation results between the three patterns of missing data distribution much smaller than differences between the individual MI techniques. Based on these results, rather than proposing a new (set of) threshold value(s), we develop an approach combining the use of MIs with procrustean superimposition of principal component analysis results, in order to directly visualize the effect of individual missing data imputation on an ordinated space. We provide an R function for users to implement the proposed procedure.

Keywords:  Missing data; Procrustes superimposition; R function; morphometrics; multiple imputation; ordination; simulation

Mesh:

Year:  2013        PMID: 24335428     DOI: 10.1093/sysbio/syt100

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  12 in total

1.  Unlocking Andean sigmodontine diversity: five new species of Chilomys (Rodentia: Cricetidae) from the montane forests of Ecuador.

Authors:  Jorge Brito; Nicolás Tinoco; C Miguel Pinto; Rubí García; Claudia Koch; Vincent Fernandez; Santiago Burneo; Ulyses F J Pardiñas
Journal:  PeerJ       Date:  2022-04-19       Impact factor: 3.061

2.  Taxonomic and systematic revisions to the North American Nimravidae (Mammalia, Carnivora).

Authors:  Paul Z Barrett
Journal:  PeerJ       Date:  2016-02-09       Impact factor: 2.984

3.  Rule reversal: Ecogeographical patterns of body size variation in the common treeshrew (Mammalia, Scandentia).

Authors:  Eric J Sargis; Virginie Millien; Neal Woodman; Link E Olson
Journal:  Ecol Evol       Date:  2018-01-04       Impact factor: 2.912

4.  Genomic Differentiation and Demographic Histories of Atlantic and Indo-Pacific Yellowfin Tuna (Thunnus albacares) Populations.

Authors:  Julia M I Barth; Malte Damerau; Michael Matschiner; Sissel Jentoft; Reinhold Hanel
Journal:  Genome Biol Evol       Date:  2017-04-01       Impact factor: 3.416

5.  Data Driven Estimation of Imputation Error-A Strategy for Imputation with a Reject Option.

Authors:  Nikolaj Bak; Lars K Hansen
Journal:  PLoS One       Date:  2016-10-10       Impact factor: 3.240

6.  Unexpected diversity within the extinct elephant birds (Aves: Aepyornithidae) and a new identity for the world's largest bird.

Authors:  James P Hansford; Samuel T Turvey
Journal:  R Soc Open Sci       Date:  2018-09-26       Impact factor: 2.963

7.  The proportion of missing data should not be used to guide decisions on multiple imputation.

Authors:  Paul Madley-Dowd; Rachael Hughes; Kate Tilling; Jon Heron
Journal:  J Clin Epidemiol       Date:  2019-03-13       Impact factor: 6.437

8.  Association of Age of Metabolic Syndrome Onset With Cardiovascular Diseases: The Kailuan Study.

Authors:  Zegui Huang; Xianxuan Wang; Xiong Ding; Zefeng Cai; Weijian Li; Zekai Chen; Wei Fang; Zhiwei Cai; Yulong Lan; Guanzhi Chen; Weiqiang Wu; Zhichao Chen; Shouling Wu; Youren Chen
Journal:  Front Endocrinol (Lausanne)       Date:  2022-03-17       Impact factor: 5.555

9.  How many landmarks are enough to characterize shape and size variation?

Authors:  Akinobu Watanabe
Journal:  PLoS One       Date:  2018-06-04       Impact factor: 3.240

10.  Creating functional groups of marine fish from categorical traits.

Authors:  Monique A Ladds; Nokuthaba Sibanda; Richard Arnold; Matthew R Dunn
Journal:  PeerJ       Date:  2018-10-23       Impact factor: 2.984

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.