Literature DB >> 35707424

Ordered quantile normalization: a semiparametric transformation built for the cross-validation era.

Ryan A Peterson1,2, Joseph E Cavanaugh1.   

Abstract

Normalization transformations have recently experienced a resurgence in popularity in the era of machine learning, particularly in data preprocessing. However, the classical methods that can be adapted to cross-validation are not always effective. We introduce Ordered Quantile (ORQ) normalization, a one-to-one transformation that is designed to consistently and effectively transform a vector of arbitrary distribution into a vector that follows a normal (Gaussian) distribution. In the absence of ties, ORQ normalization is guaranteed to produce normally distributed transformed data. Once trained, an ORQ transformation can be readily and effectively applied to new data. We compare the effectiveness of the ORQ technique with other popular normalization methods in a simulation study where the true data generating distributions are known. We find that ORQ normalization is the only method that works consistently and effectively, regardless of the underlying distribution. We also explore the use of repeated cross-validation to identify the best normalizing transformation when the true underlying distribution is unknown. We apply our technique and other normalization methods via the bestNormalize R package on a car pricing data set. We built bestNormalize to evaluate the normalization efficacy of many candidate transformations; the package is freely available via the Comprehensive R Archive Network.
© 2019 Informa UK Limited, trading as Taylor & Francis Group.

Entities:  

Keywords:  High-dimensional data analysis; machine learning; normalizing transformation; predictive modeling; preprocessing

Year:  2019        PMID: 35707424      PMCID: PMC9042069          DOI: 10.1080/02664763.2019.1630372

Source DB:  PubMed          Journal:  J Appl Stat        ISSN: 0266-4763            Impact factor:   1.416


  2 in total

1.  The use of transformations.

Authors:  M S BARTLETT
Journal:  Biometrics       Date:  1947-03       Impact factor: 2.571

2.  Rank-based inverse normal transformations are increasingly used, but are they merited?

Authors:  T Mark Beasley; Stephen Erickson; David B Allison
Journal:  Behav Genet       Date:  2009-06-14       Impact factor: 2.805

  2 in total
  16 in total

1.  Microbial diversity declines in warmed tropical soil and respiration rise exceed predictions as communities adapt.

Authors:  Andrew T Nottingham; Jarrod J Scott; Kristin Saltonstall; Kirk Broders; Maria Montero-Sanchez; Johann Püspök; Erland Bååth; Patrick Meir
Journal:  Nat Microbiol       Date:  2022-09-05       Impact factor: 30.964

2.  Linear interactions between intraocular, intracranial pressure, and retinal vascular pulse amplitude in the fourier domain.

Authors:  Anmar Abdul-Rahman; William Morgan; Ying Jo Khoo; Christopher Lind; Allan Kermode; William Carroll; Dao-Yi Yu
Journal:  PLoS One       Date:  2022-06-28       Impact factor: 3.752

3.  Microbiome Heritability and Its Role in Adaptation of Hosts to Novel Resources.

Authors:  Karen Bisschop; Hylke H Kortenbosch; Timo J B van Eldijk; Cyrus A Mallon; Joana F Salles; Dries Bonte; Rampal S Etienne
Journal:  Front Microbiol       Date:  2022-07-05       Impact factor: 6.064

4.  COVID-19 conspiracy ideation is associated with the delusion proneness trait and resistance to update of beliefs.

Authors:  A V Lebedev; P Petrovic; K Acar; O Horntvedt; A Cabrera; A Olsson; M Ingvar
Journal:  Sci Rep       Date:  2022-06-20       Impact factor: 4.996

5.  Deriving psychiatric symptom-based biomarkers from multivariate relationships between psychophysiological and biochemical measures.

Authors:  Victoria B Risbrough; Dewleen G Baker; Daniel M Stout; Alan N Simmons; Caroline M Nievergelt; Arpi Minassian; Nilima Biswas; Adam X Maihofer
Journal:  Neuropsychopharmacology       Date:  2022-03-28       Impact factor: 8.294

6.  The influence of caffeinated and non-caffeinated multi-ingredient pre-workout supplements on resistance exercise performance and subjective outcomes.

Authors:  Matthew T Stratton; Madelin R Siedler; Patrick S Harty; Christian Rodriguez; Jake R Boykin; Jacob J Green; Dale S Keith; Sarah J White; Brielle DeHaven; Abegale D Williams; Grant M Tinsley
Journal:  J Int Soc Sports Nutr       Date:  2022-04-04       Impact factor: 4.948

7.  Systems genetic analysis of binge-like eating in a C57BL/6J x DBA/2J-F2 cross.

Authors:  Emily J Yao; Richard K Babbs; Julia C Kelliher; Kimberly P Luttik; Kristyn N Borrelli; M Imad Damaj; Megan K Mulligan; Camron D Bryant
Journal:  Genes Brain Behav       Date:  2021-05-12       Impact factor: 3.708

8.  Social recognition and short-term memory in two taxa of striped mouse with differing social systems.

Authors:  Candice N Neves; Neville Pillay
Journal:  J Exp Zool A Ecol Integr Physiol       Date:  2022-03-07

9.  Heading Exposure in Elite Football (Soccer): A Study in Adolescent, Young Adult, and Adult Male and Female Players.

Authors:  Shari Langdon; Edwin Goedhart; Jaap Oosterlaan; Marsh Königs
Journal:  Med Sci Sports Exerc       Date:  2022-04-25

10.  Discovery of stripe rust resistance with incomplete dominance in wild emmer wheat using bulked segregant analysis sequencing.

Authors:  Valentyna Klymiuk; Harmeet Singh Chawla; Krystalee Wiebe; Jennifer Ens; Andrii Fatiukha; Liubov Govta; Tzion Fahima; Curtis J Pozniak
Journal:  Commun Biol       Date:  2022-08-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.