Literature DB >> 34873350

Modeling household online shopping demand in the U.S.: a machine learning approach and comparative investigation between 2009 and 2017.

Limon Barua1, Bo Zou1,2, Yan Zhou3, Yulin Liu4.   

Abstract

Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. This paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models, specifically gradient boosting machine (GBM), for predicting household-level online shopping purchases. The NHTS data allow for not only conducting nationwide investigation but also at the level of households, which is more appropriate than at the individual level given the connected consumption and shopping needs of members in a household. We follow a systematic procedure for model development including employing Recursive Feature Elimination algorithm to select input variables (features) in order to reduce the risk of model overfitting and increase model explainability. Among several ML models, GBM is found to yield the best prediction accuracy. Extensive post-modeling investigation is conducted in a comparative manner between 2009 and 2017, including quantifying the importance of each input variable in predicting online shopping demand, and characterizing value-dependent relationships between demand and the input variables. In doing so, two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling. The modeling and investigation are performed at the national level, with a number of findings obtained. The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand.
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.

Entities:  

Keywords:  Accumulated local effects; Gradient boosting machine; National Household Travel Survey; Online shopping demand; Prediction; Shapley value-based feature importance

Year:  2021        PMID: 34873350      PMCID: PMC8637526          DOI: 10.1007/s11116-021-10250-z

Source DB:  PubMed          Journal:  Transportation (Amst)        ISSN: 0049-4488            Impact factor:   4.814


  8 in total

1.  Neighbourhood food environment and area deprivation: spatial accessibility to grocery stores selling fresh fruit and vegetables in urban and rural settings.

Authors:  Dianna M Smith; Steven Cummins; Mathew Taylor; John Dawson; David Marshall; Leigh Sparks; Annie S Anderson
Journal:  Int J Epidemiol       Date:  2009-06-02       Impact factor: 7.196

2.  A working guide to boosted regression trees.

Authors:  J Elith; J R Leathwick; T Hastie
Journal:  J Anim Ecol       Date:  2008-04-08       Impact factor: 5.091

3.  Satellite-based ground PM2.5 estimation using a gradient boosting decision tree.

Authors:  Tianning Zhang; Weihuan He; Hui Zheng; Yaoping Cui; Hongquan Song; Shenglei Fu
Journal:  Chemosphere       Date:  2020-10-29       Impact factor: 7.086

4.  A comparison of random forests, boosting and support vector machines for genomic selection.

Authors:  Joseph O Ogutu; Hans-Peter Piepho; Torben Schulz-Streeck
Journal:  BMC Proc       Date:  2011-05-27

5.  Gradient boosting machines, a tutorial.

Authors:  Alexey Natekin; Alois Knoll
Journal:  Front Neurorobot       Date:  2013-12-04       Impact factor: 2.650

6.  Selecting the most important self-assessed features for predicting conversion to mild cognitive impairment with random forest and permutation-based methods.

Authors:  Jaime Gómez-Ramírez; Marina Ávila-Villanueva; Miguel Ángel Fernández-Blázquez
Journal:  Sci Rep       Date:  2020-11-26       Impact factor: 4.379

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.