Literature DB >> 26092335

A new strategy to prevent over-fitting in partial least squares models based on model population analysis.

Bai-Chuan Deng1, Yong-Huan Yun2, Yi-Zeng Liang3, Dong-Sheng Cao4, Qing-Song Xu5, Lun-Zhao Yi6, Xin Huang2.   

Abstract

Partial least squares (PLS) is one of the most widely used methods for chemical modeling. However, like many other parameter tunable methods, it has strong tendency of over-fitting. Thus, a crucial step in PLS model building is to select the optimal number of latent variables (nLVs). Cross-validation (CV) is the most popular method for PLS model selection because it selects a model from the perspective of prediction ability. However, a clear minimum of prediction errors may not be obtained in CV which makes the model selection difficult. To solve the problem, we proposed a new strategy for PLS model selection which combines the cross-validated coefficient of determination (Qcv(2)) and model stability (S). S is defined as the stability of PLS regression vectors which is obtained using model population analysis (MPA). The results show that, when a clear maximum of Qcv(2) is not obtained, S can provide additional information of over-fitting and it helps in finding the optimal nLVs. Compared with other regression vector based indictors such as the Euclidean 2-norm (B2), the Durbin Watson statistic (DW) and the jaggedness (J), S is more sensitive to over-fitting. The model selected by our method has both good prediction ability and stability.
Copyright © 2015 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Cross-validation; Model population analysis; Model selection; Model stability; Over-fitting; Partial least squares

Mesh:

Year:  2015        PMID: 26092335     DOI: 10.1016/j.aca.2015.04.045

Source DB:  PubMed          Journal:  Anal Chim Acta        ISSN: 0003-2670            Impact factor:   6.558


  5 in total

1.  Artificial Intelligence in Adult Spinal Deformity.

Authors:  Pramod N Kamalapathy; Aditya V Karhade; Daniel Tobert; Joseph H Schwab
Journal:  Acta Neurochir Suppl       Date:  2022

2.  Toxicity Prediction Method Based on Multi-Channel Convolutional Neural Network.

Authors:  Qing Yuan; Zhiqiang Wei; Xu Guan; Mingjian Jiang; Shuang Wang; Shugang Zhang; Zhen Li
Journal:  Molecules       Date:  2019-09-17       Impact factor: 4.411

3.  Development and Validation of an In-Line API Quantification Method Using AQbD Principles Based on UV-Vis Spectroscopy to Monitor and Optimise Continuous Hot Melt Extrusion Process.

Authors:  Juan Almeida; Mariana Bezerra; Daniel Markl; Andreas Berghaus; Phil Borman; Walkiria Schlindwein
Journal:  Pharmaceutics       Date:  2020-02-12       Impact factor: 6.321

4.  Effect of Tumor Microenvironment and Angiogenesis on Clinical Outcomes of Primary Central Nervous System Lymphoma.

Authors:  Hui-Ching Wang; Hui-Hua Hsiao; Jeng-Shiun Du; Shih-Feng Cho; Tsung-Jang Yeh; Yuh-Ching Gau; Yi-Chang Liu; Sin-Hua Moi
Journal:  Biomed Res Int       Date:  2021-09-30       Impact factor: 3.411

5.  Prediction of mosquito species and population age structure using mid-infrared spectroscopy and supervised machine learning.

Authors:  Mario González Jiménez; Simon A Babayan; Pegah Khazaeli; Margaret Doyle; Finlay Walton; Elliott Reedy; Thomas Glew; Mafalda Viana; Lisa Ranford-Cartwright; Abdoulaye Niang; Doreen J Siria; Fredros O Okumu; Abdoulaye Diabaté; Heather M Ferguson; Francesco Baldini; Klaas Wynne
Journal:  Wellcome Open Res       Date:  2019-09-16
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.