### Abstract

Original language | English |
---|---|

Journal | Journal of Chemometrics |

Volume | 24 |

Issue number | 3-4 |

Pages (from-to) | 168-187 |

Number of pages | 9 |

ISSN | 0886-9383 |

DOIs | |

Publication status | Published - Mar 2010 |

Event | Conferentia Chemometrica 2009 - Siófok, Hungary Duration: 27 Sep 2009 → 30 Sep 2009 |

### Conference

Conference | Conferentia Chemometrica 2009 |
---|---|

Country | Hungary |

City | Siófok |

Period | 27/09/2009 → 30/09/2009 |

### Cite this

*Journal of Chemometrics*,

*24*(3-4), 168-187. https://doi.org/10.1002/cem.1310

}

*Journal of Chemometrics*, vol. 24, no. 3-4, pp. 168-187. https://doi.org/10.1002/cem.1310

**Principles of Proper Validation : use and abuse of re-sampling for validation.** / Esbensen, Kim; Geladi, Paul.

Research output: Contribution to journal › Conference article in Journal › Research › peer-review

TY - GEN

T1 - Principles of Proper Validation

T2 - use and abuse of re-sampling for validation

AU - Esbensen, Kim

AU - Geladi, Paul

PY - 2010/3

Y1 - 2010/3

N2 - Validation in chemometrics is presented using the exemplar context of multivariate calibration/prediction. A phenomenological analysis of common validation practices in data analysis and chemometrics leads to formulation of a set of generic Principles of Proper Validation (PPV), which is based on a set of characterizing distinctions: (i) Validation cannot be understood by focusing on the methods of validation only; validation must be based on full knowledge of the underlying definitions, objectives, methods, effects and consequences which are all outlined and discussed here. (ii) Analysis of proper validation objectives implies that there is one valid paradigm only: test set validation. (iii) Contrary to much contemporary chemometric practices (and validation myths), cross-validation is shown to be unjustified in the form of monolithic application of a one-for-all procedure (segmented cross-validation) on all data sets. Within its own design and scope, cross-validation is in reality a sub-optimal simulation of test set validation, crippled by a critical sampling variance omission, as it manifestly is based on one data set only (training data set). Other re-sampling validation methods are shown to suffer from the same deficiencies. The PPV are universal and can be applied to all situations in which the assessment of performance is desired: prediction-, classification-, time series forecasting-, modeling validation. The key element of PPV is the Theory of Sampling (TOS), which allow insight into all variance generating factors, especially the so-called incorrect sampling errors, which, if not properly eliminated, are responsible for a fatal inconstant sampling bias, for which no statistical correction is possible. In the light of TOS it is shown how a second data set (test set, validation set) is critically necessary for the inclusion of the sampling errors incurred in all 'future' situations in which the validated model must perform. Logically, therefore, all one data set re-sampling approaches for validation, especially cross-validation and leverage-corrected validation, should be terminated, or at the very least used only with full scientific understanding and disclosure of their detrimental variance omissions and consequences. Regarding PLS-regression, an emphatic call is made for stringent commitment to test set validation based on graphical inspection of pertinent t-u plots for optimal understanding of the X-Y interrelationships and for validation guidance. OSAR/QSAP forms a partial exemption from the present test set imperative with no generalization potential.

AB - Validation in chemometrics is presented using the exemplar context of multivariate calibration/prediction. A phenomenological analysis of common validation practices in data analysis and chemometrics leads to formulation of a set of generic Principles of Proper Validation (PPV), which is based on a set of characterizing distinctions: (i) Validation cannot be understood by focusing on the methods of validation only; validation must be based on full knowledge of the underlying definitions, objectives, methods, effects and consequences which are all outlined and discussed here. (ii) Analysis of proper validation objectives implies that there is one valid paradigm only: test set validation. (iii) Contrary to much contemporary chemometric practices (and validation myths), cross-validation is shown to be unjustified in the form of monolithic application of a one-for-all procedure (segmented cross-validation) on all data sets. Within its own design and scope, cross-validation is in reality a sub-optimal simulation of test set validation, crippled by a critical sampling variance omission, as it manifestly is based on one data set only (training data set). Other re-sampling validation methods are shown to suffer from the same deficiencies. The PPV are universal and can be applied to all situations in which the assessment of performance is desired: prediction-, classification-, time series forecasting-, modeling validation. The key element of PPV is the Theory of Sampling (TOS), which allow insight into all variance generating factors, especially the so-called incorrect sampling errors, which, if not properly eliminated, are responsible for a fatal inconstant sampling bias, for which no statistical correction is possible. In the light of TOS it is shown how a second data set (test set, validation set) is critically necessary for the inclusion of the sampling errors incurred in all 'future' situations in which the validated model must perform. Logically, therefore, all one data set re-sampling approaches for validation, especially cross-validation and leverage-corrected validation, should be terminated, or at the very least used only with full scientific understanding and disclosure of their detrimental variance omissions and consequences. Regarding PLS-regression, an emphatic call is made for stringent commitment to test set validation based on graphical inspection of pertinent t-u plots for optimal understanding of the X-Y interrelationships and for validation guidance. OSAR/QSAP forms a partial exemption from the present test set imperative with no generalization potential.

U2 - 10.1002/cem.1310

DO - 10.1002/cem.1310

M3 - Conference article in Journal

VL - 24

SP - 168

EP - 187

JO - Journal of Chemometrics

JF - Journal of Chemometrics

SN - 0886-9383

IS - 3-4

ER -