A novel credit model risk measure: Do more data lead to lower model risk?
Large databases and Machine Learning enhance our capacity to develop models with many observations and explanatory variables. While the literature has primarily focused on optimizing classifications, little attention has been given to model risk, especially originating from inadequate use. To addres...
Gespeichert in:
Veröffentlicht in: | The Quarterly review of economics and finance 2025-03, Vol.100, p.101960, Article 101960 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Large databases and Machine Learning enhance our capacity to develop models with many observations and explanatory variables. While the literature has primarily focused on optimizing classifications, little attention has been given to model risk, especially originating from inadequate use. To address this gap, we introduce a new metric for assessing model risk in credit applications. We test the metric using cross-section LASSO default models, each incorporating 200 thousand loan observations from several banks and more than 100 explanatory variables. The results indicate that models that use loans from a single bank have lower model risk than models using loans from the entire financial system. Therefore, adding loans from different banks to increase the number of observations in a model is suboptimal, challenging the widely accepted assumption that more data leads to better predictions.
•We propose a measure to assess the model risk for default estimation models.•We apply this metric in the context of Big Data and Machine Learning.•We use the plugin LASSO regressions on a large loan-level dataset.•We add by adapting the relative model risk measure of Barrieu & Scandolo (2015).•We compare the model risk of different credit scoring models. |
---|---|
ISSN: | 1062-9769 |
DOI: | 10.1016/j.qref.2025.101960 |