Deriving adequate sample sizes for ANN-based modelling of real estate valuation tasks by complexity analysis

Property valuation in areas with few transactions on basis of a linear regression fails due to a not sufficient number of purchasing cases. One approach which is enhancing the available data set is to evaluate these purchasing cases together with a neighbouring submarket. However, it leads to non-li...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Land use policy 2021-08, Vol.107, p.105475, Article 105475
Hauptverfasser: Horvath, Sabine, Soot, Matthias, Zaddach, Sebastian, Neuner, Hans, Weitkamp, Alexandra
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Property valuation in areas with few transactions on basis of a linear regression fails due to a not sufficient number of purchasing cases. One approach which is enhancing the available data set is to evaluate these purchasing cases together with a neighbouring submarket. However, it leads to non-linearities. Consequently, non-linear models for a cross-submarket real estate valuation are required to obtain reasonable results. We focus in this contribution on non-linear modelling on basis of artificial neural networks (ANN). A prerequisite for these procedures is an adequate sample size. We present a new approach based on the aggregation of submarkets additional to the markets with few transactions at the expense of increasing complexity of the model required. The cross-submarket ANN estimation aims to reach accuracies comparable to local property valuation procedures in a first step and in further consequence to enable a reasonable estimation in areas with few transactions. We introduce an extended Kalman filter (EKF) estimation procedure for the ANN parameters and compare it to the standard optimisation procedure Levenberg Marquardt (LM) as well as to the multiple linear regression. Thus, German spatial and functional submarkets are aggregated. For the spatially aggregated data set, the ANN estimation leads to improved results. The ANN estimation of the functionally aggregated data appears deceptively simple due to too small samples not representing the sampling density. The question arises, what are adequate sample sizes regarding the complexity of the unknown relationship. We purpose a model complexity analysis procedure based on resampling and the structural risk minimisation theory and derive a minimum sample size for the spatially aggregated data. Only for the EKF computations, this minimum sample size is reached due to less variance of the ANN estimations. Generally, the EKF computation leads to a better ANN performance in contrast to LM. Finally, the spatial cross-submarket ANN estimation reaches accuracies of local property valuation procedures. •Improve understanding of the applicability of ANN in real estate valuation (especially concerning sample size and complexity).•Data aggregation strategies of real estate sub-market to generate a sufficient database for ANN modelling.•Spatially aggregated valuation data is less complex than functionally aggregated valuation data.•Complexity analysis of down-sampled valuation data allows the specification o
ISSN:0264-8377
1873-5754
DOI:10.1016/j.landusepol.2021.105475