A Machine Learning Approach on Outlier Removal for Decision Tree Regression Method
Outliers can occur in application areas, adversely affecting the prediction method's performance. Outliers can be removed by using robust statistical algorithms. However, statistical methods have limitations in capturing the outlier for high-dimensional data. Approaches using Machine Learning (...
Gespeichert in:
Veröffentlicht in: | Ingénierie des systèmes d'Information 2024-08, Vol.29 (4), p.1397-1403 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng ; fre |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Outliers can occur in application areas, adversely affecting the prediction method's performance. Outliers can be removed by using robust statistical algorithms. However, statistical methods have limitations in capturing the outlier for high-dimensional data. Approaches using Machine Learning (ML) are offered as they develop rapidly due to their excellent interpretability and strong generalization capabilities. So, ML is popular in detecting or eliminating outliers to increase the accuracy of forecasting methods, such as Isolation Forest (IF), an unsupervised outlier detection strategy using a collective approach to calculate the isolation score for every data point. This research objective is to improve the prediction accuracy of the Decision Tree Regression (DTR) method by proposing an IF as an ML-based outlier removal method. The proposed method was tested by two Air Quality Index (AQI) dataset that contained outliers with Mean Absolute Error (MAE), R-Square, and Root Mean Square Error (RMSE) as the accuracy measurements. The results showed that the proposed method outperforms previous studies. |
---|---|
ISSN: | 1633-1311 2116-7125 |
DOI: | 10.18280/isi.290414 |