Assessment of importance-based machine learning feature selection methods for aggregate size distribution measurement in a 3D binocular vision system
•Creating a dataset with 27 2D/3D features for aggregate sieve-size classification.•Proposing the importance-based feature selection method.•Developing a new strategy for select the best-performing model.•Comprehensive comparison the average accuracy score of models trained by 17 machine learning cl...
Gespeichert in:
Veröffentlicht in: | Construction & building materials 2021-11, Vol.306, p.124894, Article 124894 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Creating a dataset with 27 2D/3D features for aggregate sieve-size classification.•Proposing the importance-based feature selection method.•Developing a new strategy for select the best-performing model.•Comprehensive comparison the average accuracy score of models trained by 17 machine learning classifiers under different sub-datasets.
Aggregate size is usually measured by manual sampling and sieving. Machine vision techniques can provide fast, non-invasive measurement. However, the traditional imaging method using a single size descriptor to discriminate different sieve-size classes of coarse aggregates might not yield high-precision classification results. To determine the optimum supervised machine learning model for coarse aggregates sieve-size measurement, 17 methods were evaluated and compared. To train our model, a new dataset named MFCA27 (Multiple Features of Coarse Aggregate 27) was introduced, which contains 27 features of aggregates based on aggregate three-dimensional (3D) top-surface object. In addition, a feature selection approach for investigating how accuracy varied with the datasets under different feature sets was developed, where feature selection was performed according to the impurity-based feature importance score measured using an extremely randomized tree model. Experiments demonstrated that the Gaussian process classifier (GPC) was the best-performing method on the datasets with two- or three-dimensional (2D/3D) feature sets in terms of accuracy and robustness. The results also showed that, compared with the traditional aggregate sieve-size measurement method, which is based on a single size descriptor, GPC can achieve an accuracy of 95.06% on the test dataset of MFCA27 in the aggregate sieve-size class measurement task. |
---|---|
ISSN: | 0950-0618 1879-0526 |
DOI: | 10.1016/j.conbuildmat.2021.124894 |