Inheritance metrics feats in unsupervised learning to classify unlabeled datasets and clusters in fault prediction

Fault prediction is a necessity to deliver high-quality software. The absence of training data and mechanism to labeling a cluster faulty or fault-free is a topic of concern in software fault prediction (SFP). Inheritance is an important feature of object-oriented development, and its metrics measur...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PeerJ. Computer science 2021-10, Vol.7, p.e722-e722, Article 722
Hauptverfasser:	Aziz, Syed Rashid, Khan, Tamim Ahmed, Nadeem, Aamer
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial Intelligence Classification Cluster analysis Clustering Computer Science Computer Science, Artificial Intelligence Computer Science, Information Systems Computer Science, Theory & Methods Data mining Data Mining and Machine Learning Datasets Discriminant analysis Electronic data processing Fault diagnosis Fault location (Engineering) Inheritances Labeling Literature reviews Machine learning Methods Neural networks Program errors Quality assurance Science & Technology Software Software Engineering Software fault prediction Software inheritance metrics Software metrics Software quality Supervise learning Technology Unsupervised learning Vector quantization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Fault prediction is a necessity to deliver high-quality software. The absence of training data and mechanism to labeling a cluster faulty or fault-free is a topic of concern in software fault prediction (SFP). Inheritance is an important feature of object-oriented development, and its metrics measure the complexity, depth, and breadth of software. In this paper, we aim to experimentally validate how much inheritance metrics are helpful to classify unlabeled data sets besides conceiving a novel mechanism to label a cluster as faulty or fault-free. We have collected ten public data sets that have inheritance and C&K metrics. Then, these base datasets are further split into two datasets labeled as C&K with inheritance and the C&K dataset for evaluation. K-means clustering is applied, Euclidean formula to compute distances and then label clusters through the average mechanism. Finally, TPR, Recall, Precision, F1 measures, and ROC are computed to measure performance which showed an adequate impact of inheritance metrics in SFP specifically classifying unlabeled datasets and correct classification of instances. The experiment also reveals that the average mechanism is suitable to label clusters in SFP. The quality assurance practitioners can benefit from the utilization of metrics associated with inheritance for labeling datasets and clusters.
ISSN:	2376-5992 2376-5992
DOI:	10.7717/peerj-cs.722