Systems and methods for classification of software defect reports

Existing software defect text categorization approaches are based on use of supervised/semi-supervised machine learning techniques, which may require significant amount of labeled training data for each class in order to train the classifier model leading to significant amount of human effort, resul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Patil, Sangameshwar Suryakant
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Existing software defect text categorization approaches are based on use of supervised/semi-supervised machine learning techniques, which may require significant amount of labeled training data for each class in order to train the classifier model leading to significant amount of human effort, resulting in an expensive process. Embodiments of the present disclosure provide systems and methods for circumventing the problem of dependency on labeled training data and features derived from source code by performing concept based classification of software defect reports. In the present disclosure, semantic similarity between the defect category/type labels and the software defect report(s) is computed and represented in a concept space spanned by corpus of documents obtained from one or more knowledge bases, and distribution of similarity values are obtained. These similarity values are compared with a dynamically generated threshold, and based on the comparison, the software defect reports are classified into software defect categories.