Identifying Non-functional Requirements from Unconstrained Documents using Natural Language Processing and Machine Learning Approaches
Requirements engineering is the first phase in software development life cycle and it also plays one of the most important and critical roles. Requirement document mainly contains both functional requirements and non-functional requirements. Non-functional requirements are significant to describe th...
Gespeichert in:
Veröffentlicht in: | IEEE access 2024, p.1-1 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Requirements engineering is the first phase in software development life cycle and it also plays one of the most important and critical roles. Requirement document mainly contains both functional requirements and non-functional requirements. Non-functional requirements are significant to describe the properties and constraints of the system. Early identification of Non-functional requirement has direct impact on the system architecture and initial design decision. Practically, non-functional requirements are extracted manually from the document. This makes it tedious, time-consuming task and prone to various errors. In this paper, we propose an automatic approach to identify and classify non-functional requirements using semantic and syntactic analysis with machine learning approaches from unconstrained documents. We used A dataset of public requirements documents (PURE) that consists of 79 unconstrained requirements documents in different forms. In our approach, features were extracted from the requirement sentences using four different natural language processing methods including statistical and state-of-the-art semantic analysis presented by Google word2vec and bidirectional encoder representations from transformers models. The adopted approach can efficiently classify non-functional requirements with an accuracy between 84% and 87% using statistical vectorization method and 88% to 92% using word embedding semantic methods. Furthermore, by fusing different models trained on different features, the accuracy improves by 2.4% compared with the best individual classifier. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2021.3052921 |