BiDETS: Binary Differential Evolutionary based Text Summarization

In extraction-based automatic text summarization (ATS) applications, feature scoring is the cornerstone of the summarization process since it is used for selecting the candidate summary sentences. Handling all features equally leads to generating disqualified summaries. Feature Weighting (FW) is an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of advanced computer science & applications 2021, Vol.12 (1)
Hauptverfasser:	Aljahdali, Hani Moetque, Hamza, Ahmed, Abuobieda, Albaraa
Format:	Artikel
Sprache:	eng
Schlagworte:	Clustering Documents Evolutionary algorithms Evolutionary computation Feature extraction Genetic algorithms Human performance Machine learning Particle swarm optimization Weighting
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In extraction-based automatic text summarization (ATS) applications, feature scoring is the cornerstone of the summarization process since it is used for selecting the candidate summary sentences. Handling all features equally leads to generating disqualified summaries. Feature Weighting (FW) is an important approach used to weight the scores of the features based on their presence importance in the current context. Therefore, some of the ATS researchers have proposed evolutionary-based machine learning methods, such as Particle Swarm Optimization (PSO) and Genetic Algorithm (GA), to extract superior weights to their assigned features. Then the extracted weights are used to tune the scored-features in order to generate a high qualified summary. In this paper, the Differential Evolution (DE) algorithm was proposed to act as a feature weighting machine learning method for extraction-based ATS problems. In addition to enabling the DE to represent and control the assigned features in binary dimension space, it was modulated into a binary coded format. Simple mathematical calculation features have been selected from various literature and employed in this study. The sentences in the documents are first clustered according to a multi-objective clustering concept. DE approach simultaneously optimizes two objective functions, which are compactness measuring and separating the sentence clusters based on these objectives. In order to automatically detect a number of sentence clusters contained in a document, representative sentences from various clusters are chosen with certain sentence scoring features to produce the summary. The method was tested and trained using DUC2002 dataset to learn the weight of each feature. To create comparative and competitive findings, the proposed DE method was compared with evolutionary methods: PSO and GA. The DE was also compared against the best and worst systems benchmark in DUC 2002. The performance of the BiDETS model is scored with 49% similar to human performance (52%) in ROUGE-1; 26% which is over the human performance (23%) using ROUGE-2; and lastly 45% similar to human performance (48%) using ROUGE-L. These results showed that the proposed method outperformed all other methods in terms of F-measure using the ROUGE evaluation tool.
ISSN:	2158-107X 2156-5570
DOI:	10.14569/IJACSA.2021.0120132