An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking
This paper proposes an automatic method to summarize Bangla news document. In the proposed approach,pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical...
Gespeichert in:
Veröffentlicht in: | Journal of information processing systems 2017, 13(4), 46, pp.752-777 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes an automatic method to summarize Bangla news document. In the proposed approach,pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary.
After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figuresand title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence isincreased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence isincluded in summary always if it contains any title word. In Bangla text, numerical figures can be presentedboth in words and digits with a variety of forms. All these forms are identified to assess the importance ofsentences. We have used the rule-based system in this approach with hidden Markov model and Markovchain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Banglagrammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3summaries are for each document). The evaluation results demonstrate the effectiveness of the proposedtechnique over the four latest methods. KCI Citation Count: 1 |
---|---|
ISSN: | 2092-805X 1976-913X 2092-805X |
DOI: | 10.3745/JIPS.04.0038 |