Extracting abstract and keywords from context for academic articles
Every year thousands of academic studies are published all over the world. When researchers search for a topic, they quickly look at abstracts and keywords. In many academic disciplines, the authors write keywords and abstracts in their publications. On the other hand, there are publications of some...
Gespeichert in:
Veröffentlicht in: | Social network analysis and mining 2018-12, Vol.8 (1), p.45, Article 45 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Every year thousands of academic studies are published all over the world. When researchers search for a topic, they quickly look at abstracts and keywords. In many academic disciplines, the authors write keywords and abstracts in their publications. On the other hand, there are publications of some disciplines, such as social sciences which do not contain keywords and abstracted information. In addition, there may be no abstract or keyword in some of old publications in all disciplines. Search engines for academic publications usually conduct this search by checking keywords, abstracts and titles. The lack of an abstract and a keyword in the publication makes this situation difficult to provide accurate search results and it prevents the researcher to review the publication quickly. This study proposes a method to generate keywords and an abstract from the text that can be used in academic studies. In the previous studies,
k
-NN and fuzzy CCG methods have been generally used to solve this problem. Nonetheless, the structures of words have not been examined and semantic analysis has not been used for solving this problem. In this study, the sections of the publication are also divided into parts such as the references, the introduction and the methodology. Each section is graded differently so that the word in each section has a different score. Furthermore, NLP methods were used to analyze texts and phrases, removing prepositions and conjunctions. After these processes, the data was used to generate the keyword using TF–IDF. Text generation for abstract is also performed using the TextRank method with this data. Thus, much more successful, truthful and contextually relevant keywords and abstracts are produced. The proposed method was tested on Sobiad Academic Database, which is employed by 72 universities in Turkey, covering more than 250,000 academic publications. Experimental results were measured with precision and
F
measure, and the results were found to be promising compared to the previous studies, which focused on keyword derivation and abstract generation. |
---|---|
ISSN: | 1869-5450 1869-5469 |
DOI: | 10.1007/s13278-018-0524-z |