Neural Model for Content Extraction in Multilingual Web Documents

Neural model for multilingual web documents in Indian sub-continent is gaining prominence in day to day life. While translation and transliteration are gaining its importance on web pages, it becomes difficult for the common man to understand what the web page says about, especially when regional la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer applications 2013-01, Vol.65 (4)
Hauptverfasser: Prakash, Kolla Bhanu, Dorai Rangaswamy, M A, Raman, Arun Raja
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Neural model for multilingual web documents in Indian sub-continent is gaining prominence in day to day life. While translation and transliteration are gaining its importance on web pages, it becomes difficult for the common man to understand what the web page says about, especially when regional language is not known to the user. So, our effort here is a generic tool applied in Neural networks to overcome this problem. The model takes inputs in both English and Telugu, an Indian regional language in both printed and handwritten formats. Words having common content are chosen and neural network is used to normalize the output. A sample page from a physics textbook dealing with magnetism is taken for consideration for this paper.
ISSN:0975-8887
0975-8887
DOI:10.5120/10909-5837