Impact of Deep Learning on Localizing and Recognizing Handwritten Text in Lecture Videos

Now-a-days, the video recording technologies have turned out to be more and more forceful and easier to utilize. Therefore, numerous universities are recording and publishing their lectures online in order to make them reachable for learners or students. These lecture videos encapsulate the handwrit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of advanced computer science & applications 2021, Vol.12 (4)
Hauptverfasser: Medida, Lakshmi Haritha, Ramani, Kasarapu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Now-a-days, the video recording technologies have turned out to be more and more forceful and easier to utilize. Therefore, numerous universities are recording and publishing their lectures online in order to make them reachable for learners or students. These lecture videos encapsulate the handwritten text written either on a paper or blackboard or on a tablet using a stylus. On the other hand, this mechanism of recording the lecture videos consumes huge quantity of multimedia data in a faster manner. Thus, handwritten text recognition on the lecture video portals has turned out to be an incredibly significant and demanding task. Thus, this paper intends to develop a novel handwritten text detection and recognition approach on the video lecture dataset by following four major phases, viz. (a) Text Localization, (b) Segmentation (c) Pre-processing and (d) Recognition. The text localization in the lecture video frames is the initial phase and here the arbitrarily oriented text on video frames is localized using the Modified Region Growing (MRG) algorithm. Then, the localized words are subjected to segmentation via the K-means clustering, in which the words from the detected text regions are segmented out. Subsequently, the segmented words are pre-processed to avoid the blurriness artifacts as well. Finally, the pre-processed words are recognized using the Deep Convolutional Neural Network (DCNN). The performance of the proposed model is analyzed in terms of the performance measures like accuracy, precision, sensitivity and specificity to exhibit the supremacy of the text detection and recognition in lecture video. Experimental results reveal that at Learning Percentage of 70, the presented work has the highest accuracy of 89.3% for 500 count of frames.
ISSN:2158-107X
2156-5570
DOI:10.14569/IJACSA.2021.0120442