Classification of heterogeneous Malayalam documents based on structural features using deep learning models

The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manuall...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of electrical and computer engineering (Malacca, Malacca) Malacca), 2023-02, Vol.13 (1), p.894
Hauptverfasser: Balakrishnan Jayakumari, Bipin Nair, Thomas Kavana, Amel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page 894
container_title International journal of electrical and computer engineering (Malacca, Malacca)
container_volume 13
creator Balakrishnan Jayakumari, Bipin Nair
Thomas Kavana, Amel
description The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manually collected from different sources. The method of research proceeds with preprocessing, feature extraction, and classification. The proposed work deals with three fine-tuned deep learning models such as visual geometry group-16 (VGG-16), convolutional neural network (CNN) and AlexNet. The models attained high accuracies of 99.7%, 96%, and 95%, respectively. Among the three models, the fine-tuned VGG-16 model was found to perform better attaining a very high accuracy on the dataset. As a future work, methods to classify the documents based on content as well as spectral features can be developed.
doi_str_mv 10.11591/ijece.v13i1.pp894-901
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2766677012</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2766677012</sourcerecordid><originalsourceid>FETCH-LOGICAL-c145t-f6f40d1aab82438db4cf977caa55c47acad91b99e86abda04e14efb6ea2c7cae3</originalsourceid><addsrcrecordid>eNotkE1LxDAQhosouKz7FyTguWvSpkl6lMUvWPGi5zBNJmvXtqlJK-y_N1YPw7wzPMzAk2XXjG4Zq2p22x7R4PablS3bjqOqeV5TdpatClkUeVFJdZ4yVSpXkqrLbBNj21DOJadSVKvsc9dBWrnWwNT6gXhHPnDC4A84oJ8jeYEOTql6Yr2ZexymSBqIaEmi4xRmM80BOuIQUsBI5tgOB2IRR9IhhOF36r3FLl5lFw66iJv_vs7eH-7fdk_5_vXxeXe3zw3j1ZQ74Ti1DKBRBS-VbbhxtZQGoKoMl2DA1qypa1QCGguUI-PoGoFQmERhuc5u_u6OwX_NGCd99HMY0ktdSCGElJQViRJ_lAk-xoBOj6HtIZw0o3pxqxe3enGrF7c6uS1_ACnUdA8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2766677012</pqid></control><display><type>article</type><title>Classification of heterogeneous Malayalam documents based on structural features using deep learning models</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Balakrishnan Jayakumari, Bipin Nair ; Thomas Kavana, Amel</creator><creatorcontrib>Balakrishnan Jayakumari, Bipin Nair ; Thomas Kavana, Amel</creatorcontrib><description>The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manually collected from different sources. The method of research proceeds with preprocessing, feature extraction, and classification. The proposed work deals with three fine-tuned deep learning models such as visual geometry group-16 (VGG-16), convolutional neural network (CNN) and AlexNet. The models attained high accuracies of 99.7%, 96%, and 95%, respectively. Among the three models, the fine-tuned VGG-16 model was found to perform better attaining a very high accuracy on the dataset. As a future work, methods to classify the documents based on content as well as spectral features can be developed.</description><identifier>ISSN: 2088-8708</identifier><identifier>EISSN: 2722-2578</identifier><identifier>EISSN: 2088-8708</identifier><identifier>DOI: 10.11591/ijece.v13i1.pp894-901</identifier><language>eng</language><publisher>Yogyakarta: IAES Institute of Advanced Engineering and Science</publisher><subject>Accuracy ; Agreements ; Artificial neural networks ; Classification ; Comparative studies ; Computer science ; Coronaviruses ; COVID-19 ; Datasets ; Deep learning ; Digitization ; Documents ; Feature extraction ; Machine learning ; Methods ; Neural networks ; Tobacco</subject><ispartof>International journal of electrical and computer engineering (Malacca, Malacca), 2023-02, Vol.13 (1), p.894</ispartof><rights>Copyright IAES Institute of Advanced Engineering and Science 2023</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-4592-4947 ; 0000-0003-0220-0527</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Balakrishnan Jayakumari, Bipin Nair</creatorcontrib><creatorcontrib>Thomas Kavana, Amel</creatorcontrib><title>Classification of heterogeneous Malayalam documents based on structural features using deep learning models</title><title>International journal of electrical and computer engineering (Malacca, Malacca)</title><description>The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manually collected from different sources. The method of research proceeds with preprocessing, feature extraction, and classification. The proposed work deals with three fine-tuned deep learning models such as visual geometry group-16 (VGG-16), convolutional neural network (CNN) and AlexNet. The models attained high accuracies of 99.7%, 96%, and 95%, respectively. Among the three models, the fine-tuned VGG-16 model was found to perform better attaining a very high accuracy on the dataset. As a future work, methods to classify the documents based on content as well as spectral features can be developed.</description><subject>Accuracy</subject><subject>Agreements</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Comparative studies</subject><subject>Computer science</subject><subject>Coronaviruses</subject><subject>COVID-19</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Digitization</subject><subject>Documents</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Tobacco</subject><issn>2088-8708</issn><issn>2722-2578</issn><issn>2088-8708</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNotkE1LxDAQhosouKz7FyTguWvSpkl6lMUvWPGi5zBNJmvXtqlJK-y_N1YPw7wzPMzAk2XXjG4Zq2p22x7R4PablS3bjqOqeV5TdpatClkUeVFJdZ4yVSpXkqrLbBNj21DOJadSVKvsc9dBWrnWwNT6gXhHPnDC4A84oJ8jeYEOTql6Yr2ZexymSBqIaEmi4xRmM80BOuIQUsBI5tgOB2IRR9IhhOF36r3FLl5lFw66iJv_vs7eH-7fdk_5_vXxeXe3zw3j1ZQ74Ti1DKBRBS-VbbhxtZQGoKoMl2DA1qypa1QCGguUI-PoGoFQmERhuc5u_u6OwX_NGCd99HMY0ktdSCGElJQViRJ_lAk-xoBOj6HtIZw0o3pxqxe3enGrF7c6uS1_ACnUdA8</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>Balakrishnan Jayakumari, Bipin Nair</creator><creator>Thomas Kavana, Amel</creator><general>IAES Institute of Advanced Engineering and Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0003-4592-4947</orcidid><orcidid>https://orcid.org/0000-0003-0220-0527</orcidid></search><sort><creationdate>20230201</creationdate><title>Classification of heterogeneous Malayalam documents based on structural features using deep learning models</title><author>Balakrishnan Jayakumari, Bipin Nair ; Thomas Kavana, Amel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c145t-f6f40d1aab82438db4cf977caa55c47acad91b99e86abda04e14efb6ea2c7cae3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Agreements</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Comparative studies</topic><topic>Computer science</topic><topic>Coronaviruses</topic><topic>COVID-19</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Digitization</topic><topic>Documents</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Tobacco</topic><toplevel>online_resources</toplevel><creatorcontrib>Balakrishnan Jayakumari, Bipin Nair</creatorcontrib><creatorcontrib>Thomas Kavana, Amel</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East &amp; South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Balakrishnan Jayakumari, Bipin Nair</au><au>Thomas Kavana, Amel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Classification of heterogeneous Malayalam documents based on structural features using deep learning models</atitle><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle><date>2023-02-01</date><risdate>2023</risdate><volume>13</volume><issue>1</issue><spage>894</spage><pages>894-</pages><issn>2088-8708</issn><eissn>2722-2578</eissn><eissn>2088-8708</eissn><abstract>The proposed work gives a comparative study on performance of various pretrained deep learning models for classifying Malayalam documents such as agreement documents, notebook images, and palm leaves. The documents are classified based on their visual and structural features. The dataset was manually collected from different sources. The method of research proceeds with preprocessing, feature extraction, and classification. The proposed work deals with three fine-tuned deep learning models such as visual geometry group-16 (VGG-16), convolutional neural network (CNN) and AlexNet. The models attained high accuracies of 99.7%, 96%, and 95%, respectively. Among the three models, the fine-tuned VGG-16 model was found to perform better attaining a very high accuracy on the dataset. As a future work, methods to classify the documents based on content as well as spectral features can be developed.</abstract><cop>Yogyakarta</cop><pub>IAES Institute of Advanced Engineering and Science</pub><doi>10.11591/ijece.v13i1.pp894-901</doi><orcidid>https://orcid.org/0000-0003-4592-4947</orcidid><orcidid>https://orcid.org/0000-0003-0220-0527</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2088-8708
ispartof International journal of electrical and computer engineering (Malacca, Malacca), 2023-02, Vol.13 (1), p.894
issn 2088-8708
2722-2578
2088-8708
language eng
recordid cdi_proquest_journals_2766677012
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Accuracy
Agreements
Artificial neural networks
Classification
Comparative studies
Computer science
Coronaviruses
COVID-19
Datasets
Deep learning
Digitization
Documents
Feature extraction
Machine learning
Methods
Neural networks
Tobacco
title Classification of heterogeneous Malayalam documents based on structural features using deep learning models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T03%3A39%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Classification%20of%20heterogeneous%20Malayalam%20documents%20based%20on%20structural%20features%20using%20deep%20learning%20models&rft.jtitle=International%20journal%20of%20electrical%20and%20computer%20engineering%20(Malacca,%20Malacca)&rft.au=Balakrishnan%20Jayakumari,%20Bipin%20Nair&rft.date=2023-02-01&rft.volume=13&rft.issue=1&rft.spage=894&rft.pages=894-&rft.issn=2088-8708&rft.eissn=2722-2578&rft_id=info:doi/10.11591/ijece.v13i1.pp894-901&rft_dat=%3Cproquest_cross%3E2766677012%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2766677012&rft_id=info:pmid/&rfr_iscdi=true