Datasets classification using deep learning and machine learning classification algorithms

The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the wor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Abdulameer, Maysaa H., Abdullah, Mahmood Z.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title
container_volume 2591
creator Abdulameer, Maysaa H.
Abdullah, Mahmood Z.
description The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.
doi_str_mv 10.1063/5.0120454
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_2792137551</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2792137551</sourcerecordid><originalsourceid>FETCH-LOGICAL-p2034-ad19f3123cdf110ffd2d5e49dd46d736beb33b3b801ac04011c9fd9fbf555c7d3</originalsourceid><addsrcrecordid>eNp9kE1Lw0AQhhdRsFYP_oOANyF1Zj-y3aPUTyh4URAvy2Y_2i1pErOp4L83tYWCB0_DDM87MzyEXCJMEAp2IyaAFLjgR2SEQmAuCyyOyQhA8Zxy9n5KzlJaAVAl5XREPu5Mb5LvU2Yrk1IM0Zo-NnW2SbFeZM77Nqu86eptZ2qXrY1dxtofhn9yplo0XeyX63ROToKpkr_Y1zF5e7h_nT3l85fH59ntPG8pMJ4bhyowpMy6gAghOOqE58o5XjjJitKXjJWsnAIaCxwQrQpOhTIIIax0bEyudnvbrvnc-NTrVbPp6uGkplJRZHLwMFDXOyrZ2P--qtsurk33rb-aTgu996ZbF_6DEfRW9CHAfgCj8nE9</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2792137551</pqid></control><display><type>conference_proceeding</type><title>Datasets classification using deep learning and machine learning classification algorithms</title><source>AIP Journals Complete</source><creator>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z.</creator><contributor>Agarwal, Parul ; Obaid, Ahmed J. ; Albermany, Salah A. ; Banerjee, Jyoti Sekhar</contributor><creatorcontrib>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z. ; Agarwal, Parul ; Obaid, Ahmed J. ; Albermany, Salah A. ; Banerjee, Jyoti Sekhar</creatorcontrib><description>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0120454</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Classification ; Colleges &amp; universities ; Datasets ; Deep learning ; Education ; Machine learning ; Portable document format ; Public speaking</subject><ispartof>AIP conference proceedings, 2023, Vol.2591 (1)</ispartof><rights>Author(s)</rights><rights>2023 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0120454$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,794,4512,23930,23931,25140,27924,27925,76384</link.rule.ids></links><search><contributor>Agarwal, Parul</contributor><contributor>Obaid, Ahmed J.</contributor><contributor>Albermany, Salah A.</contributor><contributor>Banerjee, Jyoti Sekhar</contributor><creatorcontrib>Abdulameer, Maysaa H.</creatorcontrib><creatorcontrib>Abdullah, Mahmood Z.</creatorcontrib><title>Datasets classification using deep learning and machine learning classification algorithms</title><title>AIP conference proceedings</title><description>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Colleges &amp; universities</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Education</subject><subject>Machine learning</subject><subject>Portable document format</subject><subject>Public speaking</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNp9kE1Lw0AQhhdRsFYP_oOANyF1Zj-y3aPUTyh4URAvy2Y_2i1pErOp4L83tYWCB0_DDM87MzyEXCJMEAp2IyaAFLjgR2SEQmAuCyyOyQhA8Zxy9n5KzlJaAVAl5XREPu5Mb5LvU2Yrk1IM0Zo-NnW2SbFeZM77Nqu86eptZ2qXrY1dxtofhn9yplo0XeyX63ROToKpkr_Y1zF5e7h_nT3l85fH59ntPG8pMJ4bhyowpMy6gAghOOqE58o5XjjJitKXjJWsnAIaCxwQrQpOhTIIIax0bEyudnvbrvnc-NTrVbPp6uGkplJRZHLwMFDXOyrZ2P--qtsurk33rb-aTgu996ZbF_6DEfRW9CHAfgCj8nE9</recordid><startdate>20230329</startdate><enddate>20230329</enddate><creator>Abdulameer, Maysaa H.</creator><creator>Abdullah, Mahmood Z.</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20230329</creationdate><title>Datasets classification using deep learning and machine learning classification algorithms</title><author>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p2034-ad19f3123cdf110ffd2d5e49dd46d736beb33b3b801ac04011c9fd9fbf555c7d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Colleges &amp; universities</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Education</topic><topic>Machine learning</topic><topic>Portable document format</topic><topic>Public speaking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abdulameer, Maysaa H.</creatorcontrib><creatorcontrib>Abdullah, Mahmood Z.</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Abdulameer, Maysaa H.</au><au>Abdullah, Mahmood Z.</au><au>Agarwal, Parul</au><au>Obaid, Ahmed J.</au><au>Albermany, Salah A.</au><au>Banerjee, Jyoti Sekhar</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Datasets classification using deep learning and machine learning classification algorithms</atitle><btitle>AIP conference proceedings</btitle><date>2023-03-29</date><risdate>2023</risdate><volume>2591</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0120454</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0094-243X
ispartof AIP conference proceedings, 2023, Vol.2591 (1)
issn 0094-243X
1551-7616
language eng
recordid cdi_proquest_journals_2792137551
source AIP Journals Complete
subjects Accuracy
Algorithms
Artificial neural networks
Classification
Colleges & universities
Datasets
Deep learning
Education
Machine learning
Portable document format
Public speaking
title Datasets classification using deep learning and machine learning classification algorithms
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T13%3A21%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Datasets%20classification%20using%20deep%20learning%20and%20machine%20learning%20classification%20algorithms&rft.btitle=AIP%20conference%20proceedings&rft.au=Abdulameer,%20Maysaa%20H.&rft.date=2023-03-29&rft.volume=2591&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0120454&rft_dat=%3Cproquest_scita%3E2792137551%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2792137551&rft_id=info:pmid/&rfr_iscdi=true