Datasets classification using deep learning and machine learning classification algorithms
The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the wor...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | |
container_title | |
container_volume | 2591 |
creator | Abdulameer, Maysaa H. Abdullah, Mahmood Z. |
description | The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms. |
doi_str_mv | 10.1063/5.0120454 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_2792137551</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2792137551</sourcerecordid><originalsourceid>FETCH-LOGICAL-p2034-ad19f3123cdf110ffd2d5e49dd46d736beb33b3b801ac04011c9fd9fbf555c7d3</originalsourceid><addsrcrecordid>eNp9kE1Lw0AQhhdRsFYP_oOANyF1Zj-y3aPUTyh4URAvy2Y_2i1pErOp4L83tYWCB0_DDM87MzyEXCJMEAp2IyaAFLjgR2SEQmAuCyyOyQhA8Zxy9n5KzlJaAVAl5XREPu5Mb5LvU2Yrk1IM0Zo-NnW2SbFeZM77Nqu86eptZ2qXrY1dxtofhn9yplo0XeyX63ROToKpkr_Y1zF5e7h_nT3l85fH59ntPG8pMJ4bhyowpMy6gAghOOqE58o5XjjJitKXjJWsnAIaCxwQrQpOhTIIIax0bEyudnvbrvnc-NTrVbPp6uGkplJRZHLwMFDXOyrZ2P--qtsurk33rb-aTgu996ZbF_6DEfRW9CHAfgCj8nE9</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2792137551</pqid></control><display><type>conference_proceeding</type><title>Datasets classification using deep learning and machine learning classification algorithms</title><source>AIP Journals Complete</source><creator>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z.</creator><contributor>Agarwal, Parul ; Obaid, Ahmed J. ; Albermany, Salah A. ; Banerjee, Jyoti Sekhar</contributor><creatorcontrib>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z. ; Agarwal, Parul ; Obaid, Ahmed J. ; Albermany, Salah A. ; Banerjee, Jyoti Sekhar</creatorcontrib><description>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0120454</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Classification ; Colleges & universities ; Datasets ; Deep learning ; Education ; Machine learning ; Portable document format ; Public speaking</subject><ispartof>AIP conference proceedings, 2023, Vol.2591 (1)</ispartof><rights>Author(s)</rights><rights>2023 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0120454$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,794,4512,23930,23931,25140,27924,27925,76384</link.rule.ids></links><search><contributor>Agarwal, Parul</contributor><contributor>Obaid, Ahmed J.</contributor><contributor>Albermany, Salah A.</contributor><contributor>Banerjee, Jyoti Sekhar</contributor><creatorcontrib>Abdulameer, Maysaa H.</creatorcontrib><creatorcontrib>Abdullah, Mahmood Z.</creatorcontrib><title>Datasets classification using deep learning and machine learning classification algorithms</title><title>AIP conference proceedings</title><description>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Colleges & universities</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Education</subject><subject>Machine learning</subject><subject>Portable document format</subject><subject>Public speaking</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNp9kE1Lw0AQhhdRsFYP_oOANyF1Zj-y3aPUTyh4URAvy2Y_2i1pErOp4L83tYWCB0_DDM87MzyEXCJMEAp2IyaAFLjgR2SEQmAuCyyOyQhA8Zxy9n5KzlJaAVAl5XREPu5Mb5LvU2Yrk1IM0Zo-NnW2SbFeZM77Nqu86eptZ2qXrY1dxtofhn9yplo0XeyX63ROToKpkr_Y1zF5e7h_nT3l85fH59ntPG8pMJ4bhyowpMy6gAghOOqE58o5XjjJitKXjJWsnAIaCxwQrQpOhTIIIax0bEyudnvbrvnc-NTrVbPp6uGkplJRZHLwMFDXOyrZ2P--qtsurk33rb-aTgu996ZbF_6DEfRW9CHAfgCj8nE9</recordid><startdate>20230329</startdate><enddate>20230329</enddate><creator>Abdulameer, Maysaa H.</creator><creator>Abdullah, Mahmood Z.</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20230329</creationdate><title>Datasets classification using deep learning and machine learning classification algorithms</title><author>Abdulameer, Maysaa H. ; Abdullah, Mahmood Z.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p2034-ad19f3123cdf110ffd2d5e49dd46d736beb33b3b801ac04011c9fd9fbf555c7d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Colleges & universities</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Education</topic><topic>Machine learning</topic><topic>Portable document format</topic><topic>Public speaking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abdulameer, Maysaa H.</creatorcontrib><creatorcontrib>Abdullah, Mahmood Z.</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Abdulameer, Maysaa H.</au><au>Abdullah, Mahmood Z.</au><au>Agarwal, Parul</au><au>Obaid, Ahmed J.</au><au>Albermany, Salah A.</au><au>Banerjee, Jyoti Sekhar</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Datasets classification using deep learning and machine learning classification algorithms</atitle><btitle>AIP conference proceedings</btitle><date>2023-03-29</date><risdate>2023</risdate><volume>2591</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0120454</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0094-243X |
ispartof | AIP conference proceedings, 2023, Vol.2591 (1) |
issn | 0094-243X 1551-7616 |
language | eng |
recordid | cdi_proquest_journals_2792137551 |
source | AIP Journals Complete |
subjects | Accuracy Algorithms Artificial neural networks Classification Colleges & universities Datasets Deep learning Education Machine learning Portable document format Public speaking |
title | Datasets classification using deep learning and machine learning classification algorithms |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T13%3A21%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Datasets%20classification%20using%20deep%20learning%20and%20machine%20learning%20classification%20algorithms&rft.btitle=AIP%20conference%20proceedings&rft.au=Abdulameer,%20Maysaa%20H.&rft.date=2023-03-29&rft.volume=2591&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0120454&rft_dat=%3Cproquest_scita%3E2792137551%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2792137551&rft_id=info:pmid/&rfr_iscdi=true |