New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method

Abstract Classification new book is needed in facilitating students and lecturers to find books. The law used is Dewey Decimal Classification (DDC) classification. The application of the DDC classification requires a high level of accuracy and concentration in grouping books into appropriate classes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of physics. Conference series 2019-04, Vol.1211 (1), p.12044
Hauptverfasser: Nurdiansyah, Y, Andrianto, A, Kamshal, L
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page 12044
container_title Journal of physics. Conference series
container_volume 1211
creator Nurdiansyah, Y
Andrianto, A
Kamshal, L
description Abstract Classification new book is needed in facilitating students and lecturers to find books. The law used is Dewey Decimal Classification (DDC) classification. The application of the DDC classification requires a high level of accuracy and concentration in grouping books into appropriate classes. Errors that occur in the form of discrepancies in the provision of class books. Performance can be improved by the existence of an information system that can help classify classes in books according to DDC law. The process of giving classes to books by looking for the highest similarity between titles and synopsis of books with each DDC dictionary class. Adjusting to the process of giving classes to books at the University of Jember Library, the title, synopsis and DDC dictionary are processed using the text mining method. Text mining produces data in the form of basic words from the title, synopsis and DDC dictionary. The number of occurrences of each word is useful for measuring how important a word is in a document. The method that is suitable for calculating the importance of a word in a document is the method of weighting Term Frequency-Inverse Document Frequency (TF-IDF). The results of the TF-IDF weighting are used to find the highest similarity between the title and the synopsis with the class in the DDC dictionary. The appropriate method in calculating the similarity of two documents is Cosine Similarity. The biggest similarity value between the title and synopsis with the DDC dictionary using Cosine Similarity method is made a priority in determining the class of books. The results of the application of the method in the system there are 20 data books resulting in book classes in DDC 000 class there are 3 books, DDC 100 class is 1 book, DDC class 200 there is 1 book, DDC 300 class there are 6 books, DDC 400 class there are 4 books, DDC 500 class is 1 book, DDC 600 class there are 2 books and DDC 700 class there are 2 books. Testing book classification information system produces accuracy percentage of 35 %.
doi_str_mv 10.1088/1742-6596/1211/1/012044
format Article
fullrecord <record><control><sourceid>proquest_iop_j</sourceid><recordid>TN_cdi_proquest_journals_2566130884</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2566130884</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-95c1e0c439baf3efcb24fb440ba992c362fa111abc0a41f5baed7682776cb6fb3</originalsourceid><addsrcrecordid>eNqFkFtLwzAUgIsoOKe_wYAvKtTm1rR9lM4rQwX1OSRpopnd0jUdY__elMpkIJiHk8PJd84hXxSdIniFYJ4nKKM4ZmnBEoQRSlACEYaU7kWj7cv-Ns_zw-jI-xmEJJxsFC2f9BpI576AqoX31lglOusWQAqvKxCSiV7rTYjKzkUNyl3qfDIpL0At1mDl7eIDdCa2lQFiUQHlQkUDb-e2Fq3tNmCuu09XHUcHRtRen_zc4-j99uatvI-nz3cP5fU0ViQturhIFdJQUVJIYYg2SmJqJKVQiqLAijBsBEJISAUFRSaVQlcZy3GWMSWZkWQcnQ1zm9YtV9p3fOZW7SKs5DhlDJEgjwYqGyjVOu9bbXjTho-2G44g7_3y3hzvLfLeL0d88Bs6L4dO65rf0Y8v5esuyJvKBJj8Af-34hvLgosp</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2566130884</pqid></control><display><type>article</type><title>New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Institute of Physics Open Access Journal Titles</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Nurdiansyah, Y ; Andrianto, A ; Kamshal, L</creator><creatorcontrib>Nurdiansyah, Y ; Andrianto, A ; Kamshal, L</creatorcontrib><description>Abstract Classification new book is needed in facilitating students and lecturers to find books. The law used is Dewey Decimal Classification (DDC) classification. The application of the DDC classification requires a high level of accuracy and concentration in grouping books into appropriate classes. Errors that occur in the form of discrepancies in the provision of class books. Performance can be improved by the existence of an information system that can help classify classes in books according to DDC law. The process of giving classes to books by looking for the highest similarity between titles and synopsis of books with each DDC dictionary class. Adjusting to the process of giving classes to books at the University of Jember Library, the title, synopsis and DDC dictionary are processed using the text mining method. Text mining produces data in the form of basic words from the title, synopsis and DDC dictionary. The number of occurrences of each word is useful for measuring how important a word is in a document. The method that is suitable for calculating the importance of a word in a document is the method of weighting Term Frequency-Inverse Document Frequency (TF-IDF). The results of the TF-IDF weighting are used to find the highest similarity between the title and the synopsis with the class in the DDC dictionary. The appropriate method in calculating the similarity of two documents is Cosine Similarity. The biggest similarity value between the title and synopsis with the DDC dictionary using Cosine Similarity method is made a priority in determining the class of books. The results of the application of the method in the system there are 20 data books resulting in book classes in DDC 000 class there are 3 books, DDC 100 class is 1 book, DDC class 200 there is 1 book, DDC 300 class there are 6 books, DDC 400 class there are 4 books, DDC 500 class is 1 book, DDC 600 class there are 2 books and DDC 700 class there are 2 books. Testing book classification information system produces accuracy percentage of 35 %.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/1211/1/012044</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Classification ; Data mining ; Dewey Decimal Classification ; Dictionaries ; Information systems ; Mathematical analysis ; Similarity ; Weighting</subject><ispartof>Journal of physics. Conference series, 2019-04, Vol.1211 (1), p.12044</ispartof><rights>Published under licence by IOP Publishing Ltd</rights><rights>2019. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-95c1e0c439baf3efcb24fb440ba992c362fa111abc0a41f5baed7682776cb6fb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://iopscience.iop.org/article/10.1088/1742-6596/1211/1/012044/pdf$$EPDF$$P50$$Giop$$Hfree_for_read</linktopdf><link.rule.ids>315,781,785,27929,27930,38873,38895,53845,53872</link.rule.ids></links><search><creatorcontrib>Nurdiansyah, Y</creatorcontrib><creatorcontrib>Andrianto, A</creatorcontrib><creatorcontrib>Kamshal, L</creatorcontrib><title>New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method</title><title>Journal of physics. Conference series</title><addtitle>J. Phys.: Conf. Ser</addtitle><description>Abstract Classification new book is needed in facilitating students and lecturers to find books. The law used is Dewey Decimal Classification (DDC) classification. The application of the DDC classification requires a high level of accuracy and concentration in grouping books into appropriate classes. Errors that occur in the form of discrepancies in the provision of class books. Performance can be improved by the existence of an information system that can help classify classes in books according to DDC law. The process of giving classes to books by looking for the highest similarity between titles and synopsis of books with each DDC dictionary class. Adjusting to the process of giving classes to books at the University of Jember Library, the title, synopsis and DDC dictionary are processed using the text mining method. Text mining produces data in the form of basic words from the title, synopsis and DDC dictionary. The number of occurrences of each word is useful for measuring how important a word is in a document. The method that is suitable for calculating the importance of a word in a document is the method of weighting Term Frequency-Inverse Document Frequency (TF-IDF). The results of the TF-IDF weighting are used to find the highest similarity between the title and the synopsis with the class in the DDC dictionary. The appropriate method in calculating the similarity of two documents is Cosine Similarity. The biggest similarity value between the title and synopsis with the DDC dictionary using Cosine Similarity method is made a priority in determining the class of books. The results of the application of the method in the system there are 20 data books resulting in book classes in DDC 000 class there are 3 books, DDC 100 class is 1 book, DDC class 200 there is 1 book, DDC 300 class there are 6 books, DDC 400 class there are 4 books, DDC 500 class is 1 book, DDC 600 class there are 2 books and DDC 700 class there are 2 books. Testing book classification information system produces accuracy percentage of 35 %.</description><subject>Classification</subject><subject>Data mining</subject><subject>Dewey Decimal Classification</subject><subject>Dictionaries</subject><subject>Information systems</subject><subject>Mathematical analysis</subject><subject>Similarity</subject><subject>Weighting</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>O3W</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqFkFtLwzAUgIsoOKe_wYAvKtTm1rR9lM4rQwX1OSRpopnd0jUdY__elMpkIJiHk8PJd84hXxSdIniFYJ4nKKM4ZmnBEoQRSlACEYaU7kWj7cv-Ns_zw-jI-xmEJJxsFC2f9BpI576AqoX31lglOusWQAqvKxCSiV7rTYjKzkUNyl3qfDIpL0At1mDl7eIDdCa2lQFiUQHlQkUDb-e2Fq3tNmCuu09XHUcHRtRen_zc4-j99uatvI-nz3cP5fU0ViQturhIFdJQUVJIYYg2SmJqJKVQiqLAijBsBEJISAUFRSaVQlcZy3GWMSWZkWQcnQ1zm9YtV9p3fOZW7SKs5DhlDJEgjwYqGyjVOu9bbXjTho-2G44g7_3y3hzvLfLeL0d88Bs6L4dO65rf0Y8v5esuyJvKBJj8Af-34hvLgosp</recordid><startdate>20190401</startdate><enddate>20190401</enddate><creator>Nurdiansyah, Y</creator><creator>Andrianto, A</creator><creator>Kamshal, L</creator><general>IOP Publishing</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20190401</creationdate><title>New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method</title><author>Nurdiansyah, Y ; Andrianto, A ; Kamshal, L</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-95c1e0c439baf3efcb24fb440ba992c362fa111abc0a41f5baed7682776cb6fb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Classification</topic><topic>Data mining</topic><topic>Dewey Decimal Classification</topic><topic>Dictionaries</topic><topic>Information systems</topic><topic>Mathematical analysis</topic><topic>Similarity</topic><topic>Weighting</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nurdiansyah, Y</creatorcontrib><creatorcontrib>Andrianto, A</creatorcontrib><creatorcontrib>Kamshal, L</creatorcontrib><collection>Institute of Physics Open Access Journal Titles</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nurdiansyah, Y</au><au>Andrianto, A</au><au>Kamshal, L</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method</atitle><jtitle>Journal of physics. Conference series</jtitle><addtitle>J. Phys.: Conf. Ser</addtitle><date>2019-04-01</date><risdate>2019</risdate><volume>1211</volume><issue>1</issue><spage>12044</spage><pages>12044-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Abstract Classification new book is needed in facilitating students and lecturers to find books. The law used is Dewey Decimal Classification (DDC) classification. The application of the DDC classification requires a high level of accuracy and concentration in grouping books into appropriate classes. Errors that occur in the form of discrepancies in the provision of class books. Performance can be improved by the existence of an information system that can help classify classes in books according to DDC law. The process of giving classes to books by looking for the highest similarity between titles and synopsis of books with each DDC dictionary class. Adjusting to the process of giving classes to books at the University of Jember Library, the title, synopsis and DDC dictionary are processed using the text mining method. Text mining produces data in the form of basic words from the title, synopsis and DDC dictionary. The number of occurrences of each word is useful for measuring how important a word is in a document. The method that is suitable for calculating the importance of a word in a document is the method of weighting Term Frequency-Inverse Document Frequency (TF-IDF). The results of the TF-IDF weighting are used to find the highest similarity between the title and the synopsis with the class in the DDC dictionary. The appropriate method in calculating the similarity of two documents is Cosine Similarity. The biggest similarity value between the title and synopsis with the DDC dictionary using Cosine Similarity method is made a priority in determining the class of books. The results of the application of the method in the system there are 20 data books resulting in book classes in DDC 000 class there are 3 books, DDC 100 class is 1 book, DDC class 200 there is 1 book, DDC 300 class there are 6 books, DDC 400 class there are 4 books, DDC 500 class is 1 book, DDC 600 class there are 2 books and DDC 700 class there are 2 books. Testing book classification information system produces accuracy percentage of 35 %.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/1211/1/012044</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1742-6588
ispartof Journal of physics. Conference series, 2019-04, Vol.1211 (1), p.12044
issn 1742-6588
1742-6596
language eng
recordid cdi_proquest_journals_2566130884
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Institute of Physics Open Access Journal Titles; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Classification
Data mining
Dewey Decimal Classification
Dictionaries
Information systems
Mathematical analysis
Similarity
Weighting
title New book classification based on Dewey Decimal Classification (DDC) law using tf-idf and cosine similarity method
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T00%3A51%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_iop_j&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=New%20book%20classification%20based%20on%20Dewey%20Decimal%20Classification%20(DDC)%20law%20using%20tf-idf%20and%20cosine%20similarity%20method&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=Nurdiansyah,%20Y&rft.date=2019-04-01&rft.volume=1211&rft.issue=1&rft.spage=12044&rft.pages=12044-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/1211/1/012044&rft_dat=%3Cproquest_iop_j%3E2566130884%3C/proquest_iop_j%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2566130884&rft_id=info:pmid/&rfr_iscdi=true