Automatic vs manual categorisation of documents in Spanish
Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been...
Gespeichert in:
Veröffentlicht in: | Journal of documentation 2001-11, Vol.57 (6), p.763-773 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 773 |
---|---|
container_issue | 6 |
container_start_page | 763 |
container_title | Journal of documentation |
container_volume | 57 |
creator | Figuerola, Carlos G. Zazo Rodríguez, Angel Luis Alonso Berrocal, José |
description | Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared. |
doi_str_mv | 10.1108/EUM0000000007099 |
format | Article |
fullrecord | <record><control><sourceid>proquest_eric_</sourceid><recordid>TN_cdi_proquest_journals_217974826</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ639506</ericid><sourcerecordid>87400496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</originalsourceid><addsrcrecordid>eNqFkM1Lw0AQxRdRsFbvHjwEQW-xs5v99FakflHxoD2H6XajKUm27iaC_72RVoVeOpeBeb83jxlCTilcUQp6NJk9wW8pMGaPDKgSOlWZMvtkAMBYCpzqQ3IU4xKA9oIekOtx1_oa29ImnzGpsemwSiy27s2HMvZz3yS-SBbedrVr2piUTfKywqaM78fkoMAqupNNH5LZ7eT15j6dPt893IynqeUya1M6N2zO5lJY0EoXKCnPjKMFOovzzFK9ACzQGO4K5qTVmREGFHdcChDoIBuSy_XeVfAfnYttXpfRuqrCxvku5kJxxRTwnSDTUlIhdA-eb4FL34WmPyJnVBnFNZM9BGvIBh9jcEW-CmWN4SunkP-8PN9-eW-52OzFaLEqAja2jP8-TpkRSvXc2ZpzobR_8uRR9rfDT_JoI9cuYLXYHfwNd82WYA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>217974826</pqid></control><display><type>article</type><title>Automatic vs manual categorisation of documents in Spanish</title><source>Emerald Journals</source><creator>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</creator><creatorcontrib>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</creatorcontrib><description>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</description><identifier>ISSN: 0022-0418</identifier><identifier>EISSN: 1758-7379</identifier><identifier>DOI: 10.1108/EUM0000000007099</identifier><identifier>CODEN: JDOCAS</identifier><language>eng</language><publisher>Bradford: MCB UP Ltd</publisher><subject>Algorithms ; And ; Automatic Language Processing ; Automation ; Classification ; Collections ; Comparative studies ; Computerized catalogues ; Documents ; Exact sciences and technology ; Experiments ; Feedback ; Information and communication sciences ; Information Needs ; Information Retrieval ; Information retrieval systems ; Information retrieval systems. Information and document management system ; Information science. Documentation ; Information Sources ; Information Systems ; Learning Processes ; Manual catalogues ; Mathematics ; Measurement Techniques ; Relevance (Information Retrieval) ; Sciences and techniques of general use ; Spanish ; Spanish language materials</subject><ispartof>Journal of documentation, 2001-11, Vol.57 (6), p.763-773</ispartof><rights>MCB UP Limited</rights><rights>2002 INIST-CNRS</rights><rights>Copyright ASLIB. The Association for Information Management Nov 2001</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</citedby><cites>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/EUM0000000007099/full/pdf$$EPDF$$P50$$Gemerald$$H</linktopdf><linktohtml>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/EUM0000000007099/full/html$$EHTML$$P50$$Gemerald$$H</linktohtml><link.rule.ids>314,776,780,961,11614,27901,27902,52661,52664</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ639506$$DView record in ERIC$$Hfree_for_read</backlink><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=14129577$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Figuerola, Carlos G.</creatorcontrib><creatorcontrib>Zazo Rodríguez, Angel</creatorcontrib><creatorcontrib>Luis Alonso Berrocal, José</creatorcontrib><title>Automatic vs manual categorisation of documents in Spanish</title><title>Journal of documentation</title><description>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</description><subject>Algorithms</subject><subject>And</subject><subject>Automatic Language Processing</subject><subject>Automation</subject><subject>Classification</subject><subject>Collections</subject><subject>Comparative studies</subject><subject>Computerized catalogues</subject><subject>Documents</subject><subject>Exact sciences and technology</subject><subject>Experiments</subject><subject>Feedback</subject><subject>Information and communication sciences</subject><subject>Information Needs</subject><subject>Information Retrieval</subject><subject>Information retrieval systems</subject><subject>Information retrieval systems. Information and document management system</subject><subject>Information science. Documentation</subject><subject>Information Sources</subject><subject>Information Systems</subject><subject>Learning Processes</subject><subject>Manual catalogues</subject><subject>Mathematics</subject><subject>Measurement Techniques</subject><subject>Relevance (Information Retrieval)</subject><subject>Sciences and techniques of general use</subject><subject>Spanish</subject><subject>Spanish language materials</subject><issn>0022-0418</issn><issn>1758-7379</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><sourceid>AVQMV</sourceid><sourceid>BENPR</sourceid><sourceid>K50</sourceid><sourceid>M1D</sourceid><recordid>eNqFkM1Lw0AQxRdRsFbvHjwEQW-xs5v99FakflHxoD2H6XajKUm27iaC_72RVoVeOpeBeb83jxlCTilcUQp6NJk9wW8pMGaPDKgSOlWZMvtkAMBYCpzqQ3IU4xKA9oIekOtx1_oa29ImnzGpsemwSiy27s2HMvZz3yS-SBbedrVr2piUTfKywqaM78fkoMAqupNNH5LZ7eT15j6dPt893IynqeUya1M6N2zO5lJY0EoXKCnPjKMFOovzzFK9ACzQGO4K5qTVmREGFHdcChDoIBuSy_XeVfAfnYttXpfRuqrCxvku5kJxxRTwnSDTUlIhdA-eb4FL34WmPyJnVBnFNZM9BGvIBh9jcEW-CmWN4SunkP-8PN9-eW-52OzFaLEqAja2jP8-TpkRSvXc2ZpzobR_8uRR9rfDT_JoI9cuYLXYHfwNd82WYA</recordid><startdate>20011101</startdate><enddate>20011101</enddate><creator>Figuerola, Carlos G.</creator><creator>Zazo Rodríguez, Angel</creator><creator>Luis Alonso Berrocal, José</creator><general>MCB UP Ltd</general><general>Emerald</general><general>Emerald Group Publishing Limited</general><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>8AO</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AVQMV</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K50</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>M0C</scope><scope>M0P</scope><scope>M1D</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7SC</scope><scope>8FD</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20011101</creationdate><title>Automatic vs manual categorisation of documents in Spanish</title><author>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Algorithms</topic><topic>And</topic><topic>Automatic Language Processing</topic><topic>Automation</topic><topic>Classification</topic><topic>Collections</topic><topic>Comparative studies</topic><topic>Computerized catalogues</topic><topic>Documents</topic><topic>Exact sciences and technology</topic><topic>Experiments</topic><topic>Feedback</topic><topic>Information and communication sciences</topic><topic>Information Needs</topic><topic>Information Retrieval</topic><topic>Information retrieval systems</topic><topic>Information retrieval systems. Information and document management system</topic><topic>Information science. Documentation</topic><topic>Information Sources</topic><topic>Information Systems</topic><topic>Learning Processes</topic><topic>Manual catalogues</topic><topic>Mathematics</topic><topic>Measurement Techniques</topic><topic>Relevance (Information Retrieval)</topic><topic>Sciences and techniques of general use</topic><topic>Spanish</topic><topic>Spanish language materials</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Figuerola, Carlos G.</creatorcontrib><creatorcontrib>Zazo Rodríguez, Angel</creatorcontrib><creatorcontrib>Luis Alonso Berrocal, José</creatorcontrib><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest Pharma Collection</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Arts Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Library & Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Art, Design & Architecture Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ABI/INFORM Global</collection><collection>Education Database</collection><collection>Arts & Humanities Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of documentation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Figuerola, Carlos G.</au><au>Zazo Rodríguez, Angel</au><au>Luis Alonso Berrocal, José</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ639506</ericid><atitle>Automatic vs manual categorisation of documents in Spanish</atitle><jtitle>Journal of documentation</jtitle><date>2001-11-01</date><risdate>2001</risdate><volume>57</volume><issue>6</issue><spage>763</spage><epage>773</epage><pages>763-773</pages><issn>0022-0418</issn><eissn>1758-7379</eissn><coden>JDOCAS</coden><abstract>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</abstract><cop>Bradford</cop><pub>MCB UP Ltd</pub><doi>10.1108/EUM0000000007099</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0022-0418 |
ispartof | Journal of documentation, 2001-11, Vol.57 (6), p.763-773 |
issn | 0022-0418 1758-7379 |
language | eng |
recordid | cdi_proquest_journals_217974826 |
source | Emerald Journals |
subjects | Algorithms And Automatic Language Processing Automation Classification Collections Comparative studies Computerized catalogues Documents Exact sciences and technology Experiments Feedback Information and communication sciences Information Needs Information Retrieval Information retrieval systems Information retrieval systems. Information and document management system Information science. Documentation Information Sources Information Systems Learning Processes Manual catalogues Mathematics Measurement Techniques Relevance (Information Retrieval) Sciences and techniques of general use Spanish Spanish language materials |
title | Automatic vs manual categorisation of documents in Spanish |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T06%3A00%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_eric_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20vs%20manual%20categorisation%20of%20documents%20in%20Spanish&rft.jtitle=Journal%20of%20documentation&rft.au=Figuerola,%20Carlos%20G.&rft.date=2001-11-01&rft.volume=57&rft.issue=6&rft.spage=763&rft.epage=773&rft.pages=763-773&rft.issn=0022-0418&rft.eissn=1758-7379&rft.coden=JDOCAS&rft_id=info:doi/10.1108/EUM0000000007099&rft_dat=%3Cproquest_eric_%3E87400496%3C/proquest_eric_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=217974826&rft_id=info:pmid/&rft_ericid=EJ639506&rfr_iscdi=true |