Automatic vs manual categorisation of documents in Spanish

Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of documentation 2001-11, Vol.57 (6), p.763-773
Hauptverfasser: Figuerola, Carlos G., Zazo Rodríguez, Angel, Luis Alonso Berrocal, José
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 773
container_issue 6
container_start_page 763
container_title Journal of documentation
container_volume 57
creator Figuerola, Carlos G.
Zazo Rodríguez, Angel
Luis Alonso Berrocal, José
description Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.
doi_str_mv 10.1108/EUM0000000007099
format Article
fullrecord <record><control><sourceid>proquest_eric_</sourceid><recordid>TN_cdi_proquest_journals_217974826</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ639506</ericid><sourcerecordid>87400496</sourcerecordid><originalsourceid>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</originalsourceid><addsrcrecordid>eNqFkM1Lw0AQxRdRsFbvHjwEQW-xs5v99FakflHxoD2H6XajKUm27iaC_72RVoVeOpeBeb83jxlCTilcUQp6NJk9wW8pMGaPDKgSOlWZMvtkAMBYCpzqQ3IU4xKA9oIekOtx1_oa29ImnzGpsemwSiy27s2HMvZz3yS-SBbedrVr2piUTfKywqaM78fkoMAqupNNH5LZ7eT15j6dPt893IynqeUya1M6N2zO5lJY0EoXKCnPjKMFOovzzFK9ACzQGO4K5qTVmREGFHdcChDoIBuSy_XeVfAfnYttXpfRuqrCxvku5kJxxRTwnSDTUlIhdA-eb4FL34WmPyJnVBnFNZM9BGvIBh9jcEW-CmWN4SunkP-8PN9-eW-52OzFaLEqAja2jP8-TpkRSvXc2ZpzobR_8uRR9rfDT_JoI9cuYLXYHfwNd82WYA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>217974826</pqid></control><display><type>article</type><title>Automatic vs manual categorisation of documents in Spanish</title><source>Emerald Journals</source><creator>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</creator><creatorcontrib>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</creatorcontrib><description>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</description><identifier>ISSN: 0022-0418</identifier><identifier>EISSN: 1758-7379</identifier><identifier>DOI: 10.1108/EUM0000000007099</identifier><identifier>CODEN: JDOCAS</identifier><language>eng</language><publisher>Bradford: MCB UP Ltd</publisher><subject>Algorithms ; And ; Automatic Language Processing ; Automation ; Classification ; Collections ; Comparative studies ; Computerized catalogues ; Documents ; Exact sciences and technology ; Experiments ; Feedback ; Information and communication sciences ; Information Needs ; Information Retrieval ; Information retrieval systems ; Information retrieval systems. Information and document management system ; Information science. Documentation ; Information Sources ; Information Systems ; Learning Processes ; Manual catalogues ; Mathematics ; Measurement Techniques ; Relevance (Information Retrieval) ; Sciences and techniques of general use ; Spanish ; Spanish language materials</subject><ispartof>Journal of documentation, 2001-11, Vol.57 (6), p.763-773</ispartof><rights>MCB UP Limited</rights><rights>2002 INIST-CNRS</rights><rights>Copyright ASLIB. The Association for Information Management Nov 2001</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</citedby><cites>FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/EUM0000000007099/full/pdf$$EPDF$$P50$$Gemerald$$H</linktopdf><linktohtml>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/EUM0000000007099/full/html$$EHTML$$P50$$Gemerald$$H</linktohtml><link.rule.ids>314,776,780,961,11614,27901,27902,52661,52664</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ639506$$DView record in ERIC$$Hfree_for_read</backlink><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=14129577$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Figuerola, Carlos G.</creatorcontrib><creatorcontrib>Zazo Rodríguez, Angel</creatorcontrib><creatorcontrib>Luis Alonso Berrocal, José</creatorcontrib><title>Automatic vs manual categorisation of documents in Spanish</title><title>Journal of documentation</title><description>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</description><subject>Algorithms</subject><subject>And</subject><subject>Automatic Language Processing</subject><subject>Automation</subject><subject>Classification</subject><subject>Collections</subject><subject>Comparative studies</subject><subject>Computerized catalogues</subject><subject>Documents</subject><subject>Exact sciences and technology</subject><subject>Experiments</subject><subject>Feedback</subject><subject>Information and communication sciences</subject><subject>Information Needs</subject><subject>Information Retrieval</subject><subject>Information retrieval systems</subject><subject>Information retrieval systems. Information and document management system</subject><subject>Information science. Documentation</subject><subject>Information Sources</subject><subject>Information Systems</subject><subject>Learning Processes</subject><subject>Manual catalogues</subject><subject>Mathematics</subject><subject>Measurement Techniques</subject><subject>Relevance (Information Retrieval)</subject><subject>Sciences and techniques of general use</subject><subject>Spanish</subject><subject>Spanish language materials</subject><issn>0022-0418</issn><issn>1758-7379</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><sourceid>AVQMV</sourceid><sourceid>BENPR</sourceid><sourceid>K50</sourceid><sourceid>M1D</sourceid><recordid>eNqFkM1Lw0AQxRdRsFbvHjwEQW-xs5v99FakflHxoD2H6XajKUm27iaC_72RVoVeOpeBeb83jxlCTilcUQp6NJk9wW8pMGaPDKgSOlWZMvtkAMBYCpzqQ3IU4xKA9oIekOtx1_oa29ImnzGpsemwSiy27s2HMvZz3yS-SBbedrVr2piUTfKywqaM78fkoMAqupNNH5LZ7eT15j6dPt893IynqeUya1M6N2zO5lJY0EoXKCnPjKMFOovzzFK9ACzQGO4K5qTVmREGFHdcChDoIBuSy_XeVfAfnYttXpfRuqrCxvku5kJxxRTwnSDTUlIhdA-eb4FL34WmPyJnVBnFNZM9BGvIBh9jcEW-CmWN4SunkP-8PN9-eW-52OzFaLEqAja2jP8-TpkRSvXc2ZpzobR_8uRR9rfDT_JoI9cuYLXYHfwNd82WYA</recordid><startdate>20011101</startdate><enddate>20011101</enddate><creator>Figuerola, Carlos G.</creator><creator>Zazo Rodríguez, Angel</creator><creator>Luis Alonso Berrocal, José</creator><general>MCB UP Ltd</general><general>Emerald</general><general>Emerald Group Publishing Limited</general><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>8AO</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AVQMV</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K50</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>M0C</scope><scope>M0P</scope><scope>M1D</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7SC</scope><scope>8FD</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20011101</creationdate><title>Automatic vs manual categorisation of documents in Spanish</title><author>Figuerola, Carlos G. ; Zazo Rodríguez, Angel ; Luis Alonso Berrocal, José</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c463t-1b92b2b65c0878fa61439e1faecab3c18d0afa994ef2e6c83959074e46505ae03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Algorithms</topic><topic>And</topic><topic>Automatic Language Processing</topic><topic>Automation</topic><topic>Classification</topic><topic>Collections</topic><topic>Comparative studies</topic><topic>Computerized catalogues</topic><topic>Documents</topic><topic>Exact sciences and technology</topic><topic>Experiments</topic><topic>Feedback</topic><topic>Information and communication sciences</topic><topic>Information Needs</topic><topic>Information Retrieval</topic><topic>Information retrieval systems</topic><topic>Information retrieval systems. Information and document management system</topic><topic>Information science. Documentation</topic><topic>Information Sources</topic><topic>Information Systems</topic><topic>Learning Processes</topic><topic>Manual catalogues</topic><topic>Mathematics</topic><topic>Measurement Techniques</topic><topic>Relevance (Information Retrieval)</topic><topic>Sciences and techniques of general use</topic><topic>Spanish</topic><topic>Spanish language materials</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Figuerola, Carlos G.</creatorcontrib><creatorcontrib>Zazo Rodríguez, Angel</creatorcontrib><creatorcontrib>Luis Alonso Berrocal, José</creatorcontrib><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest Pharma Collection</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>Arts Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Library &amp; Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Art, Design &amp; Architecture Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ABI/INFORM Global</collection><collection>Education Database</collection><collection>Arts &amp; Humanities Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of documentation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Figuerola, Carlos G.</au><au>Zazo Rodríguez, Angel</au><au>Luis Alonso Berrocal, José</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ639506</ericid><atitle>Automatic vs manual categorisation of documents in Spanish</atitle><jtitle>Journal of documentation</jtitle><date>2001-11-01</date><risdate>2001</risdate><volume>57</volume><issue>6</issue><spage>763</spage><epage>773</epage><pages>763-773</pages><issn>0022-0418</issn><eissn>1758-7379</eissn><coden>JDOCAS</coden><abstract>Automatic categorisation can be understood as a learning process during which a program recognises the characteristics that distinguish each category or class from others, i.e. those characteristics which the documents should have in order to belong to that category. As yet few experiments have been carried out with documents in Spanish. Here we show the possibilities of elaborating pattern vectors that include the characteristics of different classes or categories of documents, using techniques based on those applied to the expansion of queries by relevance; likewise, the results of applying these techniques to a collection of documents in Spanish are given. The same collection of documents was categorised manually and the results of both procedures were compared.</abstract><cop>Bradford</cop><pub>MCB UP Ltd</pub><doi>10.1108/EUM0000000007099</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0022-0418
ispartof Journal of documentation, 2001-11, Vol.57 (6), p.763-773
issn 0022-0418
1758-7379
language eng
recordid cdi_proquest_journals_217974826
source Emerald Journals
subjects Algorithms
And
Automatic Language Processing
Automation
Classification
Collections
Comparative studies
Computerized catalogues
Documents
Exact sciences and technology
Experiments
Feedback
Information and communication sciences
Information Needs
Information Retrieval
Information retrieval systems
Information retrieval systems. Information and document management system
Information science. Documentation
Information Sources
Information Systems
Learning Processes
Manual catalogues
Mathematics
Measurement Techniques
Relevance (Information Retrieval)
Sciences and techniques of general use
Spanish
Spanish language materials
title Automatic vs manual categorisation of documents in Spanish
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T06%3A00%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_eric_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20vs%20manual%20categorisation%20of%20documents%20in%20Spanish&rft.jtitle=Journal%20of%20documentation&rft.au=Figuerola,%20Carlos%20G.&rft.date=2001-11-01&rft.volume=57&rft.issue=6&rft.spage=763&rft.epage=773&rft.pages=763-773&rft.issn=0022-0418&rft.eissn=1758-7379&rft.coden=JDOCAS&rft_id=info:doi/10.1108/EUM0000000007099&rft_dat=%3Cproquest_eric_%3E87400496%3C/proquest_eric_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=217974826&rft_id=info:pmid/&rft_ericid=EJ639506&rfr_iscdi=true