Matrices, Vector Spaces, and Information Retrieval

The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of tex...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SIAM review 1999-06, Vol.41 (2), p.335-362
Hauptverfasser: Berry, Michael W., Drmač, Zlatko, Jessup, Elizabeth R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 362
container_issue 2
container_start_page 335
container_title SIAM review
container_volume 41
creator Berry, Michael W.
Drmač, Zlatko
Jessup, Elizabeth R.
description The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.
doi_str_mv 10.1137/S0036144598347035
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_27013136</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>2653077</jstor_id><sourcerecordid>2653077</sourcerecordid><originalsourceid>FETCH-LOGICAL-c394t-ff1eb71dc57b15fdced0648bbb4a1051646b88c570003eb426f9c0b5edcbd76b3</originalsourceid><addsrcrecordid>eNplkEtLw0AUhQdRsFZ_gOAiiLgyeu88k6UUH4WKYNVtmJnMQEqa1Jm04L83sUVBV5fD-c7hcgg5RbhGZOpmDsAkci7yjHEFTOyREUIuUkUB9slosNPBPyRHMS6g1xnLR4Q-6S5U1sWr5N3Zrg3JfKW_pW7KZNr4Nix1V7VN8uJ60G10fUwOvK6jO9ndMXm7v3udPKaz54fp5HaWWpbzLvUenVFYWqEMCl9aV4LkmTGGawSBkkuTZb07_OIMp9LnFoxwpTWlkoaNyeW2dxXaj7WLXbGsonV1rRvXrmNBFSBDJnvw_A-4aNeh6X8rMOcUqBADhFvIhjbG4HyxCtVSh88CoRgmLP5N2GcudsU6Wl37oBtbxd9gxiXNhuqzLbaI_YA_NpWCgVLsC_9seBk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>194202556</pqid></control><display><type>article</type><title>Matrices, Vector Spaces, and Information Retrieval</title><source>Jstor Complete Legacy</source><source>LOCUS - SIAM's Online Journal Archive</source><source>Business Source Complete</source><source>JSTOR Mathematics &amp; Statistics</source><creator>Berry, Michael W. ; Drmač, Zlatko ; Jessup, Elizabeth R.</creator><creatorcontrib>Berry, Michael W. ; Drmač, Zlatko ; Jessup, Elizabeth R.</creatorcontrib><description>The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.</description><identifier>ISSN: 0036-1445</identifier><identifier>EISSN: 1095-7200</identifier><identifier>DOI: 10.1137/S0036144598347035</identifier><identifier>CODEN: SIREAD</identifier><language>eng</language><publisher>Philadelphia, PA: Society for Industrial and Applied Mathematics</publisher><subject>Algebra ; Applied sciences ; Approximation ; Baking ; Computer science; control theory; systems ; Cosine function ; Education ; Exact sciences and technology ; Factorization ; Information retrieval ; Information retrieval. Graph ; Linear and multilinear algebra, matrix theory ; Linear programming ; Mathematical vectors ; Mathematics ; Matrices ; Matrix ; Memory organisation. Data processing ; Numerical analysis ; Numerical analysis. Scientific computation ; Numerical linear algebra ; Sciences and techniques of general use ; Software ; Theoretical computing ; Vector space models ; Vector spaces</subject><ispartof>SIAM review, 1999-06, Vol.41 (2), p.335-362</ispartof><rights>Copyright 1999 Society for Industrial and Applied Mathematics</rights><rights>1999 INIST-CNRS</rights><rights>Copyright Society for Industrial and Applied Mathematics Jun 1999</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c394t-ff1eb71dc57b15fdced0648bbb4a1051646b88c570003eb426f9c0b5edcbd76b3</citedby><cites>FETCH-LOGICAL-c394t-ff1eb71dc57b15fdced0648bbb4a1051646b88c570003eb426f9c0b5edcbd76b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/2653077$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/2653077$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,828,3172,27901,27902,57992,57996,58225,58229</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=1846286$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Berry, Michael W.</creatorcontrib><creatorcontrib>Drmač, Zlatko</creatorcontrib><creatorcontrib>Jessup, Elizabeth R.</creatorcontrib><title>Matrices, Vector Spaces, and Information Retrieval</title><title>SIAM review</title><description>The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.</description><subject>Algebra</subject><subject>Applied sciences</subject><subject>Approximation</subject><subject>Baking</subject><subject>Computer science; control theory; systems</subject><subject>Cosine function</subject><subject>Education</subject><subject>Exact sciences and technology</subject><subject>Factorization</subject><subject>Information retrieval</subject><subject>Information retrieval. Graph</subject><subject>Linear and multilinear algebra, matrix theory</subject><subject>Linear programming</subject><subject>Mathematical vectors</subject><subject>Mathematics</subject><subject>Matrices</subject><subject>Matrix</subject><subject>Memory organisation. Data processing</subject><subject>Numerical analysis</subject><subject>Numerical analysis. Scientific computation</subject><subject>Numerical linear algebra</subject><subject>Sciences and techniques of general use</subject><subject>Software</subject><subject>Theoretical computing</subject><subject>Vector space models</subject><subject>Vector spaces</subject><issn>0036-1445</issn><issn>1095-7200</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1999</creationdate><recordtype>article</recordtype><recordid>eNplkEtLw0AUhQdRsFZ_gOAiiLgyeu88k6UUH4WKYNVtmJnMQEqa1Jm04L83sUVBV5fD-c7hcgg5RbhGZOpmDsAkci7yjHEFTOyREUIuUkUB9slosNPBPyRHMS6g1xnLR4Q-6S5U1sWr5N3Zrg3JfKW_pW7KZNr4Nix1V7VN8uJ60G10fUwOvK6jO9ndMXm7v3udPKaz54fp5HaWWpbzLvUenVFYWqEMCl9aV4LkmTGGawSBkkuTZb07_OIMp9LnFoxwpTWlkoaNyeW2dxXaj7WLXbGsonV1rRvXrmNBFSBDJnvw_A-4aNeh6X8rMOcUqBADhFvIhjbG4HyxCtVSh88CoRgmLP5N2GcudsU6Wl37oBtbxd9gxiXNhuqzLbaI_YA_NpWCgVLsC_9seBk</recordid><startdate>19990601</startdate><enddate>19990601</enddate><creator>Berry, Michael W.</creator><creator>Drmač, Zlatko</creator><creator>Jessup, Elizabeth R.</creator><general>Society for Industrial and Applied Mathematics</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>U9A</scope><scope>7SC</scope><scope>8FD</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>19990601</creationdate><title>Matrices, Vector Spaces, and Information Retrieval</title><author>Berry, Michael W. ; Drmač, Zlatko ; Jessup, Elizabeth R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c394t-ff1eb71dc57b15fdced0648bbb4a1051646b88c570003eb426f9c0b5edcbd76b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Algebra</topic><topic>Applied sciences</topic><topic>Approximation</topic><topic>Baking</topic><topic>Computer science; control theory; systems</topic><topic>Cosine function</topic><topic>Education</topic><topic>Exact sciences and technology</topic><topic>Factorization</topic><topic>Information retrieval</topic><topic>Information retrieval. Graph</topic><topic>Linear and multilinear algebra, matrix theory</topic><topic>Linear programming</topic><topic>Mathematical vectors</topic><topic>Mathematics</topic><topic>Matrices</topic><topic>Matrix</topic><topic>Memory organisation. Data processing</topic><topic>Numerical analysis</topic><topic>Numerical analysis. Scientific computation</topic><topic>Numerical linear algebra</topic><topic>Sciences and techniques of general use</topic><topic>Software</topic><topic>Theoretical computing</topic><topic>Vector space models</topic><topic>Vector spaces</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Berry, Michael W.</creatorcontrib><creatorcontrib>Drmač, Zlatko</creatorcontrib><creatorcontrib>Jessup, Elizabeth R.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>SIAM review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Berry, Michael W.</au><au>Drmač, Zlatko</au><au>Jessup, Elizabeth R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Matrices, Vector Spaces, and Information Retrieval</atitle><jtitle>SIAM review</jtitle><date>1999-06-01</date><risdate>1999</risdate><volume>41</volume><issue>2</issue><spage>335</spage><epage>362</epage><pages>335-362</pages><issn>0036-1445</issn><eissn>1095-7200</eissn><coden>SIREAD</coden><abstract>The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.</abstract><cop>Philadelphia, PA</cop><pub>Society for Industrial and Applied Mathematics</pub><doi>10.1137/S0036144598347035</doi><tpages>28</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0036-1445
ispartof SIAM review, 1999-06, Vol.41 (2), p.335-362
issn 0036-1445
1095-7200
language eng
recordid cdi_proquest_miscellaneous_27013136
source Jstor Complete Legacy; LOCUS - SIAM's Online Journal Archive; Business Source Complete; JSTOR Mathematics & Statistics
subjects Algebra
Applied sciences
Approximation
Baking
Computer science
control theory
systems
Cosine function
Education
Exact sciences and technology
Factorization
Information retrieval
Information retrieval. Graph
Linear and multilinear algebra, matrix theory
Linear programming
Mathematical vectors
Mathematics
Matrices
Matrix
Memory organisation. Data processing
Numerical analysis
Numerical analysis. Scientific computation
Numerical linear algebra
Sciences and techniques of general use
Software
Theoretical computing
Vector space models
Vector spaces
title Matrices, Vector Spaces, and Information Retrieval
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-16T09%3A23%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Matrices,%20Vector%20Spaces,%20and%20Information%20Retrieval&rft.jtitle=SIAM%20review&rft.au=Berry,%20Michael%20W.&rft.date=1999-06-01&rft.volume=41&rft.issue=2&rft.spage=335&rft.epage=362&rft.pages=335-362&rft.issn=0036-1445&rft.eissn=1095-7200&rft.coden=SIREAD&rft_id=info:doi/10.1137/S0036144598347035&rft_dat=%3Cjstor_proqu%3E2653077%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=194202556&rft_id=info:pmid/&rft_jstor_id=2653077&rfr_iscdi=true