Exploring term dependences in probabilistic information retrieval model

Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term depen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information processing & management 2003-07, Vol.39 (4), p.505-519
Hauptverfasser: Cho, Bong-Hyun, Lee, Changki, Lee, Gary Geunbae
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 519
container_issue 4
container_start_page 505
container_title Information processing & management
container_volume 39
creator Cho, Bong-Hyun
Lee, Changki
Lee, Gary Geunbae
description Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.
doi_str_mv 10.1016/S0306-4573(02)00078-X
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57577849</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ670083</ericid><els_id>S030645730200078X</els_id><sourcerecordid>432545591</sourcerecordid><originalsourceid>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</originalsourceid><addsrcrecordid>eNqFkM1KJDEYRcOgYNv6Bg4UgqKL0qTyV7UaBmn_aJjFKLgLqeSLRKoqPUl1o29v2m4ccOMqkHu-y-Ug9JPgC4KJuPyLKRYl45Ke4eocYyzr8ukHmpBa0pJTSXbQ5BPZQ_spvWSIcVJN0M3sddGF6IfnYoTYFxYWMFgYDKTCD8Uihla3vvNp9CZ_uBB7PfowFBHG6GGlu6IPFroDtOt0l-Bw-07R4_Xs4eq2nP-5ubv6PS8NI2IsnWtELnQtaYwFTZ2TNeYCWstkazF3lWiY0y1YaExFwTliXVtTp4HI1jE6Raeb3rzs3xLSqHqfDHSdHiAsk-KSS1mzJoPHX8CXsIxD3qZIwxrMasIzxDeQiSGlCE4tou91fFMEq7Vb9eFWrcUpXKkPt-op351sy3UyunNRD8an_8dMNlQIkbmjDQfRm894di8kxjXN8a9tnI2tPESVjF-7tz6CGZUN_psh79CGmaA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>194904815</pqid></control><display><type>article</type><title>Exploring term dependences in probabilistic information retrieval model</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</creator><creatorcontrib>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</creatorcontrib><description>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</description><identifier>ISSN: 0306-4573</identifier><identifier>EISSN: 1873-5371</identifier><identifier>DOI: 10.1016/S0306-4573(02)00078-X</identifier><identifier>CODEN: IPMADK</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>2-Poisson model ; Bahadur–Lazarsfeld expansion ; Comparative Analysis ; Computerized information storage and retrieval ; Exact sciences and technology ; Improvement ; Information and communication sciences ; Information Retrieval ; Information retrieval systems ; Information retrieval systems. Information and document management system ; Information science. Documentation ; Mathematical models ; Online Searching ; Poisson Probability Distribution ; Probabilistic model ; Probabilistic Models ; Probabilistic retrieval ; Probability ; Relevance (Information Retrieval) ; Sciences and techniques of general use ; Searching ; Studies ; Subject Index Terms ; Term dependence ; Terms</subject><ispartof>Information processing &amp; management, 2003-07, Vol.39 (4), p.505-519</ispartof><rights>2002 Elsevier Science Ltd</rights><rights>2003 INIST-CNRS</rights><rights>Copyright Pergamon Press Inc. Jul 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</citedby><cites>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/S0306-4573(02)00078-X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ670083$$DView record in ERIC$$Hfree_for_read</backlink><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=14793666$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Cho, Bong-Hyun</creatorcontrib><creatorcontrib>Lee, Changki</creatorcontrib><creatorcontrib>Lee, Gary Geunbae</creatorcontrib><title>Exploring term dependences in probabilistic information retrieval model</title><title>Information processing &amp; management</title><description>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</description><subject>2-Poisson model</subject><subject>Bahadur–Lazarsfeld expansion</subject><subject>Comparative Analysis</subject><subject>Computerized information storage and retrieval</subject><subject>Exact sciences and technology</subject><subject>Improvement</subject><subject>Information and communication sciences</subject><subject>Information Retrieval</subject><subject>Information retrieval systems</subject><subject>Information retrieval systems. Information and document management system</subject><subject>Information science. Documentation</subject><subject>Mathematical models</subject><subject>Online Searching</subject><subject>Poisson Probability Distribution</subject><subject>Probabilistic model</subject><subject>Probabilistic Models</subject><subject>Probabilistic retrieval</subject><subject>Probability</subject><subject>Relevance (Information Retrieval)</subject><subject>Sciences and techniques of general use</subject><subject>Searching</subject><subject>Studies</subject><subject>Subject Index Terms</subject><subject>Term dependence</subject><subject>Terms</subject><issn>0306-4573</issn><issn>1873-5371</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNqFkM1KJDEYRcOgYNv6Bg4UgqKL0qTyV7UaBmn_aJjFKLgLqeSLRKoqPUl1o29v2m4ccOMqkHu-y-Ug9JPgC4KJuPyLKRYl45Ke4eocYyzr8ukHmpBa0pJTSXbQ5BPZQ_spvWSIcVJN0M3sddGF6IfnYoTYFxYWMFgYDKTCD8Uihla3vvNp9CZ_uBB7PfowFBHG6GGlu6IPFroDtOt0l-Bw-07R4_Xs4eq2nP-5ubv6PS8NI2IsnWtELnQtaYwFTZ2TNeYCWstkazF3lWiY0y1YaExFwTliXVtTp4HI1jE6Raeb3rzs3xLSqHqfDHSdHiAsk-KSS1mzJoPHX8CXsIxD3qZIwxrMasIzxDeQiSGlCE4tou91fFMEq7Vb9eFWrcUpXKkPt-op351sy3UyunNRD8an_8dMNlQIkbmjDQfRm894di8kxjXN8a9tnI2tPESVjF-7tz6CGZUN_psh79CGmaA</recordid><startdate>20030701</startdate><enddate>20030701</enddate><creator>Cho, Bong-Hyun</creator><creator>Lee, Changki</creator><creator>Lee, Gary Geunbae</creator><general>Elsevier Ltd</general><general>Elsevier Science</general><general>Elsevier Science Ltd</general><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20030701</creationdate><title>Exploring term dependences in probabilistic information retrieval model</title><author>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>2-Poisson model</topic><topic>Bahadur–Lazarsfeld expansion</topic><topic>Comparative Analysis</topic><topic>Computerized information storage and retrieval</topic><topic>Exact sciences and technology</topic><topic>Improvement</topic><topic>Information and communication sciences</topic><topic>Information Retrieval</topic><topic>Information retrieval systems</topic><topic>Information retrieval systems. Information and document management system</topic><topic>Information science. Documentation</topic><topic>Mathematical models</topic><topic>Online Searching</topic><topic>Poisson Probability Distribution</topic><topic>Probabilistic model</topic><topic>Probabilistic Models</topic><topic>Probabilistic retrieval</topic><topic>Probability</topic><topic>Relevance (Information Retrieval)</topic><topic>Sciences and techniques of general use</topic><topic>Searching</topic><topic>Studies</topic><topic>Subject Index Terms</topic><topic>Term dependence</topic><topic>Terms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cho, Bong-Hyun</creatorcontrib><creatorcontrib>Lee, Changki</creatorcontrib><creatorcontrib>Lee, Gary Geunbae</creatorcontrib><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><jtitle>Information processing &amp; management</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cho, Bong-Hyun</au><au>Lee, Changki</au><au>Lee, Gary Geunbae</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ670083</ericid><atitle>Exploring term dependences in probabilistic information retrieval model</atitle><jtitle>Information processing &amp; management</jtitle><date>2003-07-01</date><risdate>2003</risdate><volume>39</volume><issue>4</issue><spage>505</spage><epage>519</epage><pages>505-519</pages><issn>0306-4573</issn><eissn>1873-5371</eissn><coden>IPMADK</coden><abstract>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1016/S0306-4573(02)00078-X</doi><tpages>15</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0306-4573
ispartof Information processing & management, 2003-07, Vol.39 (4), p.505-519
issn 0306-4573
1873-5371
language eng
recordid cdi_proquest_miscellaneous_57577849
source Elsevier ScienceDirect Journals Complete
subjects 2-Poisson model
Bahadur–Lazarsfeld expansion
Comparative Analysis
Computerized information storage and retrieval
Exact sciences and technology
Improvement
Information and communication sciences
Information Retrieval
Information retrieval systems
Information retrieval systems. Information and document management system
Information science. Documentation
Mathematical models
Online Searching
Poisson Probability Distribution
Probabilistic model
Probabilistic Models
Probabilistic retrieval
Probability
Relevance (Information Retrieval)
Sciences and techniques of general use
Searching
Studies
Subject Index Terms
Term dependence
Terms
title Exploring term dependences in probabilistic information retrieval model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T19%3A58%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20term%20dependences%20in%20probabilistic%20information%20retrieval%20model&rft.jtitle=Information%20processing%20&%20management&rft.au=Cho,%20Bong-Hyun&rft.date=2003-07-01&rft.volume=39&rft.issue=4&rft.spage=505&rft.epage=519&rft.pages=505-519&rft.issn=0306-4573&rft.eissn=1873-5371&rft.coden=IPMADK&rft_id=info:doi/10.1016/S0306-4573(02)00078-X&rft_dat=%3Cproquest_cross%3E432545591%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=194904815&rft_id=info:pmid/&rft_ericid=EJ670083&rft_els_id=S030645730200078X&rfr_iscdi=true