Exploring term dependences in probabilistic information retrieval model
Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term depen...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2003-07, Vol.39 (4), p.505-519 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 519 |
---|---|
container_issue | 4 |
container_start_page | 505 |
container_title | Information processing & management |
container_volume | 39 |
creator | Cho, Bong-Hyun Lee, Changki Lee, Gary Geunbae |
description | Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems. |
doi_str_mv | 10.1016/S0306-4573(02)00078-X |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57577849</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ670083</ericid><els_id>S030645730200078X</els_id><sourcerecordid>432545591</sourcerecordid><originalsourceid>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</originalsourceid><addsrcrecordid>eNqFkM1KJDEYRcOgYNv6Bg4UgqKL0qTyV7UaBmn_aJjFKLgLqeSLRKoqPUl1o29v2m4ccOMqkHu-y-Ug9JPgC4KJuPyLKRYl45Ke4eocYyzr8ukHmpBa0pJTSXbQ5BPZQ_spvWSIcVJN0M3sddGF6IfnYoTYFxYWMFgYDKTCD8Uihla3vvNp9CZ_uBB7PfowFBHG6GGlu6IPFroDtOt0l-Bw-07R4_Xs4eq2nP-5ubv6PS8NI2IsnWtELnQtaYwFTZ2TNeYCWstkazF3lWiY0y1YaExFwTliXVtTp4HI1jE6Raeb3rzs3xLSqHqfDHSdHiAsk-KSS1mzJoPHX8CXsIxD3qZIwxrMasIzxDeQiSGlCE4tou91fFMEq7Vb9eFWrcUpXKkPt-op351sy3UyunNRD8an_8dMNlQIkbmjDQfRm894di8kxjXN8a9tnI2tPESVjF-7tz6CGZUN_psh79CGmaA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>194904815</pqid></control><display><type>article</type><title>Exploring term dependences in probabilistic information retrieval model</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</creator><creatorcontrib>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</creatorcontrib><description>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</description><identifier>ISSN: 0306-4573</identifier><identifier>EISSN: 1873-5371</identifier><identifier>DOI: 10.1016/S0306-4573(02)00078-X</identifier><identifier>CODEN: IPMADK</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>2-Poisson model ; Bahadur–Lazarsfeld expansion ; Comparative Analysis ; Computerized information storage and retrieval ; Exact sciences and technology ; Improvement ; Information and communication sciences ; Information Retrieval ; Information retrieval systems ; Information retrieval systems. Information and document management system ; Information science. Documentation ; Mathematical models ; Online Searching ; Poisson Probability Distribution ; Probabilistic model ; Probabilistic Models ; Probabilistic retrieval ; Probability ; Relevance (Information Retrieval) ; Sciences and techniques of general use ; Searching ; Studies ; Subject Index Terms ; Term dependence ; Terms</subject><ispartof>Information processing & management, 2003-07, Vol.39 (4), p.505-519</ispartof><rights>2002 Elsevier Science Ltd</rights><rights>2003 INIST-CNRS</rights><rights>Copyright Pergamon Press Inc. Jul 2003</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</citedby><cites>FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/S0306-4573(02)00078-X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ670083$$DView record in ERIC$$Hfree_for_read</backlink><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=14793666$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Cho, Bong-Hyun</creatorcontrib><creatorcontrib>Lee, Changki</creatorcontrib><creatorcontrib>Lee, Gary Geunbae</creatorcontrib><title>Exploring term dependences in probabilistic information retrieval model</title><title>Information processing & management</title><description>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</description><subject>2-Poisson model</subject><subject>Bahadur–Lazarsfeld expansion</subject><subject>Comparative Analysis</subject><subject>Computerized information storage and retrieval</subject><subject>Exact sciences and technology</subject><subject>Improvement</subject><subject>Information and communication sciences</subject><subject>Information Retrieval</subject><subject>Information retrieval systems</subject><subject>Information retrieval systems. Information and document management system</subject><subject>Information science. Documentation</subject><subject>Mathematical models</subject><subject>Online Searching</subject><subject>Poisson Probability Distribution</subject><subject>Probabilistic model</subject><subject>Probabilistic Models</subject><subject>Probabilistic retrieval</subject><subject>Probability</subject><subject>Relevance (Information Retrieval)</subject><subject>Sciences and techniques of general use</subject><subject>Searching</subject><subject>Studies</subject><subject>Subject Index Terms</subject><subject>Term dependence</subject><subject>Terms</subject><issn>0306-4573</issn><issn>1873-5371</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNqFkM1KJDEYRcOgYNv6Bg4UgqKL0qTyV7UaBmn_aJjFKLgLqeSLRKoqPUl1o29v2m4ccOMqkHu-y-Ug9JPgC4KJuPyLKRYl45Ke4eocYyzr8ukHmpBa0pJTSXbQ5BPZQ_spvWSIcVJN0M3sddGF6IfnYoTYFxYWMFgYDKTCD8Uihla3vvNp9CZ_uBB7PfowFBHG6GGlu6IPFroDtOt0l-Bw-07R4_Xs4eq2nP-5ubv6PS8NI2IsnWtELnQtaYwFTZ2TNeYCWstkazF3lWiY0y1YaExFwTliXVtTp4HI1jE6Raeb3rzs3xLSqHqfDHSdHiAsk-KSS1mzJoPHX8CXsIxD3qZIwxrMasIzxDeQiSGlCE4tou91fFMEq7Vb9eFWrcUpXKkPt-op351sy3UyunNRD8an_8dMNlQIkbmjDQfRm894di8kxjXN8a9tnI2tPESVjF-7tz6CGZUN_psh79CGmaA</recordid><startdate>20030701</startdate><enddate>20030701</enddate><creator>Cho, Bong-Hyun</creator><creator>Lee, Changki</creator><creator>Lee, Gary Geunbae</creator><general>Elsevier Ltd</general><general>Elsevier Science</general><general>Elsevier Science Ltd</general><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20030701</creationdate><title>Exploring term dependences in probabilistic information retrieval model</title><author>Cho, Bong-Hyun ; Lee, Changki ; Lee, Gary Geunbae</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c416t-ff96babfb19cdea3ff78056ebd47bd05f2694fabede9c23eff1dfb83fae17bf43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>2-Poisson model</topic><topic>Bahadur–Lazarsfeld expansion</topic><topic>Comparative Analysis</topic><topic>Computerized information storage and retrieval</topic><topic>Exact sciences and technology</topic><topic>Improvement</topic><topic>Information and communication sciences</topic><topic>Information Retrieval</topic><topic>Information retrieval systems</topic><topic>Information retrieval systems. Information and document management system</topic><topic>Information science. Documentation</topic><topic>Mathematical models</topic><topic>Online Searching</topic><topic>Poisson Probability Distribution</topic><topic>Probabilistic model</topic><topic>Probabilistic Models</topic><topic>Probabilistic retrieval</topic><topic>Probability</topic><topic>Relevance (Information Retrieval)</topic><topic>Sciences and techniques of general use</topic><topic>Searching</topic><topic>Studies</topic><topic>Subject Index Terms</topic><topic>Term dependence</topic><topic>Terms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cho, Bong-Hyun</creatorcontrib><creatorcontrib>Lee, Changki</creatorcontrib><creatorcontrib>Lee, Gary Geunbae</creatorcontrib><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Information processing & management</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cho, Bong-Hyun</au><au>Lee, Changki</au><au>Lee, Gary Geunbae</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ670083</ericid><atitle>Exploring term dependences in probabilistic information retrieval model</atitle><jtitle>Information processing & management</jtitle><date>2003-07-01</date><risdate>2003</risdate><volume>39</volume><issue>4</issue><spage>505</spage><epage>519</epage><pages>505-519</pages><issn>0306-4573</issn><eissn>1873-5371</eissn><coden>IPMADK</coden><abstract>Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each another. However, this kind of conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence in probabilistic retrieval model by adapting Bahadur–Lazarsfeld expansion (BLE) to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply BLE to the general probabilistic models and the state-of-the-art 2-Poisson model. Through the experiments on two standard document collections, HANTEC2.0 in Korean and WT10g in English, we demonstrate that incorporation of term dependences using the BLE significantly contribute to the improvement of performance in at least two different language IR systems.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1016/S0306-4573(02)00078-X</doi><tpages>15</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0306-4573 |
ispartof | Information processing & management, 2003-07, Vol.39 (4), p.505-519 |
issn | 0306-4573 1873-5371 |
language | eng |
recordid | cdi_proquest_miscellaneous_57577849 |
source | Elsevier ScienceDirect Journals Complete |
subjects | 2-Poisson model Bahadur–Lazarsfeld expansion Comparative Analysis Computerized information storage and retrieval Exact sciences and technology Improvement Information and communication sciences Information Retrieval Information retrieval systems Information retrieval systems. Information and document management system Information science. Documentation Mathematical models Online Searching Poisson Probability Distribution Probabilistic model Probabilistic Models Probabilistic retrieval Probability Relevance (Information Retrieval) Sciences and techniques of general use Searching Studies Subject Index Terms Term dependence Terms |
title | Exploring term dependences in probabilistic information retrieval model |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T19%3A58%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20term%20dependences%20in%20probabilistic%20information%20retrieval%20model&rft.jtitle=Information%20processing%20&%20management&rft.au=Cho,%20Bong-Hyun&rft.date=2003-07-01&rft.volume=39&rft.issue=4&rft.spage=505&rft.epage=519&rft.pages=505-519&rft.issn=0306-4573&rft.eissn=1873-5371&rft.coden=IPMADK&rft_id=info:doi/10.1016/S0306-4573(02)00078-X&rft_dat=%3Cproquest_cross%3E432545591%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=194904815&rft_id=info:pmid/&rft_ericid=EJ670083&rft_els_id=S030645730200078X&rfr_iscdi=true |