Entropy-based clustering for improving document re-ranking

Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of betwee...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chong Teng, Yanxiang He, Donghong Ji, Cheng zhou, Yixuan Geng, Shu Chen
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 666
container_issue
container_start_page 662
container_title
container_volume 3
creator Chong Teng
Yanxiang He
Donghong Ji
Cheng zhou
Yixuan Geng
Shu Chen
description Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.
doi_str_mv 10.1109/ICICISYS.2009.5358089
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5358089</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5358089</ieee_id><sourcerecordid>5358089</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-e319367e367ec5247d6b6f08cf6fd5f0087601e2c6ccd8364c14f236dae5aa2d3</originalsourceid><addsrcrecordid>eNpNT9tKw0AUXJGCWvMFIuQHEs_ed32TUGuh4EP1waey3T0r0ebCJhX696bYB2c4DDMMB4aQewolpWAfVtXEzcemZAC2lFwaMPaCZFYbKpgQQnNDL_97KeiM3JzqFiS1cEWyYfiCCUJyyvg1eVy0Y-r6Y7FzA4bc7w_DiKluP_PYpbxu-tT9nFzo_KHBdswTFsm131N2S2bR7QfMzjon78-Lt-qlWL8uV9XTuqiplmOBnFquNJ7OSyZ0UDsVwfioYpARwGgFFJlX3gfDlfBURMZVcCidY4HPyd3f3xoRt32qG5eO2_N8_gtpZkzg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Entropy-based clustering for improving document re-ranking</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Chong Teng ; Yanxiang He ; Donghong Ji ; Cheng zhou ; Yixuan Geng ; Shu Chen</creator><creatorcontrib>Chong Teng ; Yanxiang He ; Donghong Ji ; Cheng zhou ; Yixuan Geng ; Shu Chen</creatorcontrib><description>Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.</description><identifier>ISBN: 9781424447541</identifier><identifier>ISBN: 1424447542</identifier><identifier>EISBN: 9781424447381</identifier><identifier>EISBN: 1424447380</identifier><identifier>DOI: 10.1109/ICICISYS.2009.5358089</identifier><identifier>LCCN: 2009905190</identifier><language>eng</language><publisher>IEEE</publisher><subject>between-cluster entropy ; Clustering ; component ; Concrete ; Document re-ranking ; Entropy ; Helium ; Information retrieval ; Large-scale systems ; Mathematics ; Statistics ; Text analysis ; Thesauri ; Vocabulary ; within-cluster entropy</subject><ispartof>2009 IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009, Vol.3, p.662-666</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5358089$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5358089$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chong Teng</creatorcontrib><creatorcontrib>Yanxiang He</creatorcontrib><creatorcontrib>Donghong Ji</creatorcontrib><creatorcontrib>Cheng zhou</creatorcontrib><creatorcontrib>Yixuan Geng</creatorcontrib><creatorcontrib>Shu Chen</creatorcontrib><title>Entropy-based clustering for improving document re-ranking</title><title>2009 IEEE International Conference on Intelligent Computing and Intelligent Systems</title><addtitle>ICICISYS</addtitle><description>Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.</description><subject>between-cluster entropy</subject><subject>Clustering</subject><subject>component</subject><subject>Concrete</subject><subject>Document re-ranking</subject><subject>Entropy</subject><subject>Helium</subject><subject>Information retrieval</subject><subject>Large-scale systems</subject><subject>Mathematics</subject><subject>Statistics</subject><subject>Text analysis</subject><subject>Thesauri</subject><subject>Vocabulary</subject><subject>within-cluster entropy</subject><isbn>9781424447541</isbn><isbn>1424447542</isbn><isbn>9781424447381</isbn><isbn>1424447380</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2009</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpNT9tKw0AUXJGCWvMFIuQHEs_ed32TUGuh4EP1waey3T0r0ebCJhX696bYB2c4DDMMB4aQewolpWAfVtXEzcemZAC2lFwaMPaCZFYbKpgQQnNDL_97KeiM3JzqFiS1cEWyYfiCCUJyyvg1eVy0Y-r6Y7FzA4bc7w_DiKluP_PYpbxu-tT9nFzo_KHBdswTFsm131N2S2bR7QfMzjon78-Lt-qlWL8uV9XTuqiplmOBnFquNJ7OSyZ0UDsVwfioYpARwGgFFJlX3gfDlfBURMZVcCidY4HPyd3f3xoRt32qG5eO2_N8_gtpZkzg</recordid><startdate>200911</startdate><enddate>200911</enddate><creator>Chong Teng</creator><creator>Yanxiang He</creator><creator>Donghong Ji</creator><creator>Cheng zhou</creator><creator>Yixuan Geng</creator><creator>Shu Chen</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200911</creationdate><title>Entropy-based clustering for improving document re-ranking</title><author>Chong Teng ; Yanxiang He ; Donghong Ji ; Cheng zhou ; Yixuan Geng ; Shu Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-e319367e367ec5247d6b6f08cf6fd5f0087601e2c6ccd8364c14f236dae5aa2d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2009</creationdate><topic>between-cluster entropy</topic><topic>Clustering</topic><topic>component</topic><topic>Concrete</topic><topic>Document re-ranking</topic><topic>Entropy</topic><topic>Helium</topic><topic>Information retrieval</topic><topic>Large-scale systems</topic><topic>Mathematics</topic><topic>Statistics</topic><topic>Text analysis</topic><topic>Thesauri</topic><topic>Vocabulary</topic><topic>within-cluster entropy</topic><toplevel>online_resources</toplevel><creatorcontrib>Chong Teng</creatorcontrib><creatorcontrib>Yanxiang He</creatorcontrib><creatorcontrib>Donghong Ji</creatorcontrib><creatorcontrib>Cheng zhou</creatorcontrib><creatorcontrib>Yixuan Geng</creatorcontrib><creatorcontrib>Shu Chen</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chong Teng</au><au>Yanxiang He</au><au>Donghong Ji</au><au>Cheng zhou</au><au>Yixuan Geng</au><au>Shu Chen</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Entropy-based clustering for improving document re-ranking</atitle><btitle>2009 IEEE International Conference on Intelligent Computing and Intelligent Systems</btitle><stitle>ICICISYS</stitle><date>2009-11</date><risdate>2009</risdate><volume>3</volume><spage>662</spage><epage>666</epage><pages>662-666</pages><isbn>9781424447541</isbn><isbn>1424447542</isbn><eisbn>9781424447381</eisbn><eisbn>1424447380</eisbn><abstract>Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.</abstract><pub>IEEE</pub><doi>10.1109/ICICISYS.2009.5358089</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 9781424447541
ispartof 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009, Vol.3, p.662-666
issn
language eng
recordid cdi_ieee_primary_5358089
source IEEE Electronic Library (IEL) Conference Proceedings
subjects between-cluster entropy
Clustering
component
Concrete
Document re-ranking
Entropy
Helium
Information retrieval
Large-scale systems
Mathematics
Statistics
Text analysis
Thesauri
Vocabulary
within-cluster entropy
title Entropy-based clustering for improving document re-ranking
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T18%3A36%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Entropy-based%20clustering%20for%20improving%20document%20re-ranking&rft.btitle=2009%20IEEE%20International%20Conference%20on%20Intelligent%20Computing%20and%20Intelligent%20Systems&rft.au=Chong%20Teng&rft.date=2009-11&rft.volume=3&rft.spage=662&rft.epage=666&rft.pages=662-666&rft.isbn=9781424447541&rft.isbn_list=1424447542&rft_id=info:doi/10.1109/ICICISYS.2009.5358089&rft_dat=%3Cieee_6IE%3E5358089%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424447381&rft.eisbn_list=1424447380&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5358089&rfr_iscdi=true