Information discovery across multiple streams

In this paper we address the issue of continuous keyword queries on multiple textual streams and explore techniques for extracting useful information from them. The paper represents, to our best knowledge, the first approach that performs keyword search on a multiplicity of textual streams. The scen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information sciences 2009-09, Vol.179 (19), p.3268-3285
Hauptverfasser:	Hristidis, Vagelis, Valdivia, Oscar, Vlachos, Michail, Yu, Philip S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Continuous queries Correlation Keyword search Real-time search Streams
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3285
container_issue	19
container_start_page	3268
container_title	Information sciences
container_volume	179
creator	Hristidis, Vagelis Valdivia, Oscar Vlachos, Michail Yu, Philip S.
description	In this paper we address the issue of continuous keyword queries on multiple textual streams and explore techniques for extracting useful information from them. The paper represents, to our best knowledge, the first approach that performs keyword search on a multiplicity of textual streams. The scenario that we consider is quite intuitive; let’s assume that a research or financial analyst is searching for information on a topic, continuously polling data from multiple (and possibly heterogeneous) text streams, such as RSS feeds, blogs, etc. The topic of interest can be described with the aid of several keywords. Current filtering approaches would just identify single text streams containing some of the keywords. However, it would be more flexible and powerful to search across multiple streams, which may collectively answer the analyst’s question. We present such model that takes in consideration the continuous flow of text in streams and uses efficient pipelined algorithms such that results are output as soon as they are available. The proposed model is evaluated analytically and experimentally, where the Enron dataset and a variety of blog datasets are used for our experiments.
doi_str_mv	10.1016/j.ins.2009.06.008
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_36498785</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0020025509002515</els_id><sourcerecordid>36498785</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-f77057267b62e4013415e9e3373eea29316de48c39b5fd59e9749ffa6131abe83</originalsourceid><addsrcrecordid>eNp9kDtPwzAUhS0EEqXwA9gysSVc2_FLTKjiUakSC8yW61xLrvIodlqp_56UMDPd5XxH536E3FOoKFD5uKtinysGYCqQFYC-IAuqFSslM_SSLAAYlMCEuCY3Oe8AoFZSLki57sOQOjfGoS-amP1wxHQqnE9DzkV3aMe4b7HIY0LX5VtyFVyb8e7vLsnX68vn6r3cfLytV8-b0nOmxzIoBUIxqbaSYQ2U11SgQc4VR3TMcCobrLXnZitCIwwaVZsQnKScui1qviQPc-8-Dd8HzKPtpmnYtq7H4ZAtl7XRSospSOfg796Ewe5T7Fw6WQr2LMbu7CTGnsVYkHYSMzFPM4PTB8eIyWYfsffYxIR-tM0Q_6F_AD95avY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>36498785</pqid></control><display><type>article</type><title>Information discovery across multiple streams</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Hristidis, Vagelis ; Valdivia, Oscar ; Vlachos, Michail ; Yu, Philip S.</creator><creatorcontrib>Hristidis, Vagelis ; Valdivia, Oscar ; Vlachos, Michail ; Yu, Philip S.</creatorcontrib><description>In this paper we address the issue of continuous keyword queries on multiple textual streams and explore techniques for extracting useful information from them. The paper represents, to our best knowledge, the first approach that performs keyword search on a multiplicity of textual streams. The scenario that we consider is quite intuitive; let’s assume that a research or financial analyst is searching for information on a topic, continuously polling data from multiple (and possibly heterogeneous) text streams, such as RSS feeds, blogs, etc. The topic of interest can be described with the aid of several keywords. Current filtering approaches would just identify single text streams containing some of the keywords. However, it would be more flexible and powerful to search across multiple streams, which may collectively answer the analyst’s question. We present such model that takes in consideration the continuous flow of text in streams and uses efficient pipelined algorithms such that results are output as soon as they are available. The proposed model is evaluated analytically and experimentally, where the Enron dataset and a variety of blog datasets are used for our experiments.</description><identifier>ISSN: 0020-0255</identifier><identifier>EISSN: 1872-6291</identifier><identifier>DOI: 10.1016/j.ins.2009.06.008</identifier><language>eng</language><publisher>Elsevier Inc</publisher><subject>Continuous queries ; Correlation ; Keyword search ; Real-time search ; Streams</subject><ispartof>Information sciences, 2009-09, Vol.179 (19), p.3268-3285</ispartof><rights>2009</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-f77057267b62e4013415e9e3373eea29316de48c39b5fd59e9749ffa6131abe83</citedby><cites>FETCH-LOGICAL-c328t-f77057267b62e4013415e9e3373eea29316de48c39b5fd59e9749ffa6131abe83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ins.2009.06.008$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Hristidis, Vagelis</creatorcontrib><creatorcontrib>Valdivia, Oscar</creatorcontrib><creatorcontrib>Vlachos, Michail</creatorcontrib><creatorcontrib>Yu, Philip S.</creatorcontrib><title>Information discovery across multiple streams</title><title>Information sciences</title><description>In this paper we address the issue of continuous keyword queries on multiple textual streams and explore techniques for extracting useful information from them. The paper represents, to our best knowledge, the first approach that performs keyword search on a multiplicity of textual streams. The scenario that we consider is quite intuitive; let’s assume that a research or financial analyst is searching for information on a topic, continuously polling data from multiple (and possibly heterogeneous) text streams, such as RSS feeds, blogs, etc. The topic of interest can be described with the aid of several keywords. Current filtering approaches would just identify single text streams containing some of the keywords. However, it would be more flexible and powerful to search across multiple streams, which may collectively answer the analyst’s question. We present such model that takes in consideration the continuous flow of text in streams and uses efficient pipelined algorithms such that results are output as soon as they are available. The proposed model is evaluated analytically and experimentally, where the Enron dataset and a variety of blog datasets are used for our experiments.</description><subject>Continuous queries</subject><subject>Correlation</subject><subject>Keyword search</subject><subject>Real-time search</subject><subject>Streams</subject><issn>0020-0255</issn><issn>1872-6291</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><recordid>eNp9kDtPwzAUhS0EEqXwA9gysSVc2_FLTKjiUakSC8yW61xLrvIodlqp_56UMDPd5XxH536E3FOoKFD5uKtinysGYCqQFYC-IAuqFSslM_SSLAAYlMCEuCY3Oe8AoFZSLki57sOQOjfGoS-amP1wxHQqnE9DzkV3aMe4b7HIY0LX5VtyFVyb8e7vLsnX68vn6r3cfLytV8-b0nOmxzIoBUIxqbaSYQ2U11SgQc4VR3TMcCobrLXnZitCIwwaVZsQnKScui1qviQPc-8-Dd8HzKPtpmnYtq7H4ZAtl7XRSospSOfg796Ewe5T7Fw6WQr2LMbu7CTGnsVYkHYSMzFPM4PTB8eIyWYfsffYxIR-tM0Q_6F_AD95avY</recordid><startdate>20090909</startdate><enddate>20090909</enddate><creator>Hristidis, Vagelis</creator><creator>Valdivia, Oscar</creator><creator>Vlachos, Michail</creator><creator>Yu, Philip S.</creator><general>Elsevier Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20090909</creationdate><title>Information discovery across multiple streams</title><author>Hristidis, Vagelis ; Valdivia, Oscar ; Vlachos, Michail ; Yu, Philip S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-f77057267b62e4013415e9e3373eea29316de48c39b5fd59e9749ffa6131abe83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Continuous queries</topic><topic>Correlation</topic><topic>Keyword search</topic><topic>Real-time search</topic><topic>Streams</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hristidis, Vagelis</creatorcontrib><creatorcontrib>Valdivia, Oscar</creatorcontrib><creatorcontrib>Vlachos, Michail</creatorcontrib><creatorcontrib>Yu, Philip S.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Information sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hristidis, Vagelis</au><au>Valdivia, Oscar</au><au>Vlachos, Michail</au><au>Yu, Philip S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Information discovery across multiple streams</atitle><jtitle>Information sciences</jtitle><date>2009-09-09</date><risdate>2009</risdate><volume>179</volume><issue>19</issue><spage>3268</spage><epage>3285</epage><pages>3268-3285</pages><issn>0020-0255</issn><eissn>1872-6291</eissn><abstract>In this paper we address the issue of continuous keyword queries on multiple textual streams and explore techniques for extracting useful information from them. The paper represents, to our best knowledge, the first approach that performs keyword search on a multiplicity of textual streams. The scenario that we consider is quite intuitive; let’s assume that a research or financial analyst is searching for information on a topic, continuously polling data from multiple (and possibly heterogeneous) text streams, such as RSS feeds, blogs, etc. The topic of interest can be described with the aid of several keywords. Current filtering approaches would just identify single text streams containing some of the keywords. However, it would be more flexible and powerful to search across multiple streams, which may collectively answer the analyst’s question. We present such model that takes in consideration the continuous flow of text in streams and uses efficient pipelined algorithms such that results are output as soon as they are available. The proposed model is evaluated analytically and experimentally, where the Enron dataset and a variety of blog datasets are used for our experiments.</abstract><pub>Elsevier Inc</pub><doi>10.1016/j.ins.2009.06.008</doi><tpages>18</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0020-0255
ispartof	Information sciences, 2009-09, Vol.179 (19), p.3268-3285
issn	0020-0255 1872-6291
language	eng
recordid	cdi_proquest_miscellaneous_36498785
source	Elsevier ScienceDirect Journals Complete
subjects	Continuous queries Correlation Keyword search Real-time search Streams
title	Information discovery across multiple streams
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T22%3A54%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Information%20discovery%20across%20multiple%20streams&rft.jtitle=Information%20sciences&rft.au=Hristidis,%20Vagelis&rft.date=2009-09-09&rft.volume=179&rft.issue=19&rft.spage=3268&rft.epage=3285&rft.pages=3268-3285&rft.issn=0020-0255&rft.eissn=1872-6291&rft_id=info:doi/10.1016/j.ins.2009.06.008&rft_dat=%3Cproquest_cross%3E36498785%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=36498785&rft_id=info:pmid/&rft_els_id=S0020025509002515&rfr_iscdi=true