SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree

A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEICE Transactions on Information and Systems 2014, Vol.E97.D(9), pp.2398-2414
Hauptverfasser:	KIM, In-Joong, WHANG, Kyu-Young, KWON, Hyuk-Yoon
Format:	Artikel
Sprache:	eng
Schlagworte:	keyword query lossless join relational database semantic relevancy strongly related tree
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2414
container_issue	9
container_start_page	2398
container_title	IEICE Transactions on Information and Systems
container_volume	E97.D
creator	KIM, In-Joong WHANG, Kyu-Young KWON, Hyuk-Yoon
description	A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: 1) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-IDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (SRT). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank, that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods, SRT-Rank improves performance in terms of four quality measures — the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision — by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.
doi_str_mv	10.1587/transinf.2014EDP7040
format	Article
fullrecord	<record><control><sourceid>jstage_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1587_transinf_2014EDP7040</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>article_transinf_E97_D_9_E97_D_2014EDP7040_article_char_en</sourcerecordid><originalsourceid>FETCH-LOGICAL-c493t-ac044fe5f3e42397b6927d5c8acdf8361f4a2c64b7c903a1d36615f3847e2c383</originalsourceid><addsrcrecordid>eNpNkM1OwzAQhC0EEqXwBhz8Ail27MQJN9SWH1GJ0p-ztXU2bUpwkO0K5e1J1VK47OxhvpFmCLnlbMCTTN0FB9ZXthzEjMvxaKqYZGekx5VMIi5Sfk56LOdplCUiviRX3m8Z41nMkx7R89kimoH9uKf7W9k1fcX2u3EFfd-ha-kM_a4Onla2e2sIVWOhpiMIsAKPni79ngkbpPPgGruu24MPC7pwiNfkooTa481R-2T5OF4Mn6PJ29PL8GESGZmLEIFhUpaYlAJlLHK1SvNYFYnJwBRl1lUoJcQmlStlciaAFyJNeefOpMLYiEz0iTzkGtd477DUX676BNdqzvR-JP07kv43UodND9jWB1jjCQIXKlPjHzTOlR7p_Kj_Ik5WswGn0Yof_MN6Qg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree</title><source>J-STAGE Free</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>KIM, In-Joong ; WHANG, Kyu-Young ; KWON, Hyuk-Yoon</creator><creatorcontrib>KIM, In-Joong ; WHANG, Kyu-Young ; KWON, Hyuk-Yoon</creatorcontrib><description>A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: 1) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-IDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (SRT). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank, that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods, SRT-Rank improves performance in terms of four quality measures — the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision — by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.2014EDP7040</identifier><language>eng</language><publisher>The Institute of Electronics, Information and Communication Engineers</publisher><subject>keyword query ; lossless join ; relational database ; semantic relevancy ; strongly related tree</subject><ispartof>IEICE Transactions on Information and Systems, 2014, Vol.E97.D(9), pp.2398-2414</ispartof><rights>2014 The Institute of Electronics, Information and Communication Engineers</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c493t-ac044fe5f3e42397b6927d5c8acdf8361f4a2c64b7c903a1d36615f3847e2c383</citedby><cites>FETCH-LOGICAL-c493t-ac044fe5f3e42397b6927d5c8acdf8361f4a2c64b7c903a1d36615f3847e2c383</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1882,4023,27922,27923,27924</link.rule.ids></links><search><creatorcontrib>KIM, In-Joong</creatorcontrib><creatorcontrib>WHANG, Kyu-Young</creatorcontrib><creatorcontrib>KWON, Hyuk-Yoon</creatorcontrib><title>SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. & Syst.</addtitle><description>A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: 1) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-IDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (SRT). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank, that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods, SRT-Rank improves performance in terms of four quality measures — the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision — by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.</description><subject>keyword query</subject><subject>lossless join</subject><subject>relational database</subject><subject>semantic relevancy</subject><subject>strongly related tree</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNpNkM1OwzAQhC0EEqXwBhz8Ail27MQJN9SWH1GJ0p-ztXU2bUpwkO0K5e1J1VK47OxhvpFmCLnlbMCTTN0FB9ZXthzEjMvxaKqYZGekx5VMIi5Sfk56LOdplCUiviRX3m8Z41nMkx7R89kimoH9uKf7W9k1fcX2u3EFfd-ha-kM_a4Onla2e2sIVWOhpiMIsAKPni79ngkbpPPgGruu24MPC7pwiNfkooTa481R-2T5OF4Mn6PJ29PL8GESGZmLEIFhUpaYlAJlLHK1SvNYFYnJwBRl1lUoJcQmlStlciaAFyJNeefOpMLYiEz0iTzkGtd477DUX676BNdqzvR-JP07kv43UodND9jWB1jjCQIXKlPjHzTOlR7p_Kj_Ik5WswGn0Yof_MN6Qg</recordid><startdate>2014</startdate><enddate>2014</enddate><creator>KIM, In-Joong</creator><creator>WHANG, Kyu-Young</creator><creator>KWON, Hyuk-Yoon</creator><general>The Institute of Electronics, Information and Communication Engineers</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>2014</creationdate><title>SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree</title><author>KIM, In-Joong ; WHANG, Kyu-Young ; KWON, Hyuk-Yoon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c493t-ac044fe5f3e42397b6927d5c8acdf8361f4a2c64b7c903a1d36615f3847e2c383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>keyword query</topic><topic>lossless join</topic><topic>relational database</topic><topic>semantic relevancy</topic><topic>strongly related tree</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>KIM, In-Joong</creatorcontrib><creatorcontrib>WHANG, Kyu-Young</creatorcontrib><creatorcontrib>KWON, Hyuk-Yoon</creatorcontrib><collection>CrossRef</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>KIM, In-Joong</au><au>WHANG, Kyu-Young</au><au>KWON, Hyuk-Yoon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. & Syst.</addtitle><date>2014</date><risdate>2014</risdate><volume>E97.D</volume><issue>9</issue><spage>2398</spage><epage>2414</epage><pages>2398-2414</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: 1) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-IDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (SRT). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank, that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods, SRT-Rank improves performance in terms of four quality measures — the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision — by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.</abstract><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.2014EDP7040</doi><tpages>17</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0916-8532
ispartof	IEICE Transactions on Information and Systems, 2014, Vol.E97.D(9), pp.2398-2414
issn	0916-8532 1745-1361
language	eng
recordid	cdi_crossref_primary_10_1587_transinf_2014EDP7040
source	J-STAGE Free; EZB-FREE-00999 freely available EZB journals
subjects	keyword query lossless join relational database semantic relevancy strongly related tree
title	SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T13%3A23%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstage_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SRT-Rank:%20Ranking%20Keyword%20Query%20Results%20in%20Relational%20Databases%20Using%20the%20Strongly%20Related%20Tree&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=KIM,%20In-Joong&rft.date=2014&rft.volume=E97.D&rft.issue=9&rft.spage=2398&rft.epage=2414&rft.pages=2398-2414&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.2014EDP7040&rft_dat=%3Cjstage_cross%3Earticle_transinf_E97_D_9_E97_D_2014EDP7040_article_char_en%3C/jstage_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true