An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues
It is hard to maintain web applications due to rapid changes and the proliferation of various techniques applied to web applications. Several approaches, such as clustering or refactoring web applications, have been suggested to improve their maintainability. The similarity measure is one of the pri...
Gespeichert in:
Veröffentlicht in: | Journal of Information Science and Engineering 2011-11, Vol.27 (6), p.1787-1822 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1822 |
---|---|
container_issue | 6 |
container_start_page | 1787 |
container_title | Journal of Information Science and Engineering |
container_volume | 27 |
creator | 鄭羽盛(Woo-Sung Jung) 李銀珠(Eun-Joo Lee) 禹治水(Chi-Su Wu) |
description | It is hard to maintain web applications due to rapid changes and the proliferation of various techniques applied to web applications. Several approaches, such as clustering or refactoring web applications, have been suggested to improve their maintainability. The similarity measure is one of the principal criteria in these approaches. Existing studies on web similarity focused on semantic or context similarity. Most of the existing clone detection techniques concentrated on general applications, not web applications. In this paper, WSIM has been suggested to measure similarity in web applications, based on the usage degree of clues and two linking directions. The similarity clues include page relations, source and target entities, and parameters. WSIM can be classified in three levels and two directions. Six kinds of WSIMs are defined, and each WSIM has its own purpose. Finally, several experiments were conducted on simulated data and real open sources to validate the proposed WSIM. |
doi_str_mv | 10.6688/JISE.2011.27.6.1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_proquest_miscellaneous_1009818806</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><airiti_id>10162364_201111_201303130002_201303130002_1787_1822</airiti_id><sourcerecordid>1009818806</sourcerecordid><originalsourceid>FETCH-LOGICAL-a285t-7b0e8542b6c9b43e58ce76f86d819f21798e6599cb485da6bb8027a6f0c0d3663</originalsourceid><addsrcrecordid>eNpVkD1PwzAQhj2ARPnYGb0gsST4I7GdsZQCrSqBVBASi-U4F3DlJiVOEf33OGoZGE6nkx7d3fsgdElJKoRSN_PZcpoyQmnKZCpSeoRGlFCRMC6yE3QawooQJvIsG6H3cYOnPxvo3Bqa3ng83my61thP3Lf4DnqwPV66tfOmw29Q4mfzAQHfmgAVbhvMkwV8gw-4rf8w1-_wxG8hnKPj2vgAF4d-hl7vpy-Tx2Tx9DCbjBeJYSrvE1kSUHnGSmGLMuOQKwtS1EpUihY1o7JQIPKisGWm8sqIslSESSNqYknFheBn6Hq_Nz7-Fe_2eu2CBe9NA-02aEpIoahSZECvDqgJ1vi6M411QW9ietPtNMukYpLJyM33nHExj9Ordts1MYMeNA4W9WCX0qFxwmNFof8HKpXUVDHGfwEf63PA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1009818806</pqid></control><display><type>article</type><title>An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>鄭羽盛(Woo-Sung Jung) ; 李銀珠(Eun-Joo Lee) ; 禹治水(Chi-Su Wu)</creator><creatorcontrib>鄭羽盛(Woo-Sung Jung) ; 李銀珠(Eun-Joo Lee) ; 禹治水(Chi-Su Wu)</creatorcontrib><description>It is hard to maintain web applications due to rapid changes and the proliferation of various techniques applied to web applications. Several approaches, such as clustering or refactoring web applications, have been suggested to improve their maintainability. The similarity measure is one of the principal criteria in these approaches. Existing studies on web similarity focused on semantic or context similarity. Most of the existing clone detection techniques concentrated on general applications, not web applications. In this paper, WSIM has been suggested to measure similarity in web applications, based on the usage degree of clues and two linking directions. The similarity clues include page relations, source and target entities, and parameters. WSIM can be classified in three levels and two directions. Six kinds of WSIMs are defined, and each WSIM has its own purpose. Finally, several experiments were conducted on simulated data and real open sources to validate the proposed WSIM.</description><identifier>ISSN: 1016-2364</identifier><identifier>DOI: 10.6688/JISE.2011.27.6.1</identifier><language>eng</language><publisher>Taipei: 社團法人中華民國計算語言學學會</publisher><subject>Applied sciences ; Clustering ; Computer science; control theory; systems ; Computer systems and distributed systems. User interface ; Criteria ; Exact sciences and technology ; Joining ; Linking ; Programming languages ; Semantics ; Similarity ; Simulation ; Software ; Software engineering ; Websites</subject><ispartof>Journal of Information Science and Engineering, 2011-11, Vol.27 (6), p.1787-1822</ispartof><rights>2015 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=24782727$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>鄭羽盛(Woo-Sung Jung)</creatorcontrib><creatorcontrib>李銀珠(Eun-Joo Lee)</creatorcontrib><creatorcontrib>禹治水(Chi-Su Wu)</creatorcontrib><title>An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues</title><title>Journal of Information Science and Engineering</title><description>It is hard to maintain web applications due to rapid changes and the proliferation of various techniques applied to web applications. Several approaches, such as clustering or refactoring web applications, have been suggested to improve their maintainability. The similarity measure is one of the principal criteria in these approaches. Existing studies on web similarity focused on semantic or context similarity. Most of the existing clone detection techniques concentrated on general applications, not web applications. In this paper, WSIM has been suggested to measure similarity in web applications, based on the usage degree of clues and two linking directions. The similarity clues include page relations, source and target entities, and parameters. WSIM can be classified in three levels and two directions. Six kinds of WSIMs are defined, and each WSIM has its own purpose. Finally, several experiments were conducted on simulated data and real open sources to validate the proposed WSIM.</description><subject>Applied sciences</subject><subject>Clustering</subject><subject>Computer science; control theory; systems</subject><subject>Computer systems and distributed systems. User interface</subject><subject>Criteria</subject><subject>Exact sciences and technology</subject><subject>Joining</subject><subject>Linking</subject><subject>Programming languages</subject><subject>Semantics</subject><subject>Similarity</subject><subject>Simulation</subject><subject>Software</subject><subject>Software engineering</subject><subject>Websites</subject><issn>1016-2364</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNpVkD1PwzAQhj2ARPnYGb0gsST4I7GdsZQCrSqBVBASi-U4F3DlJiVOEf33OGoZGE6nkx7d3fsgdElJKoRSN_PZcpoyQmnKZCpSeoRGlFCRMC6yE3QawooQJvIsG6H3cYOnPxvo3Bqa3ng83my61thP3Lf4DnqwPV66tfOmw29Q4mfzAQHfmgAVbhvMkwV8gw-4rf8w1-_wxG8hnKPj2vgAF4d-hl7vpy-Tx2Tx9DCbjBeJYSrvE1kSUHnGSmGLMuOQKwtS1EpUihY1o7JQIPKisGWm8sqIslSESSNqYknFheBn6Hq_Nz7-Fe_2eu2CBe9NA-02aEpIoahSZECvDqgJ1vi6M411QW9ietPtNMukYpLJyM33nHExj9Ordts1MYMeNA4W9WCX0qFxwmNFof8HKpXUVDHGfwEf63PA</recordid><startdate>20111101</startdate><enddate>20111101</enddate><creator>鄭羽盛(Woo-Sung Jung)</creator><creator>李銀珠(Eun-Joo Lee)</creator><creator>禹治水(Chi-Su Wu)</creator><general>社團法人中華民國計算語言學學會</general><general>Institute of Information Science, Academia sinica</general><scope>188</scope><scope>IQODW</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20111101</creationdate><title>An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues</title><author>鄭羽盛(Woo-Sung Jung) ; 李銀珠(Eun-Joo Lee) ; 禹治水(Chi-Su Wu)</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a285t-7b0e8542b6c9b43e58ce76f86d819f21798e6599cb485da6bb8027a6f0c0d3663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Applied sciences</topic><topic>Clustering</topic><topic>Computer science; control theory; systems</topic><topic>Computer systems and distributed systems. User interface</topic><topic>Criteria</topic><topic>Exact sciences and technology</topic><topic>Joining</topic><topic>Linking</topic><topic>Programming languages</topic><topic>Semantics</topic><topic>Similarity</topic><topic>Simulation</topic><topic>Software</topic><topic>Software engineering</topic><topic>Websites</topic><toplevel>online_resources</toplevel><creatorcontrib>鄭羽盛(Woo-Sung Jung)</creatorcontrib><creatorcontrib>李銀珠(Eun-Joo Lee)</creatorcontrib><creatorcontrib>禹治水(Chi-Su Wu)</creatorcontrib><collection>Airiti Library</collection><collection>Pascal-Francis</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of Information Science and Engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>鄭羽盛(Woo-Sung Jung)</au><au>李銀珠(Eun-Joo Lee)</au><au>禹治水(Chi-Su Wu)</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues</atitle><jtitle>Journal of Information Science and Engineering</jtitle><date>2011-11-01</date><risdate>2011</risdate><volume>27</volume><issue>6</issue><spage>1787</spage><epage>1822</epage><pages>1787-1822</pages><issn>1016-2364</issn><abstract>It is hard to maintain web applications due to rapid changes and the proliferation of various techniques applied to web applications. Several approaches, such as clustering or refactoring web applications, have been suggested to improve their maintainability. The similarity measure is one of the principal criteria in these approaches. Existing studies on web similarity focused on semantic or context similarity. Most of the existing clone detection techniques concentrated on general applications, not web applications. In this paper, WSIM has been suggested to measure similarity in web applications, based on the usage degree of clues and two linking directions. The similarity clues include page relations, source and target entities, and parameters. WSIM can be classified in three levels and two directions. Six kinds of WSIMs are defined, and each WSIM has its own purpose. Finally, several experiments were conducted on simulated data and real open sources to validate the proposed WSIM.</abstract><cop>Taipei</cop><pub>社團法人中華民國計算語言學學會</pub><doi>10.6688/JISE.2011.27.6.1</doi><tpages>36</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1016-2364 |
ispartof | Journal of Information Science and Engineering, 2011-11, Vol.27 (6), p.1787-1822 |
issn | 1016-2364 |
language | eng |
recordid | cdi_proquest_miscellaneous_1009818806 |
source | Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Applied sciences Clustering Computer science control theory systems Computer systems and distributed systems. User interface Criteria Exact sciences and technology Joining Linking Programming languages Semantics Similarity Simulation Software Software engineering Websites |
title | An Experimental Approach to Detect Similar Web Pages Based on 3-Levels of Similarity Clues |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T10%3A54%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Experimental%20Approach%20to%20Detect%20Similar%20Web%20Pages%20Based%20on%203-Levels%20of%20Similarity%20Clues&rft.jtitle=Journal%20of%20Information%20Science%20and%20Engineering&rft.au=%E9%84%AD%E7%BE%BD%E7%9B%9B(Woo-Sung%20Jung)&rft.date=2011-11-01&rft.volume=27&rft.issue=6&rft.spage=1787&rft.epage=1822&rft.pages=1787-1822&rft.issn=1016-2364&rft_id=info:doi/10.6688/JISE.2011.27.6.1&rft_dat=%3Cproquest_pasca%3E1009818806%3C/proquest_pasca%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1009818806&rft_id=info:pmid/&rft_airiti_id=10162364_201111_201303130002_201303130002_1787_1822&rfr_iscdi=true |