Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services

Many existing studies measure the effectiveness of test case prioritization techniques using the average performance on a set of test suites. However, in each regression test session, a real-world developer may only afford to apply one prioritization technique to one test suite to test a service onc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of web services research 2015-07, Vol.12 (3), p.1-24
Hauptverfasser: Jia, Changjiang, Mei, Lijun, Chan, W.K, Yu, Yuen Tak, Tse, T.H
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 24
container_issue 3
container_start_page 1
container_title International journal of web services research
container_volume 12
creator Jia, Changjiang
Mei, Lijun
Chan, W.K
Yu, Yuen Tak
Tse, T.H
description Many existing studies measure the effectiveness of test case prioritization techniques using the average performance on a set of test suites. However, in each regression test session, a real-world developer may only afford to apply one prioritization technique to one test suite to test a service once, even if this application results in an adverse scenario such that the actual performance in this test session is far below the average result achievable by the same technique over the same test suite for the same application. It indicates that assessing the average performance of such a technique cannot provide adequate confidence for developers to apply the technique. The authors ask a couple of questions: To what extent does the effectiveness of prioritization techniques in average scenarios correlate with that in adverse scenarios? Moreover, to what extent may a design factor of this class of techniques affect the effectiveness of prioritization in different types of scenarios? To the best of their knowledge, the authors report in this paper the first controlled experiment to study these two new research questions through more than 300 million APFD and HMFD data points produced from 19 techniques, eight WS-BPEL benchmarks and 1000 test suites prioritized by each technique 1000 times. A main result reveals a strong and linear correlation between the effectiveness in the average scenarios and that in the adverse scenarios. Another interesting result is that many pairs of levels of the same design factors significantly change their relative strengths of being more effective within the same pairs in handling a wide spectrum of prioritized test suites produced by the same techniques over the same test suite in testing the same benchmarks, and the results obtained from the average scenarios are more similar to those of the more effective end than otherwise. This work provides the first piece of strong evidence for the research community to re-assess how they develop and validate their techniques in the average scenarios and beyond.
doi_str_mv 10.4018/IJWSR.2015070101
format Article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1709776181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A759452599</galeid><sourcerecordid>A759452599</sourcerecordid><originalsourceid>FETCH-LOGICAL-c421t-bbdd1637b0383905d186a939a1bf6c222567c2aff893423470a4992017418c373</originalsourceid><addsrcrecordid>eNp1kU1PGzEQhldVkUqhd46WeullU3-u7d7SFCgoaisC4mg5Xjs12tjB3kXi39fZRIpAVD7YM37mnRm9VXWG4IRCJL5eXd8vbiYYIgY5RBC9q44Ro03NIKTvxzerOWnwh-pjzg8QYsalOK7iLIZgTe_DCvR_LZg-2aRXFujQjvGvGOp97huYgkU_tM8guvHvRvc2b4MLPXQ9-GH7rVAMwAdwa_Ooeb-ov_85n4OFTU_e2HxaHTndZftpf59Udxfnt7Of9fz35dVsOq8Nxaivl8u2RQ3hS0gEkZC1SDRaEqnR0jUGY8wabrB2TkhCMaEcaipl2Z5TJAzh5KT6stPdpPg4lGHU2mdju04HG4esEIeS8wYJVNDPr9CHOKRQplNYEkwEgoIeqJXurPLBxT5psxVVU84kZZhJWajJG1Q5rV17E4N1vuRfFMBdgUkx52Sd2iS_1ulZIai2xqrRWHUw9rCaX_nDqK8xtWldQS_fQHd2q2Kh2lurit0l_l9LhMk_MVu1pQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2932381084</pqid></control><display><type>article</type><title>Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services</title><source>Alma/SFX Local Collection</source><creator>Jia, Changjiang ; Mei, Lijun ; Chan, W.K ; Yu, Yuen Tak ; Tse, T.H</creator><creatorcontrib>Jia, Changjiang ; Mei, Lijun ; Chan, W.K ; Yu, Yuen Tak ; Tse, T.H</creatorcontrib><description>Many existing studies measure the effectiveness of test case prioritization techniques using the average performance on a set of test suites. However, in each regression test session, a real-world developer may only afford to apply one prioritization technique to one test suite to test a service once, even if this application results in an adverse scenario such that the actual performance in this test session is far below the average result achievable by the same technique over the same test suite for the same application. It indicates that assessing the average performance of such a technique cannot provide adequate confidence for developers to apply the technique. The authors ask a couple of questions: To what extent does the effectiveness of prioritization techniques in average scenarios correlate with that in adverse scenarios? Moreover, to what extent may a design factor of this class of techniques affect the effectiveness of prioritization in different types of scenarios? To the best of their knowledge, the authors report in this paper the first controlled experiment to study these two new research questions through more than 300 million APFD and HMFD data points produced from 19 techniques, eight WS-BPEL benchmarks and 1000 test suites prioritized by each technique 1000 times. A main result reveals a strong and linear correlation between the effectiveness in the average scenarios and that in the adverse scenarios. Another interesting result is that many pairs of levels of the same design factors significantly change their relative strengths of being more effective within the same pairs in handling a wide spectrum of prioritized test suites produced by the same techniques over the same test suite in testing the same benchmarks, and the results obtained from the average scenarios are more similar to those of the more effective end than otherwise. This work provides the first piece of strong evidence for the research community to re-assess how they develop and validate their techniques in the average scenarios and beyond.</description><identifier>ISSN: 1545-7362</identifier><identifier>EISSN: 1546-5004</identifier><identifier>DOI: 10.4018/IJWSR.2015070101</identifier><language>eng</language><publisher>Hershey: IGI Global</publisher><subject>Benchmarks ; Business Process Execution Language ; Confidence intervals ; Correlation ; Data points ; Design factors ; Developers ; Effectiveness ; Fault detection ; Joining ; Questions ; Web services</subject><ispartof>International journal of web services research, 2015-07, Vol.12 (3), p.1-24</ispartof><rights>COPYRIGHT 2015 IGI Global</rights><rights>Copyright © 2015, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Jia, Changjiang</creatorcontrib><creatorcontrib>Mei, Lijun</creatorcontrib><creatorcontrib>Chan, W.K</creatorcontrib><creatorcontrib>Yu, Yuen Tak</creatorcontrib><creatorcontrib>Tse, T.H</creatorcontrib><title>Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services</title><title>International journal of web services research</title><description>Many existing studies measure the effectiveness of test case prioritization techniques using the average performance on a set of test suites. However, in each regression test session, a real-world developer may only afford to apply one prioritization technique to one test suite to test a service once, even if this application results in an adverse scenario such that the actual performance in this test session is far below the average result achievable by the same technique over the same test suite for the same application. It indicates that assessing the average performance of such a technique cannot provide adequate confidence for developers to apply the technique. The authors ask a couple of questions: To what extent does the effectiveness of prioritization techniques in average scenarios correlate with that in adverse scenarios? Moreover, to what extent may a design factor of this class of techniques affect the effectiveness of prioritization in different types of scenarios? To the best of their knowledge, the authors report in this paper the first controlled experiment to study these two new research questions through more than 300 million APFD and HMFD data points produced from 19 techniques, eight WS-BPEL benchmarks and 1000 test suites prioritized by each technique 1000 times. A main result reveals a strong and linear correlation between the effectiveness in the average scenarios and that in the adverse scenarios. Another interesting result is that many pairs of levels of the same design factors significantly change their relative strengths of being more effective within the same pairs in handling a wide spectrum of prioritized test suites produced by the same techniques over the same test suite in testing the same benchmarks, and the results obtained from the average scenarios are more similar to those of the more effective end than otherwise. This work provides the first piece of strong evidence for the research community to re-assess how they develop and validate their techniques in the average scenarios and beyond.</description><subject>Benchmarks</subject><subject>Business Process Execution Language</subject><subject>Confidence intervals</subject><subject>Correlation</subject><subject>Data points</subject><subject>Design factors</subject><subject>Developers</subject><subject>Effectiveness</subject><subject>Fault detection</subject><subject>Joining</subject><subject>Questions</subject><subject>Web services</subject><issn>1545-7362</issn><issn>1546-5004</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp1kU1PGzEQhldVkUqhd46WeullU3-u7d7SFCgoaisC4mg5Xjs12tjB3kXi39fZRIpAVD7YM37mnRm9VXWG4IRCJL5eXd8vbiYYIgY5RBC9q44Ro03NIKTvxzerOWnwh-pjzg8QYsalOK7iLIZgTe_DCvR_LZg-2aRXFujQjvGvGOp97huYgkU_tM8guvHvRvc2b4MLPXQ9-GH7rVAMwAdwa_Ooeb-ov_85n4OFTU_e2HxaHTndZftpf59Udxfnt7Of9fz35dVsOq8Nxaivl8u2RQ3hS0gEkZC1SDRaEqnR0jUGY8wabrB2TkhCMaEcaipl2Z5TJAzh5KT6stPdpPg4lGHU2mdju04HG4esEIeS8wYJVNDPr9CHOKRQplNYEkwEgoIeqJXurPLBxT5psxVVU84kZZhJWajJG1Q5rV17E4N1vuRfFMBdgUkx52Sd2iS_1ulZIai2xqrRWHUw9rCaX_nDqK8xtWldQS_fQHd2q2Kh2lurit0l_l9LhMk_MVu1pQ</recordid><startdate>20150701</startdate><enddate>20150701</enddate><creator>Jia, Changjiang</creator><creator>Mei, Lijun</creator><creator>Chan, W.K</creator><creator>Yu, Yuen Tak</creator><creator>Tse, T.H</creator><general>IGI Global</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope></search><sort><creationdate>20150701</creationdate><title>Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services</title><author>Jia, Changjiang ; Mei, Lijun ; Chan, W.K ; Yu, Yuen Tak ; Tse, T.H</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c421t-bbdd1637b0383905d186a939a1bf6c222567c2aff893423470a4992017418c373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Benchmarks</topic><topic>Business Process Execution Language</topic><topic>Confidence intervals</topic><topic>Correlation</topic><topic>Data points</topic><topic>Design factors</topic><topic>Developers</topic><topic>Effectiveness</topic><topic>Fault detection</topic><topic>Joining</topic><topic>Questions</topic><topic>Web services</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jia, Changjiang</creatorcontrib><creatorcontrib>Mei, Lijun</creatorcontrib><creatorcontrib>Chan, W.K</creatorcontrib><creatorcontrib>Yu, Yuen Tak</creatorcontrib><creatorcontrib>Tse, T.H</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Library &amp; Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal of web services research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jia, Changjiang</au><au>Mei, Lijun</au><au>Chan, W.K</au><au>Yu, Yuen Tak</au><au>Tse, T.H</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services</atitle><jtitle>International journal of web services research</jtitle><date>2015-07-01</date><risdate>2015</risdate><volume>12</volume><issue>3</issue><spage>1</spage><epage>24</epage><pages>1-24</pages><issn>1545-7362</issn><eissn>1546-5004</eissn><abstract>Many existing studies measure the effectiveness of test case prioritization techniques using the average performance on a set of test suites. However, in each regression test session, a real-world developer may only afford to apply one prioritization technique to one test suite to test a service once, even if this application results in an adverse scenario such that the actual performance in this test session is far below the average result achievable by the same technique over the same test suite for the same application. It indicates that assessing the average performance of such a technique cannot provide adequate confidence for developers to apply the technique. The authors ask a couple of questions: To what extent does the effectiveness of prioritization techniques in average scenarios correlate with that in adverse scenarios? Moreover, to what extent may a design factor of this class of techniques affect the effectiveness of prioritization in different types of scenarios? To the best of their knowledge, the authors report in this paper the first controlled experiment to study these two new research questions through more than 300 million APFD and HMFD data points produced from 19 techniques, eight WS-BPEL benchmarks and 1000 test suites prioritized by each technique 1000 times. A main result reveals a strong and linear correlation between the effectiveness in the average scenarios and that in the adverse scenarios. Another interesting result is that many pairs of levels of the same design factors significantly change their relative strengths of being more effective within the same pairs in handling a wide spectrum of prioritized test suites produced by the same techniques over the same test suite in testing the same benchmarks, and the results obtained from the average scenarios are more similar to those of the more effective end than otherwise. This work provides the first piece of strong evidence for the research community to re-assess how they develop and validate their techniques in the average scenarios and beyond.</abstract><cop>Hershey</cop><pub>IGI Global</pub><doi>10.4018/IJWSR.2015070101</doi><tpages>24</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1545-7362
ispartof International journal of web services research, 2015-07, Vol.12 (3), p.1-24
issn 1545-7362
1546-5004
language eng
recordid cdi_proquest_miscellaneous_1709776181
source Alma/SFX Local Collection
subjects Benchmarks
Business Process Execution Language
Confidence intervals
Correlation
Data points
Design factors
Developers
Effectiveness
Fault detection
Joining
Questions
Web services
title Connecting the Average and the Non-Average: A Study of the Rates of Fault Detection in Testing WS-BPEL Services
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T23%3A19%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Connecting%20the%20Average%20and%20the%20Non-Average:%20A%20Study%20of%20the%20Rates%20of%20Fault%20Detection%20in%20Testing%20WS-BPEL%20Services&rft.jtitle=International%20journal%20of%20web%20services%20research&rft.au=Jia,%20Changjiang&rft.date=2015-07-01&rft.volume=12&rft.issue=3&rft.spage=1&rft.epage=24&rft.pages=1-24&rft.issn=1545-7362&rft.eissn=1546-5004&rft_id=info:doi/10.4018/IJWSR.2015070101&rft_dat=%3Cgale_proqu%3EA759452599%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2932381084&rft_id=info:pmid/&rft_galeid=A759452599&rfr_iscdi=true