Hierarchical Clustering for Software Architecture Recovery
Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on software engineering 2007-11, Vol.33 (11), p.759-780 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 780 |
---|---|
container_issue | 11 |
container_start_page | 759 |
container_title | IEEE transactions on software engineering |
container_volume | 33 |
creator | Maqbool, O. Babri, H.A. |
description | Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain. |
doi_str_mv | 10.1109/TSE.2007.70732 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_903621223</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4339232</ieee_id><sourcerecordid>1381338101</sourcerecordid><originalsourceid>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</originalsourceid><addsrcrecordid>eNp90TtLBDEQB_AgCp6P1sbmsFCbPSeZPDZ2cviCA8HTOuzGia6st5rsKn57c55YWFglIb-ZYfgztsdhwjnYk7v5-UQAmIkBg2KNjbhFW6ASsM5GALYslCrtJttK6RkAlDFqxE6vGopV9E-Nr9rxtB1ST7FZPI5DF8fzLvQfVaTx2RL05PshP27Jd-8UP3fYRqjaRLs_5za7vzi_m14Vs5vL6-nZrPBSln2hQNfB1obQI0rMgyvNdSlN0BwevK1wea1LKr02qsTwoCwGGcgIjXUNuM2OVn1fY_c2UOrdS5M8tW21oG5IzgJqwYXALA__lSglCqN4hsf_Qq4NR5Ccl5ke_KHP3RAXeWHHrVJGo1AZTVbIxy6lSMG9xualip-Og1uG43I4bhmO-w4nF-yvChoi-sUS0Yr8-wX6v4g2</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>195576325</pqid></control><display><type>article</type><title>Hierarchical Clustering for Software Architecture Recovery</title><source>IEEE Electronic Library (IEL)</source><creator>Maqbool, O. ; Babri, H.A.</creator><creatorcontrib>Maqbool, O. ; Babri, H.A.</creatorcontrib><description>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</description><identifier>ISSN: 0098-5589</identifier><identifier>EISSN: 1939-3520</identifier><identifier>DOI: 10.1109/TSE.2007.70732</identifier><identifier>CODEN: IESEDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithm design and analysis ; Algorithms ; and reengineering ; arbitrary decisions ; Architecture ; Architecture (computers) ; architecture recovery ; Cluster analysis ; Clustering ; Clustering algorithms ; Computer architecture ; Computer engineering ; Computer programs ; Decomposition ; Digital Object Identifier ; Electrical engineering ; hierarchical clustering ; Legacy systems ; Partitioning algorithms ; Recovery ; Restructuring ; Reverse engineering ; Similarity ; Software ; Software algorithms ; Software architecture ; Software Engineering ; Software measurement ; Software systems ; Studies ; Taxonomy</subject><ispartof>IEEE transactions on software engineering, 2007-11, Vol.33 (11), p.759-780</ispartof><rights>Copyright IEEE Computer Society Nov 2007</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</citedby><cites>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4339232$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27915,27916,54749</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4339232$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Maqbool, O.</creatorcontrib><creatorcontrib>Babri, H.A.</creatorcontrib><title>Hierarchical Clustering for Software Architecture Recovery</title><title>IEEE transactions on software engineering</title><addtitle>TSE</addtitle><description>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</description><subject>Algorithm design and analysis</subject><subject>Algorithms</subject><subject>and reengineering</subject><subject>arbitrary decisions</subject><subject>Architecture</subject><subject>Architecture (computers)</subject><subject>architecture recovery</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Computer architecture</subject><subject>Computer engineering</subject><subject>Computer programs</subject><subject>Decomposition</subject><subject>Digital Object Identifier</subject><subject>Electrical engineering</subject><subject>hierarchical clustering</subject><subject>Legacy systems</subject><subject>Partitioning algorithms</subject><subject>Recovery</subject><subject>Restructuring</subject><subject>Reverse engineering</subject><subject>Similarity</subject><subject>Software</subject><subject>Software algorithms</subject><subject>Software architecture</subject><subject>Software Engineering</subject><subject>Software measurement</subject><subject>Software systems</subject><subject>Studies</subject><subject>Taxonomy</subject><issn>0098-5589</issn><issn>1939-3520</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp90TtLBDEQB_AgCp6P1sbmsFCbPSeZPDZ2cviCA8HTOuzGia6st5rsKn57c55YWFglIb-ZYfgztsdhwjnYk7v5-UQAmIkBg2KNjbhFW6ASsM5GALYslCrtJttK6RkAlDFqxE6vGopV9E-Nr9rxtB1ST7FZPI5DF8fzLvQfVaTx2RL05PshP27Jd-8UP3fYRqjaRLs_5za7vzi_m14Vs5vL6-nZrPBSln2hQNfB1obQI0rMgyvNdSlN0BwevK1wea1LKr02qsTwoCwGGcgIjXUNuM2OVn1fY_c2UOrdS5M8tW21oG5IzgJqwYXALA__lSglCqN4hsf_Qq4NR5Ccl5ke_KHP3RAXeWHHrVJGo1AZTVbIxy6lSMG9xualip-Og1uG43I4bhmO-w4nF-yvChoi-sUS0Yr8-wX6v4g2</recordid><startdate>20071101</startdate><enddate>20071101</enddate><creator>Maqbool, O.</creator><creator>Babri, H.A.</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7WY</scope><scope>7WZ</scope><scope>7X7</scope><scope>7XB</scope><scope>87Z</scope><scope>88E</scope><scope>88F</scope><scope>88I</scope><scope>88K</scope><scope>8AL</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>FYUFA</scope><scope>F~G</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>K9.</scope><scope>L.-</scope><scope>L6V</scope><scope>M0C</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M1Q</scope><scope>M2O</scope><scope>M2P</scope><scope>M2T</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20071101</creationdate><title>Hierarchical Clustering for Software Architecture Recovery</title><author>Maqbool, O. ; Babri, H.A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithm design and analysis</topic><topic>Algorithms</topic><topic>and reengineering</topic><topic>arbitrary decisions</topic><topic>Architecture</topic><topic>Architecture (computers)</topic><topic>architecture recovery</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Computer architecture</topic><topic>Computer engineering</topic><topic>Computer programs</topic><topic>Decomposition</topic><topic>Digital Object Identifier</topic><topic>Electrical engineering</topic><topic>hierarchical clustering</topic><topic>Legacy systems</topic><topic>Partitioning algorithms</topic><topic>Recovery</topic><topic>Restructuring</topic><topic>Reverse engineering</topic><topic>Similarity</topic><topic>Software</topic><topic>Software algorithms</topic><topic>Software architecture</topic><topic>Software Engineering</topic><topic>Software measurement</topic><topic>Software systems</topic><topic>Studies</topic><topic>Taxonomy</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Maqbool, O.</creatorcontrib><creatorcontrib>Babri, H.A.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Military Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Telecommunications (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>Health Research Premium Collection</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Military Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Telecommunications Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on software engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Maqbool, O.</au><au>Babri, H.A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Clustering for Software Architecture Recovery</atitle><jtitle>IEEE transactions on software engineering</jtitle><stitle>TSE</stitle><date>2007-11-01</date><risdate>2007</risdate><volume>33</volume><issue>11</issue><spage>759</spage><epage>780</epage><pages>759-780</pages><issn>0098-5589</issn><eissn>1939-3520</eissn><coden>IESEDJ</coden><abstract>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSE.2007.70732</doi><tpages>22</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0098-5589 |
ispartof | IEEE transactions on software engineering, 2007-11, Vol.33 (11), p.759-780 |
issn | 0098-5589 1939-3520 |
language | eng |
recordid | cdi_proquest_miscellaneous_903621223 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithm design and analysis Algorithms and reengineering arbitrary decisions Architecture Architecture (computers) architecture recovery Cluster analysis Clustering Clustering algorithms Computer architecture Computer engineering Computer programs Decomposition Digital Object Identifier Electrical engineering hierarchical clustering Legacy systems Partitioning algorithms Recovery Restructuring Reverse engineering Similarity Software Software algorithms Software architecture Software Engineering Software measurement Software systems Studies Taxonomy |
title | Hierarchical Clustering for Software Architecture Recovery |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T20%3A54%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Clustering%20for%20Software%20Architecture%20Recovery&rft.jtitle=IEEE%20transactions%20on%20software%20engineering&rft.au=Maqbool,%20O.&rft.date=2007-11-01&rft.volume=33&rft.issue=11&rft.spage=759&rft.epage=780&rft.pages=759-780&rft.issn=0098-5589&rft.eissn=1939-3520&rft.coden=IESEDJ&rft_id=info:doi/10.1109/TSE.2007.70732&rft_dat=%3Cproquest_RIE%3E1381338101%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=195576325&rft_id=info:pmid/&rft_ieee_id=4339232&rfr_iscdi=true |