Hierarchical Clustering for Software Architecture Recovery

Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on software engineering 2007-11, Vol.33 (11), p.759-780
Hauptverfasser: Maqbool, O., Babri, H.A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 780
container_issue 11
container_start_page 759
container_title IEEE transactions on software engineering
container_volume 33
creator Maqbool, O.
Babri, H.A.
description Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.
doi_str_mv 10.1109/TSE.2007.70732
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_903621223</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4339232</ieee_id><sourcerecordid>1381338101</sourcerecordid><originalsourceid>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</originalsourceid><addsrcrecordid>eNp90TtLBDEQB_AgCp6P1sbmsFCbPSeZPDZ2cviCA8HTOuzGia6st5rsKn57c55YWFglIb-ZYfgztsdhwjnYk7v5-UQAmIkBg2KNjbhFW6ASsM5GALYslCrtJttK6RkAlDFqxE6vGopV9E-Nr9rxtB1ST7FZPI5DF8fzLvQfVaTx2RL05PshP27Jd-8UP3fYRqjaRLs_5za7vzi_m14Vs5vL6-nZrPBSln2hQNfB1obQI0rMgyvNdSlN0BwevK1wea1LKr02qsTwoCwGGcgIjXUNuM2OVn1fY_c2UOrdS5M8tW21oG5IzgJqwYXALA__lSglCqN4hsf_Qq4NR5Ccl5ke_KHP3RAXeWHHrVJGo1AZTVbIxy6lSMG9xualip-Og1uG43I4bhmO-w4nF-yvChoi-sUS0Yr8-wX6v4g2</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>195576325</pqid></control><display><type>article</type><title>Hierarchical Clustering for Software Architecture Recovery</title><source>IEEE Electronic Library (IEL)</source><creator>Maqbool, O. ; Babri, H.A.</creator><creatorcontrib>Maqbool, O. ; Babri, H.A.</creatorcontrib><description>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</description><identifier>ISSN: 0098-5589</identifier><identifier>EISSN: 1939-3520</identifier><identifier>DOI: 10.1109/TSE.2007.70732</identifier><identifier>CODEN: IESEDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithm design and analysis ; Algorithms ; and reengineering ; arbitrary decisions ; Architecture ; Architecture (computers) ; architecture recovery ; Cluster analysis ; Clustering ; Clustering algorithms ; Computer architecture ; Computer engineering ; Computer programs ; Decomposition ; Digital Object Identifier ; Electrical engineering ; hierarchical clustering ; Legacy systems ; Partitioning algorithms ; Recovery ; Restructuring ; Reverse engineering ; Similarity ; Software ; Software algorithms ; Software architecture ; Software Engineering ; Software measurement ; Software systems ; Studies ; Taxonomy</subject><ispartof>IEEE transactions on software engineering, 2007-11, Vol.33 (11), p.759-780</ispartof><rights>Copyright IEEE Computer Society Nov 2007</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</citedby><cites>FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4339232$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27915,27916,54749</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4339232$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Maqbool, O.</creatorcontrib><creatorcontrib>Babri, H.A.</creatorcontrib><title>Hierarchical Clustering for Software Architecture Recovery</title><title>IEEE transactions on software engineering</title><addtitle>TSE</addtitle><description>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</description><subject>Algorithm design and analysis</subject><subject>Algorithms</subject><subject>and reengineering</subject><subject>arbitrary decisions</subject><subject>Architecture</subject><subject>Architecture (computers)</subject><subject>architecture recovery</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Computer architecture</subject><subject>Computer engineering</subject><subject>Computer programs</subject><subject>Decomposition</subject><subject>Digital Object Identifier</subject><subject>Electrical engineering</subject><subject>hierarchical clustering</subject><subject>Legacy systems</subject><subject>Partitioning algorithms</subject><subject>Recovery</subject><subject>Restructuring</subject><subject>Reverse engineering</subject><subject>Similarity</subject><subject>Software</subject><subject>Software algorithms</subject><subject>Software architecture</subject><subject>Software Engineering</subject><subject>Software measurement</subject><subject>Software systems</subject><subject>Studies</subject><subject>Taxonomy</subject><issn>0098-5589</issn><issn>1939-3520</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp90TtLBDEQB_AgCp6P1sbmsFCbPSeZPDZ2cviCA8HTOuzGia6st5rsKn57c55YWFglIb-ZYfgztsdhwjnYk7v5-UQAmIkBg2KNjbhFW6ASsM5GALYslCrtJttK6RkAlDFqxE6vGopV9E-Nr9rxtB1ST7FZPI5DF8fzLvQfVaTx2RL05PshP27Jd-8UP3fYRqjaRLs_5za7vzi_m14Vs5vL6-nZrPBSln2hQNfB1obQI0rMgyvNdSlN0BwevK1wea1LKr02qsTwoCwGGcgIjXUNuM2OVn1fY_c2UOrdS5M8tW21oG5IzgJqwYXALA__lSglCqN4hsf_Qq4NR5Ccl5ke_KHP3RAXeWHHrVJGo1AZTVbIxy6lSMG9xualip-Og1uG43I4bhmO-w4nF-yvChoi-sUS0Yr8-wX6v4g2</recordid><startdate>20071101</startdate><enddate>20071101</enddate><creator>Maqbool, O.</creator><creator>Babri, H.A.</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7WY</scope><scope>7WZ</scope><scope>7X7</scope><scope>7XB</scope><scope>87Z</scope><scope>88E</scope><scope>88F</scope><scope>88I</scope><scope>88K</scope><scope>8AL</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>FYUFA</scope><scope>F~G</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>K9.</scope><scope>L.-</scope><scope>L6V</scope><scope>M0C</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M1Q</scope><scope>M2O</scope><scope>M2P</scope><scope>M2T</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20071101</creationdate><title>Hierarchical Clustering for Software Architecture Recovery</title><author>Maqbool, O. ; Babri, H.A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c448t-506bf9b7e3c3343005a616847f610dc9a347f6b8e8c67583fd593f4fe7263bb03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithm design and analysis</topic><topic>Algorithms</topic><topic>and reengineering</topic><topic>arbitrary decisions</topic><topic>Architecture</topic><topic>Architecture (computers)</topic><topic>architecture recovery</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Computer architecture</topic><topic>Computer engineering</topic><topic>Computer programs</topic><topic>Decomposition</topic><topic>Digital Object Identifier</topic><topic>Electrical engineering</topic><topic>hierarchical clustering</topic><topic>Legacy systems</topic><topic>Partitioning algorithms</topic><topic>Recovery</topic><topic>Restructuring</topic><topic>Reverse engineering</topic><topic>Similarity</topic><topic>Software</topic><topic>Software algorithms</topic><topic>Software architecture</topic><topic>Software Engineering</topic><topic>Software measurement</topic><topic>Software systems</topic><topic>Studies</topic><topic>Taxonomy</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Maqbool, O.</creatorcontrib><creatorcontrib>Babri, H.A.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Military Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Telecommunications (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>Health Research Premium Collection</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Military Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Telecommunications Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on software engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Maqbool, O.</au><au>Babri, H.A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Clustering for Software Architecture Recovery</atitle><jtitle>IEEE transactions on software engineering</jtitle><stitle>TSE</stitle><date>2007-11-01</date><risdate>2007</risdate><volume>33</volume><issue>11</issue><spage>759</spage><epage>780</epage><pages>759-780</pages><issn>0098-5589</issn><eissn>1939-3520</eissn><coden>IESEDJ</coden><abstract>Gaining an architectural level understanding of a software system is important for many reasons. When the description of a system's architecture does not exist, attempts must be made to recover it. In recent years, researchers have explored the use of clustering for recovering a software system's architecture, given only its source code. The main contributions of this paper are given as follows. First, we review hierarchical clustering research in the context of software architecture recovery and modularization. Second, to employ clustering meaningfully, it is necessary to understand the peculiarities of the software domain, as well as the behavior of clustering measures and algorithms in this domain. To this end, we provide a detailed analysis of the behavior of various similarity and distance measures that may be employed for software clustering. Third, we analyze the clustering process of various well-known clustering algorithms by using multiple criteria, and we show how arbitrary decisions taken by these algorithms during clustering affect the quality of their results. Finally, we present an analysis of two recently proposed clustering algorithms, revealing close similarities in their apparently different clustering approaches. Experiments on four legacy software systems provide insight into the behavior of well-known clustering algorithms and their characteristics in the software domain.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSE.2007.70732</doi><tpages>22</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0098-5589
ispartof IEEE transactions on software engineering, 2007-11, Vol.33 (11), p.759-780
issn 0098-5589
1939-3520
language eng
recordid cdi_proquest_miscellaneous_903621223
source IEEE Electronic Library (IEL)
subjects Algorithm design and analysis
Algorithms
and reengineering
arbitrary decisions
Architecture
Architecture (computers)
architecture recovery
Cluster analysis
Clustering
Clustering algorithms
Computer architecture
Computer engineering
Computer programs
Decomposition
Digital Object Identifier
Electrical engineering
hierarchical clustering
Legacy systems
Partitioning algorithms
Recovery
Restructuring
Reverse engineering
Similarity
Software
Software algorithms
Software architecture
Software Engineering
Software measurement
Software systems
Studies
Taxonomy
title Hierarchical Clustering for Software Architecture Recovery
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T20%3A54%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Clustering%20for%20Software%20Architecture%20Recovery&rft.jtitle=IEEE%20transactions%20on%20software%20engineering&rft.au=Maqbool,%20O.&rft.date=2007-11-01&rft.volume=33&rft.issue=11&rft.spage=759&rft.epage=780&rft.pages=759-780&rft.issn=0098-5589&rft.eissn=1939-3520&rft.coden=IESEDJ&rft_id=info:doi/10.1109/TSE.2007.70732&rft_dat=%3Cproquest_RIE%3E1381338101%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=195576325&rft_id=info:pmid/&rft_ieee_id=4339232&rfr_iscdi=true