Inline Analysis: Beyond Selection Heuristics

Research on procedure inlining has mainly focused on heuristics that decide whether inlining a particular call-site maximizes application performance. However, other equally important aspects of inline analysis such as call-site analysis order, indirect effects of inlining, and selection of the most...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chakrabarti, Dhruva R., Liu, Shin-Ming
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Research on procedure inlining has mainly focused on heuristics that decide whether inlining a particular call-site maximizes application performance. However, other equally important aspects of inline analysis such as call-site analysis order, indirect effects of inlining, and selection of the most profitable version of a procedure warrant more attention. This paper evaluates a number of different sequences in which call-sites are examined for inlining and shows that choosing the correct order is crucial to obtaining the best run-time performance. We then present a novel, work-list-based, and updated sequence that achieves the best results. While applying cross-module inline analysis on large applications with thousands of files and millions of lines of code, we separate the analysis from the transformation phase and allow the former to work solely on summary information in order to reduce compile-time and memory consumption. A focus of this paper is to enumerate the summaries that our compiler maintains, present a technique to compute the goodness factor on which the work-list sequence is based, and describe methods to continuously update the summaries as and when a call-site is accepted for inlining. We then introduce inline specialization, a new technique that facilitates inlining into call chains selectively. The power of inline specialization lies in its ability to choose the most profitable version of the called procedure without having to maintain multiple versions at any point of time. We discuss implementation of these techniques in the HPUX Itanium production compiler and present experimental results showing that a dynamic work-list based analysis order, comprehensive summary updates, and inline specialization significantly improve performance of applications.
DOI:10.1109/CGO.2006.17