Exploiting Free Execution Slots on EPIC Processors for Efficient and Accurate Runtime Profiling

Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, we propose a new profiling technique that incorporates the strength of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wu, Youfeng, Lee, Yong-Fong
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, we propose a new profiling technique that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the hardware uses the free execution slots available in a user program to execute profiling operations. We have implemented the compiler instrumentation of this technique using an Itanium research compiler. Our result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. We believe this will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors.
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-30102-8_19