Finding the optimal unroll-and-jam

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reducing the traffic between CPU and main memory is one of the main issues in the optimization of programs for load/store architectures. It is the register allocation module of optimizing compilers that keeps this traffic low by cleverly associating the program variables to the CPU registers. Since register allocation takes place during code generation and works on the intermediate code produced by the compiler front-end, the structure of such a code, which closely depends on the structure of the source code, heavily affects the effectiveness of register allocation. Proper techniques can be used to restructure the source programs in such a way to produce intermediate code able to take advantage of advanced register allocation schemes. In this paper we analyze one of these techniques called unroll-and-jam. In particular we find the fractional optimal unroll-and-jam transformation valid for a large class of computing intensive programs. The paper presents the analytical model of the optimal unroll-and-jam and a method to compute the unrolling parameters numerically.

Cite

CITATION STYLE

APA

Zingirian, N., & Maresca, M. (1999). Finding the optimal unroll-and-jam. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1593, pp. 633–642). Springer Verlag. https://doi.org/10.1007/bfb0100624

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free