PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Puppin D., Tullsen D. Maximizing TLP with loop-parallelization on SMT. In: MTEAC-5 (Austin, Texas, 1 december 2001).
 
 
Abstract
(English)
This paper describes research in exploiting loop-level parallelism on a simultaneous multithreading processor. We discuss some general and ad-hoc techniques for loop parallelization that proved to be effective with SMT, and how they were tuned for it. These techniques have been tested on the well-known Livermore loops, chosen for their variety of behaviors. The set of optimizations used produced significant improvement overall: we were able to improve average IPC from 2.72 to 3.97, and to gain an average speedup of 1.39 over optimized single-thread code, using up to eight threads. We also describe a simple but effective method for determining the best number of threads to be used for parallel loops on a multithreaded processor. The model uses compile-time information to predict the most efficient point.
Subject Simultaneous multithreading
Loop-parallelization
Compiling
C.1 Processor Architectures


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional