HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Affine Vector Cache for memory bandwidth savings

Caroline Collange 1, 2, * Alexandre Kouyoumdjian 1, 2
* Corresponding author
1 ARENAIRE - Computer arithmetic
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : Preserving memory locality is a major issue in highly-multithreaded architectures such as GPUs. These architectures hide latency by maintaining a large number of threads in flight. As each thread needs to maintain a private working set, all threads collectively put tremendous pressure on on-chip memory arrays, at significant cost in area and power. We show that thread-private data in GPU-like implicit SIMD architectures can be compressed by a factor up to 16 by taking advantage of correlations between values held by different threads. We propose the Affine Vector Cache, a compressed cache design that complements the first level cache. Evaluation by simulation on the SDK and Rodinia benchmarks shows that a 32KB L1 cache assisted by a 16KB AVC presents a 59% larger usable capacity on average compared to a single 48KB L1 cache. It results in a global performance increase of 5.7% along with an energy reduction of 11% for a negligible hardware cost.
Document type :
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download

Contributor : Caroline Collange Connect in order to contact the contributor
Submitted on : Wednesday, December 7, 2011 - 12:16:44 PM
Last modification on : Friday, February 4, 2022 - 3:08:46 AM
Long-term archiving on: : Friday, November 16, 2012 - 2:40:37 PM


Files produced by the author(s)


  • HAL Id : ensl-00649200, version 1



Caroline Collange, Alexandre Kouyoumdjian. Affine Vector Cache for memory bandwidth savings. [Research Report] ENS de Lyon. 2011. ⟨ensl-00649200⟩



Record views


Files downloads