|
|
|
|
|
Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures |
|
PP: 2549-2562 |
|
Author(s) |
|
Slo-Li Chu,
Chih-Chieh Hsiao,
|
|
Abstract |
|
Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widely used in
computer systems. These GPUs provide substantially more computation capability and memory bandwidth compared to traditional
multi-cores. Also, because they are highly programmable, they provide the computational performance needed for realistic graphics
rendering. Applications with general computations can also be leveraged onto these GPUs. This study discusses the architectures of
these highly efficient GPUs and applies a unified programming standard called OpenCL to fully utilize their capabilities. Despite their
great potential, applications of these GPUs are challenging because of their diverse underlying architectural characteristics. In this
study, several optimizing techniques are applied on OpenCL-compatible heterogeneous multicore architectures to achieve thread-level
and data-level parallelisms. The architectural implications of these techniques are discussed. Finally, optimization principles for these
architectures will be are proposed. The experimental reveal average speedups of 24 and 430 for non-optimized and optimized kernels,
respectively. |
|
|
|
|
|