Fine-grained Thread Execution Mechanism for GPGPU

While GPUs are successful in computing homogeneously parallel problems such as matrix calculation and image processing, there have been few attempts to apply GPUs to heterogeneously parallel problems. This project aims at developing fine-grained nested thread execution mechanisms for GPGPU so that we can easily write parallel computation by forking threads inside of GPU cores. With such mechanisms, wider variety of programs would be accelerated by GPUs.

As the first step, we propose a thread execution model based on DynaSOAr, a highly efficient parallel object allocator for GPGPU, and investigate its performance, overheads, optimizations and compilation techniques.

👉 Implementation @ github

Kiuchi’s IPSJ PRO Workshop Talk on Fine-Grained Threading on a GPU
Kiuchi and Sakai presented Master’s Theses

Programming Research Group

Department of Mathematical and Computing Science, Institute of Science Tokyo (née Tokyo Institute of Technology)

Fine-grained Thread Execution Mechanism for GPGPU