Optimizing GPU Programs from Java using Babylon and HAT


The Heterogeneous Accelerator Toolkit (HAT) is a parallel programming framework that allows Java developers to offload Java code and dispatch the generated code on modern hardware accelerators, such as Graphics Processing Units (GPUs). This article provides an overview of the HAT programming model: using matrix-multiplication as an example, we demonstrate how Java developers can tune GPU workloads from the Java side to achieve performance close to native cuBLAS, scaling from 7 GFLOP/s on CPUs to 14 TFLOP/s on an NVIDIA A10 GPU.

More at https://openjdk.org/projects/babylon/articles/hat-matmul/hat-matmul