Parallel and Distributed Systems Group

Computer Science Department of Telecom SudParis

DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications

Reading group: Mickaël Boichot presented "DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications" (ASPLOS'23) at 1D19 the 15/9/2023 at 10h00.


GPUs are widely used in today’s computing platforms to accelerate applications in various domains. However, scarce GPU memory resources are often the dominant limiting factor in strengthening the applicability of GPU computing. In this paper, we propose DrGPUM, the first profiler that systematically investigates patterns of memory inefficiencies in GPU-accelerated applications. The strength of DrGPUM, when compared to a large class of existing GPU profilers, is its ability to (1) correlate problematic memory usage with data objects and GPU APIs, (2) identify and categorize object-level and intra-object memory inefficiencies, and (3) provide rich insights to guide memory optimization.

DrGPUM works on fully-optimized and unmodified GPU binaries, requires no modification to hardware or OS, and features a userfriendly GUI, which makes it attractive to use in production. Our evaluation with well-known benchmarks and applications shows DrGPUM’s effectiveness in identifying memory inefficiencies with moderate overhead. Eliminating these inefficiencies requires less than nine source lines of code modifications and yields significant reductions in peak memory usage (up to 83%) and/or significant performance improvements (up to 2.48×). Our optimization patches have been confirmed by application developers and upstreamed to their repositories.