ACMES team

Samovar lab

Hierarchical Trace Format: A scalable trace format for Exascale computing

Team work: Catherine Guelque presented "Hierarchical Trace Format: A scalable trace format for Exascale computing" at 1C27 the 17/11/2023 at 11h00.

Abstract

The field of High-Performance Computing (HPC) is a rapidly evolving domain. The emergence of new paradigms, such as the use of accelerators (GPUs) or task-based computing (StarPU), requires adjustments to existing tools. Furthermore, the advent of exaflop-capable computing facilities, which are clusters capable of performing more than a billion billion floating-point calculations per second, may render certain tools obsolete.
 
One such type of tool are traces, records of program executions, which allow for post-mortem analysis of these executions and are quite valuable for enhancing the scalability of applications. However, most of the existing trace tools are not suited for these new paradigms and are not scalable enough for use in exascale computing.
 
We thus propose a new trace format that offers strong scalability, incorporates advanced encoding and compression systems, and delivers performance comparable to other recent tools, such a Pilgrim. This new tool, called Hierarchical Trace Format (HTF), also provides sophisticated and easy-to-use analyses that facilitate exascale scalability and optimization.