Replaying execution trace with a performance model
The Parallel & Distributed systems group is looking for an internship.
Context
Parallel applications can be analyzed using tracing tools such as EZTrace. These tools collect events (eg. calls to MPI functions, or OpenMP constructs, etc.) during the application execution, and the resulting execution traces can be analyzed to reveal performance bugs.
As part of a collaborative research project, we develop EasyTraceAnalyzer, a generic trace analysis tool that processes various kinds of traces, and implement several performance analysis techniques (for instance, bottleneck detection).
Goal of this internship
This internship aims at designing a new performance analysis method that would predict the behavior of a parallel application if the performance of one of the platform components (eg. network, parallel filesystem, etc.) were different.
The main tasks of the internship are the following:
- Extracting a performance model from an execution trace by analyzing the execution trace (eg. MPI events, MPI-IO events, posix IO events, etc.)
- Applying a new performance model to a trace. This will require to modify the execution trace to shorten/expand the events related to the modified component, but also to spread to trace modification to the other threads/processes that communicate together.
- Evaluating the performance prediction on real applications
Keywords: HPC, MPI, performance analysis, performance modeling, performance prediction
Work conditions
- Open-source development in C++
- The internship will take place at Télécom SudParis at Palaiseau (in the same building as Télécom Paris) – 19 place Marguerite Perey, 91120 Palaiseau
- Due to the current confinement, the internship may start remotely
Contact
François Trahay <francois.trahay@telecom-sudparis.eu>, Associate professor
Parallel & Distributed Systems group
Télécom SudParis, Institut Polytechnique de Paris, Samovar lab