Alawneh, Lu'ay (2012) Techniques To Facilitate the Understanding of Inter-process Communication Traces. PhD thesis, Concordia University.
|PDF - Accepted Version|
High Performance Computing (HPC) systems play an important role in today’s heavily digitized world, which is in a constant demand for higher speed of calculation and performance. HPC applications are used in multiple domains such as telecommunication, health, scientific research, and more. With the emergence of multi-core and cloud computing platforms, the HPC paradigm is quickly becoming the design of choice of many service providers.
HPC systems are also known to be complex to debug and analyze due to the large number of processes they involve and the way these processes communicate with each other to perform specific tasks. As a result, software engineers must spend extensive amount of time understanding the complex interactions among a system’s processes. This is usually done through the analysis of execution traces generated from running the system at hand. Traces, however, are very difficult to work with due to the overwhelming size of typical traces. The objective of this research is to present a set of techniques that facilitates the understanding of the behaviour of HPC applications through the analysis of system traces.
The first technique consists of building an exchange format called MTF (MPI Trace Format) for representing and exchanging traces generated from HPC applications based on the MPI (Message Passing Interface) standard, which is a de facto standard for inter-process communication for high performance computing systems. The design of MTF is validated against well-known requirements for a standard exchange format.
The second technique aims to facilitate the understanding of large traces of inter-process communication by automatically extracting communication patterns that characterize their main behaviour. Two algorithms are presented. The first one permits the recognition of repeating patterns in traces of MPI (Message Passing Interaction) applications whereas the second algorithm searches if a given communication pattern occurs in a trace. Both algorithms are based on the n-gram extraction technique used in natural language processing.
Finally, we developed a technique to abstract MPI traces by detecting the different execution phases in a program based on concepts from information theory. Using this approach, software engineers can examine the trace as a sequence of high-level computational phases instead of a mere flow of low-level events.
The techniques presented in this thesis have been tested on traces generated from real HPC programs. The results from several case studies demonstrate the usefulness and effectiveness of our techniques.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Electrical and Computer Engineering|
|Item Type:||Thesis (PhD)|
|Degree Name:||Ph. D.|
|Program:||Electrical and Computer Engineering|
|Date:||12 April 2012|
|Thesis Supervisor(s):||Hamou-Lhadj, Abdelwahab|
|Deposited By:||LU'AY ALAWNEH|
|Deposited On:||20 Jun 2012 14:48|
|Last Modified:||20 Jun 2012 14:48|
Repository Staff Only: item control page