Automatic software interference detection in parallel applications
Title | Automatic software interference detection in parallel applications |
Publication Type | Conference Papers |
Year of Publication | 2007 |
Authors | Tabatabaee V, Hollingsworth J |
Conference Name | Proceedings of the 2007 ACM/IEEE conference on Supercomputing |
Date Published | 2007/// |
Publisher | ACM |
Conference Location | Reno, Nevada |
ISBN Number | 978-1-59593-764-3 |
Abstract | We present an automated software interference detection methodology for Single Program, Multiple Data (SPMD) parallel applications. Interference comes from the system and unexpected processes. If not detected and corrected such interference may result in performance degradation. Our goal is to provide a reliable metric for software interference that can be used in soft-failure protection and recovery systems. A unique feature of our algorithm is that we measure the relative timing of application events (i.e. time between MPI calls) rather than system level events such as CPU utilization. This approach lets our system automatically accommodate natural variations in an application's utilization of resources. We use performance irregularities and degradation as signs of software interference. However, instead of relying on temporal changes in performance, our system detects spatial performance degradation across multiple processors. We also include a case study that demonstrates our technique's effectiveness, resilience and robustness. |
DOI | 10.1145/1362622.1362642 |