Fingerpointing | Priya Narasimhan

Our work has multiple facets to it. Our projects often involve a synergistic combination of log analysis, machine-learning, systems development, visualization and fault-tolerance practices.


Failure Diagnosis for Cloud Computing. We are focusing on analyzing the behavior of cloud-computing platforms (particularly, Hadoop) to understand how to localize the node that is the source of performance problems. We have developed log-analysis techniques (SALSA), visualization techniques (Mochi), black-box metric analysis (Ganesha), amongst other techniques.

Failure Diagnosis for HPC Systems. We are focusing on failure diagnosis for high-performance file-systems such as PVFS and Lustre. We have developed black-box failure analysis techniques that examine the OS-level metrics from these systems as well as those that localize problems by analyzing system-calls alone.

Failure Diagnosis for Automotive Systems. We are focusing on embedded automotive systems, focusing on the specific kinds of failures that occur and that need to be diagnosed in that domain. We are interested in failure diagnosis for improved runtime safety.

Diagnosis-Driven Online Recovery.We are focusing on online (and not just offline) diagnosis, primarily in support of triggering more informed fault-recovery. We have aimed to scale the diagnosis approaches that we have developed to support rapid recovery and to demonstrate how diagnosis driven by recovery is superior to naive recovery.

Visualization for Failure Diagnosis.Recognizing that system adminstrators need tools for large-scale troubleshooting, we have developed visualization techniques and tools to support the rapid manual localization of performance problems by displaying multiple different views of a system's execution.

Last updated: March 2008, Priya Narasimhan