Electrical & Computer Engineering     |     Carnegie Mellon

Tuesday, May 27 , 12:00-1:00 p.m. HH-1112


Phillip Stanley-Marbell
Carnegie Mellon University

Dynamic Fault-Tolerance Management and Analysis Metrics for Failure-Prone Battery-Powered Systems

Traditional computing systems are typically characterized by metrics such as performance, and are designed to optimize this characteristic. Systems comprised of large numbers of failure prone computing devices and networks,must provide a combination of reliability and performance, in the face of failures. Dynamic Fault-Tolerance Management (DFTM) is one proposal for a structured approach to providing reliable operation in systems with very high failure rates in both devices and their interconnecting networks.

When the devices in question are resource constrained, e.g. are attached to energy sources such as batteries with non-linear discharge effects, approaches to providing reliable execution, and the metrics employed in analyzing their efficacy, must take
into consideration a combination of performance, power consumption, reliability, and battery lifetime.

This talk introduces DFTM and a set of analytical measures for characterizing failure-prone battery powered systems.

Phillip is a 2nd year Ph.D. student in the Dept. of ECE. He currently passes his time writing books about obscure operating systems, and learning languages, natural and other. Besides this diversionary avocation, he spends most of his time building simulation frameworks, devising analysis metrics , and proposing programming frameworks for computational fabrics (which have nothing to do with clothing, at least, not directly). He lives in Pittsburgh (surprise) with his pet, Avdotya, a stack of paper.