Skip main navigation

Electrical and Computer Engineering

18-749PP – Fault-Tolerant Distributed Systems

12 units

The course provides an in-depth and hands-on overview of designing and developing fault-tolerant distributed systems. The course covers both the fundamental and advanced concepts of dependability, including replication, atomic multicast, group communication, consistency, checkpointing, transaction processing and fault injection, along with industrial standards and real-world practices for achieving high availability and fault-tolerance. Additional topics include the practical trade-offs and inter-relationships between fault-tolerance and other properties, such as real-time and performance. The lecture concepts are complemented through a semester-long hands-on project that involves the design, implementation and empirical evaluation of a distributed fault-tolerant, high-performance distributed system. To introduce students to the state-of-the-art technologies, the project emphasizes the use of object-oriented middleware, such as CORBA and EJB.

3 hrs. lec., 9 hrs. lab.

Prerequisites: Experience in programming and senior or graduate standing.

Section P is for Portugal students only.

Last updated on April 11, 2008

ECE classifications

Graduate areas

Software Systems and Computer Networking

This course is currently being offered.

Links

Past semesters

F08

Please note that the course history information is incomplete and/or may reflect different courses offered under the same course number.



5000 Forbes Avenue / Pittsburgh, PA 15213-3890 / Phone: 412-268-7400 / Fax: 412-268-2860