ECE 542
Design of Fault-Tolerant Digital Systems

Displaying course information from Spring 2014.

Section Type Times Days Location Instructor
C LCD 0800 - 0920 T R   241 Everitt Lab  Ravishankar Iyer
Web Page
Official Description Advanced concepts in hardware and software fault tolerance: fault models, coding in computer systems, module and system level fault detection mechanism, reconfiguration techniques in multiprocessor systems and VLSI processor arrays, and software fault tolerance techniques such as recovery blocks, N-version programming, checkpointing, and recovery; survey of practical fault-tolerant systems. Course Information: Same as CS 536. Prerequisite: ECE 411.
Subject Area Reliable and Secure Systems
Course Prerequisites Credit in ECE 411
Course Directors Ravishankar K Iyer
Detailed Description and Outline


  • Introduction to fault-tolerant computing
  • Demonstration of error detection and recovery
  • Evaluation: hardware and software reliability models
  • Experimental evaluation: simulation based, fault-injection, operational
  • Fault-tolerant techniques: coding, checkpointing recovery
  • Software fault tolerance
  • Case studies of reliable system design
  • Reliable networked systems
  • Security
  • Class Projects

Same as CS 536.


D.K. Pradhan, Fault-Tolerant Computer System Design, Prentice-Hall, 1996.

Collateral Reading:
D. Siewiorek and R. Swarz, Reliable Computer Systems-Design and Evaluation, 2nd ed., Digital Press - Butterworth, 1992.
B. W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems, Addison-Wesley, 1989.

Last updated: 2/13/2013