CALCM - Computer Architecture Lab at Carnegie Mellon


	Faculty Students Projects Seminar Reports Links Contacts

A Self-Checkpointing Microprocessor Based on Magnetoelectronic Devices

Tuesday October 10, 2006
Hamerschlag Hall 1112
4:30 pm

Nick Carter
University of Illinois at Urbana-Champaign

Data volatility is a major issue in many computer systems. The memory elements most commonly used in CMOS integrated circuits (SRAM cells, DRAM cells, and register latches) represent binary values using the charge stored on one or more capacitative structures. If a system's power supply is interrupted, the charge on these capacitors quickly drains off, destroying any data it represented. This can cause loss of progress in long-running computations, long power-on/boot times because a device's operating system must be loaded from non-volatile storage each time it is turned on, and high idle power consumption due to leakage currents in memory arrays. In addition, CMOS capacitors are vulnerable to radiation-induced "soft" errors, which are becoming increasingly common as feature sizes decrease.

Magnetoelectronic devices that combine ferromagnetic materials with conventional semiconductor structures, such as the hybrid Hall effect device used in this work, can address these limitations of CMOS systems. Because they store data by magnetizing their ferromagnetic elements in a particular direction, rather than placing charge on capacitors, magnetoelectronic devices are inherently non-volatile, retaining their state in the absence of a power supply. Magnetoelectronic devices are also highly resistant to the effects of radiation, and, unlike many other non-volatile memory elements, can tolerate an arbitrary number of read-write cycles without failing or suffering performance degradation.

In this talk, I describe the hybrid Hall effect device, and show how it can be integrated with CMOS electronics to implement a self-checkpointing microprocessor that periodically copies program state to three on-chip magnetoelectronic memory structures: a non-volatile register file, a checkpoint buffer, and a dirty data buffer. If program execution is interrupted for any reason, the self-checkpointing microprocessor can use the contents of these buffers to resume the program at the last checkpoint, limiting the amount of data that is lost during a power supply failure and allowing the processor to near-instantly enter and leave an idle state in which it consumes zero power. Unlike software checkpointing systems, self-checkpointing has very low power and performance costs. In simulations, a self-checkpointing version of the Pentium 4 architecture saw only a 62 mW increase in power consumption, and had a maximum performance degradation of 0.7% when compared to the original architecture. On many programs, the self-checkpointing architecture slightly outperformed the baseline Pentium 4, because our mechanisms aggressively write dirty data back to the main memory to free up space in the on-chip non-volatile memories.

Nicholas Carter has been an Assistant Professor at the University of Illinois at Urbana-Champaign since 1999. Prior to that, he was a graduate student at the Massachusetts Institute of Technology, where he was the memory system architect on the M-Machine project. His research interests focus on reconfigurable computing and computing using non-silicon devices.