The aim of this research is to explore the benefits and challenges of utilizing online testing techniques where memory controller will be responsible to locate and resolve DRAM faults and ensure error-free operations.
DRAM has been the main component of the main memory in the systems. Therefore, DRAM vendors strive to produce DRAMs that provide more data capacity in the same or smaller area. However, reducing the size of DRAM cells make individual cells more susceptible to leakages and other errors. Rigorous testings and replacements of faulty cells are done by the vendors to resolve such problems, but they are very costly and prone to reduction of yield.
Hence, it is imperative that we find ways to tolerate DRAM errors in the system. In this work, micro-architectural solutions for tolerating DRAM errors will be explored. If the system is capable of monitoring and locating errors while the system is running, DRAM vendors can have higher yield by shipping DRAM chips with some faults.