===== Reading ===== **Please submit your paper reviews [[http://safari2.ece.cmu.edu/18742-f14/submissions/|here]].** \\ Please create your account as soon as possible so that we can send you the paper for review. If you have any problem with the website, please let me know. (Send me an email at camellyx@gmail.com) \\ Note that the reviews are due at 11:59 PM on the due date. ==== Reading List (in reverse order) ==== === 12/3/2014 === * Christian Jacobi, Timothy J. Slegel, Dan F. Greiner: Transactional Memory Architecture and Implementation for IBM System Z. MICRO 2010. * Jae-Woong Chung, Luke Yen, Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, Dan Grossman: ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory. MICRO 2010. * Jack B. Dennis, David Misunas: A Preliminary Architecture for a Basic Data Flow Processor. ISCA 1974. * James E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, James Laudon: The ZS-1 Central Processor. ASPLOS 1987. * James E. Smith: Decoupled access/execute computer architectures. ISCA 1982 === 11/18/2014 === Review required for the following two paper, due on Monday, Nov 17. * **Justin Meza et al., "Memory Errors at Scale: What the Trends Across a Billion-User Web Services Company Foretell", under submission. (Sent through email, please do not distribute)** * **[[https://www.usenix.org/legacy/event/sec08/tech/full_papers/halderman/halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]]** === 11/13/2014 === Reviews required for both papers, due on Wednesday, Nov 12. * **[[http://dl.acm.org/citation.cfm?id=2503257|Sridharan et al., "Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults", SC 2013.]]** * [[https://www.cs.cmu.edu/~bianca/fast07.pdf|Bianca Schroeder, Garth A. Gibson, "Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You?", FAST 2007.]] * [[http://www.pdl.cmu.edu/PDL-FTP/associated/dsn06.pdf|Bianca Schroeder, Garth A. Gibson, "A large-scale study of failures in high-performance computing systems", DSN 2006.]] * [[http://www.cs.toronto.edu/~hwang/papers/asplos2012.pdf|Andy A. Hwang, et al., "Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design", ASPLOS 2012.]] * [[http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study", SIGMETRICS 2009.]] * **[[https://www.cs.princeton.edu/~appel/papers/memerr.pdf|Sudhakar Govindavajhala, Andrew W. Appel, "Using Memory Errors to Attack a Virtual Machine", SP 2003.]]** * [[https://www.usenix.org/legacy/event/sec08/tech/full_papers/halderman/halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]] === 11/12/2014 === Reviews required for both papers, due on Tuesday, Nov 11. * **[[http://users.ece.utexas.edu/~merez/vecc_asplos_2010.pdf|Doe Hyun Yoon, Mattan Erez, "Virtualized and flexible ECC for main memory", ASPLOS 2010.]]** * [[http://ece.umd.edu/courses/enee759h.S2003/references/ibm_chipkill.pdf|T. J. Dell, “A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics Division, 1997.]] * [[http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study", SIGMETRICS 2009.]] * [[http://research.microsoft.com/pubs/144888/eurosys84-nightingale.pdf|Edmund B. Nightingale et al., "Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs", EuroSys 2011.]] * **[[http://passat.crhc.illinois.edu/rakeshk/hpca13.pdf|Xun Jian, Rakesh Kumar, "Adaptive Reliability Chipkill Correct (ARCC)", HPCA 2013.]]** === 11/5/2014 === === 10/31/2014 === Review required for the following two papers, due on Friday, Oct 31. * **[[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Maurice Herlihy and J. Eliot B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]]** * [[http://web.mit.edu/~mmt/Public/Knight86.pdf|Tom Knight, "An achitecture for mostly functional languages", LFP 1986.]] * [[http://www.cs.rice.edu/~alc/old/comp520/papers/SW91.pdf|Frank Schmuck, Jim Wyllie, "Experience with transactions in QuickSilver", SOSP 1991.]] * [[http://cs.brown.edu/~mph/AspnesH90/p340-aspnes.pdf|James Aspnes, Maurice Herlihy, "Wait-Free Data Structures in the Asynchronous PRAM Model", SPAA 1990.]] * [[http://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf|Maurice Herlihy, "Wait-free synchronization", TOPLAS 1991.]] * [[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Maurice Herlihy, J. Eliot B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]] * [[http://csl.stanford.edu/~christos/publications/2006.bliss.taco.pdf|Ahmad Zmily and Christos Kozyrakis, "Block-Aware Instruction Set Architecture", TACO 2006.]] * [[http://web.stanford.edu/class/cs343/resources/crusoe.pdf|Alexander Klaiber, "The Technology Behind Crusoe™ Processors", 2000.]] * [[http://courses.cs.vt.edu/cs5204/fall11-kafura/Papers/TransactionalMemory/TM-Book-V2.pdf.pdf|J. Larus and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture (Ch. 1 & 2).]] * [[http://www.cs.binghamton.edu/~dima/cs580a/spec_wake_micro00.pdf|Jared Stark, Mary D. Brown, Yale N. Patt, "On pipelining dynamic instruction scheduling logic", MICRO 2000.]] * [[http://www.christianjacobi.de/publications/jsg12_tx.pdf|Christian Jacobi, et al., "Transactional Memory Architecture and Implementation for IBM System Z", MICRO 2012.]] * [[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=931895|Sanjay J. Patel et al., “rePLay: a hardware framework for dynamic optimization,” IEEE TC, June 2001.]] * **[[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|J. Gregory Steffan et al., "A Scalable Approach to Thread-Level Speculation", ISCA 2000.]]** * [[ftp://ftp.cs.wisc.edu/sohi/papers/1992/isca.esw.pdf|Manoj Franklin, Gurindar S. Sohi, "The expandable split window paradigm for exploiting fine-grain parallelsim", ISCA 1992.]] * [[ftp://ftp.cs.wisc.edu/sohi/papers/1995/isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,” ISCA 1995.]] * [[http://www.cs.cmu.edu/~colohan/papers/tls_isca00.pdf|Steffan et al., “A Scalable Approach to Thread-Level Speculation,” ISCA 2000.]] * [[https://homes.cs.washington.edu/~luisceze/publications/isca06_bulk.pdf|Luis Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors", ISCA 2006.]] * [[http://www.princeton.edu/~rblee/ELE572Papers/DynamicMultithreadingProc_akkary.pdf?q=tilde/rblee/ELE572Papers/DynamicMultithreadingProc_akkary.pdf|Akkary and Driscoll, “A dynamic multithreading processor,” MICRO 1998.]] Required videos for module 2.5.* in [[http://www.ece.cmu.edu/~ece740/f13/doku.php?id=schedule#schedule|18-740]]: | 9/25 Wed. | 2.5.1 Speculation | [[http://www.ece.cmu.edu/~ece742/f14/files/onur-740-fall13-module2.5-speculation.pdf|pdf]], [[http://www.ece.cmu.edu/~ece742/f14/files/onur-740-fall13-module2.5-speculation.pptx|pptx]], [[https://www.youtube.com/watch?v=g3IF8DTtr8c|YouTube]][[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=dbb3baf9-c85e-4007-8c71-1e3204fe9907|Panopto]] | [[http://www.ece.cmu.edu/~ece740/f13/doku.php?id=readings#module_2-5|readings]] | | 9/27 Fri. | 2.5.2 Speculation | [[https://www.youtube.com/watch?v=FqHk4bxrI8Y|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=bf92ac2b-5fb5-4896-9bbd-6cb216a16cdd|Panopto]] | ::: | | 9/30 Mon. | 2.5.3 Speculation | [[https://www.youtube.com/watch?v=uhyNTy8hvDs|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=7dbc66b0-b245-45ba-b047-09a84427676a|Panopto]] | ::: | | ::: | 2.5.4 Speculation | [[https://www.youtube.com/watch?v=McMyefc8CCE|video]] [[http://cmu.vid.acatar.com/Panopto/Pages/Viewer/Default.aspx?id=e9eddf7c-c5c0-4ab4-a731-30852d601506|Panopto]] | ::: | Related readings: * [[ftp://ftp.cs.wisc.edu/sohi/papers/1995/isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,” ISCA 1995.]] * [[http://classes.soe.ucsc.edu/cmpe202/Spring13/papers/12a.pdf|Zhou, “Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window,” PACT 2005.]] * [[http://pages.cs.wisc.edu/~rajwar/papers/micro01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution,” MICRO 2001.]] === 10/16/2014 === Consistency II -- Review required (one out of two) due on Wednesday night * **[[http://www.eecg.toronto.edu/~moshovos/research/store-wait-free.pdf|Wenisch et al., "Mechanisms for Store-wait–free Multiprocessors", ISCA 2007]]** * [[http://www.cs.utexas.edu/~pingali/CS395T/2009fa/lectures/herlihy93transactional.pdf|Herlihy et al., "Transactional Memory: Architectural Support for Lock-Free Data Structures", ISCA 1993.]] * [[http://www.cs.cmu.edu/~tcm/tcm_papers/isca00.pdf|Steffan et al., "A Scalable Approach to Thread-Level Speculation", ISCA 2000.]] * **[[https://homes.cs.washington.edu/~luisceze/publications/isca07_bulksc.pdf|Ceze et al., "BulkSC: Bulk Enforcement of Sequential Consistency", ISCA 2007.]]** * [[https://homes.cs.washington.edu/~luisceze/publications/isca06_bulk.pdf|Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors", ISCA 2006.]] * [[http://www.ece.cmu.edu/~ece740/f13/lib/exe/fetch.php?media=rajwar01.pdf|Rajwar and Goodman, “Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution”, MICRO 2001.]] * [[http://www.ece.cmu.edu/~ece740/f13/lib/exe/fetch.php?media=herlihy93.pdf|Herlihy and Moss, “Transactional Memory: Architectural Support for Lock-Free Data Structures”, ISCA 1993.]] === 10/14/2014 === Background lecture and paper (required reading, no need to review): * [[https://www.youtube.com/watch?v=Mq24MXW4g3U|Consistency & Coherence Lecture]] * [[http://courses.cs.washington.edu/courses/cse548/10wi/Lamport.pdf|Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs", IEEE Trans. Computers 28(9): 690-691 (1979)]] Consistency I -- Required review (one out of two) due on Monday night * **[[http://hpc.cs.tsinghua.edu.cn/research/zwm/reading/prof/2a.pdf|Gharachorloo et al., "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors", ISCA 1990.]]** * **[[https://courses.engr.illinois.edu/cs533/sp2012/reading_list/gharachorloo91two.pdf|Gharachorloo et al., "Two Techniques to Enhance the Performance of Memory Consistency Models", ICPP 1991.]]** * [[http://www.csd.uoc.gr/~hy460/pdf/kung.pdf|Kung et al., "On Optimistic Methods for Concurrency Control", TODS 1981.]] * [[https://engineering.purdue.edu/~vpai/Publications/ranganathan-spaa97.pdf|Ranganathan et al., "Using Speculative Retirement and Larger Instruction Windows to Narrow the Performance Gap Between Memory Consistency Models", SPAA 1997.]] * [[http://www.cs.arizona.edu/~gniady/papers/isca99_scrc.pdf|Gniady et al., "Is SC + ILP = RC?", ISCA 1999.]] Recommended book * [[http://www.morganclaypool.com/doi/pdfplus/10.2200/S00346ED1V01Y201104CAC016|A Primer on Memory Consistency and Cache Coherence, Chapters 1, 3, 4, 5]] Further reading in a different area (system level consistency) * [[http://sns.cs.princeton.edu/docs/eiger-nsdi13.pdf|Lloyd et al., "Stronger Semantics for Low-Latency Geo-Replicated Storage", NSDI 2013.]] === 10/9/2014 === Amirali's Literature Survey * [[http://www.cs.utah.edu/events/thememoryforum/mike.pdf|Connor et al., "Highlights of the high-bandwidth memory (HBM) standard", Memory Forum 2014.]] * [[https://cs.uwaterloo.ca/~brecht/courses/702/Possible-Readings/multiprocessor/tlb-consistency-computer-1990.pdf|Teller et al., "Translation-Lookaside Buffer Consistency", Computer 1990.]] * [[http://research.cs.wisc.edu/multifacet/papers/isca13_direct_segment.pdf|Basu et al., "Efficient virtual memroy support in big memory servers", ISCA 2013.]] Jiyuan's Paper Discussion: (required 1 out of 3 reviews) * **[[http://mercury.pr.erau.edu/~davisb22/papers/burst_scheduling_hpca13.pdf|Shao et al., "A Burst Scheduling Access Reordering Mechanism", HPCA 2007.]]** * [[http://users.ece.cmu.edu/~omutlu/pub/dram-aware-caches-TR-HPS-2010-002.pdf|Lee et al., "DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems", HPS 2010.]] * [[http://lca.ece.utexas.edu/people/kaseridis/papers/ISCA_2010.pdf]] * [["FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems"]] * [[Adaptive History-Based Memory Schedulers]] * [["Self Optimizing Memory Controllers: A Reinforcement Learning Approach"]] * **[[http://www.cs.cmu.edu/~chensm/LBA_reading_group/papers/3Ddram-isca08.pdf|Loh et al., "3D-Stacked Memory Architectures for Multi-Core Processors", ISCA 2008.]]** * [[http://users.ece.gatech.edu/~moin/papers/micro12.pdf|Qureshi et al., "Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design", MICRO 2012.]] * [[http://sampa.cs.washington.edu/papers/micro06_mshr.pdf|Tuck et al., "Scalable Cache Miss Handling for High Memory Level Parallelism", MICRO 2006.]] * [[http://comparch.gatech.edu/hparch/papers/sim_isca13.pdf|Sim et al., "Resilient die-stacked DRAM caches", ISCA 2013.]] * **[[http://www.dongpingzhang.com/wordpress/wp-content/uploads/2013/06/MSPC6-Zhang.pdf|Zhang et al., "A New Perspective on Processing-in-memory Architecture Design", MSPC 2013.]]** * [[https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CB4QFjAA&url=http%3A%2F%2Fisca2010.inria.fr%2Fmedia%2Fslides%2FISCA_Needle_A_0610.pptx&ei=9kE9VMOuMtLCsATIuIGQCQ&usg=AFQjCNGfI_qA9tHBnR8pJo50uRNYvgVEBw&sig2=OTCWpdAXmMmekm9jun8uBg&bvm=bv.77161500,d.cWc&cad=rja|Dally et al., "Moving the needle Computer Architecture Research in Academe and Industry", ISCA keynote 2010.]] === 10/7/2014 === Hui: (required 1 out of 3 reviews) * **[[http://tinker.cc.gatech.edu/pdfs/MICRO44_Jesse_Beu.pdf|Beu et al., "Manager-Client Pairing: A Framework for Implementing Coherence Hierarchies", MICRO 2011.]]** * **[[http://research.cs.wisc.edu/multifacet/papers/hpca14_quick_release.pdf|Hechtman et al., "Quick Release: A Throughput-oriented Approach to Release Consistency on GPUs", HPCA 2014.]]** * **[[http://dl.acm.org/citation.cfm?id=2541982|Voskuilen et al., "High-Performance Fractal Coherence", ASPLOS 2014.]]** === 10/2/2014 === Required review for the Memory Forum paper: ^ Due 9/28/2014 | [[http://www.cs.utah.edu/events/thememoryforum/kang.pdf|Kang et al., "Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling", Memory Forum 2014.]] | ^ | [[http://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf|Kim et al., "A