User Tools

Site Tools


readings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
readings [2014/11/05 17:00]
yixinluo
readings [2014/12/03 21:12] (current)
yixinluo
Line 6: Line 6:
  
 ==== Reading List (in reverse order) ==== ==== Reading List (in reverse order) ====
 +=== 12/3/2014 ===
 +  * Christian Jacobi, Timothy J. Slegel, Dan F. Greiner: Transactional Memory Architecture and Implementation for IBM System Z. MICRO 2010.
 +  * Jae-Woong Chung, Luke Yen, Stephan Diestelhorst,​ Martin Pohlack, Michael Hohmuth, David Christie, Dan Grossman: ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory. MICRO 2010.
 +  * Jack B. Dennis, David Misunas: A Preliminary Architecture for a Basic Data Flow Processor. ISCA 1974.
 +  * James E. Smith, G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, James Laudon: The ZS-1 Central Processor. ASPLOS 1987.
 +  * James E. Smith: Decoupled access/​execute computer architectures. ISCA 1982
 +
 +=== 11/18/2014 ===
 +Review required for the following two paper, due on Monday, Nov 17.
 +  * **Justin Meza et al., "​Memory Errors at Scale: What the Trends Across a Billion-User Web Services Company Foretell",​ under submission. (Sent through email, please do not distribute)**
 +  * **[[https://​www.usenix.org/​legacy/​event/​sec08/​tech/​full_papers/​halderman/​halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]]**
 +
 +=== 11/13/2014 ===
 +Reviews required for both papers, due on Wednesday, Nov 12.
 +  * **[[http://​dl.acm.org/​citation.cfm?​id=2503257|Sridharan et al., "Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults",​ SC 2013.]]**
 +    * [[https://​www.cs.cmu.edu/​~bianca/​fast07.pdf|Bianca Schroeder, Garth A. Gibson, "Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You?", FAST 2007.]]
 +    * [[http://​www.pdl.cmu.edu/​PDL-FTP/​associated/​dsn06.pdf|Bianca Schroeder, Garth A. Gibson, "A large-scale study of failures in high-performance computing systems",​ DSN 2006.]]
 +    * [[http://​www.cs.toronto.edu/​~hwang/​papers/​asplos2012.pdf|Andy A. Hwang, et al., "​Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design",​ ASPLOS 2012.]]
 +    * [[http://​static.googleusercontent.com/​media/​research.google.com/​en/​us/​pubs/​archive/​35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study",​ SIGMETRICS 2009.]]
 +  * **[[https://​www.cs.princeton.edu/​~appel/​papers/​memerr.pdf|Sudhakar Govindavajhala,​ Andrew W. Appel, "Using Memory Errors to Attack a Virtual Machine",​ SP 2003.]]**
 +    * [[https://​www.usenix.org/​legacy/​event/​sec08/​tech/​full_papers/​halderman/​halderman.pdf|J. Alex Halderman et al., "Lest We Remember: Cold Boot Attacks on Encryption Keys", USENIX Security Symposium 2008.]]
 +
 +=== 11/12/2014 ===
 +Reviews required for both papers, due on Tuesday, Nov 11.
 +  * **[[http://​users.ece.utexas.edu/​~merez/​vecc_asplos_2010.pdf|Doe Hyun Yoon, Mattan Erez, "​Virtualized and flexible ECC for main memory",​ ASPLOS 2010.]]**
 +    * [[http://​ece.umd.edu/​courses/​enee759h.S2003/​references/​ibm_chipkill.pdf|T. J. Dell, “A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics Division, 1997.]]
 +    * [[http://​static.googleusercontent.com/​media/​research.google.com/​en/​us/​pubs/​archive/​35162.pdf|Bianca Schroeder et al., "DRAM Errors in the Wild: A Large-Scale Field Study",​ SIGMETRICS 2009.]]
 +    * [[http://​research.microsoft.com/​pubs/​144888/​eurosys84-nightingale.pdf|Edmund B. Nightingale et al., "​Cycles,​ cells and platters: an empirical analysisof hardware failures on a million consumer PCs", EuroSys 2011.]]
 +  * **[[http://​passat.crhc.illinois.edu/​rakeshk/​hpca13.pdf|Xun Jian, Rakesh Kumar, "​Adaptive Reliability Chipkill Correct (ARCC)",​ HPCA 2013.]]**
 +
 === 11/5/2014 === === 11/5/2014 ===
 === 10/31/2014 === === 10/31/2014 ===
Line 11: Line 41:
   * **[[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Maurice Herlihy and J. Eliot B. Moss, "​Transactional Memory: Architectural Support for Lock-Free Data Structures",​ ISCA 1993.]]**   * **[[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Maurice Herlihy and J. Eliot B. Moss, "​Transactional Memory: Architectural Support for Lock-Free Data Structures",​ ISCA 1993.]]**
     * [[http://​web.mit.edu/​~mmt/​Public/​Knight86.pdf|Tom Knight, "An achitecture for mostly functional languages",​ LFP 1986.]]     * [[http://​web.mit.edu/​~mmt/​Public/​Knight86.pdf|Tom Knight, "An achitecture for mostly functional languages",​ LFP 1986.]]
 +    * [[http://​www.cs.rice.edu/​~alc/​old/​comp520/​papers/​SW91.pdf|Frank Schmuck, Jim Wyllie, "​Experience with transactions in QuickSilver",​ SOSP 1991.]]
     * [[http://​cs.brown.edu/​~mph/​AspnesH90/​p340-aspnes.pdf|James Aspnes, Maurice Herlihy, "​Wait-Free Data Structures in the Asynchronous PRAM Model",​ SPAA 1990.]]     * [[http://​cs.brown.edu/​~mph/​AspnesH90/​p340-aspnes.pdf|James Aspnes, Maurice Herlihy, "​Wait-Free Data Structures in the Asynchronous PRAM Model",​ SPAA 1990.]]
     * [[http://​cs.brown.edu/​~mph/​Herlihy91/​p124-herlihy.pdf|Maurice Herlihy, "​Wait-free synchronization",​ TOPLAS 1991.]]     * [[http://​cs.brown.edu/​~mph/​Herlihy91/​p124-herlihy.pdf|Maurice Herlihy, "​Wait-free synchronization",​ TOPLAS 1991.]]
 +    * [[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Maurice Herlihy, J. Eliot B. Moss, "​Transactional Memory: Architectural Support for Lock-Free Data Structures",​ ISCA 1993.]]
 +    * [[http://​csl.stanford.edu/​~christos/​publications/​2006.bliss.taco.pdf|Ahmad Zmily and Christos Kozyrakis, "​Block-Aware Instruction Set Architecture",​ TACO 2006.]]
 +    * [[http://​web.stanford.edu/​class/​cs343/​resources/​crusoe.pdf|Alexander Klaiber, "The Technology Behind Crusoe™ Processors",​ 2000.]]
     * [[http://​courses.cs.vt.edu/​cs5204/​fall11-kafura/​Papers/​TransactionalMemory/​TM-Book-V2.pdf.pdf|J. Larus and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture (Ch. 1 & 2).]]     * [[http://​courses.cs.vt.edu/​cs5204/​fall11-kafura/​Papers/​TransactionalMemory/​TM-Book-V2.pdf.pdf|J. Larus and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture (Ch. 1 & 2).]]
     * [[http://​www.cs.binghamton.edu/​~dima/​cs580a/​spec_wake_micro00.pdf|Jared Stark, Mary D. Brown, Yale N. Patt, "On pipelining dynamic instruction scheduling logic",​ MICRO 2000.]]     * [[http://​www.cs.binghamton.edu/​~dima/​cs580a/​spec_wake_micro00.pdf|Jared Stark, Mary D. Brown, Yale N. Patt, "On pipelining dynamic instruction scheduling logic",​ MICRO 2000.]]
 +    * [[http://​www.christianjacobi.de/​publications/​jsg12_tx.pdf|Christian Jacobi, et al., "​Transactional Memory Architecture and Implementation for IBM System Z", MICRO 2012.]]
 +    * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=931895|Sanjay J. Patel et al., “rePLay: a hardware framework for dynamic optimization,​” IEEE TC, June 2001.]]
   * **[[http://​www.cs.cmu.edu/​~tcm/​tcm_papers/​isca00.pdf|J. Gregory Steffan et al., "A Scalable Approach to Thread-Level Speculation",​ ISCA 2000.]]**   * **[[http://​www.cs.cmu.edu/​~tcm/​tcm_papers/​isca00.pdf|J. Gregory Steffan et al., "A Scalable Approach to Thread-Level Speculation",​ ISCA 2000.]]**
     * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1992/​isca.esw.pdf|Manoj Franklin, Gurindar S. Sohi, "The expandable split window paradigm for exploiting fine-grain parallelsim",​ ISCA 1992.]]     * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1992/​isca.esw.pdf|Manoj Franklin, Gurindar S. Sohi, "The expandable split window paradigm for exploiting fine-grain parallelsim",​ ISCA 1992.]]
     * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1995/​isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,​” ISCA 1995.]]     * [[ftp://​ftp.cs.wisc.edu/​sohi/​papers/​1995/​isca.multiscalar.pdf|Sohi et al., “Multiscalar Processors,​” ISCA 1995.]]
     * [[http://​www.cs.cmu.edu/​~colohan/​papers/​tls_isca00.pdf|Steffan et al., “A Scalable Approach to Thread-Level Speculation,​” ISCA 2000.]]     * [[http://​www.cs.cmu.edu/​~colohan/​papers/​tls_isca00.pdf|Steffan et al., “A Scalable Approach to Thread-Level Speculation,​” ISCA 2000.]]
 +    * [[https://​homes.cs.washington.edu/​~luisceze/​publications/​isca06_bulk.pdf|Luis Ceze et al., "Bulk Disambiguation of Speculative Threads in Multiprocessors",​ ISCA 2006.]]
 +    * [[http://​www.princeton.edu/​~rblee/​ELE572Papers/​DynamicMultithreadingProc_akkary.pdf?​q=tilde/​rblee/​ELE572Papers/​DynamicMultithreadingProc_akkary.pdf|Akkary and Driscoll, “A dynamic multithreading processor,​” MICRO 1998.]]
 Required videos for module 2.5.* in [[http://​www.ece.cmu.edu/​~ece740/​f13/​doku.php?​id=schedule#​schedule|18-740]]:​ Required videos for module 2.5.* in [[http://​www.ece.cmu.edu/​~ece740/​f13/​doku.php?​id=schedule#​schedule|18-740]]:​
 | 9/25 Wed. | 2.5.1 Speculation | [[http://​www.ece.cmu.edu/​~ece742/​f14/​files/​onur-740-fall13-module2.5-speculation.pdf|pdf]],​ [[http://​www.ece.cmu.edu/​~ece742/​f14/​files/​onur-740-fall13-module2.5-speculation.pptx|pptx]],​ [[https://​www.youtube.com/​watch?​v=g3IF8DTtr8c|YouTube]][[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=dbb3baf9-c85e-4007-8c71-1e3204fe9907|Panopto]] | [[http://​www.ece.cmu.edu/​~ece740/​f13/​doku.php?​id=readings#​module_2-5|readings]] | | 9/25 Wed. | 2.5.1 Speculation | [[http://​www.ece.cmu.edu/​~ece742/​f14/​files/​onur-740-fall13-module2.5-speculation.pdf|pdf]],​ [[http://​www.ece.cmu.edu/​~ece742/​f14/​files/​onur-740-fall13-module2.5-speculation.pptx|pptx]],​ [[https://​www.youtube.com/​watch?​v=g3IF8DTtr8c|YouTube]][[http://​cmu.vid.acatar.com/​Panopto/​Pages/​Viewer/​Default.aspx?​id=dbb3baf9-c85e-4007-8c71-1e3204fe9907|Panopto]] | [[http://​www.ece.cmu.edu/​~ece740/​f13/​doku.php?​id=readings#​module_2-5|readings]] |
Line 30: Line 68:
  
 === 10/16/2014 === === 10/16/2014 ===
-Review required (one out of two) due on Wednesday night+Consistency II -- Review required (one out of two) due on Wednesday night
   * **[[http://​www.eecg.toronto.edu/​~moshovos/​research/​store-wait-free.pdf|Wenisch et al., "​Mechanisms for Store-wait–free Multiprocessors",​ ISCA 2007]]**   * **[[http://​www.eecg.toronto.edu/​~moshovos/​research/​store-wait-free.pdf|Wenisch et al., "​Mechanisms for Store-wait–free Multiprocessors",​ ISCA 2007]]**
     * [[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Herlihy et al., "​Transactional Memory: Architectural Support for Lock-Free Data Structures",​ ISCA 1993.]]     * [[http://​www.cs.utexas.edu/​~pingali/​CS395T/​2009fa/​lectures/​herlihy93transactional.pdf|Herlihy et al., "​Transactional Memory: Architectural Support for Lock-Free Data Structures",​ ISCA 1993.]]
Line 43: Line 81:
   * [[https://​www.youtube.com/​watch?​v=Mq24MXW4g3U|Consistency & Coherence Lecture]]   * [[https://​www.youtube.com/​watch?​v=Mq24MXW4g3U|Consistency & Coherence Lecture]]
   * [[http://​courses.cs.washington.edu/​courses/​cse548/​10wi/​Lamport.pdf|Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs",​ IEEE Trans. Computers 28(9): 690-691 (1979)]]   * [[http://​courses.cs.washington.edu/​courses/​cse548/​10wi/​Lamport.pdf|Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs",​ IEEE Trans. Computers 28(9): 690-691 (1979)]]
-Required review (one out of two) due on Monday night+Consistency I -- Required review (one out of two) due on Monday night
   * **[[http://​hpc.cs.tsinghua.edu.cn/​research/​zwm/​reading/​prof/​2a.pdf|Gharachorloo et al., "​Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors",​ ISCA 1990.]]**   * **[[http://​hpc.cs.tsinghua.edu.cn/​research/​zwm/​reading/​prof/​2a.pdf|Gharachorloo et al., "​Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors",​ ISCA 1990.]]**
   * **[[https://​courses.engr.illinois.edu/​cs533/​sp2012/​reading_list/​gharachorloo91two.pdf|Gharachorloo et al., "Two Techniques to Enhance the Performance of Memory Consistency Models",​ ICPP 1991.]]**   * **[[https://​courses.engr.illinois.edu/​cs533/​sp2012/​reading_list/​gharachorloo91two.pdf|Gharachorloo et al., "Two Techniques to Enhance the Performance of Memory Consistency Models",​ ICPP 1991.]]**
Line 84: Line 122:
 Required review for the Memory Forum paper: Required review for the Memory Forum paper:
 ^ Due 9/28/2014 | [[http://​www.cs.utah.edu/​events/​thememoryforum/​kang.pdf|Kang et al., "​Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling",​ Memory Forum 2014.]] | ^ Due 9/28/2014 | [[http://​www.cs.utah.edu/​events/​thememoryforum/​kang.pdf|Kang et al., "​Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling",​ Memory Forum 2014.]] |
-^ | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​salp-dram_isca12.pdf|Kim et al., "​A ​Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM", ISCA 2012.]] | +^ | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​salp-dram_isca12.pdf|Kim et al., "​A ​
- +
-=== 10/1/2014 === +
-Amirali'​s literature survey: +
- +
-=== 9/30/2014 === +
-Doru's literature survey: +
-  * **[[http://​users.elis.ugent.be/​~leeckhou/​papers/​isca13.pdf|Bois et al., "​Criticality Stacks: Identifying Critical Threads in Parallel Programs using Synchronization Behavior",​ ISCA 2013.]]** +
-  * **[[http://​www.istc-cc.cmu.edu/​publications/​papers/​2013/​joao_isca13_preprint.pdf|Joao et al., "​Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs", ISCA 2013.]]** +
-    * [[http://​users.elis.ugent.be/​~leeckhou/​papers/​isca12-2.pdf|Craeynest et al., "​Scheduling Heterogeneous Multi-Cores through Performance Impact Estimation (PIE)",​ ISCA 2012.]] +
-    * [[http://​webdocs.cs.ualberta.ca/​~amaral/​courses/​605/​papers/​DuesterwaldEtAl.pdf|Evelyn Duesterwald et al., "​Characterizing and Predicting Program Behavior and its Variability",​ PACT 2003.]] +
-  * **[[http://​hps.ece.utexas.edu/​pub/​morphcore_micro2012.pdf|Khubaib et al., "​MorphCore:​ An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP", ISCA 2012.]]** +
-    * [[http://​cccp.eecs.umich.edu/​papers/​lukefahr_micro12.pdf|Andrew Lukefahr et al., "​Composite Cores: Pushing Heterogeneity Into a Core", MICRO 2012.]] +
-    * [[http://​cccp.eecs.umich.edu/​papers/​shrupad_micro13.pdf|Shruti Padmanabha et al., "Trace based phase prediction for tightly-coupled heterogeneous cores",​ MICRO 2013.]] +
-    * [[http://​m3.csl.cornell.edu/​papers/​isca07.pdf|Engin Ipek et al., "Core fusion: accommodating software diversity in chip multiprocessors",​ ISCA 2007.]] +
-    * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​heterogeneous-block-architecture_iccd14.pdf|Chris Fallin et al., "The Heterogeneous Block Architecture",​ ICCD 2014.]] +
-    * [[http://​hps.ece.utexas.edu/​pub/​TR-HPS-2014-001.pdf|Carlos Villavieja et al., "Yoga: A Hybrid Dynamic VLIW/OoO Processor",​ HPS Tech Report 2014.]] +
- +
-Yang's literature survey: +
-  * [[http://​plaza.ufl.edu/​chaol/​File/​Enabling-HPCA-2013.pdf|Chao Li et al., "​Enabling distributed generation powered sustainable high-performance data center",​ HPCA 2013.]] +
-  * [[https://​www.usenix.org/​system/​files/​conference/​icac14/​icac14-paper-li_chao.pdf|Chao Li et al., "​Managing Green Datacenters Powered by Hybrid Renewable Energy Systems",​ ICAC 2014.]] +
-  * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=6672933&​tag=1|Sen Li et al., "Data center power control for frequency regulation",​ PES 2014.]] +
- +
- +
-=== 9/25/2014 === +
-  * [[http://​www.cs.utah.edu/​~rajeev/​pubs/​micro12.pdf|Chatterjee et al., "​Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access",​ MICRO 2012.]] +
-  * [[http://​parsa.epfl.ch/​cloudsuite/​cloudsuite.html|CloudSuite]] +
- +
-=== 9/24/2014 === +
-Kevin'​s literature survey: +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​raidr-dram-refresh_isca12.pdf|Liu et al., "​RAIDR:​ Retention-Aware Intelligent DRAM Refresh",​ ISCA 2012.]]** +
-    * [[http://​arch.ece.gatech.edu/​pub/​micro40.pdf|Mrinmoy Ghosh et al., "Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs",​ MICRO 2007.]] +
-    * [[http://​www.cs.utah.edu/​events/​thememoryforum/​kang.pdf|Kang et al., "​Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling",​ Memory Forum 2014.]] +
-    * [[http://​people.engr.ncsu.edu/​ericro/​publications/​conference_HPCA-12.pdf|Ravi K. Venkatesan et al, "​Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM", HPCA 2006.]] +
-    * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=4154211&​tag=1|K. Ohyu et al, "​Quantitative identification for the physical origin of variable retention time: A vacancy-oxygen complex defect model",​ IEDM 2006.]] +
-    * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=5970110|Heesang Kim et al., "​Characterization of the Variable Retention Time in Dynamic Random Access Memory",​ TED 2011.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​dram-retention-time-characterization_isca13.pdf|Liu et al., "An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms",​ ISCA 2013.]]** +
-  * **[[http://​www.ece.cmu.edu/​~safari/​pubs/​error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|Khan et al., "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study",​ SIGMETRICS 2014.]]** +
- +
-=== 9/23/2014 === +
-Hui's literature survey: +
-  * [[https://​www.ece.ubc.ca/​~aamodt/​papers/​isingh.hpca2013.pdf|Singh et al., "Cache Coherence for GPU Architectures",​ HPCA 2013.]] +
-  * [[http://​research.cs.wisc.edu/​multifacet/​papers/​micro13_hsc.pdf|Power et al., "​Heterogeneous System Coherence for Integrated CPU-GPU Systems",​ MICRO 2013.]] +
-  * [[http://​users.crhc.illinois.edu/​djohns53/​pub/​cohesion-isca2010.pdf|Kelm et al., "​Cohesion:​ A Hybrid Memory Model for Accelerators",​ ISCA 2010.]] +
- +
-Jiyuan'​s literature survey: +
-  * **[[http://​cseweb.ucsd.edu/​~swanson/​papers/​ASPLOS2011Prefetching.pdf|Kamruzzaman et al., "​Inter-core Prefetching for Multicore Processors Using Migrating Helper Threads",​ ASPLOS 2011.]]** +
-    * [[http://​www.cs.ucf.edu/​~zhou/​dce_pact_2005_ieee.pdf|Zhou et al., "​Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window",​ PACT 2005.]] +
-    * [[http://​people.engr.ncsu.edu/​ericro/​publications/​conference_ASPLOS-9.pdf|Sundaramoorthy et al., "​Slipstream Processors: Improving both Performance and Fault Tolerance",​ ASPLOS 2000.]] +
-  * **[[http://​www.cs.utah.edu/​wondp/​sqrl.pdf|Kumar et al., "SQRL: Hardware Accelerator for Collecting Software Data Structures",​ PACT 2014.]]** +
-  * **[[http://​www.cse.ust.hk/​catalac/​papers/​scatter_sc07.pdf|He et al., "​Efficient Gather and Scatter Operations on Graphics Processors",​ SC 2007.]]** +
-    * [[http://​www.cs.utah.edu/​~ald/​pubs/​hpca99.pdf|Carter et al., "​Impulse:​ Building a Smarter Memory Controller",​ HPCA 1999.]] +
-    * [[http://​www.cs.utah.edu/​~rajeev/​pubs/​asplos10.pdf|Sudan et al., "​Micro-Pages:​ Increasing DRAM Efficiency with Locality-Aware Data Placement",​ ASPLOS 2010.]] +
- +
-=== 9/18/2014 === +
-Yang: (required 1 out of 3 reviews) +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​stfm_micro07.pdf|Mutlu et al., "​Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors",​ MICRO 2007.]]** +
-    * [[http://​users.elis.ugent.be/​~leeckhou/​papers/​isca12-2.pdf|Craeynest et al., "​Scheduling Heterogeneous Multi-Cores through Performance Impact Estimation (PIE)",​ ISCA 2012.]] +
-    * [[http://​mprc.pku.edu.cn/​~liuxianhua/​chn/​corpus/​Notes/​articles/​isca/​ISCA2002/​p47.pdf|Fields et al., "​Slack:​ Maximizing Performance Under Technological Constraints",​ ISCA 2002.]] +
-    * [[http://​cadal3.cse.nsysu.edu.tw/​seminar/​seminar_file/​2002/​10/​Focusing%20processor%20policies%20via%20critical-path%20prediction.pdf|Fields et al., "​Focusing Processor Policies via Critical-Path Prediction",​ ISCA 2001.]] +
-    * [[http://​www.ece.ncsu.edu/​arpers/​Papers/​faircaching.pdf|Kim et al., "Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture",​ PACT 2004.]] +
-  * **[[ftp://​ftp-sop.inria.fr/​maestro/​Sigmetrics-Performance-2012-papers-and-posters/​p295.pdf|Xu et al., "​Providing Fairness on Shared-Memory Multiprocessors via Process Scheduling ", Sigmetrics 2012.]]** +
-    * [[http://​cseweb.ucsd.edu/​~calder/​papers/​ASPLOS-02-SimPoint.pdf|Sherwood et al., "​Automatically characterizing large scale program behavior",​ ASPLOS 2002.]] +
-    * [[http://​cseweb.ucsd.edu/​~calder/​papers/​ISCA-03-Phase.pdf|Sherwood et al., "Phase Tracking and Prediction",​ ISCA 2003.]] +
-    * [[http://​www.csl.cornell.edu/​~martinez/​doc/​isca13-ghose.pdf|Ghose et al., "​Improving memory scheduling via processor-side load criticality information",​ ISCA 2013.]] +
-    * [[http://​users.eecs.northwestern.edu/​~rjoseph/​eecs453/​papers/​quereshi-micro2006.pdf|Qureshi et al., "​Utility-Based Cache Partitioning:​ A Low-Overhead,​ High-Performance,​ Runtime Mechanism to Partition Shared Caches",​ MICRO 2006.]] +
-    * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=165388|Stone et al., "​Optimal partitioning of cache memory",​ IEEE Trans. 1992.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​fairness-via-throttling_acm_tocs12.pdf|Ebrahimi et al., "​Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems",​ TOCS 2012.]]** +
- +
-=== 9/16/2014 === +
-Amirali: (required 1 out of 3 reviews) +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​rowclone_micro13.pdf|Seshadri et al., "​RowClone:​ Fast and Efficient In-DRAM Copy and Initialization of Bulk Data", MICRO 2013.]]** +
-    * [[http://​scale.eecs.berkeley.edu/​papers/​mmp-asplos2002.pdf|Witchel et al., "​Mondrian Memory Protection",​ ASPLOS 2002.]] +
-    * [[http://​www.info.uni-karlsruhe.de/​lehre/​2002SS/​uebau2/​papers/​ChilimbiHillLarus-1999.pdf|Chilimbi et al., "​Cache-Conscious Structure Layout",​ PLDI 1999.]] +
-    * [[http://​dl.acm.org/​citation.cfm?​id=301635|Chilimbi et al., "​Cache-conscious structure definition",​ PLDI 1999.]] +
-    * [[http://​www.cs.tufts.edu/​comp/​150CMP/​papers/​chilimbi02prefetching.pdf|Chilimbi et al., "​Dynamic Hot Data Stream Prefetching for General-Purpose Programs",​ PLDI 2002.]] +
-    * [[http://​ieeexplore.ieee.org/​stamp/​stamp.jsp?​arnumber=4115697|Kogge et al., "​EXECUBE - A New Architecture for Scalable MPPs", ICPP 1994.]] +
-    * [[http://​www.ai.mit.edu/​projects/​aries/​course/​notes/​terasys.pdf|Gokhale et al., "​Processing in memory: The Terasys massively parallel PIM array."​ Computer 28.4 1995.]] +
-    * [[http://​www.eecg.toronto.edu/​~dunc/​cram/#​Bibliography|The Computational RAM (C-RAM) project.]] +
-    * [[http://​pages.cs.wisc.edu/​~isca2005/​papers/​04B-03.PDF|Cantin et al., "​Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking",​ ISCA 2005.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​bdi-compression_pact12.pdf|Pekhimenko et al., "​Base-Delta-Immediate Compression:​ Practical Data Compression for On-Chip Caches",​ PACT 2012.]]** +
-    * [[http://​ieeexplore.ieee.org/​xpls/​abs_all.jsp?​arnumber=6657054|Chen et al., "Free ECC: An efficient error protection for compressed last-level caches",​ ICCD 2013.]] +
-    * [[http://​taco.cse.tamu.edu/​pdfs/​p53-tian.pdf|Tian et al., "​Last-Level Cache Deduplication",​ ICS 2014.]] +
-    * [[http://​dl.acm.org/​citation.cfm?​id=2370864|Sathish et al., "​Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads",​ PACT 2012.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​linearly-compressed-pages_micro13.pdf|Pekhimenko et al., "​Linearly Compressed Pages: A Low-Complexity,​ Low-Latency Main Memory Compression Framework",​ MICRO 2013.]]** +
- +
-=== 9/11/2014 === +
-Doru: (required 2 out of 3 reviews) +
-  * **[[http://​users.elis.ugent.be/​~seyerman/​ISCA10.pdf|Eyerman et al., "​Modeling critical sections in amdahl’s law and its implications for multicore design",​ ISCA 2010.]]** +
-    * [[http://​research.cs.wisc.edu/​multifacet/​papers/​tr1593_amdahl_multicore.pdf|Hill et al., "​Amdahl’s Law in the Multicore Era", HPCA 2008.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​bottleneck-identification-and-scheduling_asplos12.pdf|Joao et al., "​Bottleneck Identification and Scheduling in Multithreaded Applications",​ ASPLOS 2012.]]** +
-    * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​acs_asplos09.pdf|Suleman et al., "​Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures",​ ASPLOS 2009.]] +
-    * [[http://​www.ann.ece.ufl.edu/​courses/​eel6686_14spr/​papers/​MeetingPointsUsingThreadCriticalitytToAdaptToMulticoreHardwareToParallelRegions.pdf|Cai et al., "​Meeting Points: Using Thread Criticality to Adapt Multicore Hardware to Parallel Regions",​ PACT 2008.]] +
-    * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​srinath_hpca07.pdf|Srinath et al., "​Feedback Directed Prefetching:​ Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers",​ HPCA 2007.]] +
-    * [[http://​mrmgroup.cs.princeton.edu/​papers/​abhattac-isca2009.pdf|Bhattacharjee et al., "​Thread Criticality Predictors for Dynamic Performance,​ Power, and Resource Management in Chip Multiprocessors",​ ISCA 2009.]] +
-    * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​dm_isca10.pdf|Suleman et al., "Data Marshaling for Multi-core Systems",​ ISCA 2010.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​mise-predictable_memory_performance-hpca13.pdf|Subramanian et al., "MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems",​ HPCA 2013.]]** +
-    * [[http://​hps.ece.utexas.edu/​pub/​morphcore_micro2012.pdf|Khubaib et al., "​MorphCore:​ An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP", ISCA 2012.]] +
-    * [[http://​m3.csl.cornell.edu/​papers/​isca07.pdf|Ipek et al., "Core Fusion: Accommodating Software Diversity in Chip Multiprocessors",​ ISCA 2007.]] +
-    * [[http://​www.istc-cc.cmu.edu/​publications/​papers/​2013/​joao_isca13_preprint.pdf|Joao et al., "​Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs", ISCA 2013.]] +
-    * [[http://​cseweb.ucsd.edu/​~calder/​papers/​ISCA-03-Phase.pdf|Sherwood et al., "Phase Tracking and Prediction",​ ISCA 2003.]] +
-    * [[http://​www.cs.rochester.edu/​~ipek/​micro08.pdf|Bitirgen et al., "​Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors:​ A Machine Learning Approach",​ MICRO 2008.]] +
- +
-=== 9/9/2014 === +
-Papers discussed in class by Kevin (and their related papers): +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​dirty-block-index_isca14.pdf|Seshadri et al., "The Dirty-Block Index",​ ISCA 2014.]]** +
-    * [[http://​users.ece.utexas.edu/​~merez/​vecc_asplos_2010.pdf|Yoon et al., "​Virtualized and Flexible ECC for Main Memory",​ ASPLOS 2010.]] +
-    * [[https://​www.cs.sfu.ca/​~ashriram/​publications/​2012_MICRO_AmoebaCache.pdf|Kumar et al., "​Amoeba-Cache:​ Adaptive Blocks for Eliminating Waste in the Memory Hierarchy",​ MICRO 2012.]] +
-    * [[http://​pages.cs.wisc.edu/​~isca2005/​papers/​04B-03.PDF|Cantin et al., "​Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking",​ ISCA 2005.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​staged-memory-scheduling_isca12.pdf|Ausavarungnirun et al., "​Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems",​ ISCA 2012.]]** +
-    * [[https://​www.usenix.org/​legacy/​publications/​library/​proceedings/​osdi/​full_papers/​waldspurger.pdf|Waldspurger et al., "​Lottery Scheduling: Flexible Proportional-Share Resource Management",​ OSDI 1994.]] +
-    * [[http://​www.eecg.toronto.edu/​~moshovos/​ACA05/​read/​complexity.pdf|Palacharla et al., "​Complexity-Effective Superscalar Processors",​ ISCA 1997.]] +
-    * [[http://​users.ece.cmu.edu/​~omutlu/​pub/​parbs_isca08.pdf|Mutlu et al., "​Parallelism-Aware Batch Scheduling: Enabling High-Performance and Fair Memory Controllers",​ ISCA 2008.]] +
-    * [[http://​cfall.in/​pubs/​micro2011_pams.pdf|Fallin et al., "​Parallel Application Memory Scheduling",​ MICRO 2011.]] +
-  * **[[http://​users.ece.cmu.edu/​~omutlu/​pub/​dram-access-refresh-parallelization_hpca14.pdf|Chang et al., "​Improving DRAM Performance by Parallelizing Refreshes with Accesses",​ HPCA 2014.]]** +
- +
-=== 9/3/2014 === +
-| DRAM arch. | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​salp-dram_isca12.pdf|Kim et al., "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM", ISCA 2012.]] | +
-| DRAM arch. | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​raidr-dram-refresh_isca12.pdf|Liu et al., "​RAIDR:​ Retention-Aware Intelligent DRAM Refresh",​ ISCA 2012.]] | +
-| Flash | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​flash-error-analysis-and-management_itj13.pdf|Cai et al., "Error Analysis and Retention-Aware Error Management for NAND Flash Memory",​ ITJ Vol. 17-1 2013.]] | +
-| DRAM reliab. | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​error-mitigation-for-intermittent-dram-failures_sigmetrics14.pdf|Khan et al., "The Efficacy of Error Mitigation Techniques for DRAM Retention Failures: A Comparative Experimental Study",​ SIGMETRICS 2014.]] | +
-| Reliability | [[http://​www.crhc.illinois.edu/​ACS/​pub/​branchflip.pdf|Wang et al., "​Y-Branches:​ When You Come to a Fork in the Road, Take It", PACT 2003.]] | +
-| Reliability | [[http://​users.ece.cmu.edu/​~omutlu/​pub/​heterogeneous-reliability-memory-for-data-centers_dsn14.pdf|Luo et al., "​Characterizing Application Memory Error Vulnerability to Optimize Data Center Cost", DSN 2014.]] | +
-| Security | [[https://​www.cs.princeton.edu/​~appel/​papers/​memerr.pdf|Govindavajhala et al., "Using Memory Errors to Attack a Virtual Machine",​ SP 2003.]] | +
-| 3d stacking | [[http://​www.cs.cmu.edu/​~chensm/​LBA_reading_group/​papers/​3Ddram-isca08.pdf|Loh et al., "​3D-Stacked Memory Architectures for Multi-core Processors",​ ISCA 2008.]] | +
-| 3d stacking | [[http://​pdf.aminer.org/​000/​499/​580/​die_stacking_d_microarchitecture.pdf|Black et al., "Die Stacking (3D) Microarchitecture",​ MICRO 2006.]] | +
-| In mem comp. | [[http://​ieeexplore.ieee.org/​stamp/​stamp.jsp?​arnumber=4115697|Kogge et al., "​EXECUBE - A New Architecture for Scalable MPPs", ICPP 1994.]] | +
-| In mem comp. | [[http://​www.ece.umd.edu/​courses/​enee759m.S2002/​papers/​fromm1997-isca24.pdf|Fromm et al., "The Energy Efficiency of IRAM Architectures",​ ISCA 1997.]] | +
-| In mem comp. | [[http://​www.eecs.berkeley.edu/​~yelick/​yelick/​iram-micro97.pdf|Patterson et al., "A Case for Intelligent DRAM: IRAM", IEEE Micro 1997.]] | +
- +
-=== 8/26/2014 === +
-Required reviews, see three due dates below: +
-^ Due 9/2/2014 | {{motmoo-springer-chapter-7-30-2014.pdf|Onur Mutlu, "Main Memory Scaling: Challenges and Solution Directions",​ preprint book Chapter 6, 2014.}} | +
-^ Due 9/6/2014 | Pick 3 papers referenced by the above paper that pique your interest | +
-^ Due 9/6/2014 | [[http://​www.cs.virginia.edu/​~robins/​YouAndYourResearch.html|Hamming,​ "You and Your Research,"​ Bell Communications Research Colloquium Seminar, 7 March 1986.]] | +
-| | [[http://​web.stanford.edu/​class/​cs240/​readings/​lampson-hints.pdf|Butler W. Lampson, "Hints for computer system design",​ SOSP 1983]] | +
-| | [[http://​books.google.com/​books/​about/​Inside_the_AS_400.html?​id=hJtyAAAACAAJ|Frank Soltis, "​Inside the AS/​400",​ 1996]] | +
-| | [[http://​www.cs.utexas.edu/​users/​mckinley/​notes/​reviewing.html|Hill and McKinley, "Notes on Constructive and Positive Reviewing"​.]] | +
-| | [[https://​www.usenix.org/​legacy/​publications/​library/​proceedings/​dsl97/​good_paper.html|Levin and Redell, "How (and how not) to write a good systems paper",​ OSR 1983.]] | +
-| | [[http://​www.ifs.tuwien.ac.at/​~silvia/​research-tips/​smith-advice.pdf|Alan Jay Smith, “The Task of the Referee”, IEEE Computer 1990.]] | +
-| | [[http://​research.microsoft.com/​en-us/​um/​people/​simonpj/​papers/​giving-a-talk/​writing-a-paper-slides.pdf|Jones,​ "How to Write a Great Research Paper"​.]] | +
-| | [[http://​www2.cs.uregina.ca/​~pwlfong/​CS499/​writing-paper.pdf|Philip W. L. Fong, “How to Write a CS Research Paper: A Bibliography”,​ 2004.]] |+
readings.1415206802.txt.gz · Last modified: 2014/11/05 17:00 by yixinluo