List of Eligible Papers for the 2016 Award

View the 2016 call for nominations.

MICRO 1994

Paper Title	Authors
Static Branch Frequency and Program Profile Analysis	Youfeng Wu, James R. Larus
Using Branch Handling Hardware to Support Profile-Driven Optimization	Thomas M. Conte, Burzin A. Patel, J. Stan Cox
Branch Classification: A New Mechanism for Improving Branch Predictor Performance	Po-Yung Chang, Eric Hao, Tse-Yu Yeh, Yale Patt
Techniques for Compressing Program Address Traces	Andrew R. Pleszkun
Height Reduction of Control Recurrences for ILP Processors	Michael Schlansker, Vinod Kathail, Sadun Anik
Theoretical Modeling of Superscalar Processor Performance	Derek B. Noonburg, John P. Shen
Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops	B. Ramakrishna Rau
Minimum Register Requirements for a Modulo Schedule	Alexandre E. Eichenberger, Edward S. Davidson, Santosh G. Abraham
Minimizing Register Requirements Under Resource-Constrained Rate-Optimal Software Pipelining	R. Govindarajan, Erik R. Altman, Guang R. Gao
Software Pipelining with Register Allocation and Spilling	Jian Wang, Andreas Krall, M. Anton Ertl, Christine Eisenbeis
Reducing Memory Traffic with CRegs	Peter Dahl, Matthew O'Keefe
Dynamic Memory Disambiguation for Array References	David Bernstein, Doron Cohen, Dror E. Maydan
A Study of Pointer Aliasing for Software Pipelining Using Run-Time Disambiguation	Bogong Su, Stanley Habib, Wei Zhao, Jian Wang, Youfeng Wu
Data Relocation and Prefetching for Programs with Large Data Sets	Yoji Yamada, John Gyllenhall, Grant Haab, Wen-mei Hwu
Cache Designs with Partial Address Matching	Lishing Liu
Minimizing Branch Misprediction Penalties for Superpipelined Processors	Ching-Long Su, Alvin M. Despain
Facilitating Superscalar Processing Via a Combined Static/Dynamic Register Renaming Scheme	Eric Sprangle, Yale Patt
Improving Resource Utilization of the MIPS R8000 Via Post-Scheduling Global Instruction Distribution	Raymond Lo, Sun Chan, Fred Chow, Shin-Ming Liu
A Comparison of Two Pipeline Organizations	Michael Golden, Trevor Mudge
A Fill-Unit Approach to Multiple Instruction Issue	Manoj Franklin, Mark Smotherman
A High-Performance Microarchitecture with Hardware-Programmable Functional Units	Rahul Razdan, Michael D. Smith
The Anatomy of the Register File in a Multiscalar Processor	Scott E. Breach, T. N. Vijaykumar, Gurindar S. Sohi
Register File Port Requirements of Transport Triggered Architectures	Jan Hoogerbrugge, Henk Corporaal
The Effects of Predicated Execution On Branch Prediction	Gary Scott Tyson
Analysis of the Conditional Skip Instructions of the HP Precision Architecture	Jonathan P. Vogel, Bruce K. Holmer
Characterizing the Impact of Predicated Execution On Branch Prediction	Scott A. Mahlke, Richard E. Hank, Roger A. Bringmann, John C. Gyllenhaal, David M. Gallagher, Wen-mei W. Hwu
The Effect of Speculatively Updating Branch History On Branch Prediction Accuracy, Revisited	Eric Hao, Po-Yung Chang, Yale N. Patt

MICRO 1995

Paper Title	Authors
Performance Issues in Correlated Branch Prediction Schemes	Nicolas Gloy, Michael D. Smith, Cliff Young
Dynamic Path-Based Branch Correlation	Ravi Nair
The Predictability of Branches in Libraries	Brad Calder, Dirk Grunwald, Amitabh Srivastava
The Performance Impact of Incomplete Bypassing in Processor Pipelines	Pritpal S. Ahuja, Douglas W. Clark, Anne Rogers
Efficient Instruction Scheduling Using Finite State Automata	Vasanth Bala, Norman Rubin
Critical Path Reduction for Scalar Programs	Michael Schlansker, Vinod Kathail
A Limit Study of Local Memory Requirements Using Value Reuse Profiles	Andrew S. Huang, John P. Shen
Zero-Cycle Loads: Microarchitecture Support for Reducing Load Latency	Todd M. Austin, Gurindar S. Sohi
A Modified Approach to Data Cache Management	Gary Tyson, Matthew Farrens, John Matthews, Andrew R. Pleszkun
Petri Net Versus Modulo Scheduling for Software Pipelining	Vicki H. Allan, U. R. Shah, K. M. Reddy
Modulo Scheduling with Multiple Initiation Intervals	Nancy J. Warter-Perez, Noubar Partamian
Spill-Free Parallel Scheduling of Basic Blocks	B. Natarajan, M. Schlansker
Improving Instruction-Level Parallelism by Loop Unrolling and Dynamic Memory Disambiguation	Jack W. Davidson, Sanjay Jinturkar
Self-Regulation of Workload in the Manchester Data-Flow Computer	John R. Gurd, David F. Snelling
The M-Machine Multicomputer	Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, Whay S. Lee
Region-Based Compilation: An Introduction and Motivation	Richard E. Hank, Wen-Mei W. Hwu, B. Ramakrishna Rau
An Experimental Study of Several Cooperative Register Allocation and Instruction Scheduling Strategies	Cindy Norris, Lori L. Pollock
Register Allocation for Predicated Code	Alexandre E. Eichenberger, Edward S. Davidson
Partial Resolution in Branch Target Buffers	Barry Fagin, Kathryn Russell
A System Level Perspective On Branch Architecture Performance	Brad Calder, Dirk Grunwald, Joel Emer
Dynamic Rescheduling: A Technique for Object Code Compatibility in VLIW Architectures	Thomas M. Conte, Sumedh W. Sathaye
Improving CISC Instruction Decoding Performance Using a Fill Unit	Mark Smotherman, Manoj Franklin
SPAID: Software Prefetching in Pointer- and Call-Intensive Environments	Mikko H. Lipasti, William J. Schmidt, Steven R. Kunkel, Robert R. Roediger
An Effective Programmable Prefetch Engine for On-Chip Caches	Tien-Fu Chen
Cache Miss Heuristics and Preloading Techniques for General-Purpose Programs	Toshihiro Ozawa, Yasunori Kimura, Shin'ichiro Nishizaki
Alternative Implementations of Hybrid Branch Predictors	Po-Ying Chang, Eric Hao, Yale N. Patt
Control Flow Prediction with Tree-Like Subgraphs for Superscalar Processors	Simonjit Dutta, Manoj Franklin
The Role of Adaptivity in Two-Level Adaptive Branch Prediction	Stuart Sechrest, Chih-Chieh Lee, Trevor Mudge
Design of Storage Hierarchy in Multithreaded Architectures	Lucas Roh, Walid A. Najjar
An Investigation of the Performance of Various Instruction-Issue Buffer Topologies	Stéphan Jourdan, Pascal Sainrat, Daniel Litaize
Decoupling Integer Execution in Superscalar Processors	Subbarao Palacharla, J. E. Smith
Exploiting Short-Lived Variables in Superscalar Processors	Luis A. Lozano, Guang R. Gao
Partitioned Register File for TTAs	Johan Janssen, Henk Corporaal
Disjoint Eager Execution: An Optimal Form of Speculative Execution	Augustus K. Uht, Vijay Sindagi, Kelley Hall
Unrolling-Based Optimizations for Modulo Scheduling	Daniel M. Lavery, Wen-Mei W. Hwu
Stage Scheduling: A Technique to Reduce the Register Requirements of a Modulo Schedule	Alexandre E. Eichenberger, Edward S. Davidson
Hypernode Reduction Modulo Scheduling	Josep Llosa, Mateo Valero, Eduard Ayguadé, Antonio González

MICRO 1996

Paper Title	Authors
A Persistent Rescheduled-Page Cache for Low Overhead Object Code Compatibility in VLIW Architectures	Thomas M. Conte, Sumedh W. Sathaye, Sanjeev Banerjia
Integrating a Misprediction Recovery Cache (MRC) Into a Superscalar Pipeline	James O. Bondi, Ashwini K. Nanda, Simonjit Dutta
Accurate and Practical Profile-Driven Compilation Using the Profile Buffer	Thomas M. Conte, Kishore N. Menezes, Mary Ann Hirsch
Efficient Path Profiling	Thomas Ball, James R. Larus
Profile-Driven Instruction Level Parallel Scheduling with Application to Super Blocks	C. Chekuri, R. Johnson, R. Motwani, B. Natarajan, B. R. Rau, M. Schlansker
Speculative Hedge: Regulating Compile-Time Speculation Against Profile Variations	Brian L. Deitrich, Wen-mei W. Hwu
Hot Cold Optimization of Large Windows/NT Applications	Robert Cohn, P. Geoffrey Lowney
Java Bytecode to Native Code Translation: The Caffeine Prototype and Preliminary Results	Cheng-Hsueh A. Hsieh, John C. Gyllenhaal, Wen-mei W. Hwu
Analysis Techniques for Predicated Code	Richard Johnson, Michael Schlansker
Global Predicate Analysis and Its Application to Register Allocation	David M. Gillies, Dz-ching Roy Ju, Richard Johnson, Michael Schlansker
Modulo Scheduling of Loops in Control-Intensive Non-Numeric Programs	Daniel M. Lavery, Wen-mei W. Hwu
Assigning Confidence to Conditional Branch Predictions	Erik Jacobsen, Eric Rotenberg, J. E. Smith
Compiler Synthesized Dynamic Branch Prediction	Scott Mahlke, Balas Natarajan
Wrong-Path Instruction Prefetching	Jim Pierce, Trevor Mudge
Design Decisions Influencing the UltraSPARC's Instruction Fetch Architecture	Robert Yung
Increasing the Instruction Fetch Rate Via Block-Structured Instruction Set Architectures	Eric Hao, Po-Yung Chang, Marius Evers, Yale N. Patt
Instruction Fetch Mechanisms for VLIW Architectures with Compressed Encodings	Thomas M. Conte, Sanjeev Banerjia, Sergei Y. Larin, Kishore N. Menezes, Sumedh W. Sathaye
Tango: A Hardware-Based Data Prefetching Technique for Superscalar Processors	Shlomit S. Pinter, Adi Yoaz
Exceeding the Dataflow Limit Via Value Prediction	Mikko H. Lipasti, John Paul Shen
The Performance Potential of Data Dependence Speculation & Collapsing	Yiannakis Sazeides, Stamatis Vassiliadis, James E. Smith
Heuristics for Register-Constrained Software Pipelining	Josep Llosa, Mateo Valero, Eduard Ayguadé
Software Pipelining Loops with Conditional Branches	Mark G. Stoodley, Corinna G. Lee
Combining Loop Transformations Considering Caches and Scheduling	Michael E. Wolf, Dror E. Maydan, Ding-Kai Chen
Instruction Scheduling and Executable Editing	Eric Schnarr, James R. Larus
Instruction Scheduling for the HP PA-8000	David A. Dunn, Wei-Chung Hsu
Meld Scheduling: Relaxing Scheduling Constraints Across Region Boundaries	Santosh G. Abraham, Vinod Kathail, Brian L. Deitrich
Custom-Fit Processors: Letting Applications Define Architectures	Joseph A. Fisher, Paolo Faraboschi, Giuseppe Desoli
Optimization for a Superscalar Out-of-Order Machine	Anne M. Holler
Optimization of Machine Descriptions for Efficient Use	John C. Gyllenhaal, Wen-mei W. Hwu, B. Ramabriohna Rau

MICRO 1997

Paper Title	Authors
The Bi-Mode Branch Predictor	Chih-Chieh Lee, I-Cheng K. Chen, Trevor N. Mudge
Path-Based Next Trace Prediction	Quinn Jacobson, Eric Rotenberg, James E. Smith
Alternative Fetch and Issue Policies for the Trace Cache Fetch Mechanism	Daniel Holmes Friendly, Sanjay Jeram Patel, Yale N. Patt
Reducing the Performance Impact of Instruction Cache Misses by Writing Instructions Into the Reservation Stations Out-of-Order	Jared Stark, Paul Racunas, Yale N. Patt
On High-Bandwidth Data Cache Design for Multi-Issue Processors	Jude A. Rivers, Gary S. Tyson, Edward S. Davidson, Todd M. Austin
Run-Time Spatial Locality Detection and Optimization	Teresa L. Johnson, Matthew C. Merten, Wen-Mei W. Hwu
A Comparison of Data Prefetching On an Access Decoupled and Superscalar Machine	G. P. Jones, N. P. Topham
The Design and Performance of a Conflict-Avoiding Cache	Nigel Topham, Antonio González, José González
Prediction Caches for Superscalar Processors	James E. Bennett, Michael J. Flynn
A Framework for Balancing Control Flow and Predication	David I. August, Wen-mei W. Hwu, Scott A. Mahlke
Evaluation of Scheduling Techniques On a SPARC-Based VLIW Testbed	Seongbae Park, SangMin Shim, Soo-Mook Moon
Tuning Compiler Optimizations for Simultaneous Multithreading	Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay S. Parekh, Dean M. Tullsen
Exploiting Dead Value Information	Milo M. Martin, Amir Roth, Charles N. Fischer
Trace Processors	Eric Rotenberg, Quinn Jacobson, Yiannakis Sazeides, Jim Smith
The Multicluster Architecture: Reducing Cycle Time Through Partitioning	Keith I. Farkas, Paul Chow, Norman P. Jouppi, Zvonko Vranesic
Out-of-Order Vector Architectures	Roger Espasa, Mateo Valero, James E. Smith
Initial Results On the Performance and Cost of Vector Microprocessors	Corinna G. Lee, Derek J. DeVries
The Filter Cache: An Energy Efficient Memory Structure	Johnson Kin, Munish Gupta, William H. Mangione-Smith
Improving Code Density Using Compression Techniques	Charles Lefurgy, Peter Bird, I-Cheng Chen, Trevor Mudge
Procedure Based Program Compression	Darko Kirovski, Johnson Kin, William H. Mangione-Smith
Improving the Accuracy and Performance of Memory Communication Through Renaming	Gary S. Tyson, Todd M. Austin
Microarchitecture Support for Improving the Performance of Load Target Prediction	Chung-Ho Chen, Akida Wu
Streamlining Inter-Operation Memory Communication Via Data Dependence Prediction	Andreas Moshovos, Gurindar S. Sohi
The Predictability of Data Values	Yiannakis Sazeides, James E. Smith
Value Profiling	Brad Calder, Peter Feller, Alan Eustace
Can Program Profiling Support Value Prediction?	Freddy Gabbay, Avi Mendelson
Highly Accurate Data Value Prediction Using Hybrid Predictors	Kai Wang, Manoj Franklin
ProfileMe: Hardware Support for Instruction-Level Profiling On Out-of-Order Processors	Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos
Procedure Placement Using Temporal Ordering Information	Nikolas Gloy, Trevor Blackwell, Michael D. Smith, Brad Calder
Predicting Data Cache Misses in Non-Numeric Applications Through Correlation Profiling	Todd C. Mowry, Chi-Keung Luk
Available Paralellism in Video Applications	Heng Liao, Andrew Wolfe
MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems	Chunho Lee, Miodrag Potkonjak, William H. Mangione-Smith
Cache Sensitive Modulo Scheduling	F. Jesús Sánchez, Antonio González
Unroll-and-Jam Using Uniformly Generated Sets	Steve Carr, Yiping Guan
Resource-Sensitive Profile-Directed Data Flow Analysis for Code Optimization	Rajiv Gupta, David A. Berson, Jesse Z. Fang

MICRO 1998

Paper Title	Authors
A Bandwidth-Efficient Architecture for Media Processing	Scott Rixner, William J. Dally, Ujval J. Kapasi, Brucek Khailany, Abelardo López-Lagunas, Peter R. Mattson, John D. Owens
Exploiting Instruction Level Parallelism in Geometry Processing for Three Dimensional Graphics Applications	Chia-Lin Yang, Barton Sano, Alvin R. Lebeck
Simple Vector Microprocessors for Multimedia Applications	Corinna G. Lee, Mark G. Stoodley
Evaluating MMX Technology Using DSP and Multimedia Applications	Ravi Bhargava, Lizy K. John, Brian L. Evans, Ramesh Radhakrishnan
Analyzing the Working Set Characteristics of Branch Execution	Sangwook P. Kim, Gary S. Tyson
Dataflow Analysis of Branch Mispredictions and Its Application to Early Resolution of Branch Outcomes	Alexandre Farcy, Olivier Temam, Roger Espasa, Toni Juan
The YAGS Branch Prediction Scheme	Avinoam N. Eden, Trevor Mudge
Task Selection for a Multiscalar Processor	T. N. Vijaykumar, Gurindar S. Sohi
Split-Path Enhanced Pipeline Scheduling for Loops with Control Flows	SangMin Shim, Soo-Mook Moon
Effective Cluster Assignment for Modulo Scheduling	Erik Nystrom, Alexandre E. Eichenberger
Better Global Scheduling Using Path Profiles	Cliff Young, Michael D. Smith
Predictive Techniques for Aggressive Load Speculation	Glenn Reinman, Brad Calder
Compiler-Directed Early Load-Address Generation	Ben-Chung Cheng, Daniel A. Connors, Wen-mei W. Hwu
Load Latency Tolerance in Dynamically Scheduled Processors	Srikanth T. Srinivasan, Alvin R. Lebeck
Improving I/O Performance with a Conditional Store Buffer	Lambert Schaelicke, Al Davis
Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors	Daniel Holmes Friendly, Sanjay Jeram Patel, Yale N. Patt
Cooperative Prefetching: Compiler and Hardware Support for Effective Instruction Prefetching in Modern Processors	Chi-Keung Luk, Todd C. Mowry
Code Compression Based on Operand Factorization	Guido Araujo, Paulo Centoducatte, Mario Cartes, Ricardo Pannain
Understanding the Differences Between Value Prediction and Instruction Reuse	Avinash Sodani, Gurindar S. Sohi
A Novel Renaming Scheme to Exploit Value Temporal Locality Through Physical Register Reuse and Unification	Stephen Jourdan, Ronny Ronen, Michael Bekerman, Bishara Shomar, Adi Yoaz
A Dynamic Multithreading Processor	Haitham Akkary, Michael A. Driscoll
Widening Resources: A Cost-Effective Technique for Aggressive ILP Architectures	David López, Josep Llosa, Mateo Valero, Eduard Ayguadé
The Cascaded Predictor: Economical and Adaptive Branch Target Prediction	Karel Driesen, Urs Hölzle
Improving Prediction for Procedure Returns with Return-Address-Stack Repair Mechanisms	Kevin Skadron, Pritpal S. Ahuja, Margaret Martonosi, Douglas W. Clark
Predicting Indirect Branches via Data Compression	John Kalamatianos, David R. Kaeli
Improving Locality Using Loop and Data Transformations in an Integrated Framework	Mahmut Kandemir, Alok Choudhary, J. Ramanujam, Prithviraj Banerjee
Precise Register Allocation for Irregular Architectures	Timothy Kong, Kent D. Wilken
Unified Assign and Schedule: A New Approach to Scheduling for Clustered Register File Microarchitectures	Emre Özer, Sanjeev Banerjia, Thomas M. Conte

Annual IEEE/ACM International Symposium on Microarchitecture

MICRO Test of Time Award

List of Eligible Papers for the 2016 Award

MICRO 1994

MICRO 1995

MICRO 1996

MICRO 1997

MICRO 1998