Architecture Research Term Project

You will carry out a semester-long term project on a topic based on content from this course. Your project must include a design space assessment, an architectural design component, and a quantitative evaluation. The choice of project topic is deliberately open-ended. You are expected to identify an interesting question that does not have an obvious answer, and to answer that question with your project. A possibly helpful analogy for the scale of the project is that your project should be roughly the size and complexity of a paper that you might see published at a workshop. In planning your project topic, you are encouraged to consider outside academic interests (ML, HCI, AI, PL, etc) in the context of parallel, heterogeneous, and emerging computer architectures. Your project could be an implementation of an idea that we read about in class, transplanted to a new context (e.g., deterministic execution for GPU/CPU systems), or something completely novel (e.g., an FPGA-based accelerator for neural network computer vision workloads). Your project should not be a measurement study only (e.g., how scalable is PARSEC on a Haswell CPU), but a well-crafted measurement study that answers an interesting question might make a good project. You will have to choose a project topic before we have covered all of the topics for the semester. It will benefit you to read the abstracts and skim any papers in the reading list that seem interesting, to help identify project ideas.


  • Project Team - Form a team of 2 people
  • Project Description - Describe your project in a document of no more than 3 pages including the key question, design space assessment, architecture design, and evaluation plan, including necessary hardware and software.
  • Project Progress Reports - Update me of your progress executing project. What is working and what is not working? Are you on track to complete your project? Have your preliminary results matched your intuitions? Surprised you?
  • Project Final Report - A full description of your project, including the key question, design space, design choices, and quantitative evaluation methodology and results. This document should be no more than 10 pages conference format: 2 column, 11pt font with figures.
  • Project Final Presentation - An in class presentation about your project including the important details from your project final report. You will have approximately 10 minutes to present.
  • Project Ideas

    1. Software or architecture support for resource-constrained or intermittent graph processing
    2. Distributed, intermittent Deep Neural Network training system
    3. Energy-harvesting simulation infrastructure: power and performance model
    4. Measurement study of architectural implications of non-volatile technology as storage or logic in a CPU
    5. Relaxed memory consistency for FPGA/CPU SoCs
    6. Performance and correctness impact of approximate synchronization operations on neural network or computer vision applications
    7. Hardware support for data-centric synchronization / per-address memory fences
    8. Heterogeneous memory consistency for CPU+FPGA systems with per-FPGA-state-machine consistency guarantees
    9. Design and evaluation of an intermittent reconfigurable architecture
    10. Approximate, compressive cache, LLC or main memory
    11. Data-race detection or SC-violation detection in a reconfigurable computing device or heterogeneous FPGA/CPU-based system.
    12. Application study: precision vs. performance trade-off in a parallel system with approximate cache coherence
    13. Deterministic parallel computation in an FPGA
    14. Application study: when is it beneficial to execute code on a GPU or FPGA in parallel with execution on a CPU?
    15. Symbolic execution to evaluate candidate power schedules for programs running on intermittently powered devices
    16. 3D-stacked, processing-in-memory to accelerate garbage collection or other pointer-chasing analysis
    17. Deterministic transactional execution with weak isolation guarantees
    18. Approximate, parallel scatter/gather or reduction
    19. Performance and Power model and assessment of a "perpetual" solar-powered, fully-nonvolatile processor
    20. Using shared memory communication graphs to predict magnitude/importance of shared value updates
    21. Cache architecture and memory hierarchy design for heterogeneous CPU/GPU/Accelerator architecture
    22. Feasibility assessment and performance model of porting TensorFlow kernels to an FPGA
    23. Environmental impact assessment and mitigation strategy for current and future cloud machine learning
    24. Hardware concurrency bug detection for FPGA designs


    1. PARSEC (parallel applications)
    2. Rodinia (heterogeneous parallel applications)

    Simulators and Tools

    1. Sniper (easy-to-use architecture simulator)
    2. Gem5 (very detailed architecture simulator)
    3. MarssX86 (detailed architecture simulator)
    4. McPat (architectural power modeling)
    5. Cacti (power modeling)
    6. Pin (binary instrumentation)
    7. LLVM (compiler infrastructure)
    8. Z3 (SMT solver)
    9. KLEE (C/C++ symbolic execution engine)