

### Fast Flexible FPGA-Tuned Networks-on-Chip

Michael K. Papamichael, James C. Hoe

papamix@cs.cmu.edu, jhoe@ece.cmu.edu

www.ece.cmu.edu/~mpapamic/connect



Computer Architecture Lab at Carnegie Mellon Includes

Portland, OR, June 2012

This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations.

# FPGAs and Networks-on-Chip (NoCs)



- Rapid growth of FPGA capacity and features
  - Extended SoC and full-system prototyping
  - FPGA-based high-performance computing

**Need for flexible NoCs to support communication** 





Map existing ASIC-oriented NoC designs on FPGAs?

# **CALCM** - Computer Architecture Lab at Carnegie Mellon

# FPGAs and Networks-on-Chip (NoCs)

- **Rapid growth of FPGA capacity and features** 
  - Extended SoC and full-system prototyping
  - **FPGA-based high-performance computing**

Need for flexible NoCs to support communication

### Map existing ASIC-oriented NoC designs on FPGAs?

- Inefficient use of FPGA resources
- ASIC-driven NoC architecture not optimal for FPGA





# FPGAs and Networks-on-Chip (NoCs)

CALCM

- Rapid growth of FPGA capacity and features
  - Extended SoC and full-system prototyping
  - FPGA-based high-performance computing

**Need for flexible NoCs to support communication** 



Map existing ASIC-oriented NoC designs on FPGAs?
Inefficient use of FPGA resources

ASIC-driven NoC architecture not optimal for FPGA

# **FPGA-tuned NoC Architecture**

- Embodies FPGA-motivated design principles
- Very lightweight, minimizes resource usage
  - ~50% resource reduction vs. ASIC-oriented NoC
- Publicly released flexible NoC generator (demo)

#### Often goes against ASIC-driven NoC conventional wisdom



- NoC Terminology (single-slide review)
- CONNECT Approach
  - Tailoring NoCs to FPGAs
- Results
- Related Work & Conclusion
- Public Release & Demo!







- NoC Terminology (single-slide review)
- **<b>EUNNELT** Approach
  - Tailoring NoCs to FPGAs
- Results
- Related Work & Conclusion
- Public Release & Demo!





# **NoC Terminology Overview**





#### Packets

- Basic logical unit of transmission
- Flits
  - Packets broken into into multiple flits unit of flow control

### Virtual Channels

• Multiple logical channels over single physical link

#### Flow Control

Management of buffer space in the network



### • NoC Terminology (single-slide review)

### CONTACT Approach

Tailoring NoCs to FPGAs

### Results

#### • Related Work & Conclusion

### • Public Release & Demo!





# How FPGAs are Different from ASICs



#### FPGAs peculiar HW realization substrate in terms of

- Relative cost of speed vs. logic vs. wires vs. memory
- Unique mapping and operating characteristics
- **CINNELT** focuses on 4 FPGA characteristics:



#### FPGA characteristics uniquely influence key NoC design decisions

### CALCM

#### Abundance of Wires

#### Densely connected wiring substrate

- (Over)provisioned to handle worst case
- Wires are "free" compared to other resources



### **NoC** Implications

- Make datapaths and channels as wide as possible
- Adjust packet format
  - E.g. carry control info on the side through dedicated links
- Adapt traditional credit-based flow control



#### **Storage Shortage & Peculiarities**

- Modern FPGAs offer storage in two forms
  - Block RAMs and LUT RAMs (use logic resources)
  - Only come in specific aspect ratios and sizes
- Typically in high demand, especially Block RAMs



### **NoC** Implications

- Minimize usage and optimize for aspect ratios and sizes
  - Implement multiple logical flit buffers in each physical buffer
- Use LUT RAM for flit buffers
  - Block RAM much larger than typically NoC flit buffer sizes
  - Allow rest of design to use scarce Block RAM resources



#### Frequency "Challenged"

- Much lower frequencies compared to ASICs
  - LUTs inherently slower that ASIC standard cells
  - Large wire delays when chaining LUTs
- Rapidly diminishing returns when pipelining
  - Deep pipelining hard due to quantization effects



#### **NoC Implications**

- Design router as single-stage pipeline
  - Also dramatically reduces network latency
- Make up for lower frequency by adjusting network
  - E.g. increase width of datapath and links or change topology

#### **Reconfigurable Nature**

#### Reconfigurable nature of FPGAs

- Sets them apart from ASICs
- Support diverse range of applications



### **NoC Implications**

#### Support extensive application-specific customization

- Flexible parameterized NoC architecture
- Automated NoC design generator (demo!)

#### Adhere to standard common interface

NoC appears as plug-and-play black box from user-perspective

### **CONNECT** Architecture

#### Topology-Agnostic Parameterized Architecture

- # in/out ports, # virtual channels, flit width, buffer depths
- Flexible user-specified routing
- Four allocation algorithms and two flow-control mechanisms

CAICM

#### • CONNECT Router Architecture



**CALCM** - Computer Architecture Lab at Carnegie Mellon



- NoC Terminology (single-slide review)
- **LINNELT** Approach
  - Tailoring NoCs to FPGAs

### Results

• Related Work & Conclusion

### • Public Release & Demo!





# **CONNECT vs. ASIC-Oriented RTL**

#### 16-node 4x4 Mesh Network-on-Chip (NoC)

- SOTA: state-of-the-art high-quality ASIC-oriented RTL\*
- CONNECT: identically configured **CONNECT**: identically configured **CONNECT**: identically configured **CONNECT**.



CAICM

FPGA Resource Usage

(same router/NoC configuration)



\*NoC RTL from http://nocs.stanford.edu/cgi-bin/trac.cgi/wiki/Resources/Router

# **CONNECT vs. ASIC-Oriented RTL**

#### 16-node 4x4 Mesh Network-on-Chip (NoC)

- SOTA: state-of-the-art high-quality ASIC-oriented RTL\*
- CONNECT: identically configured **CONNECT**: identically configured **CONNECT**: identically configured **CONNECT**.





#### **CALCM -** Computer Architecture Lab at Carnegie Mellon

# **CONNECT Sample Networks**



- Four sample CONNECT Networks ( router, endpoint)
  - 16 endpoints, 2/4 virtual channels, 128-bit datapath



All above networks are interchangeable from user perspective





# **CONNECT Sample Networks**



- Four sample CONNECT Networks ( router, endpoint)
  - 16 endpoints, 2/4 virtual channels, 128-bit datapath



#### All above networks are interchangeable from user perspective



pleasestie fitservice more synthesis & perforntate (estilits/cycle)

**CALCM** - Computer Architecture Lab at Carnegie Mellon



- NoC Terminology (single-slide review)
- **LINNELT** Approach
  - Tailoring NoCs to FPGAs
- Results

### Related Work & Conclusion

### • Public Release & Demo!





### **Related Work**



#### • FPGA-oriented NoC Architectures

- PNoC: lightweight circuit-switched NoC [Hilton '06]
- NoCem: simple router block, no virtual channels [Schelle '08]
- FPGA-related NoC Studies
  - Analytical models for predicting NoC perf. on FPGAs [Lee '10]
  - Effect of FPGA NoC params on multiproccesor system [Lee '09]

#### Modify FPGA configuration circuitry to build NoC

- Metawire: use configuration circuitry as NoC [Shelburne '08]
- Time-division multiplexed wiring to enable new NoC [Francis '08]
- Commercial Interconnect Approaches
  - ARM AMBA, STNoC, CoreConnect PLB/OPB, Altera Qsys, etc.

### Conclusions



- Significant gains from tuning for FPGA
  - FPGAs and ASICs have different design "sweet spot"
- CONNECT → flexible, efficient, lightweight NoC
- Compared to ASIC-driven NoC, CONNECT offers
  - Significantly lower network latency and
  - ~50% lower LUT usage **or** 3-4x higher network performance
- Take advantage of reconfigurable nature of FPGA
  - Tailor NoC to specific communication needs of application



- NoC Terminology (single-slide review)
- **LINNELT** Approach
  - Tailoring NoCs to FPGAs
- Results

HH

- Related Work & Conclusion
- Public Release & Demo!





### **Public Release**





### http://www.ece.cmu.edu/~mpapamic/connect/

#### NoC Generator with web-based interface

- Supports multiple pre-configured topologies
- Includes graphical editor for custom topologies
- FreeBSD-like license (limited to non-commercial research use)

#### Acknowledgments

- Derek Chiou, Daniel Becker & Stanford CVA group
- NSF, Xilinx, Bluespec

#### Demo!

### **Some Release Stats**

### Released in March 2012

- 1500+ unique visitors
- 150+ network generation requests



CAICM

### Thanks!



### **Questions?**

**CALCM** - Computer Architecture Lab at Carnegie Mellon