[Home] [Resume] [Publications] [Projects] [Photo Album] [Links]

 

 

 

Current Project:

  • Jan. 2007-present        Research of ILLIAC6 Supercomputer

     

    Controlled and coordinated the progress of developing communication protocol stack in reconfigurable device. The main work included clarifying the hardware /software interface and each layer's specification, analyzing and optimizing data path for bandwidth, allocating global resource of device, and building up a run-time reconfigurable solution. The whole protocol stack has almost been finished and expected to be verified on mezzanine card soon.

     

    Designed the crossbar switches in the protocol stack. Crossbar switch interacts with both physical channels and the software mapper. Changed with the static scheduling algorithm, round-robin dynamic scheduling and fixed-length packet time-slot interchanger. The first one fits only light overload. The second one avoids the NP-problem in static routing but provides unpredictable bandwidth. The final version is capable to guarantee tight bandwidth for real-time application. Crossbar switch design has been tested on development board in 200MHz frequency and supplied up to 6.4 Gbps for each link.

 

Past Projects:

  • Mar. 2005-Jul. 2006        Processor's Micro-architecture

In order to increase the throughput of VLIW (Very Long Instruction Word) microprocessor in multi-media applications, I analyzed the performance under real workloads and found out how to optimize. I think SMT (Simultaneous Multithreading) technology will be an efficient method in area and power, considering the features of video compression such as lots of vector operations and high thread-level parallelism. Therefore, I design an asymmetrical multithreading architecture in VLIW microprocessor, which can increase the performance--especially SIMD coprocessor's efficiency, power efficiency and area efficiency of the processor. Asymmetrical mechanism guarantees the main thread with the whole resource, and increases the throughput with executing some low-priority threads simultaneously.

  • Aug. 2004-Mar. 2005      The Custom Design of 16 Ports Register File in 500 MHz

Register file is a crucial unit in microprocessor which has register-to-register ISA (Instruction Set Architecture). Multi-ports and high frequency of register file can support the high performance of the superscalar computer system. This register file can be written by 6 ports and read by 10 ports at one time. I chose a reliable memory cell structure for 4-port reading without content changes. The register design started from transistor-level circuit and was layout compactly. It took digital IC flow to design and verify functions. The tools I utilized are Spice, CosmosSE, CosmosLE, Nanosim, StarRCXT, etc. It has been implemented in 0.18um CMOS technology and meets the requirements very well.

  • Aug. 2003- Aug. 2004     The Design of High Speed DSP&CPU Microprocessor (SuperV)

SuperV is a VLIW microprocessor with SIMD instructions. It includes over one million gates with the speed up to 266MHz. My tasks in this project included

1) Implemented the control path of the microprocessor in RTL level;
2) Accomplished the custom design of 16-ports write-through  register file in transistor-level, which supported 4-issued microprocessor;
3) Verified the function of the CPU core and SIMD coprocessor on both simulation and formal methods.

The tools we used included Modelsim, VCS, Design-Compiler, Formality, etc. In the phase of verification, I designed some kernel assembly programs with sub-word parallelism for simulation and built up an efficient test-bench to shorten the verification period. This project wais funded by national basic research program of China. It has been fabricated  in 0.18μm CMOS technology. The chip got one-pass and ran MPEG-2 application as we expected.

  • Oct. 2002-Aug. 2003         The Custom Design of 6 Ports Register File in 400MHz

The register file is designed as a small IP core. It supported 4 read and 2 write operations at one time. It was designed in 0.18um. I designed the transistor-level circuit and finished the layout and optimization. The tools I used were Spice, CosmosSE, Enterprise, Star_sim, StarRC, etc.

  • Mar. 2002-Oct. 2002          The Design of Regular Multiplier Generator

The multiplier generator is a work to shorten the design period of parallel multiplier. It can produce the RTL code under any data widths from 4-bit to 40-bit. The multiplier consists of many 4-2 counters so that the structure is neat to place and route. I designed it in C language and synthesized all multipliers of all widths to get the timing and area reports in Design_Compiler.