Image recognition utilized in embedded systems, such as automotive and surveillance systems, has been put to practical use recently and requires high performance with low power consumption. An effective approach to meet these requirements is many-core architecture. Accordingly, we have developed a many-core SoC that includes two many-core clusters with 32 energy efficient processor cores connected by a low latency tree-based NoC. In this talk, I will present the performance evaluation of our many-core SoC by face detection as an example of real image recognition applications and discuss two parallelized implementations on the many-core clusters. By keeping balance of workloads on the cores, the performance scales up to 64 cores and the SoC consumes only 2.21W.
Hiroyuki Usui is a Specialist in Center for Semiconductor Research & Development at Toshiba Corporation, Kawasaki, Japan, where he engages in the research and development of mobile SoCs, multi- and many-core processors. He is now doing a visiting research with Professor Onur Mutlu at Carnegie Mellon University. He received the B.S. and M.S. degrees in information and computer science from Keio University, Yokohama, Japan, in 2002 and 2004, respectively. In 2004, he joined Toshiba Corporation. He is interested in memory scheduling in heterogeneous systems that have CPU, GPU and hardware accelerators.