Electrical & Computer Engineering     |     Carnegie Mellon

Wednesday, December 8, 12:00-1:00 p.m. HH-1112


Patrick Bourke
Carnegie Mellon University


Speech Recognition in Silicon


Automatic speech recognition has made enormous progress over the last decade. Various implementations now appear in devices ranging from low-end toys to high-end desktops. However, the basic architecture varies dramatically, and is extremely dependent on the acceptable cost, recognition capability, and power consumption of the application. At the high-end, complicated software systems can achieve excellent accuracy, but are extremely hardware intensive, fully utilizing high-performance desktop machines. At the low-end, small CPU, DSP or ASIC solutions exist, but to achieve low cost and power make dramatic comprises in recognition quality. They are fundamentally incapable of scaling to robust, large-vocabulary, speaker-independent, continuous and real-time speech recognition. By moving the core of today's most sophisticated recognition algorithms directly into silicon, however, high-quality speech recognition can be made widely available.

An overview of modern speech recognition technology will be presented, including the state-of-the-art Carnegie Mellon Sphinx III software speech recognizer on which we base our work in hardware. Hardware speech recognition will be discussed and a possible hardware implementation described. We show the potential performance gains by moving speech recognition into hardware.


Patrick Bourke received the B.Sc. in Physics and the B.Eng. in Electrical & Electronic Engineering in 2000 and 2001 respectively from the University of Adelaide, Australia. In 2004 he received the M.S. degree in Electrical and Computer Engineering from Carnegie Mellon University, and is currently a Ph.D. candidate with advisor Prof. Rob Rutenbar. He has worked for Motorola in the area of integrated circuit design and his present research interests include speech recognition, computer architecture and computational biology.