Carnegie Mellon University

Richard Stern

Richard Stern

Professor, Electrical and Computer Engineering
Language Technologies Institute
Computer Science Department
Biomedical Engineering
Lecturer, Music

Address 5000 Forbes Avenue
Pittsburgh, PA 15213

Bio

Richard M. Stern received the S.B. degree from the Massachusetts Institute of Technology in 1970, the M.S. from the University of California, Berkeley, in 1972, and the Ph.D. from MIT in 1977, all in electrical engineering. He has been on the faculty of Carnegie Mellon University since 1977, where he is currently a Professor in the Department of Electrical and Computer Engineering, the Department of Computer Science, and the Language Technologies Institute, and a Lecturer in the School of Music. Much of Dr. Stern's current research is in spoken language systems, where he is particularly concerned with the development of techniques with which automatic speech recognition can be made more robust with respect to changes in environment and acoustical ambience. In addition to his work in speech recognition, Dr. Stern has worked extensively in psychoacoustics, where he is best known for theoretical work in binaural perception. Dr. Stern is a Fellow of the IEEE, the Acoustical Society of America, and the International Speech Communication Association (ISCA). He was the ISCA 2008-2009 Distinguished Lecturer, a recipient of the Allen Newell Award for Research Excellence in 1992, and he served as the General Chair of Interspeech 2006. He is also a member of the Audio Engineering Society.

Education

Ph.D., 1976 
Electrical Engineering and Computer Science 
Massachusetts Institute of Technology

M.S., 1972 
Electrical Engineering and Computer Science 
University of California, Berkeley

B.S., 1970 
Electrical Engineering 
Massachusetts Institute of Technology

Research

Automatic Speech Recognition

Most current speech recognition systems do not yet perform well in difficult acoustical environments, or in different environments from the ones in which they had been trained. This research is concerned with improving the robustness of SPHINX, Carnegie Mellons large-vocabulary continuous-speech recognition system, with respect to acoustical distortion resulting from sources such as background noise, competing talkers, change of microphone, and room reverberation. Several different strategies are being used to address these problems. These include: improved noise cancellation and speech normalization methods, the use of representations of the speech waveform that are based on the processing of sounds by the human auditory system, and the use of array-processing techniques to improve the signal-to-noise ratio of the speech that is input to the system.

Signal Processing in the Auditory System

This research includes both psychoacoustical measurements to determine how we hear complex sounds, and the development of mathematical models that use optimal communication theory to relate the results of these experiments to the neural coding of sounds by the auditory system. Much of this work has been concerned with the localization of sound and other aspects of binaural perception.

Keywords

  • Robust automatic speech recognition
  • Auditory perception
  • Signal processing
  • Music information retrieval
  • Acoustics