Synchronized Captioning System
Using MPEG-4 and SPHINX
To implement a system that will encode a television signal with closed captioning information into an MPEG-4 compatible bitstream with the captions on a separate scaleable layer, and to implement the decoder to decode and display the bitstream while allowing the user to adjust the appearance of the captions.
In addition, SPHINX voice recognition software will be used to compensate for the time delay inherent in closed captioning for live broadcasts, so as to achieve better synchronization between the speech and the captions.
System Block Diagram
Progress
Currently we are in posession of a Diamond Crunch It video capture board. It is installed on Jie's machine in the multimedia lab.
Also, we have received some captured broadcast news data from Infomedia in MPEG-1 format.
We have conducted research into the commercially available products suitable for our use. We compared prices from websites provided by vendors. The best deal we have found so far is the Text Grabber by SunBelt which is retailing for $299.95.
Another option is the Diamond DTV 2000 which is retailing for $129.95. This is a TV tuner enhancement board that works with some of Diamond's most popular video cards, and it provides closed captioning.
Informedia also kindly provided us with captured closed captioning data in ASCII format.
From www.mp3.com we have obtained an MPEG-2 layer 3 encoder (L3Enc/L3Dec). With this encoder we have done an offline compression of a .wav file. The 3 minute file took about 15 minutes to compress on a Pentium 166Mhz.
We believe we can obtain source code for this encoder from the Fraunhofer Institute website .
Thanks to Mosur Ravishankar from the speech group, we have obtained a real-time PC version of the SPHINX-II recognition system. We are in the process of looking into the source code.
No work has been done on this component yet.
We have obtained MPEG-4 encoder and decoder source code from Microsoft and compiled it using Visual C++ 5.0. We have encoded a number of .yuv files from the class site.
We have compiled Microsoft's decoder and tested it successfully on the files compressed by the encoder. The output of the decoder is a .yuv file that is in the IPBB order. We have written some java applications that can view these .yuv files:
We successfully decoded the file we encoded above.
No work has been done on this component yet.
No work has been done on this component yet.
However, we have coded up a version of the .yuv file viewer in Visual C++. This program requires the Visual C++ library to be installed in the system, and for the display to be set at 24-bit color. This viewer is much faster than the java version, able to display a maximum of 20 frames per second for qcif files.