18-899 Project: H.323 System

K. Bradley, X. Chen, Y. Yuan

We intend to implement an H.323 videoconferencing system over the local TCP/IP network. This system will have three initial challenges for us: understanding and conforming to standards; efficiently coding DSP algorithms; and interacting with multimedia equipment through operating systems.

An H.323 system has five major components. The video system follows the H.261 standard, which we understand fairly well. The audio system is based on G.7xx (we have still not determined the tradeoffs in the various audio codecs). The bitstreams are merged using the H.225 standard. Finally, H.245 is used for negotiation, setup, and termination. The fifth component, the whiteboarding and other data transmission, will be ignored for this project.

The project lifecycle is divided into two components; as the midterm deadline is approaching, the midterm goals and deliverables are discussed in detail.

Midterm goals: H.261 system

Encoder: generates compliant bitstream from YUV video.
Decoder: generates YUV video from compliant bitstream.
Communications: ignore H.225 (only one stream), send data to remote system through a socket.
Video: use OS features to retrieve a video frame, convert to YUV, send to encoder; receive YUV video, convert to appropriate format, and display on VDU.
Demonstration: show video across the network using two machines. One machine must be equipped with a video camera, the other with a display unit. Frame rate is "as good as possible". Presumably 30 fps encoding and decoding will not be possible in software; experiments will have to be run to determine how quickly encoding and decoding can be done.

Final goals: complete system

Audio: G.7xx

Stream management:

Negotiation and signalling: H.245

Demonstration: at least one-way transmission of audio and video data, from host to target. Two-way communication if resources permit (two cameras, for example).

As we have three group members, we are dividing work for the midterm project based on interests and strenghts. We have not yet divided work for the final stages of the project.

Initially, Yuan will focus on efficient implementations for the loop filter and motion compensation. Xufeng will focus on efficient implementations of the DCT and quantization. By "efficient" here we mean optimized C algorithms that require as little time as possible, including perhaps integer-only arithmetic. Kevin will construct test environments that will allow YUV video to be read or written, and serve up "fake" bitstream data for the decoder, as well as investigating the OS requirements for different system implementations for video.

After the algorithms are implemented, the encoder and decoder will be constructed from them in week 2. YUV data will be read from a file and compressed in the encoder, while the decoder will receive "fake" bitstream data to decode. "Fake" data is necessary, as the bitstreams will not be generated or parseable until Week #3. In addition, a mechanism for transporting a bitstream across the network will be implemented. While elements of the H.225 standard will be used, it will probably be ignored because we are concerned with a single bitstream at this point; however, this can change. Initial investigations of the video systems will be undertaken.

The bitstreams are then generated from the compressed data, and compliant bitstreams are read from file for the decoder. A mechanism for obtaining and displaying YUV video will also be constructed.

Finally, the encoder, decoder, video, and networking subsystems will be combined. Hopefully the integration will go smoothly. and a demonstration possible at the end of the week.

The exact software design and test procedures are not outlined in this document.