Packet-switching based video conferencing has emerged as one of the most important multimedia applications. Lip synchronization can be disrupted in the packet network as the result of the network properties: packet delay jitters at the capture end, network delay jitters, packet loss, packet arrived out of sequence, local clock mismatch, and video playback overlay with the graphic system. The synchronization problem become more demanding as the real time and multiparty requirement of the video conferencing application. Some of the above mentioned problem can be solved in the more advanced network architecture as ATM having promised. This paper will present some of the solutions to the problems that can be useful at the end station terminals in the massively deployed packet switching network today. The playback scheme in the end station will consist of two units: compression domain buffer management unit and the pixel domain buffer management unit. The pixel domain buffer management unit is responsible for removing the annoying frame shearing effect in the display. The compression domain buffer management unit is responsible for parsing the incoming packets for identifying the complete data blocks in the compressed data stream which can be decoded independently. The compression domain buffer management unit is also responsible for concealing the effects of clock mismatch, lip synchronization, and packet loss, out of sequence, and network jitters. This scheme can also be applied to the multiparty teleconferencing environment. Some of the schemes presented in this paper have been implemented in the Multiparty Multimedia Teleconferencing (MMT) system prototype at the IBM watson research center.