Video object segmentation is an important component for object-based video coding schemes such as MPEG-4. A fast and robust video segmentation technique, which aims at efficient foreground and background separation via effective combination of motion and color information, is proposed in this work. First, a non-parametric gradient-based iterative color clustering algorithm, called the mean shift algorithm, is employed to provide robust dominant color regions according to color similarity. With the dominant color information from previous frames as the initial guess for the next frame, the amount of computational time can be reduced to 50%. Next, moving regions are identified by a motion detection method, which is developed based on the frame intensity difference to circumvent the motion estimation complexity for the whole frame. Only moving regions are further merged or split according to a region- based affine motion model. Furthermore, sizes, colors, and motion information of homogeneous regions are tracked to increase temporal and spatial consistency of extracted objects. The proposed system is evaluated for several typical MPEG-4 test sequences. It provides very consistent and accurate object boundaries throughout the entire test sequences.