The ROI based video coding is widely applied in video communication. In this paper, we propose a multilevel ROI
model, which includes the eye-mouth core region (CR), the face profile region (PR), the edge region (ER) and the
background region (BR), to classify the subjective importance level of regions for the scene. Taking account of the
proposed model, we first segment the current frame into four regions through skin color detection and feature location.
Then, we improve the rate control algorithm in JVT-G012 proposal. We consider two factors, including subjective factor
by our multi-level ROI model and objective factor by direct difference from reference frame, to model the complexity
weight of each macroblock (MB).We allocate resources both at the frame layer and the basic unit layer, and adjust QP at
MB layer. Finally, we restrict the QP of MB with three strategies to maintain the spatial and temporal smoothness. The
experimental results illustrate that PSNR of ROI (CR plus PR) area using proposed method is in average over 0.5dB
higher than JM8.6, while there are only slight changes in the PSNR of whole frame between two methods. Subjective
quality based on our method also achieves much better performance.