A robust object tracking system that is invariant to object appearance variations and background clutter is proposed. Multiple instance learning with a boosting algorithm is applied to select discriminant texture information between the object and background data. Additionally, depth information, which is important to distinguish the object from a complicated background, is integrated. We propose two depth-based models that can compensate texture information to cope with both appearance variants and background clutter. Moreover, in order to reduce the risk of drifting problem increased for the textureless depth templates, an update mechanism is proposed to select more precise tracking results to avoid incorrect model updates. In the experiments, the robustness of the proposed system is evaluated and quantitative results are provided for performance analysis. Experimental results show that the proposed system can provide the best success rate and has more accurate tracking results than other well-known algorithms.