It is a challenging task to develop an effective visual tracking algorithm due to factors such as pose variation, rotation, and so on. Combined discriminative global and generative local appearance models are proposed to address this problem. Specifically, we develop a compact global object representation by extracting the low-frequency coefficients of the color and texture of the object based on two-dimensional discrete cosine transform. Then, with the global appearance representation, we learn a discriminative metric classifier in an online fashion to differentiate the target object from its background, which is very important to robustly indicate the changes in appearance. Second, we develop a new generative local model that exploits the scale invariant feature transform and its spatial geometric information. To make use of the advantages of the global discriminative model and the generative local model, we incorporate them into Bayesian inference framework. In this framework, the complementary models help the tracker locate the target more accurately. Furthermore, we use different mechanisms to update global and local templates to capture appearance changes. The experimental results demonstrate that the proposed approach performs favorably against state-of-the-art methods in terms of accuracy.