Object tracking is a core subject in computer vision and has significant meaning in both theory and practice. In this paper, we propose a novel tracking method, in which a robust discriminative classifier is built basing on both object and context information. In this method, we consider multiple frames of local invariant features on and around the object, and construct the object template and context template. To overcome the limitation of the invariant representations, we also design a non-parametric learning algorithm using transitive matching perspective transformation, which is called as LUPT (Learning Using Perspective Transformation). This learning algorithm can keep adding new object appearance into the object template and avoid improper updating when occlusions appear. In this paper, we also analyze the asymptotic stability of our method and prove its drift-free capability in long term tracking. Extensive experiments using challenging publicly available video sequences that cover most of the critical conditions in tracking demonstrate the enhanced strength and robustness of our method. Moreover, in comparison with several state-of -the-art tracking systems, our method shows superior performance in most of cases, especially in long time sequences.