We present a visual tracking method with feature fusion via joint sparse presentation. The proposed method describes each target candidate by combining different features and joint sparse representation for robustness in coefficient estimation. Then, we build a probabilistic observation model based on the approximation error between the recovered candidate image and the observed sample. Finally, this observation model is integrated with a stochastic affine motion model to form a particle filter framework for visual tracking. Furthermore, a dynamic and robust template update strategy is applied to adapt the appearance variations of the target and reduce the possibility of drifting. Quantitative evaluations on challenging benchmark video sequences demonstrate that the proposed method is effective and can perform favorably compared to several state-of-the-art methods.