Image sets and videos can be modeled as subspaces, which are actually points on Grassmann manifolds. Clustering of such visual data lying on Grassmann manifolds is a hard issue based on the fact that the state-of-the-art methods are only applied to vector space instead of non-Euclidean geometry. Although there exist some clustering methods for manifolds, the desirable method for clustering on Grassmann manifolds is lacking. We propose an algorithm termed as kernel sparse subspace clustering on the Grassmann manifold, which embeds the Grassmann manifold into a reproducing kernel Hilbert space by an appropriate Gaussian projection kernel. This kernel is applied to obtain kernel sparse representations of data on Grassmann manifolds utilizing the self-expressive property and exploiting the intrinsic Riemannian geometry within data. Although the Grassmann manifold is compact, the geodesic distances between Grassmann points are well measured by kernel sparse representations based on linear reconstruction. With the kernel sparse representations, clustering results of experiments on three prevalent public datasets outperform a number of existing algorithms and the robustness of our algorithm is demonstrated as well.