Effortless and robust recognition of human faces in video sequences is a practically important, but technically very challenging problem, especially in the presence of pose and lighting variability. Here we study the statistical structure of one such sequence and observe that images of both facial features and the full head lie on low-dimensional manifolds that are embedded in very high-dimensional spaces. We apply IndependentManifold Analysis (IMA) to learn these manifolds and use them to track local features to sub-pixel accuracy. We utilize sub-pixel resampling, which allows a very smooth
estimate of head pose. In the process, we learn a manifold model of the head and use it to partially compensate for pose. Finally, in experiments on the standard FERET database, we report that this pose compensation results in more than an order of magnitude reduction of the equal error rate.