We show that we can effectively fit a very complex facial animation model to uncalibrated video sequences, without benefit of targets, structured light or any other active device. Our approach is based on regularized bungle-adjustment followed by least-squares minimization using a set of progressively finer control triangulations. It takes advantage of three complementary sources of information: stereo data, silhouette edges and 2-D feature points. In this way, complete head models can be acquired with a cheap and entirely passive sensor, such as an ordinary video camera. They can then be fed to existing animation software to produce synthetic sequences.