Previous work on unsupervised learning has shown that it is possible to learn Gabor-like feature representations,
similar to those employed in the primary visual cortex, from the statistics of natural images. However, such
representations are still not readily suited for object recognition or other high-level visual tasks because they
can change drastically as the image changes to due object motion, variations in viewpoint, lighting, and other
factors. In this paper, we describe how bilinear image models can be used to learn independent representations
of the invariances, and their transformations, in natural image sequences. These models provide the foundation
for learning higher-order feature representations that could serve as models of higher stages of processing in the
cortex, in addition to having practical merit for computer vision tasks.