We have been exploring the hypothesis that vision is an explanatory process, in which causal and functional reasoning about potential motion plays an intimate role in mediating the activity of low-level visual processes. In particular, we have explored two of the consequences of this view for the construction of purposeful vision systems: Causal and design knowledge can be used to (1) drive focus of attention, and (2) choose between ambiguous image interpretations. An important result of visual understanding is an explanation of the scene's causal structure: How action is originated, constrained, and prevented, and what will happen in the immediate future. In everyday visual experience, most action takes the form of motion, and most causal analysis takes the form of dynamical analysis. This is even true of static scenes, where much of a scene's interest lies in how possible motions are arrested. This paper describes our progress in developing domain theories and visual processes for the understanding of various kinds of structured scenes, including structures built out of children's constructive toys and simple mechanical devices.