For efficient generation of realistic images, 4 kinds of generic models (data-, object-, role- and process-models) are introduced in the system based on the extensible well (window- based elaboration language) which has been reported previously. These models are constructed so that they have hierarchical interfaces from data to processes. In order to satisfy multiple intentions interacting with each other, concepts of roles are introduced. Each role is recognized as a set of the object-networks, and the respective user's intentions are referred to a set of executions of many roles. Object-networks which consists of noun-objects and verb objects express transitions of states. In each occurrence of transition, the user's intention is issued in an event driven manner, and it provides concurrent processes Multiple roles are made interactive with each other by using the common platform which consists of windows. Each role is specified as a structure of the object-network, which is defined by a graph structure. Every object has templates which define data structure. Data of objects are specified by constraint- relating attributes of objects or referring to user's data driven action. By concepts of constraints and models, roles, realistic images are obtained with the least data. Some examples for human movements are demonstrated.