Ambient Intelligent is expected to become one of the driving key factors of the semiconductors industry in this decade. One of the most promising areas in this respect is the advent of embedded smart imaging applications in a variety of consumer applications, like mobile communication devices and the automotive domain. The efficient VLSI implementation of these applications requires architectural concepts that enable the extraction of objects and associated information out of video sequences in real-time. The main architectural challenge is to find an appropriate trade-off between architectural flexibility and scalability in order to cope with moderate variations of the applied smart imaging algorithms on one hand and cost efficiency of the implementation on the other hand. This paper describes the algorithmic and architectural requirements for the implementation of smart imaging applications in the mentioned fields. The target system, based on an embedded RISC processor, embedded memory, and cores for accelerating essential functions, like morphological operations, connected component labeling, motion extraction etc., is presented. The functional system partitioning applied is based on HW acceleration of core functions that enable the extraction of low-level information out of the images of a video sequence. This information is provided to the embedded RISC processor for further abstraction of the image content information and interpretation of the image content by SW means. One of the focal points of this paper is the derivation of efficient architectural concepts for smart imaging coprocessors, acting as a system toolbox for accelerating the required smart imaging core functions.