This paper discusses VLSI architectural support for motion estimation (ME) algorithms within the H.263 and MPEG-4 video coding standards under low power constraints. A high memory access bandwidth and a high number of memory modules is mainly responsible for high power consumption in various motion estimation architectures. Therefore the aim of the presented VLSI architecture was to gain high efficiency at low memory bandwidth requirements for the computationally demanding algorithms as well as the support of several motion estimation algorithmic features with less additional area overhead. The presented VLSI architecture supports besides full search ME with [-16, 15] and [-8, +7] pel search area, MPEG-4 ME for arbitrarily shaped objects, advanced prediction mode, 2:1 pel subsampling, 4:1 pel subsampling, 4:1 alternate pel subsampling, Three Step Search (TSS), preference of the zero-MV, R/D-optimized ME and half-pel ME. A special data-flow design is used within the proposed architecture which allows to perform up to 16 absolute difference calculations in parallel, while loading only up to 2 bytes in parallel from current block and search are memory per clock cycle each. This VLSI-architecture was implemented using a VHDL-synthesis approach and resulted into a size of 22.8 kgates (without RAM), 100 Mhz (min.) using a 0.25 micrometer commercial CMOS library.