This paper proposes a VLSI architecture for the parallel processing of the generalized 2-D convolution. The processor consists of a shift-buffer pipeline, an array of multipliers and a tree of adders. The image data enter the processor in a raster scan format and are stroed and shifted in the pipeline. The multiplier array takes data from the pipeline and does the mulitiplication in parallel, and then sends the partial products to the adder tree to complete the computation. The simple architecture and control strategy makes the proposed scheme suitable for VLSI implementation.