Document scanning is now an accepted part of office procedure, allowing the incorporation of digitized images into new documents and the conversion of scanned print into ASCII by optical character recognition ( OCR). Often document pages contain more than one form of information - textual, graphical and/or pictorial. Segmentation of document images into these three categories is feasible with the aid of image processing. Projections of the thresholded document images in conjunction with autocorrelation are used to check text alignment. Then the edge shifting properties of the rank filter are used to coalesce image regions containing text into solid near-rectangular blocks. Pyramidal reduction is combined with the filtering to ease the computational burden. Horizontal and vertical projections are used to segment whole pages recursively into homogeneous blocks
whose properties are then analysed. Applications forseen for the image segmentation include modified facsimile systems, achievement of
artifact-free OCR and conversion of document images into files with separate formats for text, graphics and pictures.