Authentication of digital documents is an important concern as digital documents are replacing the traditional
paper-based documents for offcial and legal purposes. This is especially true in the case of documents that are
exchanged over the Internet, which could be accessed and modified by intruders. The most popular methods used
for authentication of digital documents are public key encryption-based authentication and digital watermarking.
Traditional watermarking techniques embed a pre-determined character string, such as the company logo, in a
document. We propose a fragile watermarking system, which uses an on-line signature of the author as the
watermark in a document. The embedding of a biometric characteristic such as signature in a document enables
us to verify the identity of the author using a set of reference signatures, in addition to ascertaining the document
integrity. The receiver of the document reconstructs the signature used to watermark the document, which is then
used to verify the author's claimed identity. The paper presents a signature encoding scheme, which facilitates
reconstruction by the receiver, while reducing the chances of collusion attacks.
Many document images contain both text and non-text (images, line drawings, etc.) regions. An automatic segmentation of such an image into text and non-text regions is extremely useful in a variety of applications. Identification of text regions helps in text recognition applications, while the classification of an image into text and non-text regions helps in processing the individual regions differently in applications like page reproduction and printing. One of the main approaches to text detection is based on modeling the text as a texture. We present a method based on a combination of neural networks (texture-based) and connected component analysis to detect text in color documents with busy foreground and background. The proposed method achieves an accuracy of 96% (by area) on a test set of 40 documents.