Detecting documents with a certain stamp instance is an effective and reliable way to retrieve documents associated with a specific source. However, this unique problem has essentially remained unaddressed. In this paper, we present a novel stamp detection framework based on parameter estimation of connected edge features. Using robust basic-shape detectors, the approach is effective for stamps with analytically shaped contours, when only limited samples are available. For elliptic/circular stamps, it efficiently exploits the orientation information from pairs of edge points to determine its center position and area, without computing all the five parameters of an ellipse. In our approach, we considered the set of unique characteristics of stamp patterns. Specifically, we introduced effective algorithms to address the problem that stamps often spatially overlay their background contents. These give our approach significant advantages in detection accuracy and computation complexity over traditional Hough transform method in locating candidate ellipse regions. Experimental results on real degraded documents demonstrated the robustness of this retrieval approach on large document database, which consists of both printed text and handwritten notes.
Most researchers would agree that research in the field of document processing can benefit tremendously from a common software library through which institutions are able to develop and share research-related software and applications across academic, business, and government domains. However, despite several attempts in the past, the research community still lacks a widely-accepted standard software library for document processing. This paper describes a new library called DOCLIB, which tries to overcome the drawbacks of earlier approaches. Many of DOCLIB's features are unique either in themselves or in their combination with others, e.g. the factory concept for support of different image types, the juxtaposition of image data and metadata, or the add-on mechanism. We cherish the hope that DOCLIB serves the needs of researchers better than previous approaches and will readily be accepted by a larger group of scientists.