Smart Browser is a framework that supports the easy integration of customized background services within a Web
browser. The framework utilizes a set of extendable XML message schemas to communicate between the browser and
the background services. Based on this framework, a set of background services are integrated into Firefox browser.
These services can bring the intelligence of crowd and machine, which are usually logic complicated, data intensive and
computing complex, to each end user by utilizing light weighted and daily used browser. It's obvious that applications
built on this framework can improve users' experiences when surfing the Web.
Blind image watermarking technologies allow information to be embedded in common digital images and then recover
from the watermarked images without the original images. However, the embedded information is often damaged after
the print-and-scan process, because of the randomly added noises, the altered color, and the rotation and scaling
introduced in the process. In this paper, we present a practical blind image watermarking scheme based on DCT domain
which can survive from the print-and-scan process. The image is partitioned into blocks, and each block embeds one bit
watermark data. Two uncorrelated pseudo random sequences are used to spread bit 0 and 1 in the middle frequency band
of block-DCT spectrum respectively, which is done by adding the corresponding pseudo random sequence to the middle
frequency block-DCT coefficients adaptively. The embedded bit is recovered by comparing the correlations of the
modified middle frequency coefficients with each pseudo random sequence. Experiments show that the bit error ratio of
watermarking is 2.26% after the print-and-scan process, which is robust enough for visual objects embedding. The
robustness of the embedded data can be further improved by incorporating data error correction coding and data
repetition voting techniques. In conclusion, this scheme achieves a good performance of both watermark robustness and
watermark transparency for the print-and-scan process.
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be
digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type
variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some
characteristics, for example, one character may be part of another character, we define the character set for recognition
according to the segmented components, and the components are combined into characters by rule-based post-processing
module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented.
For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and
connected components. As Mongolian has different font-types which are categorized into two major groups, the
parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is
introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant
character recognition kernels are integrated. Experiments show that the presented methods are effective. The text
recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
As a cursive script, the characteristics of Arabic texts are different from Latin or Chinese greatly. For example, an Arabic character has up to four written forms and characters that can be joined are always joined on the baseline. Therefore, the methods used for Arabic document recognition are special, where character segmentation is the most critical problem. In this paper, a printed Arabic document recognition system is presented, which is composed of text line segmentation, word segmentation, character segmentation, character recognition and post-processing stages. In the beginning, a top-down and bottom-up hybrid method based on connected components classification is proposed to segment Arabic texts into lines and words. Subsequently, characters are segmented by analysis the word contour. At first the baseline position of a given word is estimated, and then a function denote the distance between contour and baseline is analyzed to find out all candidate segmentation points, at last structure rules are proposed to merge over-segmented characters. After character segmentation, both statistical features and structure features are used to do character recognition. Finally, lexicon is used to improve recognition results. Experiment shows that the recognition accuracy of the system has achieved 97.62%.
Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms (Isolated, Final, Initial, Medial), special zones (divided according to the headline and the baseline of a text line) that characters occupy and component information (with or without secondary parts, say, diacritical marks, movements, etc.). Then 12 types of directional features are extracted from character profiles. After dimension reduction by linear discriminant analysis (LDA), features are sent to modified quadratic discriminant function (MQDF), which is utilized as the final classifier. At last, similar characters are discriminated before outputting recognition results. Selecting involved parameters properly, encouraging experimental results on test sets demonstrate the validity of proposed approach.
Document image understanding techniques have been widely used in many application domains. Various kinds of documents have been researched and different methods are developed for information retrieval purpose. In this paper we present a practical method to extract information items from Chinese business card. Before retrieval information in business card, the image of business card had been segmented into little text regions and each text region had been recognized. Because the typeset of business card is variable, and both English and Chinese characters are used, so there are errors in segmentation and recognition result. We focus on building a robust model that can tolerate errors and extract syntax pattern of each text lines in business card, which using both layout information and logical information. By this model, many errors will be identified and adjusted. Finally, correct property will be assigned to each text region in business card, and recognition errors will be corrected.