Handwriting recognition systems are typically trained using publicly available databases, where data have been
collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often
the case in real-world scenarios, classification performance can be affected when novel data is presented to the
word recognition system. To overcome this problem, we present in this paper a new approach called database
adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training,
respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity
normalization are considered. The advantage of such approach is that we can re-use the existing recognition
system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and
with a real-world database. We adapt either the test set or the training set. Results show that training set
adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted
data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.
We present in this paper an HMM-based recognizer for the recognition of unconstrained Arabic handwritten words.
The recognizer is a context-dependent HMM which considers variable topology and contextual information for a better modeling of writing units.
We propose an algorithm to adapt the topology of each HMM to the character to be modeled.
For modeling the contextual units, a state-tying process based on decision tree clustering is introduced which significantly reduces the number of parameters.
Decision trees are built according to a set of expert-based questions on how characters are written.
Questions are divided into global questions yielding larger clusters and precise questions yielding smaller ones.
We apply this modeling to the recognition of Arabic handwritten words.
Experiments conducted on the OpenHaRT2010 database show that variable length topology and contextual information significantly improves the recognition rate.
In this paper we present a system for the off-line recognition of cursive Arabic handwritten words. This system
in an enhanced version of our reference system presented in [El-Hajj et al., 05] which is based on Hidden Markov
Models (HMMs) and uses a sliding window approach. The enhanced version proposed here uses contextual
character models. This approach is motivated by the fact that the set of Arabic characters includes a lot of ascending
and descending strokes which overlap with one or two neighboring characters. Additional character models are
constructed according to characters in their left or right neighborhood. Our experiments on images of the benchmark
IFN/ENIT database of handwritten villages/towns names show that using contextual character models improves
recognition. For a lexicon of 306 name classes, accuracy is increased by 0.6% in absolute value which corresponds
to a 7.8% reduction in error rate.
A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems.
To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system.
Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.
Conference Committee Involvement (1)
Mobile Multimedia/Image Processing for Military and Security Applications
20 April 2006 | Orlando (Kissimmee), Florida, United States