Many preprocessing techniques have been proposed for isolated word recognition. However, recently, recognition systems have dealt with text blocks and their compound text lines. In this paper, we propose a new preprocessing approach to efficiently correct baseline skew and fluctuations. Our approach is based on a sliding window within which the vertical position of the baseline is estimated. Segmentation of text lines into subparts is, thus, avoided. Experiments conducted on a large publicly available database (Rimes), with a BLSTM (bidirectional long short-term memory) recurrent neural network recognition system, show that our baseline correction approach highly improves performance.
This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this
system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system
does not require any segmentation into words or characters and directly works at line level. To take into account
linguistic information and enhance performance, a language model is introduced. This language model is based
on bigrams and built from training document transcriptions only. Different experiments with various vocabulary
sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show
the interest of specific language models, fit to handwritten mail recognition task.