A large collection of reproductions of calligraphy on paper was scanned into images to enable web access for both the academic community and the public. Calligraphic paper digitization technology is mature, but technology for segmentation, character coding, style classification, and identification of calligraphy are lacking. Therefore, computational tools for classification and quantification of calligraphic style are proposed and demonstrated on a statistically characterized corpus. A subset of 259 historical page images is segmented into 8719 individual character images. Calligraphic style is revealed and quantified by visual attributes (i.e., appearance features) of character images sampled from historical works. A style space is defined with the features of five main classical styles as basis vectors. Cross-validated error rates of 10% to 40% are reported on conventional and conservative sampling into training/test sets and on same-work voting with a range of voter participation. Beyond its immediate applicability to education and scholarship, this research lays the foundation for style-based calligraphic forgery detection and for discovery of latent calligraphic groups induced by mentor-student relationships.
Calligraphic style is considered, for this research, visual attributes of images of calligraphic characters sampled randomly
from a "work" created by a single artist. It is independent of page layout or textual content. An experimental design is
developed to investigate to what extent the source of a single, or of a few pairs, of character images can be assigned to
the either same work or to two different works. The experiments are conducted on the 13,571 segmented and labeled
600-dpi character images of the CADAL database. The classifier is not trained on the works tested, only on other works.
Even when only a few samples of same-class pairs are available, the difference-vector of a few simple features extracted
from each image of a pair yields over 80% classification accuracy for a same-work vs. different-work dichotomy. When
many pairs of different classes are available for each pair, the accuracy, using the same features, is almost the same.
These style-verification experiments are part of our larger goal of style identification and forgery detection.