The advent of the internet has opened a host of new and exciting questions in the science and mathematics of information organization and data mining. In particular, a highly ambitious promise of the internet is to bring the bulk of human knowledge to everyone with access to a computer network, providing a democratic medium for sharing and communicating knowledge regardless of the language of the communication. The development of sharing and communication of knowledge via transfer of digital files is the first crucial achievement in this direction. Nonetheless, available solutions to numerous ancillary problems remain far from satisfactory. Among such outstanding problems are the first few fundamental questions that have been responsible for the emergence and rapid growth of the new field of Knowledge Engineering, namely, classification of forms of data, their effective organization, and extraction of knowledge from massive distributed data sets, and the design of fast effective search engines. The precision of machine learning algorithms in classification and recognition of image data (e.g. those scanned from books and other printed documents) are still far from human performance and speed in similar tasks. Discriminating the many forms of ASCII data from each other is not as difficult in view of the emerging universal standards for file-format. Nonetheless, most of the past and relatively recent human knowledge is yet to be transformed and saved in such machine readable formats. In particular, an outstanding problem in knowledge engineering is the problem of organization and management--with precision comparable to human performance--of knowledge in the form of images of documents that broadly belong to either text, image or a blend of both. It was shown in that the effectiveness of OCR was intertwined with the success of language and font recognition.