This paper considers problems regarding the development of stochastic models consistent with the results of character image recognition in video stream. Assumptions about their structure and properties are formulated for the constructed models. The description of the model components defines the Dirichlet distribution and its generalizations. The parameters of these distributions are determined using statistical estimation methods. The Akaike information criterion is used to rank models. The verification of the agreement of the proposed theoretical distributions to the sample data is carried out.
In this paper the problem statement is given to compare the digitized pages of the official papers. Such problem appears during the comparison of two customer copies signed at different times between two parties with a view to find the possible modifications introduced on the one hand. This problem is a practically significant in the banking sector during the conclusion of contracts in a paper format. The method of comparison based on the recognition, which consists in the comparison of two bag-of-words, which are the recognition result of the master and test pages, is suggested. The described experiments were conducted using the OCR Tesseract and the siamese neural network. The advantages of the suggested method are the steady operation of the comparison algorithm and the high exacting precision, and one of the disadvantages is the dependence on the chosen OCR.