A vascular network is often represented by a Reeb graph, which is a topological skeleton, and graph theory has been widely applied to analyze properties of a vascular network. A Reeb graph model for a vascular network is obtained by assigning the branch points of the network to be the vertices of the graph and the vessels between branch points to be the edges of the graph. Vascular networks develop by way of angiogenesis, a growth process that involves the biological mechanisms of vessel sprouting (budding) and splitting (intussusception). Vascular networks develop by way of two biological mechanisms of vessel sprouting (budding) and splitting (intussusception). According to a graph theory modeling of two vascular network growth mechanisms, all nodes in the Reeb graph must be cubic in degree except for two special nodes: the afferent (A) and efferent (E) nodes. We define that a vascular network is cubic if all internal nodes are cubic in degree. We consider six normal adult rat renal glomerular networks and use their reeb graphs already constructed and published in the literature. We observe that five of them contain internal vertices of degree higher than three. Branch points in vascular networks may appear to be of a higher degree if the imaging resolution cannot differentiate between blood vessels that are very close in proximity. Here, we propose a random graph theory model that edits a non-cubic vascular network into a cubic graph. We observe that the edited cubic graph from a non-cubic vascular network has the similar size and order as the one cubic vascular network.
Handwriting originates from a particular copybook style such as Palmer or Zaner-Bloser that one learns in childhood. Since questioned document examination plays an important investigative and forensic role in many types of crime, it is important to develop a system that helps objectively identify a questioned document’s handwriting style. Here, we propose a computer vision system that can assist a document examiner in the identification of a writer’s handwriting style and therefore the origin or nationality of an unknown writer of a questioned document. We collected 33 Roman alphabet copybook styles from 18 countries. Each character in a questioned document is segmented and matched against all of the 33 handwriting copybook styles. The more characters present in the questioned document, the higher the accuracy observed.
For describing and analyzing digital images of paintings we propose a model to serve as the basis for an interactive image retrieval system. The model defines two types of features: palette and canvas features. Palette features are those related to the set of colors in a painting while canvas features relate to the frequency and spatial distribution of those colors. The image retrieval system differs from previous retrieval systems for paintings in that it does not rely on image or color segmentation. The features specified in the model can be extracted from any image and stored in a database with other control information. Users select a sample image and the system returns the ten closest images as determined by calculating the Euclidean distance between feature sets. The system was tested with an initial dataset of 100 images (training set) and 90 sample images (testing set). In 81 percent of test cases, the system retrieved at least one painting by the same artist suggesting that the model is sufficient for the interactive classification of paintings by artist. Future studies aim to expand and refine the model for the classification of artwork according to artist and period style.
In content-based image indexing and retrieval (IIR), hue component histograms of images are widely used for indexing the images in an image database. It is to retrieve all color images whose distance between hue distributions are within some threshold distance of the query image. Edit distance has been successfully used as a similarity measure. Our earlier algorithm O(b2) computing the edit distance between two angular histograms, where b is the number of bins in the hue histogram, tends to be too slow for users to wait for the outputs when applied to every image in the database. For this reason, we design two filtration functions that eliminate most color images from consideration as possible outputs quickly and exact edit distances are only computed for those remaining images. We are still guaranteed to find all similar hue distributions and the filtration technique gives significant speeds-ups.
A study was undertaken to determine the power of handwriting to distinguish between individuals. Handwriting samples of one thousand five hundred individuals, representative of the US population with respect to gender, age, ethnic groups, etc., were obtained. Analyzing differences in handwriting was done by using computer algorithms for extracting features from scanned images of handwriting. Attributes characteristic of the handwriting were obtained, e.g., line separation, slant, character shapes, etc. These attributes, which are a subset of attributes used by expert document examiners, were used to quantitatively establish individuality by using machine learning approaches. Using global attributes of hadwriting and very few characters in the writing, the ability to determine the writer with a high degree of confidence was established. The work is a step towards providing scientific support for admitting handwriting evidence in court. The mathematical approach and the resulting software also have the promise of aiding the expert document examiner.
In our previous work of writer identification, a database of handwriting samples (written in English) of over one thousand individuals was created, and two types of computer-generated features of sample handwriting were extracted: macro and micro features. Using these features, writer identification experiments were performed: given that a document is written by one of n writers, the task is to determine the writer. With n = 2, we correctly determined the writer with a 99% accuracy using only 10-character micro features in the writing; with n = 1000, the accuracy is dropped to 80%. To obtain higher performance, we propose a combination of macro and micro level features. First, macro level features are used in a filtering model: the computer program is presented with multiple handwriting samples from a large number (1000) of writers, and the question posed is: Which of the samples are consistent with a test sample? As a result of using the filtering model, a reduced set of documents (100) is obtained and presented to the final identification model which uses the micro level features. We improved our writer identification system from 80% to 87.5% by the proposed filtering-combination technique when n = 1000.
This paper describes an off-line handwritten document data collection effort conducted at CEDAR and discusses systems that manage the document image data. We introduce the CEDAR letter, discuss its completeness and then describe the specification of the CEDAR letter image database consisting of writer data and features obtained from a handwriting sample, statistically representative of the U.S. population. We divide the document image and information management system into four systems: (1) acquisition, (2) archiving, (3) indexing and retrieval and (4) display systems. This paper discusses the issues raised by constructing the CEDAR letter database and by its potential usefulness to document image analysis, recognition, and identification fields.
The problem of Writer Identification based on similarity is formalized by defining a distance between character or word level features and finding the most similar writings or all writings which are within a certain threshold distance. Among many features, we consider stroke direction and pressure sequence strings of a character as character level image signatures for writer identification. As the conventional definition of edit distance is not applicable in essence, we present the newly defined and modified edit distances depending upon their measurement types. Finally, we present a prototype stroke directional and pressure sequence string extractor used on the writer identification. The importance of this study is the attempt to give a definition of distance between two characters based on the two types of strings.