Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is
one of the common brain disorders among children and not much information is known about the cause of this
disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned
subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains.
For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the
time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation
value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject
as a histogram of network features; such as the number of degrees per voxel. The classification is done using
a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for
each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features.
Experimental results verified that the classification accuracy improves when the combined histogram is used.
We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition
and obtained promising results. The dataset not only has a large size but also includes subjects from different
demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in
any functional brain disorder classification and we believe that this approach will be useful in analysis of many
brain related conditions.
The recent deployment of functional networks to analyze fMRI images has been very promising. In this method,
the spatio-temporal fMRI data is converted to a graph-based representation, where the nodes are voxels and edges
indicate the relationship between the nodes, such as the strength of correlation or causality. Graph-theoretic
measures can then be used to compare different fMRI scans.
However, there is a significant computational bottleneck, as the computation of functional networks with
directed links takes several hours on conventional machines with single CPUs. The study in this paper shows
that a GPU can be advantageously used to accelerate the computation, such that the network computation takes
a few minutes. Though GPUs have been used for the purposes of displaying fMRI images, their use in computing
functional networks is novel.
We describe specific techniques such as load balancing, and the use of a large number of threads to achieve the
desired speedup. Our experience in utilizing the GPU for functional network computations should prove useful
to the scientific community investigating fMRI as GPUs are a low-cost platform for addressing the computational
Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among
brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions
among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into
regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine
learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have
very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur
in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap
with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method
identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain
Statistical image reconstruction algorithms potentially offer many advantages to x-ray computed tomography (CT), e.g.
lower radiation dose. But, their adoption in practical CT scanners requires extra computation power, which is traditionally
provided by incorporating additional computing hardware (e.g. CPU-clusters, GPUs, FPGAs etc.) into a scanner. An
alternative solution is to access the required computation power over the internet from a cloud computing service, which
is orders-of-magnitude more cost-effective. This is because users only pay a small pay-as-you-go fee for the computation
resources used (i.e. CPU time, storage etc.), and completely avoid purchase, maintenance and upgrade costs. In this
paper, we investigate the benefits and shortcomings of using cloud computing for statistical image reconstruction. We
parallelized the most time-consuming parts of our application, the forward and back projectors, using MapReduce, the
standard parallelization library on clouds. From preliminary investigations, we found that a large speedup is possible at a
very low cost. But, communication overheads inside MapReduce can limit the maximum speedup, and a better MapReduce
implementation might become necessary in the future. All the experiments for this paper, including development and
testing, were completed on the Amazon Elastic Compute Cloud (EC2) for less than $20.
In this paper we investigate the spatial correlational structure of orientation and color information in natural
images. We compare these with the spatial correlation structure of optical recordings of macaque monkey
primary visual cortex, in response to oriented and color stimuli. We show that the correlation of orientation falls
off rapidly over increasing distance. By using a color metric based on the a-b coordinates in the CIE-Lab color
space, we show that color information, on the other hand, is more highly correlated over larger distances. We
also show that orientation and color information are statistically independent in natural images. We perform
a similar spatial correlation analysis of the cortical responses to orientation and color. We observe a similar
behavior to that of natural images, in that the correlation of orientation-specific responses falls off; more rapidly
than the correlation of color-specific responses. Our findings suggest that: (a) orientation and color information
should be processed in separate channels, and (b) the organization of cortical color responses at a lower spatial
frequency compared to orientation is a reflection of the statistical structure of visual world.
One of the important features of the human visual system is that it is able to recognize objects in a scale and translational invariant manner. However, achieving this desirable behavior through biologically realistic networks is a challenge. The synchronization of neuronal firing patterns has been suggested as a possible solution to the binding problem (where a biological mechanism is sought to explain how features that represent an object can be scattered across a network, and yet be unified). This observation has led to neurons being modeled as oscillatory dynamical units. It is possible for a network of these dynamical units to exhibit synchronized oscillations under the right conditions. These network models have been applied to solve signal deconvolution or blind source separation problems. However, the use of the same network to achieve properties that the visual sytem exhibits, such as scale and translational invariance have not been fully explored. Some approaches investigated in the literature (Wallis, 1996) involve the use of non-oscillatory elements that are arranged in a hierarchy of layers. The objects presented are allowed to move, and the network utilizes a trace learning rule, where a time averaged output value is used to perform Hebbian learning with respect to the input value. This is a modification of the standard Hebbian learning rule, which typically uses instantaneous values of the input and output. In this paper we present a network of oscillatory amplitude-phase units connected in two layers. The types of connections include feedforward, feedback and lateral. The network consists of amplitude-phase units that can
exhibit synchronized oscillations. We have previously shown that such a network can segment the components of each input object that most contribute to its classification. Learning is unsupervised and based on a Hebbian update, and the architecture is very simple. We extend the ability of this network to address the problem of translational invariance. We show that by adopting a specific treatment of the phase values of the output layer, the network exhibits translational invariant object representation. The scheme used in training is as follows. The network is presented with an input, which then moves. During the motion the amplitude and phase of the upper layer units is not reset, but continues with the past value before the introduction of the object in the new position. Only the input layer is changed
instantaneously to reflect the moving object. The network behavior is such that it categorizes the translated objects with the same label as the stationary object, thus establishing an invariant categorization with respect to translation. This is a promising result as it uses the same framework of oscillatory units that achieves synchrony, and introduces motion to achieve translational invariance.
We present a modelling framework for cortical processing aimed at understanding how, maintaining biological plausibility, neural network models can: (a) approximate general inference algorithms like belief propagation, combining bottom-up and top-down information, (b) solve Rosenblatt's classical superposition problem, which we link to the binding problem, and (c) do so based on an unsupervised learning approach. The framework leads to two related models: the first model shows that the use of top-down feedback significantly improves the
network's ability to perform inference of corrupted inputs; the second model, including oscillatory behavior in the processing units, shows that the superposition problem can be efficiently solved based on the unit's phases.
In this paper we address the problem of understanding the cortical
processing of color information. Unravelling the cortical
representation of color is a difficult task, as the neural pathways for color processing have not been fully mapped, and there are few computational modelling efforts devoted to color. Hence, we first present a conjecture for an ideal target color map based on principles of color opponency, and constraints such as retinotopy and the two dimensional nature of the map. We develop a computational model for the cortical processing of color information that seeks to produce this target color map in a self-organized manner. The input model consists of a luminance channel and opponent color channels, comprising red-green and blue-yellow signals. We use an optional stage consisting of applying an antagonistic center-surround filter to these channels. The input is projected to a restricted portion of the cortical network in a topographic way. The units in the cortical map receive the color opponent input, and compete amongst each other to represent the input. This competition is carried out through the determination of a local winner. By simulating a self-organizing map for color according to this scheme, we are largely able to achieve the desired target color map. According to recent neurophysiological findings, there is evidence for the representation of color mixtures in the cortex, which is consistent with our model. Furthermore, an
orderly traversal of stimulus hues in the CIE chromaticity map
correspond to an orderly spatial traversal in the primate cortical
area V2. Our experimental results are also consistent with this
The amount of documents in electronic formats has dramatically increased because of the ease of information sharing. Hence, it is highly desirable to design an efficient document compression technique. In this paper, a divide-and-conquer technique is propose to classify a local region into uni-level, bi-level and multi-level classes. As a result, various compression approaches can be applied to suitable area to increase compression efficiency. The color sigma filtering technique is adopted as a preprocessing stage to facilitate the following segmentation and cluster validation processes. Experiment results demonstrate that this technique successfully dichotomize a color document into regions with similar characteristics.
Most commercially printed images are halftoned using a screening process. IN order to reproduce printed documents containing images, one usually performs descreening or inverse half toning to avoid possible moire patterns. There exists a variety of gray scale halftone descreening techniques in the literature. However, color halftone descreening is still an ongoing research topic. In this paper, we present two descreening approaches: suboptimal FIR filter and a two-stage color sigma filter. The suboptimal FIR descreening filter offers an efficient descreening approach for gray scale halftoned images. In the mean time, a color halftone descreening technique based on the color sigma filter does not assume any a priori knowledge about the half toning process, making it applicable to any color halftone image. Similar to the anisotropic diffusion algorithm and total variation minimization techniques designed for gray scale images, the color sigma filter is an O(N) algorithm which can smooth out variation within each region and preserve edge information in the RGB color space. When combined with halftone segmentation techniques, a complete document processing algorithm for gray-scale and color documents can be created.
The convergence of inexpensive digital cameras and cheap hardware for displaying stereoscopic images has created the right conditions for the proliferation of stereoscopic imagin applications. One application, which is of growing importance to museums and cultural institutions, consists of capturing and displaying 3D images of objects at multiple orientations. In this paper, we present our stereoscopic imaging system and methodology for semi-automatically capturing multiple orientation stereo views of objects in a studio setting, and demonstrate the superiority of using a high resolution, high fidelity digital color camera for stereoscopic object photography. We show the superior performance achieved with the IBM TDI-Pro 3000 digital camera developed at IBM Research. We examine various choices related to the camera parameters, image capture geometry, and suggest a range of optimum values that work well in practice. We also examine the effect of scene composition and background selection on the quality of the stereoscopic image display. We will demonstrate our technique with turntable views of objects from the IBM Corporate Archive.
One of the major challenges in scanning and printing documents in a digital library is the preservation of the quality of the documents and in particular of the images they contain. When photographs are offset-printed, the process of screening usually takes place. During screening, a continuous tone image is converted into a bi-level image by applying a screen to replace each color in the original image. When high-resolution scanning of screened images is performed, it is very common in the digital version of the document to observe the screen patterns used during the original printing. In addition, when printing the digital document, more effects tend to appear because printing requires halftoning. In order to automatically suppress these moire patterns, it is necessary to detect the image areas of the document and remove the screen pattern present in those areas. In this paper, we present efficient and robust techniques to segment a grayscale document into halftone image areas, detect the presence and frequency of screen patterns in halftone areas and suppress their detected screens. We present novel techniques to perform fast segmentation based on (alpha) -crossings, detection of screen frequencies using a fast accumulator function and suppression of detected screens by low-pass filtering.
Proc. SPIE. 3314, Optical Security and Counterfeit Deterrence Techniques II
KEYWORDS: Human subjects, Visual process modeling, Visual analytics, Visualization, Photography, Digital watermarking, Statistical modeling, Illumination engineering, RGB color model, Digital libraries
Visible image watermarking has become an important and widely used technique to identify ownership and protect copyrights to images. A visible image watermark immediately identifies the owner of an image, and if properly constructed, can deter subsequent unscrupulous use of the image. The insertion of a visible watermark should satisfy two conflicting conditions: the intensity of the watermark should be strong enough to be perceptible, yet it should be light enough to be unobtrusive and not mar the beauty of the original image. Typically such an adjustment is made manually, and human intervention is required to set the intensity of the watermark at the right level. This is fine for a few images, but is unsuitable for a large collection of images. Thus, it is desirable to have a technique to automatically adjust the intensity of the watermark based on some underlying property of each image. This will allow a large number of images to be automatically watermarked, this increasing the throughput of the watermarking stage. In this paper we show that the measurement of image texture can be successfully used to automate the adjustment of watermark intensity. A linear regression model is used to predict subjective assessments of correct watermark intensity based on image texture measurements.
The specification of image content is a critical issue in image databases. In this paper we explore the problem of specifying an important visual cue, that of image texture. The approach we have taken is to separately categorize texture images and texture words (in the English language), and then explore the relationships between the identified categories of images and words. These relationships are expressed as association matrices, and measure the mapping between the visual texture space and lexical texture space. Based on experiments with human subjects, we determined Pearson's coefficient of contingency (which measures the degree of association) to be 0.63 for the association matrix mapping images to words, and 0.56 for the association matrix mapping words to images. These indicate a strong association between texture words and images. Furthermore, like categories of texture words map onto like categories of texture images, e.g. words dealing with repetition map onto images of repetitive texture.
Low voltage operated display devices require thin fluorescent screens of 3 to 4 micrometer, consist of 0.1 to 2.0 micrometer size phosphor particles. Fine grain phosphors are synthesized by hydrothermal, combustion and sol-gel processes. In sol-gel process, the pores in dried gels are often extremely small and the components of homogeneous gels are intimately mixed. The surface area of powders produced from sol-gel is very high, leading to a lower processing temperature. Stable sols of Y(OH)3, Al(OH)3 and RE(OH)3 were prepared by passing respective nitrates through an ion exchange column. Stiochematric amounts of these sols were mixed to obtain YAG:RE (0.001 to 5.0 m/o) phosphors. DTA/TGA analysis showed weight loss corresponding to the loss of water molecules and oxidation. XRD of the samples fired at 1200 degrees Celsius showed only YAG phase. SEM studies revealed that the phosphor particles are nearly spherical in shape and uniform in size. Emission spectra from these samples excited by 200 to 5000 eV, exhibited a number of lines in blue, green and red regions corresponding to Tm3+, Tb3+ and Eu3+ transitions. Introduction of inhibitors (Li+) and sensitizes (Pr3+) appears promising in improving the morphology as well as the efficiency of the phosphors.
Proc. SPIE. 1907, Machine Vision Applications in Industrial Inspection
KEYWORDS: Defect detection, Manufacturing, Inspection, Scanning electron microscopy, Process control, Machine vision, Integrated circuits, Image classification, Scene classification, Classification systems
While initial detection of defects is the most critical function of inspection, automatic classification of detected defects is becoming increasingly desirable. The key to better process control is reliable process measurement. The classification of defects provides valuable process diagnosis information. The hope is that machines can perform this task more reliably than humans. However, there are many problems in automating defect classification, and many of these are related to the central problems in artificial intelligence, such as knowledge representation, inferencing, and dealing with uncertainty. In this paper we pay special attention to the issues arising in the Automatic Defect Classification (ADC) of integrated circuits. We first discuss technical and system requirements, followed by an outline of the technical challenges to be overcome to develop flexible and powerful ACD tools which can be quickly customized on a user level for diverse applications.
A fundamental issue in texture analysis is that of deciding what textural features are important in texture perception, and how they are used. Experiments on human pre-attentive vision have identified several low-level features (such as orientation on blobs, and size of line segments), which are used in texture perception. However, the question of what higher level features of texture are used has not been adequately addressed. We designed an experiment to help identify the relevant higher order features of texture perceived by humans. We used twenty subjects, who were asked to perform an unsupervised classification of thirty pictures from Brodatz's album on texture. Each subject was asked to group these pictures into as many classes as desired. Both hierarchical cluster analysis and non-metric MDS were applied to the pooled similarity matrix generated from the subjects' groupings. A surprising outcome is that the MDS solutions fit the data very well. The stress in the two dimensional case is 0.10, and in the three dimensional case is 0.045. We rendered the original textures in these coordinate systems, and interpreted the (rotated) axes. It appears that the axes in the 2D case correspond to periodicity versus irregularity, and directional versus non-directional. In the 3D case, the third dimension represents the structural complexity of the texture. Furthermore, the clusters identified by the hierarchical cluster analysis remain virtually intact in the MDS solution. The results of our experiment indicate that people use three high level features for texture perception. Future studies are needed to determine the appropriateness of these high-level features for computational texture analysis and classification.
The measurement of surface topography is an important inspection task as it provides useful information for process and quality control. A candidate technique for such an application is confocal imaging. The advantages of confocal imaging are that it is a noncontact measurement, can be operated at high speed (greater than 10 megapixels/sec) and submicron resolution, and provides height information in multilayered semitransparent materials. In this paper, we present a scheme for the fast processing of confocal images. The scheme consists of measuring the response function of the confocal system and deriving a deconvolution filter based on this response. The input signal is deconvolved in order to improve the depth resolution and then processed to identify significant peaks. These peaks represent the position of different surfaces in the object being inspected. For semitransparent materials, our scheme is capable of detecting up to two surfaces at a given location.
The automation of visual inspection in semiconductor wafer processing is a very challenging task. In this paper we address the automatic description and measurement of surface textures in semiconductor wafers. Texture plays a critical role in inspecting surfaces that are produced at various stages in the inspection of semiconductor devices. In this paper we describe a novel scheme to characterize surface textures that arise in semiconductor wafer processing. The emphasis in our scheme is on quantitative measures that allow for accurate characterization of surface texture. The fractal dimension is a quantitative measure of surface roughness, and we have developed an algorithm to automatically measure this. We also present an algorithm to compute the orientation field of a given texture. This algorithm can be used to characterize defects such as 'orange peel'. Furthermore, we have used the qualitative theory of differential equations to devise a symbol set for oriented textures in terms of singularities. An algorithm has been devised to process an image of a defect and extract qualitative descriptions based on this theory. We present the results of applying our algorithms to representative defects that arise in semiconductor wafer processing.