Variance pose is an important research topic in face recognition. The alteration of distance parameters across variance pose face features is a challenging. We provide a solution for this problem using perspective projection for variance pose face recognition. Our method infers intrinsic camera parameters of the image which enable the projection of the image plane into 3D. After this, face box tracking and centre of eyes detection can be identified using our novel technique to verify the virtual face feature measurements. The coordinate system of the perspective projection for face tracking allows the holistic dimensions for the face to be fixed in different orientations. The training of frontal images and the rest of the poses on FERET database determine the distance from the centre of eyes to the corner of box face. The recognition system compares the gallery of images against different poses. The system initially utilises information on position of both eyes then focuses principally on closest eye in order to gather data with greater reliability. Differentiation between the distances and position of the right and left eyes is a unique feature of our work with our algorithm outperforming other state of the art algorithms thus enabling stable measurement in variance pose for each individual.
Iris segmentation is the process of defining the valid part of the eye image used for further processing (feature extraction, matching and decision making). Segmentation of the iris mostly starts with pupil boundary segmentation. Most pupil segmentation techniques are based on the assumption that the pupil is circular shape. In this paper, we propose a new pupil segmentation technique which combines shape, location and spatial information for accurate and efficient segmentation of the pupil. Initially, the pupil’s position and radius is estimated using a statistical approach and circular Hough transform. In order to segment the irregular boundary of the pupil, an active contour model is initialized close to the estimated boundary using information from the first step and segmentation is achieved using energy minimization based active contour. Pre-processing and post-processing were carried out to remove noise and occlusions respectively. Experimental results on CASIA V1.0 and 4.0 shows that the proposed method is highly effective at segmenting irregular boundaries of the pupil.
In this paper, a fast incremental image reduction principal component analysis approach (IIRPCA) is developed for
image representation and recognition. As opposed to traditional appearance based image techniques, IRPCA computes
the principal components of a sequence of image samples directly on the 2D image matrix incrementally without
estimating the covariance matrix. Therefore, IRPCA overcomes the limitations such as the computational cost and
memory requirements to making it suitable for real time applications. The feasibility of the proposed approach was tested
on a recently published large database consisting of over 2000 face images. IIRPCA shows superiority in terms of
computational time, storage and comparable recognition accuracy (94.0%) when compared to recent techniques such as
2DPCA (92.0%) and 2D RPCA (94.5%).
We develop a novel image feature extraction and recognition method <i>two-dimensional reduction principal component analysis</i> (2D-RPCA)). A two dimension image matrix contains redundancy information between columns and between rows. Conventional PCA removes redundancy by transforming the 2D image matrices into a vector where dimension reduction is done in one direction (column wise). Unlike 2DPCA, 2D-RPCA eliminates redundancies between image rows and compresses the data in rows, and finally eliminates redundancies between image columns and compress the data in columns. Therefore, 2D-RPCA has two image compression stages: firstly, it eliminates the redundancies between image rows and compresses the information optimally within a few rows. Finally, it eliminates the redundancies between image columns and compresses the information within a few columns. This sequence is selected in such a way that the recognition accuracy is optimized. As a result it has a better representation as the information is more compact in a smaller area. The classification time is reduced significantly (smaller feature matrix). Furthermore, the computational complexity of the proposed algorithm is reduced. The result is that 2D-RPCA classifies image faster, less memory storage and yields higher recognition accuracy. The ORL database is used as a benchmark. The new algorithm achieves a recognition rate of 95.0% using 9×5 feature matrix compared to the recognition rate of 93.0% with a 112×7 feature matrix for the 2DPCA method and 90.5% for PCA (Eigenfaces) using 175 principal components.
Automated medical image diagnosis using quantitative measurements is extremely helpful for cancer prognosis to reach a high degree of accuracy and thus make reliable decisions. In this paper, six morphological features based on texture analysis were studied in order to categorize normal and cancer colon mucosa. They were derived after a series of pre-processing steps to generate a set of different shape measurements. Based on the shape and the size, six features known as <i>Euler Number, Equivalent Diamater, Solidity, Extent, Elongation</i>, and <i>Shape Factor AR</i> were extracted. Mathematical morphology is used firstly to remove background noise from segmented images and then to obtain different morphological measures to describe shape, size, and texture of colon glands. The automated system proposed is tested to classifying 102 microscopic samples of colorectal tissues, which consist of 44 normal color mucosa and 58 cancerous. The results were first statistically evaluated, using one-way ANOVA method in order to examine the significance of each feature extracted. Then significant features are selected in order to classify the dataset into two categories. Finally, using two discrimination methods; linear method and k-means clustering, important classification factors were estimated. In brief, this study demonstrates that abnormalities in low-level power tissue morphology can be distinguished using quantitative image analysis. This investigation shows the potential of an automated vision system in histopathology. Furthermore, it has the advantage of being objective, and more importantly a valuable diagnostic decision support tool.
Accurate and reliable decision making in cancer prognosis can help in the planning of appropriate surgery and therapy and, in general, optimize patient management through the different stages of the disease. In this paper, we present a novel fractal geometry algorithm as a potential method for classifying colorectal histopathological images. 102 microscopic samples of colon tissue were examined in order to identify abnormalities using a morphogical feature approach based on segmenting the image into different classes, derived from fractal dimension. The obtained mean fractal dimension (FD) for normal object tissue was 1.797+/- 0.0381 (n = 44) compared with 1.866+/-0.0262 for malignant samples (n = 58). In brief, this study was able to demonstrate the value of fractal dimension based on morphological approach in the analysis of microscopic colon cancer images. Although, the obtained results are strongly significant in the separation between normal and malignant colorectal images, further analyses are essential to incorporate this methodology into routine clinical practice by supporting pathologist decision.
Proc. SPIE. 5022, Image and Video Communications and Processing 2003
KEYWORDS: Signal to noise ratio, Image compression, Detection and tracking algorithms, Video, Distortion, Computer programming, Video compression, Motion estimation, Standards development, Video coding
The paper presents a novel Orthogonal Logarithmic Search (OLS) method for block base motion compensation. The performance of the algorithm is evaluated by using standard QCIF benchmark video sequences and the results are compared to a traditional well-known full search algorithm (FSA) and a sub-optimal method called the Three Step Search (3SS). The evaluation considers the three important metrics, time, entropy and PSNR (Peak Signal to Noise Ratio). The paper also shows that the strength of the algorithm lies in its speed of operation as it is 95% faster than the FSA and over 60% faster than the 3SS.
Investigation into motion estimation algorithms is one of the important issues in the video coding standards such as ISO MPEG-1/2 and ITU-T H.263. These international standards regularly use a conventional FSA to estimate the motion of pixels between pairs of image block matching algorithms. Since a full search requires intensive computations and the distortion function needs to be evaluated many times for each target block to be matched, the process is very time consuming. Therefore, the main aim of this investigation has been to alleviate this acute problem of search speed and accuracy.
Accurate detection of elementary features such as edges and lines in digital images plays an important role in early vision and is included as a first step in many image processing systems. An efficient and flexible form of feature detection is based on the responses of sets of orientation selective filters. These steerable filters can be designed to posses near Gabor properties and provide a derivable interpolation function over orientation. However, these filters are rarely designed using non-circular receptive fields and cannot be optimized in terms of angular selectivity. A line of reasoning is developed here which allows orientation selective filters to be constructed using elliptical window functions and used within a conventional steerable filter bank system. Results suggest an optimal situation existing between the rejection of unwanted signals and high angular selectivity of filters. Further results are developed which emphasize phase differences between elliptical and circular filter responses as a measure of locally convex curvature maxima.
In this paper, we present the interpixel redundancy between the neighboring pixels and exploit the intensity similarities of the them in serial scan vectors of the images to develop a simple system for a lossless image compression scheme and show that the process is reversible. This algorithm is called Neighboring Zero Coding (NZC); it is based on the common characteristic of that imags as neighboring pixels are highly correlated. The proposed method is a kind of integration in the coding phase and it is a form of prediction in the decoding phase, it can integrate more than 65 percent of the image coefficient to zeros. The NZC method resulting in, on average, lossless compression to about 1.65 b/pixel from 8 bits with a different high-resolution digitized computed tomography and magnetic resonance images with comparable signal to noise ratio, in addition, the algorithm coding and decoding procedure are extremely fast.
In this paper, an image sequence coding scheme for very low bit-rate video coding is presented. The new technique utilizes windowed overlapped block matching motion compensation for the temporal coding scheme, and scheme, and vector quantization to reduce the spatial redundancy within the predicted image. There are many advantages associated with VQ, especially its capacity to operate in error- resilient coding system and furthermore, VQ does not suffer from blocking effects that are visually disjointed and has therefore a major advantage over DCT based methods. We examine the performance of various codebooks to remove the spatial redundancy within the difference frame. When the codec is configured to operate at 10.1 kbit/s, average PSNR values in excess of 32.86dB and 25.6dB are achieved for the 'Miss America' and 'Carphone' sequences respectively.
In this paper, an image sequence coding scheme for very low bit-rate video coding is presented. We examine the performance of various codebooks to remove the spatial redundancy within the difference frame. When the codec is configured to operate at 10.1 kbit/s, average PSNR values in excess of 32.86dB and 25.6dB are achieved for the 'Miss America' and 'Carphone' sequences respectively. We also present a new methodology for adaptive vector quantization (AVQ), where the codebook is updated with new vectors. The new vectors replace less significant ones in the codebook based on a novel scoring criterion that utilizes a forgetting factor and codebook half- life. The proposed method gives rise to an additional performance enhancement of around 1 dB over conventional techniques of AVQ. Furthermore, the methods do not suffer from blocking effects due to the inherent properties of both the temporal and spatial coding.
A suite of bandwidth efficient image codecs are presented for the use in second-generation wireless systems, such as the American IS-54 and IS-95 systems, the Pan-European GSM system and the Japanese digital cellular system. The proposed codecs are configured to operate at 9.6K bits per second and are suitable for Quarter Common Intermediate Format videophone sequences, scanned at 10 frames per second. The new image codecs employ the orthonormal wavelet transform to decompose the displaced frame difference data for each frame, into four frequency subbands. The wavelet coefficients within the frequency subbands are then encoded using vector quantization. Comparison measures are undertaken for the two-stage pairwise nearest neighbor (PNN) algorithm and the designs are rated upon their ability to coherently reconstruct an efficient codebook from a training sequence of vector coefficients. It was found that the two- stage PNN algorithm constitutes a valuable compromise in terms of computational complexity with only negligible performance loss. When the codecs were configured to operate at 9.6Kpbs, the average peak signal to noise ratio of the two stage PNN and the adaptive algorithms were in excess of 28kB and 30dB respectively.
The research introduces MARTI (man-machine animation real-time interface) for the realization of natural human-machine interfacing. The system uses simple vocal sound-tracks of human speakers to provide lip synchronization of computer graphical facial models. We present novel research in a number of engineering disciplines, which include speech recognition, facial modeling, and computer animation. This interdisciplinary research utilizes the latest, hybrid connectionist/hidden Markov model, speech recognition system to provide very accurate phone recognition and timing for speaker independent continuous speech, and expands on knowledge from the animation industry in the development of accurate facial models and automated animation. The research has many real-world applications which include the provision of a highly accurate and 'natural' man-machine interface to assist user interactions with computer systems and communication with one other using human idiosyncrasies; a complete special effects and animation toolbox providing automatic lip synchronization without the normal constraints of head-sets, joysticks, and skilled animators; compression of video data to well below standard telecommunication channel bandwidth for video communications and multi-media systems; assisting speech training and aids for the handicapped; and facilitating player interaction for 'video gaming' and 'virtual worlds.' MARTI has introduced a new level of realism to man-machine interfacing and special effect animation which has been previously unseen.
For many real-time and scientific applications, it is desirable to perform signal and image processing algorithms by means of special hardware with very high speeds. With the advent of VLSI technology, large collections of processing elements, which cooperate with each other to achieve high-speed computation, have become economically feasible. In such systems, some level of fault tolerance must be obtained to ensure the validity of the results. Fermat number transforms (FNT's) are attractive for the implementation of digital convolution because the computations are carried out in modular arithmetic which involves no round-off error. In this paper we present a fault tolerant linear array design for FNT by adopting the weighted checksum approach. The results show that the approach is ideally suited to the FNT since it offers fault tolerance, with very low cost, free from round-off error and overflow problems.