1 February 2006 Local-information-based uncorrelated feature extraction
Author Affiliations +
Optical Engineering, 45(2), 020505 (2006). doi:10.1117/1.2163873
Abstract
In the past few years, the computer vision and pattern recognition community has witnessed the rapid growth of a new kind of feature extraction method, the manifold learning methods, which attempt to project the original data into a lower dimensional feature space by preserving the local neighborhood structure. Among them, locality preserving projection (LPP) is one of the most promising feature extraction techniques. Based on LPP, we propose a novel feature extraction method, called uncorrelated locality preserving projection (ULPP). We show that the extracted features via ULPP are statistically uncorrelated, which is desirable for many pattern analysis applications. We compare the proposed ULPP approach with LPP and principal component analysis (PCA) on the publicly available data sets, FERET and AR. Experimental results suggest that the proposed ULPP approach provides a better representation of the data and achieves much higher recognition accuracies.
Zhao, Sun, and Jing: Local-information-based uncorrelated feature extraction

1.

Introduction

Recently, a new kind of feature extraction method, manifold learning algorithms, based on the local geometry analysis have drawn much attention. Unlike PCA, which aims to preserve the global Euclidean structure of the data space, manifold learning algorithms aim to preserve the inherent manifold structure. Locally linear embedding (LLE), Isomap, Laplacian eigenmap, and locality preserving projection (LPP)1 are the most important manifold learning algorithms. All of these four algorithms attempt to embed the original data into a submanifold by preserving the local neighborhood structure. They are based on the assumption that the manifold’s intrinsic geometry can be fully determined by the local metric and the neighborhood information. Different from LLE, Isomap, and Laplacian eigenmap, LPP is a linear algorithm. It is quite simple and easy to realize. LPP is originally derived by the discretely approximation of the Laplace Beltrami operator on compact Rimannian manifold. He 1 have applied LPP on face recognition problem. The application has shown the effectiveness of LPP in discovering the the intrinsic geometrical structure of the data.

Based on LPP, we propose a novel algorithm, called uncorrelated locality preserving projection (ULPP). It shares the same locality preserving character as LPP, but at the same time it requires the basis vectors to be statistically uncorrelated. Uncorrelated attributes are highly desirable in pattern analysis, because they contain minimum redundancy.2 However, the basis vectors of LPP are statistically correlated, then the extracted features contain redundant information. Moreover, the overlapped information can distort the distribution of the features and even dramatically degrade the performance of LPP. In this paper, we overcome this problem by putting an uncorrelated constraint on the computation of the basis vectors. This makes the proposed ULPP algorithm much more robust.

2.

Locality Preserving Projection

LPP1 can be explained by spectral graph theory. Let us assume that the n -dimensional data set {x1,x2,,xN} is distributed on a low-dimensional submanifold. We desire to find a set of d -dimensional (d<n) data points {y1,y2,,yN} with the same local neighborhood structure as the data {x1,x2,,xN} . In order to do so, we can construct a weighted graph G=(V,E,W) , where V is the set of all points; E is the set of edges connecting the points; and W=(wij) is a similarity matrix with weights characterizing the likelihood of two points. The criterion function of LPP is as follows:

1

minijyiyj2wij.

To make the mapping (embedding) not defined only for the original data set, LPP defines a transformation. That is, yi=ΦTxi , where Φ=[ϕ1,ϕ2,,ϕd]Rn×d . Following some algebraic steps, we see that

2

12ijyiyj2wij=12ij(ΦTxiΦTxj)T(ΦTxiΦTxj)wij=kiϕkTxiDiixiTϕkkijϕkTDiag(xi,,xi)wijDiag(xjT,,xjT)ϕk=trace(ΦTX(DW)XTΦ),
where X=[x1,x2,,xN] , and D=diag(Dii) ; its entries are row (or column since W is symmetric) sums of W , Dii=Σjwij ; Diag(x,,x) returns a block diagonal matrix with block entry x. L=DW is the Laplacian matrix.1

In order to remove the arbitrary scaling factor in the embedding, LPP imposes a constraint as follows:

3

YDYT=IΦTXDXTΦ=I.
This constraint sets the mapping (embedding) scale and makes the vertices with high similarities to be mapped nearer to the origin. Finally, the minimization problem reduces to:

4

argminΦΦTXDXTΦ=Itrace(ΦTXLXTΦ).

The transformation matrix Φ that minimizes the objective function can be obtained by solving the generalized eigenvalue problem:

5

XLXTϕ=λXDXTϕ.
Note that the Laplacian matrix can be viewed as the discretely approximation of the Laplace Beltrami operator on compact Rimannian manifold.1

Based on the matrix Φ=[ϕ1,,ϕd] , for any instance u=[u1,u2,,un]TRn , we can define the following linear transform from Rn to Rd :

6

[v1v2vd]=[ϕ1Tϕ2TϕdT]u.

Equation 6 with the constraint 3 forms the LPP transformation. It is easy to obtain the following theorem.

Theorem 1: Any two features vi and vj (ji) in Eq. 6 are correlated.

Proof: Let the covariance matrix of the data {xi}i=1N be Ξ . It is easy to obtain the following equation:

7

E[(viEvi)(vjEvj)]=ϕiTΞϕj.
With the constraint 3, we have inequality, in general,

8

ϕiTΞϕj0(ji).
Therefore, we cannot obtain the following equation in general:

9

E[(viEvi)(vjEvj)]=0,
thus completing the proof.

3.

Uncorrelated Locality Preserving Projection

The algorithmic procedure of uncorrelated locality preserving projection (ULPP) is stated below:

  • 1. Perform PCA projection. We project the data set into a PCA subspace to make the matrix XDXT become nonsingular. Let the transformation matrix of PCA be ΦPCA .

  • 2. Construct the similarity of any two data points. For each sample xi , if xj is among the m -nearest neighbors of xi , set the similarity wij=wji=xiTxj ; otherwise, set the similarity wij=wji=0 . The similarity reflects the local structure of the data space.

  • 3. Obtain the diagonal matrix D and the Laplacian matrix L .

  • 4. Compute the uncorrelated locality preserving projections. Let SL=XLXT , SD=XDXT and I=diag(1,1,,1) for concision. Let {ϕ1,ϕ2,,ϕk} be the uncorrelated locality preserving projections, thus we define

    10

    Φ(k1)=[ϕ1,ϕ2,,ϕk1].

    The uncorrelated locality preserving projections {ϕ1,ϕ2,,ϕk} can be iteratively computed as follows:

    • (a) Compute ϕ1 as the eigenvector of SD1SL associated with the smallest eigenvalue.

    • (b) Compute ϕk as the eigenvector associated with the smallest eigenvalue of the eigenfunction

      11

      P(k)SLϕ=λSDϕ,
      where

      12

      P(k)=IΞ(Φ(k1))T(Φ(k1)ΞSD1Ξ(Φ(k1))T)1Φ(k1)ΞSD1.

  • 5. Perform the ULPP transformation. Let Φ*=[ϕ1,ϕ2,,ϕd] . The embedding is as follow:

    13

    v=ΦULPPTu,
    where v is a d -dimensional representation of u and ΦULPP=ΦPCAΦ* .

3.1.

Justification

In this subsection, we provide the justification of our algorithm.

According to the following theorem, we can compute the k-th uncorrelated direction ϕk as the eigenvector of P(k) [Eq. 12] associated with the the smallest eigenvalue of P(k) .

Theorem 2: The k-th uncorrelated direction ϕk is the eigenvector corresponding to the smallest eigenvalue of the following eigenfunction:

P(k)SLϕ=λSDϕ,
where
P(k)=IΞ(Φ(k1))T(Φ(k1)ΞSD1Ξ(Φ(k1))T)1Φ(k1)ΞSD1.

Proof: Recall the minimization problem of LPP:

argminΦΦTXDXTΦ=Itrace(ΦTXLXTΦ).
In order to compute the k-th uncorrelated direction ϕk , we add the constraints
ϕkΞϕi=0(i=1,2,,k1).
We can use the method of Lagrange multipliers to transform the minimization function to include all constraints. Recall SL=XLXT and SD=XDXT , we have

14

L(ϕk)=ϕkTSLϕkλ(ϕkTSDϕk1)i=1k1γiϕkTΞϕi.
The minimization is performed by setting the partial derivative of L(ϕk) with respect to ϕk equal to zero:

15

2SLϕk2λSDϕki=1k1γiΞϕi=0.
Multiplying the left side of Eq. 15 by ϕk , we have

16

λ=ϕkTSLϕkϕkTSDϕk.
Thus, λ represents minimization problem to be optimized.

Multiplying the left-hand side of Eq. 15, respectively, by ϕjTΞSD1 (j=1,2,,k1) , we have

17

2[ϕ1Tϕ2Tϕk1T]ΞSD1SLϕk[ϕ1Tϕ2Tϕk1T]ΞSD1Ξ[ϕ1ϕ2ϕk1][γ1γ2γk1]=0.
Let Γ(k)=[γ1,γ2,,γk1]T , thus we can obtain

18

Γ(k)=2(Φ(k)ΞSD1Ξ(Φ(k))T)1Φ(k)ΞSD1SLϕk.
Since Eq. 15 can be written as

19

2SLϕk2λSDϕkΞ(Φ(k1))TΓ(k)=0,
substituting Eq. 18 into Eq. 19, we have

20

SLϕkλSDϕkΞ(ϕ(k1))T(Φ(k)ΞSD1Ξ(Φ(k))T)1Φ(k)ΞSD1SLϕk=0,
or in another form

21

[IΞ(Φ(k1))T(Φ(k1)ΞSD1Ξ(Φ(k1))T)1Φ(k1)ΞSD1]SLϕk=λSDϕk,
thus completing the proof.

4.

Experiment Results

4.1.

Experiments on FERET Data Set

FERET3 is a very popular database. Face image variations in the FERET database include pose, illumination, facial expression, and aging. In our experiment, we use a subset of the FERET data set. The data set we selected contains 72 people and 6 images for each individual. All images are aligned by the centers of the eyes and mouth and then scaled with resolution 112×92 . We randomly select 3 images from each person for training, while the rest of the images of each individual are selected for testing. The experiments are repeated 20 times and the average accuracies of rank 1 to rank 3 are recorded and shown in Table 1. The results show ULPP gives the best performance. The recognition accuracy of ULPP increases from 85.90% with rank 1 to 91.07% with rank 3, which is about 10% higher than the recognition accuracy of LPP.

Table 1

Recognition accuracies (%) comparison of FERET data set.

EigenfaceLPPULPP
Rank 158.6172.6485.90
Rank 263.7078.2289.49
Rank 368.2281.3991.07

4.2.

Experiments on AR Data Set

AR4 is also a popular database. Face image variations in the AR database include illumination, facial expression, and occlusion. In our experiment, 952 images (119 individuals, 8 images per person), which involve large variations in facial expression, are selected. We manually cropped the face portion of the images. All the cropped images are aligned at the centers of eyes and mouth and then normalized with resolution 112×92 . We randomly select 4 images from each person for training, while the rest of the images of each individual are selected for testing. The experiments are repeated 30 times and the average accuracies of rank 1 to rank 3 are recorded and shown in Table 2. From Table 2, it can be seen that ULPP outperforms the other two methods. The recognition accuracy of ULPP increases from 80.32% with rank 1 to 86.51% with rank 3, which is about 5% higher than the recognition accuracy of LPP.

Table 2

Recognition accuracies (%) comparison on AR data set.

EigenfaceLPPULPP
Rank 170.5872.4280.32
Rank 279.0777.9684.44
Rank 383.4080.4386.51

5.

Conclusion

This paper presents a novel feature extraction method, called uncorrelated locality preserving projection (ULPP). ULPP can extract uncorrelated features which contain minimum redundancy and are highly desirable for many applications. In this paper, we present a theoretical study on ULPP and provide an implementation of it. The experiments on face image data show the superiority of ULPP over the other both PCA and LPP.

Acknowledgment

This project is partially supported by National Natural Science Foundation of China (60502042) and 2010 Shanghai EXPO Special Project (2004BA908B07, 04DZ05807).

References

1.  X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang, “Face recognition using Laplacianfaces,” IEEE Trans. Pattern Anal. Mach. Intell.0162-8828 27(3), 328–340 (2005). Google Scholar

2.  J. Ye, R. Janardan, Q. Li, and H. Park, “Feature extraction via generalized uncorrelated linear discriminant analysis,” Proc. 21st Int. Conf. Mach. Learning, Banff, Alberta, Canada, 2004. Google Scholar

3.  P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” IEEE Trans. Pattern Anal. Mach. Intell.0162-8828 10.1109/34.879790 22(10), 1090–1104 (2000). Google Scholar

4.  A. M. Martinez and R. Benavente, “The AR face database,” CVC Technical Report #24 (June 1998). Google Scholar

Haitao Zhao, Shao-yuan Sun, Zhongliang Jing, "Local-information-based uncorrelated feature extraction," Optical Engineering 45(2), 020505 (1 February 2006). https://doi.org/10.1117/1.2163873
JOURNAL ARTICLE
3 PAGES


SHARE
Back to Top