Using the kernel trick idea and the kernels as features idea, we can construct two kinds of nonlinear feature spaces,
where linear feature extraction algorithms can be employed to extract nonlinear features. Thus, we have two approaches
to transform an existing linear feature extraction algorithm into its nonlinear counterpart. It has been proved that they are
equivalent up to different scalings on each feature by rigorous theoretical analysis. In this paper, we perform experiments
on several benchmark datasets and give a comparative study of the two kernel ideas applied to certain feature extraction
algorithms such as linear discriminant analysis and principal component analysis. These results provide a better
understanding of the kernel method.