We propose an invariant description method based on Zernike moments to classify hand vein patterns from raw infrared (IR) images. Orthogonal moments provide linearly independent descriptors and are invariant to affine transformations, such as translation, rotation, and scaling. A mathematical expression is given to derive a set of moment invariants. The obtained features have all the properties of moment invariants with the additional feature of image contrast invariance. For dorsal hand vein pattern acquisition, an IR imaging system is implemented. Also, a public database is used for a palm vein recognition task. A correct rate classification (CRC) above 99.9% is achieved using a set of rotation, scale, and intensity Zernike moment invariants. Additionally, multilayer perceptron and K-nearest neighbors are used as classifiers having as input data the Zernike normalized moments. A discriminative feature evaluation of the image moments allows the reduction of the number of descriptors while maintaining a high classification rate of 99%. The efficiency of the moment descriptors is evaluated in terms of accuracy and reduced computational cost by (a) avoiding the necessity of a preprocessing stage and (b) reducing the feature vector dimension. Experimental results show that Zernike moment invariants are able to achieve hand vein recognition without image preprocessing or image normalization with respect to change of size, rotation, and intensity.

## 1.

## Introduction

Biometric technology has been used in the accurate determination of an individual’s identity based on physical, chemical, or behavioral attributes.^{1} Human identification through hand vein patterns is a technique that appeared in 1990 and has been studied since then by several researchers. Typically, the vein pattern images used for people recognition include zones of interest like fingers,^{2}^{–}^{4} palmar region,^{5}^{–}^{7} dorsum of hand,^{8}^{–}^{11} forearm, and wrist.^{12}^{,}^{13} Most of the hand vein recognition systems^{4}^{,}^{6}^{,}^{8}^{–}^{14} require four steps: (1) image acquisition, (2) preprocessing of digital images that define the region of interest (RoI), (3) feature extraction of hand pattern, and (4) classification/matching, as shown in Fig. 1.

To extract vein features, reference points^{15} such as bifurcations and end-points of veins have been computed from a segmented and improved image using morphological operators and contrast enhancement techniques, respectively. Under ideal imaging conditions and preprocessing, reference points can easily be extracted from the image skeleton. However, image skeletons extracted from vein images are often unstable because the raw vein images suffer from low contrast. Usually, the feature extraction methods like histogram of oriented gradients^{16} and scale-invariant feature transform^{17} are often used as descriptors of orientation, scale, and intensity for vein patterns. However, they are not robust to noise presence and are partially invariant to translation, rotation, scale, and intensity (TRSI). Also, the vector descriptors are large with variable size, which complicates classification. Moreover, both techniques require high computational time.^{18} On the other hand, the local binary pattern (LBP)^{14}^{,}^{19} algorithm has been used for vein recognition, but whenever there are spatial and contrast changes during image acquisition, the performance of this description technique decreases.^{19}

Moment invariants also have been implemented for vein pattern description. Xueyan et al.^{2} extract finger-vein pattern features with modified Hu moment invariants, which are computed from reconstructed images by dyadic wavelet transform. Li et al.^{3} use Zernike moments to describe shape features of preprocessed finger-vein images. In these last works, a preprocessing stage is carried out to deal with spatial distortions and contrast changes in the input images. These procedures can be time consuming and require computing resources during the image geometric corrections related to scale, translation, rotation, and the radiometric normalization.

In this paper, we describe the hand vein pattern images by a set of feature invariants to TRSI transformations. Zernike orthogonal moments defined in polar coordinates^{20} are used for invariant feature extraction from raw biometric data, following the bottom stream, as is shown in Fig. 1. The performance through Zernike moments technique has a higher accuracy because it does not require a preprocessing stage. The main advantage of this approach is that the vein features based on Zernike orthogonal moments have a minimum amount of redundant information, which are invariant to spatial and radiometric transformations and also robust to noise.^{21}

In this work, each input raw biometric image is described by a pattern vector ${\psi}_{1},{\psi}_{2},\dots ,{\psi}_{\chi}$ obtained from the selected TRSI moment invariants. The classification step is done in the feature space. This results in a stable CRC curve. Furthermore, four different types of classifiers are used: K-nearest neighbors (KNN), multilayer perceptron (MP),^{22} Bayesian (BN),^{22} and naive Bayesian (NB) networks.^{22} These classifiers have shown to perform well, obtaining a CRC over 99%.

In this paper, sections are organized as follows: Sec. 2 shows a scheme of the implemented infrared (IR) imaging system for vein pattern image acquisition. The public database used is also described. Section 3 defines a set of TRSI invariant descriptors based on Zernike orthogonal moments. Experimental results of four different classifiers that use a discriminative metric to select invariant descriptors are given in Sec. 4. Finally, the conclusions are presented in Sec. 5.

## 2.

## Infrared Imaging System

## 2.1.

### Home Database

The hand vein pattern is an interconnected network of blood vessels located underneath human skin. Vein pattern structure is approximately located at 2.5 to 3.0 mm in the subcutaneous layer. From 700 to 900 nm, IR light can penetrate the skin deeply, reaching the blood vessels located in subcutaneous tissue.^{23}^{,}^{24} Vein detection through near-IR (NIR) light is based on the absorption principle of IR radiation by principal blood components like oxyhemoglobin, deoxyhemoglobin, and water.^{24}^{–}^{26} Through IR radiation, we obtain an image in which veins appear darker than the surrounding tissue in response to IR radiation exposure.

For vein pattern image acquisition, a JAI progressive scanning multispectral 2CCD camera was used, which can capture information in visible and IR channels simultaneously by means of a dichroic prism along the same optical axis.^{27} Visible and NIR sensors’ size are $4.76\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}\times 3.57\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$. The spatial resolution of acquired images is $1024\times 768$ pixels. Wavelengths for the visible channels are approximately 450, 550, and 630 nm, whereas for it IR channel is around 880 nm. Illumination source has a maximum emission peak of about 880 nm; this IR source contains 60 light emission diodes distributed in a concentric circle. Figure 2 shows the implemented image acquisition system.^{28} The field of view of the camera is $\beta =2\text{\hspace{0.17em}}\mathrm{arctan}[{h}_{s}/(2f)]=16.38\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$, where ${h}_{o}=20\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{cm}$ and ${h}_{s}=3.57\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$ denote target and sensor’s size, respectively, ${s}_{0}=53\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{cm}$ indicates the distance between target and camera lens, and ${f}_{l}=25\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$ represents the focal distance.

For the home database, volunteers were informed how to put their hands on the base in front of a uniform-colored background so that their knuckles coincided with the edge of the base. During the image capturing, we allowed a certain degree of variations of hand pose. This was done in order to increase intraclass diversity and simulate a real environment application.

The UPT database consists of 576 vascular pattern images obtained from 36 volunteers, 19 females, and 17 males aged 20 to 25, from which 8 images of each hand were acquired in NIR. Because the vein pattern of the right hand is different from the vein pattern of the left hand, they were taken as two different subjects,^{29} therefore, the subject’s number is 72.

## 2.2.

### PolyU Multispectral Palmprint Database

In order to evaluate the feature extraction algorithms, we use the PolyU Multispectral Palmprint Database (PUMPD) from the Biometric Research Center of the Hong Kong Polytechnic University. The database consists of 6000 vascular pattern images obtained from 250 volunteers, 55 females, and 195 males, from which 24 images of both hands from each subject were acquired in four channels (red, green, blue, and NIR).^{30} Again, because of vein patterns of the right and left hands are different, the number of subjects is 500 from 250 volunteers. Some images from the PolyU Database are shown in Fig. 3.

## 3.

## TRSI Zernike Moment Invariants

Image representation through characteristic descriptors is the main objective in this section. Moment invariants are widely used in pattern recognition because they can effectively characterize an image in a general way through a small set of moments^{31}^{,}^{32} and are invariant to the most common affine TRS transformations that an image undergoes. Additionally, orthogonal moments are robust to noise presence.^{21} Invariant moments proposed in this work are based on Zernike polynomials.

## 3.1.

### Affine Transformations

Imaging conditions cause that vein pattern image can change. According to Flusser et al., imaging conditions are commonly imperfect, so observed image represents a degraded version of the original scene.^{21} Degradations in the digital image can be radiometric and/or geometric. A common geometric spatial transform is affine transformation, which can be represented by means of the following transformation matrix:^{33}

## Eq. (2)

$$\left(\begin{array}{c}{x}^{\prime}\\ {y}^{\prime}\end{array}\right)=\left[\right(\begin{array}{cc}{c}_{x}& 0\\ 0& {c}_{y}\end{array}\left)\right(\begin{array}{cc}\mathrm{cos}(\alpha )& \mathrm{sin}(\alpha )\\ -\mathrm{sin}(\alpha )& \mathrm{cos}(\alpha )\end{array}\left)\right]\left(\begin{array}{c}x\\ y\end{array}\right)+\left(\begin{array}{c}{t}_{x}\\ {t}_{y}\end{array}\right),$$## 3.2.

### Zernike Orthogonal Moments on a Unitary Disk

Let $f({r}_{i,j},{\theta}_{i,j})$ be a $M\times N$ gray level image defined in discrete polar coordinates: ${r}_{i,j}=\sqrt{{x}_{j}^{2}+{y}_{i}^{2}}$ and ${\theta}_{i,j}=\mathrm{arctan}(\frac{{y}_{i}}{{x}_{j}})$, for ${x}_{j}=a+\frac{j\xb7(b-a)}{N-1}$, ${y}_{i}=b-\frac{i\xb7(b-a)}{M-1}$, $i=0,\dots ,M-1$, and $j=0,\dots ,N-1$. Parameters $a$ and $b$ are real numbers and take values according to a suitable domain inside (or outside) a unit circle $|r|\le 1$.^{34}

The 2-D discrete Zernike moments of radial order $n$ and angular repetition $l$ are as follows:^{20}

## Eq. (3)

$${Z}_{n,l}=\frac{n+1}{\pi}\sum _{i=0}^{M-1}\sum _{j=0}^{N-1}f({r}_{i,j},{\theta}_{i,j})\xb7{R}_{n,l}({r}_{i,j})\xb7{e}^{-1i\xb7l{\theta}_{i,j}},$$## Eq. (4)

$${R}_{n,l}(r)=\sum _{s=0}^{\frac{n-|l|}{2}}{(-1)}^{s}\frac{(n-s)!}{s!(\frac{n+|l|}{2}-s)!(\frac{n-|l|}{2}-s)!}{r}^{n-2\text{\hspace{0.17em}\hspace{0.17em}}s}.$$The number of Zernike moments can be computed using the following expression given by^{35}

## 3.2.1.

#### TRSI invariant descriptors based on Zernike moments

A set of Zernike moment invariants is given as follows:

• For the translation-invariant description, let the origin of a coordinate system be located at the image centroid $({x}_{c}={m}_{\mathrm{1,0}}/{m}_{\mathrm{0,0}},{y}_{c}={m}_{\mathrm{0,1}}/{m}_{\mathrm{0,0}})$. It can be calculated from the zero-order geometric moment ${m}_{\mathrm{0,0}}$ of a binary image and the first-order geometric moments ${m}_{\mathrm{1,0}}$ and ${m}_{\mathrm{0,1}}$.

• If an image object $f(r,\theta )$ is rotated as ${f}^{\prime}(r,\theta -\alpha )$, where $\alpha $ is the rotation angle, its corresponding moments are ${Z}_{n,l}^{R}({f}^{\prime})={Z}_{n,l}(f){\mathrm{exp}}^{-i\alpha l}$. The magnitude-based method

^{36}is used for rotation invariance for which $|{Z}_{n,l}^{R}({f}^{\prime})|=|{Z}_{n,l}(f)|$.• If an image object $f(r,\theta )$ is scaled as ${f}^{\prime}(r/c,\theta )$, the scaling factor $c$ can be computed using $c=\sqrt{{m}_{\mathrm{0,0}}({f}^{\prime})/{m}_{\mathrm{0,0}}(f)}$. Let $n=l+\wp $ in Eq. (3), the invariants to image rotation and scaling are

^{37}for ${\mathrm{\Gamma}}_{{f}^{\prime}}=\sqrt{|{Z}_{\mathrm{0,0}}({f}^{\prime})|}$ ${C}_{\wp ,\kappa}^{l}={(-1)}^{\wp -\kappa}\xb7\frac{(2l+\wp +1+\kappa )!}{\kappa !(\wp -\kappa )!(2l+1+\kappa )!}$ and ${D}_{\kappa ,t}^{l}=\frac{(2l+2\text{\hspace{0.17em}\hspace{0.17em}}t+2)\kappa !(2l+\kappa +1)!}{(\kappa -t)!(2l+\kappa +t+2)!}$ $0\le t\le \kappa \le \wp $## Eq. (6)

$${\psi}_{l+\wp ,l}(f)=\sum _{t=0}^{\wp}\frac{l+\wp +1}{l+t+1}(\sum _{\kappa =t}^{\wp}{({\mathrm{\Gamma}}_{{f}^{\prime}})}^{-(l+\kappa +2)}{C}_{\wp ,\kappa}^{l}\xb7{D}_{\kappa ,t}^{l})\xb7|{Z}_{l+t,l}({f}^{\prime})|,$$• If the intensity distribution of an image $f(r,\theta )$ is changed as $kf(r,\theta )$, the intensity factor $k$ can be obtained using $k=\frac{1}{{c}^{2}}({Z}_{\mathrm{0,0}}({f}^{\prime})/{Z}_{\mathrm{0,0}}(f))$ using ${m}_{\mathrm{0,0}}={Z}_{\mathrm{0,0}}/\pi $.

^{38}If $\wp =0$ and $l=\mathrm{1,2},3,\dots $, then the proposed $n=l$ TRSI Zernike moment invariants are given by

## 3.3.

### Numerical Experiments

In this subsection, the TRSI Zernike moment invariants are proven using a set of artificial distorted images. In Fig. 4, the test images are shown. The values of the moment invariants were computed for each one of these images using Eq. (7) and the logarithm of the values was taken to reduce the dynamic range. The TRSI moment invariants of the $i=1,\dots ,10$ distorted images of Fig. 4 are given in Table 1 and graphed in Fig. 5.

## Table 1

TRSI moment invariants for the i=1,…,10 images of Fig. 4.

Sample image | Spatial distortions and contrast changes | ψ˜11 | ψ˜22 | ψ˜33 | ψ˜44 | ψ˜55 | ψ˜66 | ψ˜77 | ψ˜88 | ψ˜99 | ψ˜1010 |
---|---|---|---|---|---|---|---|---|---|---|---|

1 | $\alpha =0$ | 6.85 | 11.99 | 19.38 | 23.95 | 29.87 | 35.39 | 41.15 | 46.75 | 52.83 | 58.49 |

$c=1$ | |||||||||||

$k=1$ | |||||||||||

2 | $\alpha =40\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.85 | 11.98 | 19.37 | 23.95 | 29.86 | 35.39 | 41.14 | 46.75 | 52.83 | 58.50 |

$c=1$ | |||||||||||

$k=1$ | |||||||||||

3 | $\alpha =240\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.97 | 11.98 | 19.33 | 23.93 | 29.88 | 35.42 | 41.18 | 46.79 | 52.87 | 58.51 |

$c=0.9$ | |||||||||||

$k=1.2$ | |||||||||||

4 | $\alpha =0$ | 6.85 | 12.00 | 19.43 | 23.97 | 29.90 | 35.43 | 41.20 | 46.80 | 52.89 | 58.56 |

$c=0.7$ | |||||||||||

$k=1$ | |||||||||||

5 | $\alpha =280\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.85 | 12.00 | 19.41 | 23.96 | 29.89 | 35.41 | 41.18 | 46.78 | 52.88 | 58.54 |

$c=0.8$ | |||||||||||

$k=1$ | |||||||||||

6 | $\alpha =200\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.85 | 11.99 | 19.38 | 23.95 | 29.87 | 35.39 | 41.15 | 46.75 | 52.83 | 58.49 |

$c=1$ | |||||||||||

$k=1$ | |||||||||||

7 | $\alpha =120\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.85 | 11.98 | 19.38 | 23.94 | 29.86 | 35.40 | 41.15 | 46.75 | 52.83 | 58.49 |

$c=1.1$ | |||||||||||

$k=1$ | |||||||||||

8 | $\alpha =160\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ | 6.91 | 11.94 | 19.17 | 23.89 | 29.81 | 35.32 | 41.09 | 46.64 | 52.75 | 58.32 |

$c=1.2$ | |||||||||||

$k=0.8$ | |||||||||||

9 | $\alpha =0$ | 6.85 | 11.98 | 19.36 | 23.94 | 29.87 | 35.39 | 41.15 | 46.75 | 52.82 | 58.47 |

$c=1.3$ | |||||||||||

$k=1$ | |||||||||||

10 | $\alpha =0$ | 6.83 | 11.99 | 19.47 | 23.95 | 29.88 | 35.41 | 41.14 | 46.76 | 52.82 | 58.51 |

$c=1$ | |||||||||||

$k=1.1$ | |||||||||||

$\sigma $ | 0.0420 | 0.0170 | 0.0797 | 0.0216 | 0.0242 | 0.0299 | 0.0298 | 0.0437 | 0.0395 | 0.0646 |

An image that undergoes uniform contrast variation $k$, like those that are shown in Figs. 6(a)–6(c), can be represented equivalently by scaling of the intensity function.^{38} Figures 6(d)–6(f) exemplify the processes of intensity normalization using the factor $k$.

However, in this work, the $k$ factor is used to normalize the descriptors in intensity but not to normalize the raw biometric data. The normalization factor $k$ is used in Eq. (7).

## 4.

## Experimental Results

The classification stage is carried out in the obtained space of descriptors using the feature extraction techniques previously described in Sec. 3. During this stage, the input images are transformed from raw biometric data to Zernike moments. Afterward, by means of Eqs. (6) and (7), a set of descriptors are obtained. It converts the image of $M\times N$ pixel values into a pattern vector composed by the first $\chi $ TRSI Zernike moment invariants. This method was applied to the PUMPD and also in our home database. A 3-D space of descriptors based on TRSI Zernike moment invariants is shown in Fig. 7(a).

In spite of some images including extra information about the hand, such as parts of the thumb, wrist, or scars, it can be seen that each class forms a cluster because the Zernike moments are invariant to affine transformation and illumination changes. Some points in the graph have been slightly scattered from their respective cluster. This dispersion is because the input images suffer from perspective deformations due to a nonperpendicular view (for example shearing) during image acquisition.

As we can see, similar samples are grouped in closer proximity to each other. Nearly identical or identical samples are placed in the same cluster.

## 4.1.

### Discriminative Feature Selection Algorithm

Since feature selection is meaningful to establish a functional neural network, a discriminative metric is implemented.^{39} It evaluates the effectiveness of a given moment invariant by means of the formula:^{40}

## Eq. (8)

$$Q(|{\tilde{\psi}}_{n,l}|,{S}_{i},{S}_{j})=\frac{\eta [\sigma ({S}_{i}|{\tilde{\psi}}_{n,l}|)+\sigma ({S}_{j},|{\tilde{\psi}}_{n,l}|)]}{[m({S}_{i},|{\tilde{\psi}}_{n,l}|)-m({S}_{j},|{\tilde{\psi}}_{n,l}|)]},$$## 4.2.

### Correct Rate Classification Using TRSI Zernike Moment Invariants

In this work, we use WEKA software, which is commonly used as a test platform to measure the classification capacity of several well-known pattern recognition models, such as MP, BN, NB, and KNN.^{22} All of the percentages shown in this work were calculated through cross-validation. From this point of view, an experimental comparison assesses the ability of TRSI Zernike moment invariants of Eqs. (6) and (7) for vein pattern recognition using the raw biometric data.

## 4.2.1.

#### Set 1: PUMPD database

Let $w=({w}_{1},{w}_{2},\dots ,{w}_{W})$ and $W=500$ pattern classes of the PUMPD public database. For each class ${w}_{k}$, there are 12 acquired versions in the testing dataset. The $\chi $-dimensional pattern vector $\tilde{\psi}=({\tilde{\psi}}_{1},{\tilde{\psi}}_{2},\dots ,{\tilde{\psi}}_{\chi})$ is based on the TRSI Zernike moment invariants using Eqs. (6) and (7) for maximum order ${n}_{\mathrm{max}}={l}_{\mathrm{max}}$. The classification results for order ${n}_{\mathrm{max}}=18$ with $\chi =100$ TRSI Zernike moment invariants are shown in Fig. 8(a). Furthermore, the percentage behavior of CRC is given against the number of descriptors that are used in the classification stage, using MP as a classifier in Fig. 8(b).

From Table 2, it is clear to see that the MP classifier achieves a CRC classification above 99% using at least $\chi =16$ invariants descriptors.

## Table 2

CRC classification results above 99% for the UPHK database.

nmax | χ(nmax) Zernike moment invariants | KNN | BN | NB | MP |
---|---|---|---|---|---|

6 | 16 | 98.97 | 97.57 | 98.37 | 99.35 |

8 | 25 | 99.42 | 98.92 | 99.02 | 99.52 |

10 | 36 | 99.50 | 99.13 | 99.03 | 99.57 |

12 | 49 | 99.48 | 99.28 | 99.00 | 99.52 |

14 | 64 | 99.58 | 99.30 | 98.85 | 99.55 |

16 | 81 | 99.97 | 99.30 | 98.93 | 99.50 |

18 | 100 | 99.57 | 99.33 | 98.82 | 99.42 |

The receiver operating characteristic (ROC) curves from the four tested models reached high performances; MP apparently displays better suboptimal results, as shown in Fig. 9. On the other hand, the area under ROC confirms that MP has a better performance since it has an area of 0.9577, followed by KNN with 0.9523, NB with 0.9457 and finally, the BayesNet with 0.9338.

Using the discriminative feature metric of Eq. (8), a set of ${\chi}^{\text{selected}}$ TRSI Zernike moment invariants were selected for the classification task. In Fig. 10, the results are shown; through the selection of Zernike descriptors ($\chi =100$ and ${\chi}^{\text{selected}}=46$), the input data to the classifiers are reduced 54%. In this case, the CRC drops less than 1%.

Since the first stage of a recognition system includes traditional image processing methods in order to improve information about the potential objects of interest in the scene, most of the papers in Table 3 use this procedure to enhance and normalize the original input images. Conversely, the proposed method analyzes the parametric space of geometric and radiometric image degradations. This method excludes the contrast enhancement, RoI extraction stage, and the image normalization. Moreover, our method is robust to noise presence and uses a minimal descriptors number $\chi =16$ to obtain a CRC above 99%.

## Table 3

CRC classification results for the UPHK database.

Reference | Preprocessing | Feature extraction | CRC |
---|---|---|---|

Cao et al.^{41} | RoI extraction | Thinning algorithm | Matching score = 99.50% |

Contrast enhancement | |||

Multiscale Gaussian matched filter | |||

Binarization | |||

Noise reduction | |||

Al-Juboori et al.^{42} | Enhancement filter | Wavelet transform | Euclidean matching = 99.86% |

Locality preserving | |||

Projections (LPP) | |||

LBP | |||

Variance (LBPV) | |||

Gumaei et al.^{43} | RoI extraction Whitening filter and contrast normalization | Normalized Gist-based feature extraction and feature reduction using autoencoder | Regularized extreme learning machine = 99.83% |

Zhang et al.^{44} | RoI extraction | Palmprint feature extraction by texture coding Palmvein feature extraction by matched filters Postprocessing operations to remove some small regions | Score level fusion |

99.69% | |||

Zhang et al.^{45} | Visual Geometry Group model F (VGG-F) | Convolutional neural Networks (CNN) and Vector of locally Aggregated descriptors (VLAD) | Equal error rate weighted fusion = 100% |

Proposed approach | Any | TRSI Zernike moment invariants | KNN = 99.97% |

BN = 99.33% | |||

NB = 99.03% | |||

MP = 99.57% |

## 4.2.2.

#### Set 2: UPT home database

Let $w=({w}_{1},{w}_{2},\dots ,{w}_{W})$ and $W=72$ pattern classes of the UPT home database. For each class ${w}_{k}$, there are eight acquired versions in the testing dataset. The $\chi $-dimensional pattern vector $\tilde{\psi}=({\tilde{\psi}}_{1},{\tilde{\psi}}_{2},\dots ,{\tilde{\psi}}_{\chi})$ is based on the TRSI Zernike moment invariants using Eqs. (6) and (7) for maximum order $n=l=4$. Figure 11 shows images from the UPT database.

We can observe that in addition to the geometric and radiometric distortions, the images suffer from perspective deformations due to a nonperpendicular view. Again, some images include extra information about the hand, such as parts of the thumb, wrist, or scars. In spite of that, Fig. 12(a) shows a CRC above 80% using only $\chi =9$ TRSI moment invariants with order $n=l=4$. An ROC curve using MP for the UPT database is shown in Fig. 12(b). It is visually clear that there are more true positives than false positives in the entire curve.

Due to image acquisition system conditions (some perspective variations and other alterations), this experiment did not reach higher performance rates; nevertheless, the area under ROC is close to 0.7 (0.6783).

## 5.

## Conclusions and Discussion

In practice, some factors, for instance, environmental, nonuniform illumination, and hand pose affect the image acquisition stage and increment the presence of spatial distortions and contrast changes in the sensed image. It is well known that a traditional approach of a hand vein recognition system requires RoI extraction followed by data preprocessing like contrast enhancement, spatial filters, binarization, mathematical morphology, and so on. Additionally, image normalization with respect to change of size, translation, rotation, and intensity can be required.

In this paper, we describe all images by a set of normalized features that are invariant with respect to TRSI transformations. Numerical experiments have been done using a set of artificial distorted images. We can see in Fig. 5 and Table 1 that the close range of the proposed TRSI Zernike moment invariants is reduced. This means that the descriptors defined in Eq. (7) have all the properties of TRS Zernike moment invariants with the additional feature of image contrast invariance.

In this way, two experiments were carried out in order to evaluate the performance of the proposed TRSI Zernike moment invariants on hand vein images without any kind of preprocessing. For the PUMPD database (500 subjects with 12 versions of each subject), an optimized approach selects ${\chi}^{\text{selected}}=46$ TRSI moment invariants achieve a CRC above 99.52% using MP as a classifier, as can be seen in Fig. 10. The results obtained from real data show that the invariant selected features require a lower computational cost compared to existing methods listed in Table 3.

On the other hand, for the UPT home database (72 subjects with eight versions of each subject), ${\chi}^{\text{selected}}=9$ TRSI selected moment invariants achieve 80% classification rate. In this case, in addition to the geometric and radiometric distortions, the images suffer from perspective deformations due to a nonperpendicular view during image acquisition and also include more information about the hand, such as parts of the thumb, wrist, or scars. In Fig. 7, we can see similar samples that are grouped in closer proximity to each other. Nearly identical or identical samples are placed in the same cluster. During the pattern classification process, the recognition system is able to handle changes in the dataset imputed to spatial distortions and extra information. However, due to $k$-fold cross-validation is a reliable test for classification models, and the UPT home database has several distortions, then the tested MP does not reach high recognition rates. In future work, it is proposed to add shearing invariants to TRSI Zernike moment invariants. In addition, more distorted samples for each vein pattern class can be added to the home database.

## Acknowledgments

R.C.-O. thanks to Consejo Nacional de Ciencia y Tecnología (CONACyT), award no. 436298. We extend our gratitude to the reviewers and Jennifer Speier for their useful suggestions.

## References

## Biography

**Raúl Castro-Ortega** received his bachelor’s degree in computational systems from the Higher Technological Institute of Huauchinango (ITSH) and his master’s degree from the Polytechnic University of Tulancingo (UPT) in 2014 and 2015, respectively. He is a PhD degree student in optomechatronics from UPT. His current research areas include digital image processing, biometrics, and pattern recognition. He is a member of SPIE.

**Carina Toxqui-Quitl** is an assistant professor at the Polytechnic University of Tulancingo. She received her BS degree from the Puebla Autonomous University, Mexico, in 2004. She received her MS and PhD degrees in optics from the National Institute of Astrophysics, Optics, and Electronics in 2006 and 2010, respectively. Her current research areas include image moments, multifocus image fusion, wavelet analysis, and computer vision.

**Alfonso Padilla-Vivanco** received his bachelor’s degree in physics from Puebla Autonomous University, Mexico, and his MS and PhD degrees both in optics from the National Institute of Astrophysics, Optics, and Electronics in 1995 and 1999, respectively. In 2000, he held a postdoctoral position in the Physics Department at the University of Santiago de Compostela, Spain. He is a professor at the Polytechnic University of Tulancingo. His research interests include optical information processing, image analysis, and computer vision.

**Jose Francisco Solís-Villarreal** graduated from the Research Center in Information Technologies and Systems (CITIS) of the Autonomous University of Hidalgo State (UAEH) in 2004 as master in computer science. He was graduated from the Computer Research Center (CIC) of the National Polytechnic Institute (IPN) as a doctor of computer science in 2007. Since 2012, he has been a full-time research professor at the University Center UAEM of Teotihuacan Valley of Autonomous University of Mexico State (UAEMex).

**Eber Enrique Orozco-Guillén** received his bachelor’s degree in physics from the University of Andes, Venezuela, and his MS and PhD degrees both in optics from the National Institute of Astrophysics, Optics, and Electronics in 2007 and 2009, respectively. He has 18 years of experience in bachelor and postgraduate and since 2011 has been a full-time researcher professor at the Polytechnic University of Sinaloa (UPSIN), México. His research interests include infrared thermography, image analysis, and diffuse reflectance spectroscopy.