Iris-ocular-periocular: toward more accurate biometrics for off-angle images

Abstract. Iris is one of the most well-known biometrics; it is a nonintrusive and contactless authentication technique with high accuracy, enhanced security, and unique distinctiveness. However, its dependence on image quality and its frontal image acquisition requirement limit its recognition performance and hinder its potential use in standoff applications. Standoff biometric systems require a less controlled environment than traditional systems, so their captured images will likely be nonideal, including off-angle. We present convolutional neural network (CNN)-based deep learning frameworks to improve the recognition performance of iris, ocular, and periocular biometric modalities for off-angle images. Our contribution is fourfold: first, the performances of popular AlexNet, GoogLeNet, and ResNet50 architectures are presented for off-angle biometrics. Second, we study the effect of the gaze angle difference between training and testing images on iris, ocular, and periocular recognitions. Third, we investigate the network behavior for untrained gaze angles and the information fusion capability of CNN networks at multiple off-angle images. Finally, deep learning-based results are compared with a traditional iris recognition algorithm using the gallery approach. Our results with off-angle images ranging from −50  deg to 50 deg in gaze angle show that the proposed methods improve the recognition performance of iris, ocular, and periocular recognition.


Introduction
The spread of the coronavirus pandemic has impacted our daily life and continues to impose changes in our everyday activities and behaviors in society.As protective measures increase against the virus, it is an emerging and inevitable necessity to transform our close-contact day-to-day routines to remote and touchless.At the time of COVID-19, biometric systems were required to verify the identity of individuals and validate their credentials at a distance without requiring contacting a surface or removing a face mask.Compared with other biometric modalities such as fingerprint, palm, and face, iris comes into prominence as being a nonintrusive and contactless authentication method in addition to its superior accuracy, unique distinctiveness, enhanced security, and unbreakable reliability. 1 However, its dependence on image acquisition conditions limits its accuracy and hinders its potential usage in different applications.Traditional iris recognition systems can recognize individuals in a well-controlled environment, so they require capturing high-quality frontal images.Therefore, existing systems show degraded performance for images captured at nonideal conditions including the gaze angle, iris occlusion, and pupil dilation. 2he development of standoff biometric systems enables the identification of both cooperative and noncooperative individuals in a less constrained environment. 3Compared with existing systems, subjects have more freedom during data acquisition, and the standoff system can still capture their images while subjects are moving around at-a-distance.Since subjects are not required to stand up and look at the camera in standoff biometric systems, captured images will likely be nonideal, including off-angle.Figure 1 shows an off-angle image in which the camera axis and subject's gaze do not overlap.When an off-angle image is compared with the frontal images of the same person, traditional iris recognition systems calculate their similarity score close to images from different subjects.
Traditional iris recognition systems generally adopted a similar methodology to Daugman's initial design. 1After the image acquisition, they follow four steps for iris recognition: (1) segmenting inner and outer iris boundaries, (2) normalizing the segmented image into dimensionless rectangular polar coordinates, (3) encoding the binary features from the normalized images to generate iris codes, and (4) comparing iris codes with the enrolled ones using Hamming distances (HDs).However, existing systems depend on image quality, are less flexible to a change in iris texture, and mostly require adjusting parameters for new datasets.In addition, off-angle iris images also introduce several additional challenges to the biometric community including the refraction of light at the cornea, the appearance of three-dimensional (3D) iris textures, blur due to the depth of field loss, accommodation of lens, and limbus occlusion at iris texture. 4Therefore, off-angle iris biometrics is an emerging research area that addresses these challenging issues.
As initial attempts, elliptical unwrapping and perspective transformation methods have been proposed to improve the recognition performance using traditional iris frameworks. 1,5Since these approaches aim to remove the perspective distortion, they show poor recognition performance for images with >30deg gaze differences.Figure 2 shows examples of off-angle images captured at the near-infrared (NIR) spectrum from different gaze angles and their normalized iris images using the elliptical unwrapping method.The distortion differences at the normalized iris textures become more visible as gaze angles increase from the frontal image in Fig. 2(a) to the 50-deg off-angle image in Fig. 2(f).Existing algorithms could not authenticate the identity of these two images being the same individual.Therefore, addressing the challenging degradation factors in off-angle images is required for designing an accurate standoff biometric recognition system.
The development of convolutional neural networks (CNNs) allows researchers in different disciplines to pursue solutions for their difficult problems using deep learning methods.Related to iris biometrics, there are several deep learning-based studies focusing on iris segmentation, 6 spoofing attack detection, 7 and recognition performance improvement. 8,9Although these CNNbased works investigated several feature extraction methods from iris texture and achieved performance improvement for nonideal frontal images, they do not address the challenging issues in off-angle images, and they do not report any detailed results about the off-angle biometrics.
Instead of focusing on correction of iris deformations in off-angle images, another alternative approach to improving the recognition performance is to collaborate with other available information in a captured image, such as pupil, sclera, eyelash, skin, and even eyebrows.There exist several studies on other biometric modalities using ocular and periocular structures including dynamics of eye movements, 10,11 shape-based geometric features, 12 conjunctival vasculature structures, 13 local binary patterns (LBPs), 14 and different filter models to extract features. 15,16hese studies have shown that the fusion of multiple biometric modalities improves the overall recognition performance of biometric systems.In addition, deep-PRWIS 17 suggested using the CNN-based AlexNet architecture for visible spectrum periocular images for better performance by including the periocular structures and ignoring the ocular components.Even if these methods improved the recognition performance for nonideal frontal images, they did not include any results for off-angle images.
This paper presents deep learning frameworks to improve the recognition performance of iris, ocular, and periocular biometric modalities for off-angle images.Our contribution is fourfold: (a) we utilized three off-the-shelf CNN-based frameworks for iris/ocular/periocular biometric modalities using transfer learning, (b) we study the effect of the gaze angle difference on iris/ocular/periocular recognition, (c) we discuss the information fusion capability of CNNs for different off-angle images, and (d) we compare our results with Gabor-based traditional iris recognition frameworks.The novelty of the paper is it being the first work investigating the recognition performance of the off-angle iris/ocular/periocular systems with respect to their gaze angle to the best of our knowledge.This study helps to design standoff biometric systems that capture images from a distance and angle that can be used by a variety of applications ranging from passport control to hospital check-in.It may give the opportunity to develop fast and secure standoff biometric systems.
This paper is organized as follows: Sec. 2 summarizes works related to off-angle iris recognition and information fusion approaches for ocular and periocular biometric modalities.The adapted CNN-based deep network architectures are presented in Sec. 3. Section 4 describes the details of the data collection and dataset preparation protocols and presents the baseline iris recognition algorithm to compare with the proposed methods.Section 5 shows the experimental results and discusses the important findings.Finally, we conclude in Sec. 6.

Related Works
The biometrics literature includes several studies related to iris recognition.They have mostly contributed to designing new methods to improve the recognition performance and have proposed solutions for occlusion, refraction, variations in illumination, and blur problems in frontal images.The majority of the recent research ignores the challenging issues in standoff biometrics and off-angle images.Although some address problems on nonideal images, they are mainly frontal images, and only a few studies of off-angle images exist.
For off-angle iris recognition, Daugman revised his initial circle detection method using the integrodifferential operator with an active contour algorithm in which an ellipse is fitted to both the pupil and iris boundaries. 1 Then, the geometric distortion of the off-angle iris images is corrected using affine transformation.Since two-dimensional (2D) affine transformation is inadequate to project 3D iris texture, it offers a limited improvement for off-angle iris images.For distortion compensation in off-angle images, a brute-force method 5 was proposed to project off-angle images into their frontal perspective for every possible angle.Instead of frontal projection of off-angle images, the elliptical unwrapping method can be used to normalize off-angle iris images directly from the elliptical segmentation of pupil and iris boundaries. 18While both the perspective projection and the elliptical normalization methods correct some degree of geometric deformations in off-angle images and increase the identification accuracy, their improvement is limited beyond 30 deg in angle because they do not address the challenging issues at off-angle iris images such as refraction of light at the cornea and iris occlusion due to the limbus.
To address these degradation factors of off-angle images, there are quite a few works proposing solutions in the traditional iris recognition pipeline.For example, Santos-Villalobos et al. 19 presented a frontal iris reconstruction method using raytracing in a biometric eye model to compensate for the effect of corneal refraction.Although it proved that corneal refraction elimination is possible at synthetic off-angle images, its correction at real off-angle iris images is limited due to other challenges in off-angle images such as limbus occlusion.Karakaya et al. 20 reported the "limbus effect" on off-angle iris images in which the semitransparent limbus occludes iris texture at the sclera-iris boundaries and causes segmentation differences between frontal and off-angle images.Another study on off-angle iris images 21 showed that lens accommodation affects iris surface curvature and results in a performance drop.Since all of these efforts addressed each challenging degradation factor separately, their recognition improvements on the real off-angle image were limited, and how to eliminate the challenges together with a comprehensive solution is still an open question.
With the recent improvements in the area of computation and neural networks, deep learning architectures, especially CNNs, 22 get attention from different research communities for utilizing deep networks in different applications, such as image classification tasks.In terms of CNNbased iris recognition, DeepIrisNet 8 was an early work in which deep networks were used for feature extraction from iris texture.Instead of converting grayscale iris texture to binary iris codes using Gabor filters, the CNN framework extracted features from a square-sized normalized iris image.For iris matching, features at the last fully connected layers were compared using the Euclidean distance to other subjects.Another DeepIris network 23 was proposed to improve the heterogeneous iris verification; it consists of nine layers to learn pairwise filters from different sources.Minaee et al. 9 utilized the VGG-Net framework to extract features from iris texture and used principal component analysis for dimensionality reduction.They fed the raw image as an input to the deep network without segmenting and unwrapping the iris.Nguyen et al. 24 first applied transfer learning method to adapt pretrained CNNs to extract features from segmented and normalized iris images and classify the feature vectors using multiclass support vector machines.These existing CNN-based papers focused on iris recognition for only frontal images and do not deal with the problems in off-angle images.
Data acquisition platforms usually capture the eye structures surrounding the iris such as pupil, sclera, eyelash, skin, and even eyebrows.Traditional iris recognition algorithms discard the information in the ocular and periocular components during the segmentation step.However, there might be useful information in ocular and periocular areas for developing a more reliable biometric system.Therefore, combining iris with ocular and periocular biometrics is a growing research area.The fusion of multiple biometric modalities can be performed in different ways for performance improvement in nonideal images.There are numerous studies in the biometrics literature for information fusion of iris texture with features in ocular and periocular structures such eye movement dynamics, 10,11 shape-based geometric features, 12 conjunctival vasculature structures, 13 LBPs, 14 and different filter models to extract features. 15,16Since our paper is focused on iris, ocular, and periocular biometrics to improve the recognition performance for off-angle images, information fusion with other modalities is out of our scope in this paper.
The initial studies showed the feasibility of using periocular structures as a biometrics measure.After aligning the images using the iris center, Park et al. 25 extracted several global and local features from the periocular region using gradient orientation histogram, LBP, and scaleinvariant feature transform (SIFT) descriptors.Scores were calculated using Euclidean distance for each descriptor, and they were fused at the score level using the weighted sum.Their results showed that including additional periocular structures in decision-making produced outcomes with higher accuracy.Its promising results attracted other researchers [26][27][28][29] to investigate ocular and periocular biometric modalities using various filters for feature extraction and matching fusion methods for their final scores.For example, Tan and Kumar 30 exploited SIFT, GIST, LBP, HoG, and LMF filters for feature extraction from the periocular region and combined their matching scores linearly with traditional iris recognition.Proenca 31 proposed a score-level fusion of separate modalities for images captured in visible light with one exploring the iris texture and another extracting shape parameters from eyelids, eyelashes, and skin.
In addition to these feature extraction-based solutions, some deep learning-based approaches exist in the biometrics literature.The deep-PRWIS 17 framework is proposed for periocular biometrics in visible spectrum images using an AlexNet-based network.For the visible spectrum images, they point out several degradation factors for ocular structures, including corneal reflections on iris, motion blur, and eyelid and eyelash occlusions on iris texture, and suggested disregarding ocular structures (i.e., iris and sclera) from the periocular input image using multiclass artificial sampling to get more accurate results.Zhang et al. 32 proposed a deep feature fusion network to fuse the discriminative features of iris and periocular biometrics for a better identification performance on mobile devices.Zhao and Kumar 33 proposed semantics-assisted CNN in which several deep networks were trained separately to learn additional semantic information, such as gender and ethnicity from the periocular region, and their output features were combined to improve the recognition performance.These works on ocular and periocular structures show some degree of improvement in recognition performance for frontal images, but they do not include any results for off-angle images.
To address the performance degradation at off-angle images, our previous study 34 initially investigated AlexNet-based deep networks following the traditional and nontraditional iris recognition frameworks.We also presented the effects of gaze angle on iris recognition and the feasibility of information fusion for various off-angle images.This paper extends our initial work in four aspects: (i) including additional CNN architectures and comparing their performance, (ii) studying the network behavior for untrained gaze angles, (iii) investigating the effect of the gaze angle and number of sample images per angle from each subject in the training set, and (iv) comparing the recognition performance of CNN-based iris, ocular, and periocular biometrics with a traditional state-of-the-art iris recognition algorithm.As a result, this paper allows us to investigate the effect of ocular and periocular structures on iris recognition using different input types including cropped, masked, and normalized images.

Methods
This section provides an overall structure of the adopted CNN-based deep learning frameworks.The CNN architecture extended the application areas of neural networks from one-dimensional signals to 2D images, and its popularity has boomed in various research communities.For iris recognition, CNN first extracts essential features from input images using acquired knowledge by convolutions in the training process.Then, it classifies the input into one of the classes defined in training without requiring segmentation, normalization, and encoding steps.The most critical part of a deep network is the training process, which requires thousands of images to adjust the network parameters.However, there are no publicly available biometrics datasets that have enough off-angle images for training.In addition to demanding a large dataset, training deep networks from scratch also requires extensive computational time and resources.Therefore, we adapted several pretrained networks and applied the transfer learning approach to retraining them for iris, ocular, and periocular recognition.Figure 3 shows the flowchart of the major steps in the proposed deep learning-based recognition methods.
Since CNN is one of the extensively used deep learning architectures in image classification applications, we adopted the most common CNN architectures: AlexNet, 35 GoogLeNet, 36 and ResNet50. 37These deep networks were trained with millions of images in the ImageNet dataset. 38The original task of these networks is to classify an input image into thousands of different classes such as cars, animals, and tools with high accuracy.MATLAB constructed AlexNet, GoogLeNet, and ResNet50 architectures using 25, 144, and 177 layers, respectively.
The main steps of these networks are the feature extraction, starting from low-level features at shallow layers, gradually increasing the complexity at deeper layers, and basing the final decision on the last fully connected layer.Compared with others, AlexNet is the smallest network with five convolutional layers.To minimize the number of parameter spaces for low-level feature extraction, each convolutional layer is followed by activation and max-pooling layers.GoogLeNet is a deeper network than AlexNet in which nine inception modules are stacked upon each other.The block encapsulation enables the extraction of low-level and high-level spatial features using different sizes of filters.The adapted deepest network in this work is ResNet50.The skip connection strategy allows the features to be fed from the previous layers to the next layers without changing them.The computational complexity reduction in GoogLeNet and ResNet50 is achieved using 1 × 1 convolutional layers, in whichthe number of inputs is reduced from one layer to another.This trick shrinks the size of the model and drops the number of parameters from 138 million to 4 million while allowing for an increased network depth with more layers.
Instead of training AlexNet, GoogLeNet, and ResNet50 architectures from scratch, we utilized the transfer learning approach to adapt their initial weights and parameters.Several different transfer learning approaches for adapting the weights from the base network exist.Yosinski et al. 39 proposed copying the weights of the initial layers of the base network as is and letting them update freely in the retraining process with a new dataset.Instead of copying the weights, Chaabouni et al. 40 proposed adopting the weight from the base network using a velocity function with weight decay and momentum coefficients for saliency map generation.Since the initial layers of CNNs extract similar low-level Gabor-based features from the datasets, we adapted the first approach to copy the initial weights from the base networks.As shown in Fig. 3, the transfer learning drops the last three layers of these networks and replaces them with new fully connected, softmax, and classification layers.The new fully connected layer extracts the final features from the previous layers and collects them to feed them into the softmax layer.The softmax layer calculates the likelihood function for an input belonging to a certain class.Since training images are captured from n many subjects, the classification layer has n classes.Based on the highest probability, the classification layer gives the final decision and selects the output class.After training, each test image is compared with n classes for identity verification.Based on the probability value, a decision will be made for a match or nonmatch to the compared class.During the retraining process, the parameters and weights in each layer are updated for each image minibatch.The new network is retrained with the off-angle images using the stochastic gradient descent algorithm with several learning rates ranging from 0.0001 to 0.01 and a minibatch size of 4, 8, and 16 images.Its number of max epochs was set to 8, 16, and 32 based on the training dataset.

Experimental Setup and Dataset
For our experiments, we used an off-angle biometric dataset that contains 5687 frontal and offangle images from 52 different subjects.We used NIR-sensitive IDS-UI-3240ML-NIR cameras for data acquisition.The off-angle camera was attached to the moving arm to capture images from −50 deg to þ50 deg in angle with a 10-deg step-size.Frontal iris images are captured at a 0-deg gaze angle while subjects look at the camera.For off-angle images, the camera is moved to the left for capturing positive angles and to the right for negative angles while subjects are still looking at a fixation point at 0 deg.Navitar Zoom 7000 lenses are attached to the cameras and are set at 40-mm focal length with a 5.6 f-stop.The periocular region is illuminated with a 780-nm power LED.To increase the image quality, we attached a 720-nm high-pass filter to the lens.
Figure 4 shows the data acquisition setup for off-angle biometrics.The off-angle camera captures 10 images at each gaze angle, and in total it provides 10 frontal and 100 off-angle images for each subject.Example images captured from each gaze angle are shown in Fig. 5.In an original frontal image, the iris diameter is around 410 pixels.For more details about the offangle dataset, readers can look at our previous paper. 41Each image includes textures from the iris, sclera, eyelashes, eyelids, eyebrows, and skin.For dataset requests, readers can contact the author.The captured images using our data acquisition platform are at the NIR spectrum, so our experiments only include images at the NIR spectrum.There is no available off-angle image dataset captured at the visible spectrum following the same procedures from −50 deg to þ50 deg in angle with a 10-deg step-size.Although existing datasets at the visible spectrum [42][43][44] have some off-angle images, they do not provide the ground truth gaze angles.Therefore, we could not include them in our study.However, we believe that this study can be extended for images at the visible spectrum.

Dataset Preparation
To investigate the recognition performance of CNNs for iris, ocular, and periocular biometric modalities, we designed several experiments with cropped, masked, and normalized images from the original off-angle images.Each CNN architecture is trained and tested with different types of biometric images that may include different ocular and periocular components.Although there are several interpretations for the ocular region, the definition of periocular is clear and common.Periocular is defined as the part of the facial region surrounding the eye including eyelids, lashes, and eyebrows.The term "periocular" comes from the prefix, "peri-," meaning "around or about," and the root word, "ocular," meaning "of or relating to the eye." 45,46Although some papers describe the ocular regions bigger than periocular, 47 we use the general definition in which the periocular region surrounds the ocular region.Figure 6 shows the generated seven types of biometric images from each original off-angle image in our dataset for the experiments: (a) Periocular: periocular image with pupil, iris, sclera, eyelids, eyelash, and skin textures; (b) PeriocularIris: periocular image with only iris texture; (c) PeriocularNoIris: periocular image without iris; (d) Ocular: ocular image; (e) OcularIris: ocular image with only iris; (f) OcularNoIris: ocular image without iris; and (g) NormalizedIris: normalized iris image (Table 1).
The original resolution of our off-angle biometric dataset is 1280 × 1024 pixels with a single grayscale channel.However, AlexNet, GoogLeNet, and ResNet50 require a square-sized color image as an input.The input size of AlexNet is 227 × 227 × 3. GoogLeNet and ResNet50 require 224 × 224 × 3. Therefore, we cropped, resized, and masked the images.To generate masked and normalized images, iris and pupil boundaries were segmented as ellipses ðC x ; C y ; R x ; R y ; θÞ using an off-angle iris segmentation algorithm. 48The segmentation results for the iris-pupil (inner) boundary are almost perfect for all images except a few cases.However, due to changes in eyelid occlusions and gaze angle differences, iris-sclera (outer) boundary results are not consistent at the frontal and off-angle images.Therefore, the segmentation parameters are checked using a manual ground-truth tool and corrected if needed to have error-free and consistent segmentation results.The ellipse equation for iris and pupil boundaries is defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 1 2 9 where ðC x ; C y Þ is the ellipse center, ðR x ; R y Þ are the major and minor axes, and θ is the orientation angle.Using the ellipse parameters, we first create elliptical disks for the pupil and iris regions where equality (=) in Eq. ( 1) is changed to less than (<).To exclude the inside of the ellipse, equality is changed to greater than (>).Then, we multiply these elliptical disks with Periocular and Ocular images to generate the masked image subsets including PeriocularIris, PeriocularNoIris, OcularIris, and OcularNoIris as shown in Fig. 6.To generate seven types of image subsets, we performed the following steps: (i) For periocular images in Figs.6(a)-6(c), we first cropped 128 pixels from the left and right sides of each image.(ii) For ocular images in Figs.6(d)-6(f), we cropped 454 pixels around the iris center.(iii) For PeriocularIris and OcularIris images in Figs.6(b) and 6(e), we excluded the inside of the pupil and the outside of the iris by masking their elliptical disk of pupil and iris.(iv) For PeriocularNoIris and OcularNoIris images in Figs.6(c) and 6(f), we excluded the inside of the iris-sclera by masking the elliptical disk of the iris.(v) For NormalizedIris in Fig. 6(g), we sampled the iris texture using elliptical normalization.
We selected 227 pixels from 227 angles starting from 3 o-clock at CWD. (vi) For all images, we downsampled them into the required size.(vii) Finally, to convert the grayscale images to three-channel RGB images, we copied and pasted them into every channel.

Data Splits for Training and Testing Sets
After generating input images for seven types of biometric datasets, we divided each dataset into eight subsets to establish the testing and training sets.One of the subsets was used as a training set, and the others are merged for testing.Since there were 10 iris images per angle for each subject, the frontal images set (0 deg in gaze angle), F, has 520 images from 52 subjects.

Baseline Iris Recognition Algorithm
To compare the recognition performance of CNN-based iris, ocular, and periocular recognition, we adopted a well-known traditional Gabor-based state-of-the-art iris recognition algorithm. 1fter elliptical iris segmentation and normalization, iris codes were generated using OSIRIS phase-quadrant demodulation 50 to calculate the HD.Every image in our off-angle dataset is compared with the others, and over 26.7 million comparisons were performed for the entire dataset.The intraclass histogram shows the comparisons between images from the same subjects.The comparisons between two different subjects are shown in the interclass histogram.Figure 7 shows the intraclass and interclass distributions of our off-angle datasets.The intraclass HD changes from 0.01 to 0.48 with an average of 0.2735 and a standard deviation of 0.0960.The interclass comparisons range from 0.35 to 0.59 with a mean value of 0.4764 and a standard deviation of 0.0225.We observed that the traditional recognition methods can measure the HD scores in the intraclass distribution better than interclass, which generates errors such as false matches and mismatches.The main reason for the performance degradation is the comparison of the images from different angles with them changing from −50 deg to þ50 deg in our off-angle iris dataset.
To visualize the effect of the gaze angle differences on the traditional iris recognition, we calculated the HD scores of irises with different gaze angles from the same subject as shown in Fig. 8.For example, there is no gaze angle difference between images from −50 deg to −50 deg, 0 deg to 0 deg, 50 deg to 50 deg, and so on.The result for no gaze angle difference is shown as a green solid line with an o marker.Its HD score changes from 0.01 to 0.30 due to the dilation differences and segmentation variations between images.This distance can be tolerated in traditional iris recognition systems since their data capture scenario is based on the comparison of frontal images.However, the HD score increases as the gaze angle difference between

Results and Discussion
In this section, we first evaluate the recognition performance of the adopted pretrained deep networks with the periocular images.Second, we compare the recognition results of the ResNet50 framework for off-angle images from Periocular, PeriocularIris, PeriocularNoIris, Ocular, OcularIris, OcularNoIris, and Normalized datasets when the network is trained with only frontal images.Third, we study the effect of the gaze angle difference on the iris, ocular, and periocular recognition and discuss the information fusion capability for CNN networks training with multiple off-angle images.Fourth, we investigate the network behavior for untrained gaze angles, the effect of the number of sample images per angle, and the effect of the number of subjects in the training set.Finally, the recognition performance of CNN-based frameworks is compared with a well-known Gabor-based traditional iris recognition algorithm.
Receiver operating characteristic (ROC) curve, equal error rate (EER), and area under the curve (AUC) are calculated as metrics for the performance analysis of biometric recognitions.ROC curves plot the true-positive rate (TPR) and the false-positive rate (FPR) for different thresholds at the final probability scores of the networks.AUC shows the 2D area under each ROC curve.EER gives the error rate in which false-positive and false-negative values are equal in an ROC curve.Curves close to the upper left corner, with lower EER scores, and with higher AUC values refer to higher overall accuracy for compared experiments.

Comparison of AlexNet, GoogLeNet, and ResNet50
In our first set of experiments, we investigate the recognition performance of AlexNet, GoogLeNet, and ResNet50 architectures.We first retrained these off-the-shelf deep networks with periocular images in the frontal subset (F) and tested with their off-angle images (O-ALL).Then, we repeat the same experiments with periocular images in the off-angle training subset (O-s1).ROC curves, shown in Fig. 9, evaluate the recognition performance of these experiments.It is first observed that the deep networks showed better performance for training on off-angle images (O-s1) compared with frontal image training (F).When deep networks are trained with only frontal images (F) and tested with off-angle images, GoogLeNet showed the lowest recognition performance with an EER of 15.12%, compared with AlexNet and ResNet50 with EER scores of 9.93% and 7.24%, respectively.However, GoogLeNet performed better than AlexNet for training with off-angle images (O-s1).Even if AlexNet showed better performance for frontal images, its performance could not catch up to the deeper networks for off-angle training.Since AlexNet is the shallowest network, its depth may not be good enough to learn all of the required information from multiple off-angle images.In addition, ResNet50 showed the best performance for both frontal and off-angle image subsets since it is the deepest network with more layers compared with AlexNet and GoogLeNet.Table 2 summarizes the EER and AUC scores calculated in our frontal and off-angle biometrics experiments for AlexNet, GoogLeNet, and ResNet50.Since ResNet50 showed better performances compared with others and proved its effectiveness and efficiency with frontal and off-angle images, it was used for the rest of our experiments.

Results for Frontal Training Images
In our second set of experiments, we first train the ResNet50 framework with seven different types of frontal (F) periocular, ocular, and iris image datasets: (1) Periocular, (2) PeriocularIris, (3) PeriocularNoIris, (4) Ocular, (5) OcularIris, (6) OcularNoIris, and (7) NormalizedIris.After training, we test each network with their corresponding off-angle image subsets (O-ALL) to compare their recognition performance and to investigate the effect of the existence of iris, ocular, and periocular structures on off-angle biometrics recognition, including the pupil, iris, sclera, eyelashes, eyelids, and skin.Note that, in each experiment, training and testing images come from one of seven biometric datasets.
Figure 10(a) shows the recognition performance analysis using ROC curves for training with frontal images from seven periocular, ocular, and iris image datasets.Since only frontal image subset (F) is used for network training, the recognition performance for off-angle images is lower than the desirable level (TPR > 0.99 and FPR < 0.01).The main reason for low accuracy is the gaze angle difference between the training and testing images.Table 3 shows the EERs of these seven experiments.Based on the EER and ROC results, ocular and iris images (Ocular, OcularIris, and NormalizedIris) deliver very similar results, with their EERs being 8.25%, 7.97%, and 8.92%, respectively.We observed that segmentation and normalization of iris texture do not help for recognition for training with frontal images.Compared with ocular results, Periocular shows slightly lower performance with a 9.61% EER score; here, having additional periocular structures in the frontal training images did not offer additional discriminative features.In addition, additional experiments were conducted with the periocular image with only iris texture (PeriocularIris) and Periocular image without iris (PeriocularNoIris) datasets.PeriocularIris shows better performance than PeriocularNoIris, and it is very close to Periocular images with an EER score of 10.68%.It is seen that removing the iris texture from the periocular image does not change the accuracy too much because the iris area covers <10% of the periocular image.We also observed that the performance of PeriocularIris is lower than other OcularIris due to the lower resolution of the iris texture in the input image.Finally, OcularNoIris showed the worst accuracy with 17.78% in EER; here, iris texture is masked out from the ocular image.Based on the experimental results and our observations, we can conclude that the iris texture information provides more distinctive features when only frontal images are available for deep network training.Iris is more reliable biometrics than ocular and periocular structures because   sclera, eyelids, and skin may appear more different than iris texture if there exists a gaze angle difference between training and testing images.In addition, the iris texture resolution is directly proportional to accuracy in the masked images.
For further investigation on the effect of the gaze angle difference on biometrics recognition, Fig. 11 shows more in-depth results for images in the ocular dataset (Ocular).First, CNNs are trained with frontal ocular image subset (F).Then, we tested the images from the ocular off-angle subset (O-ALL) at different gaze angles ranging from −50 deg to 50 deg, separately.Since the network is trained with only frontal images, its recognition performance drastically decreases as the gaze angle of the test images increases.The EER scores for each gaze angle ranging from 50 deg to 50 deg in angle with a 10-deg step-size are 16.56%, 9.22%, 5.35%, 3.09%, 0.42%, 0.01%, 0.78%, 3.01%, 6.76%, 13.96%, and 22.91%, respectively.We also observe that results for positive angles are not symmetrical to negative angles because of the anatomical 5-deg angle difference between the visual axis and the optical axis, aka "angle alpha."For example, the result for gaze 50 deg, marked with the dotted purple ROC curve, is worse than images from gaze −50 deg, marked with a blue curve.

Results for Off-angle Training Images
In the third set of experiments, we investigate the effect of gaze angle on deep learning-based iris, ocular, and periocular recognitions.We train our networks with seven types of off-angle image subsets (O-s1) to improve the recognition by discussing the information fusion capability for CNN networks training with multiple off-angle images.Then, we test the networks with their off-angle images (O-ALL).
Figure 10(b) shows the performance analysis using ROC for CNN frameworks trained with seven types of off-angle images at Periocular, PeriocularIris, PeriocularNoIris, Ocular, OcularIris, OcularNoIris, and NormalizedIris datasets.As shown in Table 3, their EERs are measured as 0.012%, 0.120%, 0.016%, 0.043%, 0.053%, 0.287%, and 0.091%, respectively.Compared with frontal ROC curves, CNN shows improved performance for all biometric modalities (i.e., iris, ocular, and periocular) when it is trained with off-angle image subsets (O-s1).Including multiple off-angle iris images from different gazes in the training process helps to eliminate the effect of gaze difference between the training and testing datasets.It is observed that the periocular dataset (Periocular) shows the best recognition performance compared with the ocular and iris datasets.Periocular images without iris (PeriocularNoIris) show very close accuracy to the Periocular subset.Since iris texture covers a very small area (<10%) in the periocular images, removing iris texture from periocular images slightly degrades the recognition performance (0.4% accuracy).Moreover, periocular images with only iris (PeriocularIris) show a lower performance when the iris diameter is <90 pixels, especially in off-angle images.Compared with the results of the periocular recognition, Ocular and OcularIris images present slightly less improved performance.In addition, we observed that NormalizedIris shows lower accuracy than OcularIris due to the insufficient rubber-sheet model.It assumes linear deformation on the iris texture in off-angle images and ignores the existence of several challenges including corneal refraction, 3D iris textures, and the limbus effect.Therefore, iris normalization causes more degradation in recognition performance instead of improving it.Finally, the performance of OcularNoIris is the worst result due to the existence of limited information in the image.As a result, having periocular and ocular structures in addition to iris texture in the offangle input images improves the recognition performance compared with the previous frontal training results in Fig. 10(a).These results demonstrate the recognition strength and data fusion capability of the CNN-based deep learning framework, in which the information in images from multiple gaze angles is fused easily.

Training with Limited Angles, Multiple Samples, and More Subjects
In our fourth set of experiments, we investigate the network behavior for untrained gaze angles with ocular images (Ocular).We trained the network with images in the Ocular dataset from each frontal and limited off-angles image subset, F, O-o3, O-o4, and O-o5, separately and tested with the rest of the off-angle images (O-ALL).Figure 12  To improve the recognition performance for off-angle images, it is enough to train the network with steep angles, and it is not required to train it with every gaze angle because CNN can handle small variations in the gaze angles.
We also studied the effect of the number of sample images per angle from each subject in the training set.We used the images in the ocular dataset (Ocular) from multiple sample image subsets O-s1, O-s2, and O-s3, with 1, 2, and 3 sample images per angle from each subject being included in the training set, respectively.As shown in the ROC curves in Fig. 12(b), the performance was improved as the number of sample images per angle from each subject was increased.In addition, EERs decreased to 0.043%, 0.024%, and 0.015% for subsets O-s1, O-s2, and O-s3, respectively.Even if we included off-angle images from every gaze angle in the training set, there might be a difference in pupil dilations.Therefore, including more samples per angle from each subject in the network training helps to improve the performance of deep learning-based iris recognition algorithms.
To investigate the effect of the number of subjects in the dataset on the recognition performance of CNNs, we perform additional experiments using images in the ocular dataset (Ocular) from frontal (F) and off-angle (O-s1) subsets.We train our networks with images from various numbers of subjects including 50, 75, and 90 subjects.Then, we test the networks with the rest of the images in their off-angle subset (O-ALL) to show the network behavior for a larger-scale evaluation.Figure 13 shows the ROC curves for the performance evaluations of training the network with frontal (F) and off-angle (O-s1) subsets, respectively.Although ROC curves showed that the recognition performance of the deep networks slightly changes for different numbers of subjects in the dataset, there are no significant and consistent performance changes based on the number of subjects.For frontal training of the network, EERs were 8.25%, 9.19%, and 9.02% for 50, 75, and 90 subjects, respectively.In off-angle training, EERs decreased to 0.047%, 0.039%, and 0.035% for 50, 75, and 90 subjects, respectively.Therefore, the proposed method can be applied to larger datasets with more subjects without any significant decrement in the recognition performance.

Comparison with State-of-the-Art Algorithm
Last, we present the recognition performance of the Gabor-based traditional iris recognition algorithm as a state-of-the-art algorithm to compare with our proposed deep learning frameworks.Traditional recognition systems store only one frontal iris image for each subject in its image gallery and compare them with incoming image probes.Therefore, we first included the iris codes from the frontal image subsets in our gallery and tested with off-angle (O-ALL) images.Then, we implemented the gallery approach 51 in which multiple off-angle images of the same subject are included in the gallery, and the verification decision is made based on the minimum HD of probe and gallery images.We used O-o3, O-o4, and O-o5 subsets as the gallery and tested with the rest of the off-angle (O-ALL) images.Figure 14 presents ROC curves, with the frontal image gallery, F, giving the worst performance compared with the other gallery approaches.As shown in Table 4, the EER scores are 1.59%, 0.41%, 0.34%, and 0.14% for subsets F, O-o3, O-o4, and O-o5, respectively.As expected, while the number of multiple off-angle images increases in the gallery, the recognition performance improves.However, multiple images per subject in the gallery multiplies the number of comparisons required for recognition, which causes a longer decision time for traditional iris recognition algorithms.Compared with deep learning frameworks, the accuracy of the traditional method is only better for frontal images.Even if we include multiple off-angle images in the gallery, the  traditional method could not reach the performance of deep learning frameworks for off-angle iris images.In addition, the traditional method requires iris segmentation with eyelid detection that is an error-prone process, and faulty segmentation may drop the accuracy.Eyelid detection also helps to mask the occluded iris texture in the traditional method.Therefore, deep learning frameworks show superior performance for the off-angle images compared with the traditional iris recognition methods in which they also do not need iris segmentation and eyelid detection.

Conclusions
This study compared CNN-based AlexNet, GoogLeNet, and ResNet50 architectures to improve the recognition performance of iris, ocular, and periocular biometric modalities for off-angle images.It also investigated the effect of the gaze angle difference between training and testing images, the network behavior for untrained gaze angles, and the information fusion capability of CNN networks at multiple off-angle images.
Based on our experimental results, the important conclusions are as follows: first, ResNet50 shows the best performance for both frontal and off-angle training scenarios compared with AlexNet and GoogLeNet since it is the deepest network with more layers.Second, the network accuracy significantly decreases as the difference between gaze angles of training and test images increases.Third, even though iris texture provides more reliable information at frontal training; iris, ocular, and periocular structures can be used together for improving the recognition performance at off-angle images.Fourth, CNN-based deep learning frameworks showed data fusion capability in which they can easily fuse the information from multiple off-angle images.Since CNN can handle small gaze angle variations, training with steep angles is enough, and there is no need for every angle.As the main conclusion, compared with the Gabor-based traditional iris recognition algorithm, CNN-based deep learning architectures improve the recognition performance for off-angle images when the network is trained with multiple off-angle images.

Fig. 3
Fig. 3 Flow chart of the proposed deep learning-based biometrics recognition method using transfer learning.

Fig. 4
Fig.4Experimental setup of off-angle biometrics data acquisition.
For frontal training and off-angle testing experiments, we used 520 images in the frontal subset (F) as training and the remaining 5167 off-angle images from the same 52 different subjects were used in the testing.Since this work focuses on the effect of the gaze angle on periocular/ocular/iris biometrics, it is a closed-world biometric application in which training and testing include images from all available subjects.The open-world problem in which images from new subjects are tested with the network without training them with the network is out of the scope of this study.However, this study can be extended to an open-world problem by extracting the features from the last fully connected layer for each subject and measuring the distance of the features from the new subject to each subject.For experiments with off-angle training and testing, off-angle images were split into seven different subsets for different experiments.From each subject, the O-s1 subset contains one offangle image per angle, and the other off-angle images are placed into the O-ALL subset.For O-s2 and O-s3 sets, two and three images per angle from each subject are selected for the training set, respectively.These training sets include images from every angle.To investigate the network behavior for untrained gaze angles, we also generated additional image subsets with every image captured at −30 deg, 0 deg, and 30 deg gaze angles used in the O-o3 subset for training and images from the remaining gaze angles used in testing.The O-o4 set includes images from −40 deg, −20 deg, 0 deg, 20 deg, and 40 deg gaze angles, and images from −50 deg, −30 deg, 0 deg, 30 deg, and 50 deg gaze angles are placed into the O-o5 set.Therefore, we guaranteed disjoint training and testing for different angles.For all of our experiments, we used the MatConvNet 49 as CNN frameworks in MATLAB.The experiments were performed on a DELL workstation with 16 core Intel i9-9900K CPU at 3.6 GHz, NVIDIA GeForce RTX 2080 with 8 GB GPU, and 64 GB main memory.It takes 173 s to retraining ResNet50 using transfer learning for 520 images with eight epochs.The testing time is 2.5 ms per image.

Fig. 7
Fig. 7 Histograms of HD of off-angle images.The left plot represents intraclass comparisons and the right one represents interclass comparisons.

Fig. 8
Fig. 8 Histograms of intraclass HD distributions of off-angle iris images with different gaze angle differences.Each histogram represents comparisons of specific angle differences from 0 deg to 100 deg.

Fig. 9
Fig. 9 Performance analysis using ROC for original images.AlexNet, GoogLeNet, and ResNet50 are trained with frontal and off-angle subsets (F and O-s1) and tested with off-angle image subset (O-ALL).

Fig. 11
Fig. 11 Comparison of recognition performances for ocular images (Ocular) with different gaze angles.CNNs are trained with frontal ocular images (F) and tested with ocular off-angle images from each gaze angle ranging from −50 deg to 50 deg.
(a) presents ROC curves for these experiments, and their EERs are shown in Table 4.It is observed that, when the network is trained with only the frontal images subset, F, it shows the worst performance results, as marked with a solid red line with 8.25% EER.When additional off-angle images from −30 deg and 30 deg gaze angles are included in the training set (O-o3), the network performance is improved, as shown in a solid blue line with 0.40% EER.The best network performance is shown for training with the O-o5 subset (images from −50 deg, −30 deg, 0 deg, 30 deg, and 50 deg) with 0.04% EER.Deep learning networks can learn from off-angle images even if their training does not include the images from every angle.We also observe that the results for O-o4 are not as good as the

Fig. 12
Fig. 12 Performance analysis using ROC for ocular image (Ocular).(a) CNNs are trained with frontal and limited off-angles image subsets (F, O-o3, O-o4, and O-o5) and tested with off-angle image subset (O-ALL), (b) CNNs are trained with multiple sample off-angle image subsets O-s1, O-s2, and O-s3 and tested with off-angle (O-ALL).

Fig. 13
Fig. 13 Performance analysis using ROC for ocular image (Ocular) for different number of subjects.(a) CNNs are trained with frontal image subset (F) and tested with off-angle image subset (O-ALL), (b) CNNs are trained with off-angle image subsets O-s1 and tested with off-angle (O-ALL).

Fig. 14
Fig. 14 Performance analysis using ROC for traditional iris recognition with frontal (F) and multiple off-angle image subsets (O-o3, O-o4, and O-o5) included in gallery and tested with off-angle (O-ALL) images.

Table 1
Training dataset groups.

Table 2
Comparison of networks for frontal and off-angle trainings using EER and AUC.

Table 3
EER of different image subsets for frontal and off-angle trainings.

Table 4
EER of deep learning frameworks and traditional iris recognition.