The performance of face recognition can be improved using information fusion of multimodal images and/or multiple algorithms. When multimodal face images are available, cross-modal recognition is meaningful for security and surveillance applications. For example, a probe face is a thermal image (especially at nighttime), while only visible face images are available in the gallery database. Matching a thermal probe face onto the visible gallery faces requires crossmodal matching approaches. A few such studies were implemented in facial feature space with medium recognition performance. In this paper, we propose a cross-modal recognition approach, where multimodal faces are cross-matched in feature space and the recognition performance is enhanced with stereo fusion at image, feature and/or score level. In the proposed scenario, there are two cameras for stereo imaging, two face imagers (visible and thermal images) in each camera, and three recognition algorithms (circular Gaussian filter, face pattern byte, linear discriminant analysis). A score vector is formed with three cross-matched face scores from the aforementioned three algorithms. A classifier (e.g., k-nearest neighbor, support vector machine, binomial logical regression [BLR]) is trained then tested with the score vectors by using 10-fold cross validations. The proposed approach was validated with a multispectral stereo face dataset from 105 subjects. Our experiments show very promising results: ACR (accuracy rate) = 97.84%, FAR (false accept rate) = 0.84% when cross-matching the fused thermal faces onto the fused visible faces by using three face scores and the BLR classifier.