The aim of multimodal image fusion is to enhance the perception of a scene by combining prominent features of images captured by different sensors. Using joint sparse subspace recovery (JSSR), this paper proposes an image fusion method. We consider each source image as projecting the original scene into a specified low-dimensional subspace that can be learned by the orthogonal matching pursuit (OMP) algorithm. We then reconstruct the fused image from a union of these subspaces. Considering the high computational complexity of the OMP algorithm, we provide an optimized OMP implementation for a large set of signals on the same dictionary. We evaluate the proposed JSSR fusion method on different spectral images, and compare its performance with the other state-of-the-art methods in terms of visual effect and quantitative fusion evaluation metrics. The experimental results demonstrate that our approach can enhance the visual quality of the fused images.