Monocular image-based three-dimensional (3-D) human pose recovery aims to retrieve 3-D poses using the corresponding two-dimensional image features. Therefore, the pose recovery performance highly depends on the image representations. We propose a multispectral embedding-based deep neural network (MSEDNN) to automatically obtain the most discriminative features from multiple deep convolutional neural networks and then embed their penultimate fully connected layers into a low-dimensional manifold. This compact manifold can explore not only the optimum output from multiple deep networks but also the complementary properties of them. Furthermore, the distribution of each hierarchy discriminative manifold is sufficiently smooth so that the training process of our MSEDNN can be effectively implemented only using few labeled data. Our proposed network contains a body joint detector and a human pose regressor that are jointly trained. Extensive experiments conducted on four databases show that our proposed MSEDNN can achieve the best recovery performance compared with the state-of-the-art methods.