Human action recognition has been widely used in various fields of computer vision, pattern recognition, and human–computer interaction and has attracted substantial attention. Combining deep learning and depth information, this paper proposed a method of human action recognition based on improved convolutional neural networks (CNN). First, we use the depth motion maps to extract the depth sequence features and obtain three projected maps corresponding to front, side, and the top views. On this basis, an improved CNN is constructed to realize the recognition of human action, which uses three-dimensional (3-D) input and two-dimensional process identification to speed up the computation and reduce the complexity of recognition process. We evaluate our approach on two public 3-D action datasets: MSR Action3D dataset and UT-Kinect dataset, and our private CTP Action3D dataset built using Kinect to collect data. The experimental results show that the proposed methods of human action recognition achieve higher average recognition rate of 91.3% on MSR Action3D dataset, 97.98% on UT-Kinect dataset, and the average recognition rate is 93.8% on our CTP Action3D dataset. Furthermore, the trained model on one depth video sequence dataset can be easily generalized to different datasets without changing network parameters.