Spatial-temporal interest points (STIPs) are popular in human action recognition. However, they suffer from difficulties in determining size of codebook and losing much information during forming histograms. In this paper, spatial-temporal interest regions (STIRs) are proposed, which are based on STIPs and are capable of marking the locations of the most "shining" human body parts. In order to represent human actions, the proposed approach takes great advantages of multiple features, including STIRs, pyramid histogram of oriented gradients and pyramid histogram of oriented optical flows. To achieve this, cumulative histogram is used to integrate dynamic information in sequences and to form feature vectors. Furthermore, the widely used nearest neighbor and AdaBoost methods are employed as classification algorithms. Experiments on public datasets KTH, Weizmann and UCF sports show that the proposed approach achieves effective and robust results.