Video surveillance systems have become one of the most useful entities in our routine life. Surveillance videos contain plenty of visual information about criminal actions happening in the field-of-view. With the increase of criminal activities, it is mandatory to develop the accurate criminal recognition system. Our paper aims to propose and evaluate action recognition system for the recognition of criminal actions. First, a descriptor is proposed as spatiotemporal human motion acceleration (ST-HMA) over improved dense trajectories (IDT) framework to correctly recognize the criminal actions. Second, a hybrid dataset is developed by the combination of criminal activities, e.g., fight, kick, push, punch, shoot gun, and sword fighting collected from state-of-the-art datasets named as hybrid criminal action (HCA) dataset. The dataset covers the common on-street criminal action poses. We have also evaluated different descriptors over the IDT framework. The achieved accuracies per class are 92.85%, 92.85%, 93.33%, 96.16% for kick, push, punch and fight actions, respectively. Experimental results show that ST-HMA on IDT framework gives better results than HMA descriptor in edge trajectory framework. The proposed framework also achieved high average accuracy rate of 80.89% for ST-HMA descriptor over IDT. Different descriptor applied over IDT also shows good action recognition accuracy for HCA dataset.