This paper presents a human pose recognition method which simultaneously reconstructs a human volume based on ensemble of voxel classifiers from a single depth image in real-time. The human pose recognition is a difficult task since a single depth camera can capture only visible surfaces of a human body. In order to recognize invisible (self-occluded) surfaces of a human body, the proposed algorithm employs voxel classifiers trained with multi-layered synthetic voxels. Specifically, ray-casting onto a volumetric human model generates a synthetic voxel, where voxel consists of a 3D position and ID corresponding to the body part. The synthesized volumetric data which contain both visible and invisible body voxels are utilized to train the voxel classifiers. As a result, the voxel classifiers not only identify the visible voxels but also reconstruct the 3D positions and the IDs of the invisible voxels. The experimental results show improved performance on estimating the human poses due to the capability of inferring the invisible human body voxels. It is expected that the proposed algorithm can be applied to many fields such as telepresence, gaming, virtual fitting, wellness business, and real 3D contents control on real 3D displays.
This paper presents a method for tracking human poses in real-time from depth image sequences. The key idea is to adopt recognition for generating the model to be tracked. In contrast to traditional methods utilizing a single-typed 3D body model, we directly define the human body model based on the body part recognition result of the captured depth image, which leads to the reliable tracking regardless of users' appearances. Moreover, the proposed method has the ability to efficiently reduce the tracking drift by exploiting the joint information inserted into our body model. Experimental results on real-world environments show that the proposed method is effective for estimating various human poses in real-time.