Traditional 3D human pose estimation algorithms are often influenced by the accuracy of 2D keypoint detection and camera calibration, and they struggle to handle low-resolution and occluded scenes. To address these challenges, we propose a multi-view 3D human pose estimation method that incorporates prior information. At the 2D pose extraction stage, we design a bottom-up detection network called HRPifPaf to achieve accurate human pose detection in low-resolution scenarios. It first constructs a high-resolution feature extraction module that combines features from different scales. Then, a joint prediction and association module combines confidence scores and scale factors with vector directions pointing to the main body parts of the joints. We also utilize a Kalman filter to optimize the final detection results. At the 3D pose synthesis stage, we propose a multi-camera parameter joint optimization calibration method that leverages prior information of the human skeleton to address challenges such as body occlusion and inaccurate camera intrinsic and extrinsic parameters, designing a comprehensive cost function based on reprojection error, and human body geometry constraints.
|