Contemporary object pose estimation algorithms predict transformation parameters of perspectives of objects from a reference pose. Learning these parameters often requires significantly more data than conventional sensors provide. Therefore, synthetic data is frequently used to increase the amount of data, number of object perspectives, and number of object classes, which is beneficial for improving the generalization of pose estimation algorithms. However, robust synthesis of objects from different perspectives requires manually setting precision describing increments between pose angles. Consequently, learning from arbitrarily small increments requires very precise sampling from existing sensor data, which increases time, complexity, and resources necessary for a larger sample size. Therefore, there is a need to minimize the amount of sampling and processing required for synthesis methods (e.g., generative) which have difficulty producing samples that lie outside of groups within the latent space resulting in modal collapse. While reducing the number of observed object perspectives directly addresses this problem, generative models have issues synthesizing out-of-distribution (OOD) data. We study the effects of synthesizing OOD data by exploiting orthogonality constraints to synthesize intermediate poses of 3D point cloud object representations that are not observed during training. Additionally, we perform an ablation study on each axial rotation for poses and the OOD generative capabilities between different model types. We test and evaluate our proposed method using objects from ShapeNet.
|