We propose a three-dimensional video segmentation method using deep learning convolutional neural nets. The algorithm utilizes the local gradient computed at each pixel location together with the global boundary map acquired through deep learning methods to generate initial pixel groups by traversing from low to high gradient regions. A local clustering method is then employed to refine these initial pixel groups. The refined subvolumes in the homogeneous regions of video are selected as initial seeds and iteratively combined with adjacent groups based on intensity similarities. The volume growth is terminated at the color boundaries of the video. The oversegments obtained from the above steps are then merged hierarchically by a multivariate approach yielding a final segmentation map for each frame. The results show that our proposed methodology compares favorably well, on a qualitative and quantitative level, in segmentation quality and computational efficiency, with the latest state-of-the-art techniques utilizing the video segmentation benchmark dataset.
You have requested a machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Neither SPIE nor the owners and publishers of the content make, and they explicitly disclaim, any express or implied representations or warranties of any kind, including, without limitation, representations and warranties as to the functionality of the translation feature or the accuracy or completeness of the translations.
Translations are not retained in our system. Your use of this feature and the translations is subject to all use restrictions contained in the Terms and Conditions of Use of the SPIE website.