We propose an approach to reconstruct a three-dimensional model from a stream of RGB-D images. Compared with existing methods, our strategy performs well in challenging cases, involving quick movements of the camera and missing depth values. The key idea underlying our approach is to combine registration with patch segmentation based on RGB information. We use these segments and the patch significance correspondence algorithm to transform a global restriction into a number of local restrictions during camera moving. Furthermore, we propose a method that improves the low precision of geometric registration by aligning corresponding patches instead of entire point clouds. In light of this consideration of spatial relations, we also propose a fusion strategy to extract the correct transformations in spite of poor RGB-D sequences. The results of tests on RGB-D benchmark sequences and comparisons with KinectFusion system showed that the proposed approach substantially increases the accuracy of the reconstructed models.