The instance segmentation for obstacle detection based on machine vision and deep learning is quite important for autonomous driving system. In this paper, a method using the Mask R-CNN based on feature fusion of RGB and depth images for instance segmentation is proposed. It extracts the features of depth image by designing a two-layer NiN network, and uses convolution to realize the feature fusion and dimension reduction of RGB image and depth image. The edge texture in depth image can improve the accuracy of boundary frame positioning. Experimental results on typical benchmark dataset demonstrates the effectiveness of the proposed method, which can improve the segmentation accuracy by 4% and the recall rate by 2%.