In a binocular visual system, to recover the three-dimensional information of the object, the most important step is to acquire matching points. Structure tensor is the vector representation of each point in its local neighborhood. Therefore, structure tensor performs well in region detection of local structure, and it is very suitable for detecting specific graphics such as pedestrians, cars and road signs in the image. In this paper, the structure tensor is combined with the luminance information to form the extended structure tensor. The directional derivatives of luminance in x and y directions are calculated, so that the local structure of the image is more prominent. Meanwhile, the Euclidean distance between the eigenvectors of key points is used as the similarity determination metric of key points in the two images. By matching, the coordinates of the matching points in the detected target are precisely acquired. In this paper, experiments were performed on the captured left and right images. After the binocular calibration, image matching was done to acquire the matching points, and then the target depth was calculated according to these matching points. By comparison, it is proved that the structure tensor can accurately acquire the matching points in binocular stereo matching.