X-ray coronary angiography (XCA) is a minimally invasive and common method for the diagnosis of coronary artery diseases. Fast and accurate segmentation vessel structure from XCA sequence is significant to assist doctors’ treatment. However, vessel segmentation is relatively challenging due to low contrast, presence of noise and overlaps from adjacent tissues. In this paper, we develop a novel network model based on conventional u-net architecture. To exploit the rich temporal-spatial information in XCA and provide consistent contexts for vessel inference, we take multi adjacent frames from XCA as input and adopt several 3D convolutional layers in the encode stage to extract temporal-spatial features representation. In skip connection layers, salient mechanism is utilized to adaptively filter out the noise from raw temporal-spatial features and stress vessel feature components. After that, the refined 3D temporal-spatial features are fused along with the temporal axis to generate 2D feature representations. These 2D feature representations are then passed to corresponding 2D upsampling layers in the decode stage. Experimental results verify the feasibility of salient mechanism application. Besides, extensive experiments demonstrate our superior performance over other state-of-the-art methods.