Video clip retrieval is a significant research topic of content-base multimedia retrieval. Generally, video clip retrieval process is carried out as following: (1) segment a video clip into shots; (2) extract a key frame from each shot as its representative; (3) denote every key frame as a feature vector, and thus a video clip can be denoted as a sequence of feature vectors; (4) retrieve match clip by computing the similarity between the feature vector sequence of a query clip and the feature vector sequence of any clip in database. To carry out fast video clip retrieval the index structure is indispensable. According to our literature survey, S2-tree  is the one and only index structure having been applied to support video clip retrieval, which combines the characteristics of both X-tree and Suffix-tree and converts the series vectors retrieval to string matching. But S2-tree structure will not be applicable if the feature vector's dimension is beyond 20, because the X-tree itself cannot be used to sustain similarity query effectively when dimensions of vectors are beyond 20. Furthermore, it cannot support flexible similarity definitions between two vector sequences. VA-file represents the vector approximately by compressing the original data and it maintains the original order when representing vectors in a sequence, which is a very valuable merit for vector sequences matching. In this paper, a new video clip similarity model as well as video clip retrieval algorithm based on VA-File are proposed. The experiments show that our algorithm incredibly shortened the retrieval time compared to sequential scanning without index structure.