Visual object tracking is a fundamental problem in computer vision community and has been studied for decades. Trackers are prone to drift over time without other information. In this paper, we propose a self-repairing online object tracking algorithm based on different level of features. The fine-grained low-level features are used to locate the specific object in each frame and the coarse-grained high-level features are used to describe the category-level representation. We design a tracking kernel updating mechanism based on category-level description to revise the online tracking drift. We tested our proposed algorithm on OTB-50 dataset and compared the proposed method with some popular real-time online tracking algorithms. Experimental results demonstrated the effectiveness of our proposed method.
We propose a simple yet effective method for long-term object tracking. Different from the traditional visual tracking method, which mainly depends on frame-to-frame correspondence, we combine high-level semantic information with low-level correspondences. Our framework is formulated in a confidence selection framework, which allows our system to recover from drift and partly deal with occlusion. To summarize, our algorithm can be roughly decomposed into an initialization stage and a tracking stage. In the initialization stage, an offline detector is trained to get the object appearance information at the category level, which is used for detecting the potential target and initializing the tracking stage. The tracking stage consists of three modules: the online tracking module, detection module, and decision module. A pretrained detector is used for maintaining drift of the online tracker, while the online tracker is used for filtering out false positive detections. A confidence selection mechanism is proposed to optimize the object location based on the online tracker and detection. If the target is lost, the pretrained detector is utilized to reinitialize the whole algorithm when the target is relocated. During experiments, we evaluate our method on several challenging video sequences, and it demonstrates huge improvement compared with detection and online tracking only.
A new flexible method to calibrate the external parameters of two cameras with no-overlapping field of view (<i>FOV</i>) is proposed in our paper. A flexible target with four spheres and a 1<i>D</i> bar is designed. All spheres can move freely along the bar to make sure that each camera can capture the image of two spheres clearly. As the radius of each sphere is known exactly, the center of each sphere under its corresponding camera coordinate system can be confirmed from each sphere projection. The centers of the four spheres are collinear in the process of calibration, so we can express the relationship of the four centers only by external parameters of the two cameras. When the expressions in different positions are obtained, the external parameters of two cameras can be determined. In our proposed calibration method, the center of the sphere can be determined accurately as the sphere projection is not concerned with the sphere orientation, meanwhile, the freely movement of the spheres can ensure the image of spheres clearly. Experiment results show that the proposed calibration method can obtain an acceptable accuracy, the calibrated vision system reaches 0.105 mm when measuring a distance section of 1040 mm. Moreover, the calibration method is efficient, convenient and with an easy operation.
In this paper we propose a simply yet effective and efficient method for long-term object tracking. Different from traditional visual tracking method which mainly depends on frame-to-frame correspondence, we combine high-level semantic information with low-level correspondences. Our framework is formulated in a confidence selection framework, which allows our system to recover from drift and partly deal with occlusion problem. To summarize, our algorithm can be roughly decomposed in a initialization stage and a tracking stage. In the initialization stage, an offline classifier is trained to get the object appearance information in category level. When the video stream is coming, the pre-trained offline classifier is used for detecting the potential target and initializing the tracking stage. In the tracking stage, it consists of three parts which are online tracking part, offline tracking part and confidence judgment part. Online tracking part captures the specific target appearance information while detection part localizes the object based on the pre-trained offline classifier. Since there is no data dependence between online tracking and offline detection, these two parts are running in parallel to significantly improve the processing speed. A confidence selection mechanism is proposed to optimize the object location. Besides, we also propose a simple mechanism to judge the absence of the object. If the target is lost, the pre-trained offline classifier is utilized to re-initialize the whole algorithm as long as the target is re-located. During experiment, we evaluate our method on several challenging video sequences and demonstrate competitive results.