Ship detection and tracking method for satellite video based on multiscale saliency and surrounding contrast analysis

Abstract. In port surveillance, monitoring based on satellite video is a valuable supplement to a ground monitoring system because of its wide monitoring range. Therefore, automatic ship detection and tracking based on satellite video is an important research field. However, because of the small size of ships without texture and the interference of sea noise, it is also a challenging subject. An approach of automatic detection and tracking moving ships of different sizes using satellite video is presented. First, motion compensation between two frames is realized. Then, saliency maps of multiscale differential image are combined to create dynamic multiscale saliency map (DMSM), which is more suitable for the detection of ships of different sizes. Third, candidate motion regions are segmented from DMSM, and moving ships can be detected after the false alarms are removed based on the surrounding contrast. Fourth, important elements such as centroid distance, area ratio, and histogram distance from moving ships are used to perform ship matching. Finally, ship association and tracking are realized by using the intermediate frame in every three adjacent frames. Experimental results on satellite sequences indicate that our method can effectively detect and track ships and obtain the target track, which is superior in terms of the defined recall and precision compared with other classical target tracking methods.


Introduction
The automatic monitoring of ships using video surveillance plays an important role in ocean security and maritime transportation, fishery management, ship traffic surveillance, and so on.In general, the video sources of ship monitoring systems are mainly taken from ground-based 1,2 or aerial-based 3,4 cameras.However, these systems have some shortcomings such as poor concealment, limited space coverage, and the required sensor installation and maintenance.
Fortunately, satellite-based monitoring can overcome the above shortcomings.However, video satellite was not available until recently.6][7][8] Compared with SAR images, optical satellite images have higher spatial resolution that are more suitable for ship detection, and they are important supplements to SAR images.
1][12][13] Most of these methods adopt the coarseto-fine strategies, which can be divided into ship candidate extraction stage and false alarm elimination stage.In the first stage, these methods extract candidate ships according to the differences of gray values between potential targets and background. 14,15In the second stage, most of the algorithms utilize ship features with candidate classifiers to discriminate ships from false alarms, 16 and an important issue is to find efficient descriptors to describe the ship targets.Further, some existing methods use a priori coastline data to detect sea area; 17 however, they are still disturbed by coastal areas due to low accuracy of coastline data.
However, the low-temporal resolution of existing satellite images limits the timeliness of ship detection and tracking.Fortunately, optical video satellites, such as Skysat and Jilin-1, have been developed due to the advancement of camera technology for highly spatial resolution and temporal resolution systems.][20] It should be noted that satellite video plays an important role in ship monitoring because of its strong concealment, wide monitoring range, and real-time continuous monitoring.Although much progress has been achieved in ship detection based on static satellite images, few studies have focused on moving ship detection and tracking based on satellite video, which will become a new research topic in the field of remote sensing.Rao et al. 21proposed to estimate ship speed and direction by locating them and their tracks from multisatellite imagery.Yao et al. 22 and Zhang et al. 23 proposed ship-tracking methods in GF-4 satellite sequential imagery based on the automatic identification system data of ships, which are used for cooperative ships' surveillance.As shown in Fig. 1, ship features can be easily extracted and recognized from groundbased video [see Fig. 1(a)] and aerial-based video [see Fig. 1(b)] because of the sufficient spatial resolution, which are convenient for feature description in ships detection and tracking.Compared with general ground-based video and aerial-based video, there are several problems in ship detection and tracking based on satellite video (see Fig. 1).The moving ships in satellitebased video range from just a few pixels to dozens of pixels with similar brightness due to the limited resolution and even exhibit low contrast to the background due to the influence of cloud and sea wave.From the above characteristics of moving ships in satellite-based video, it is difficult to extract available appearance feature or texture information inside the ship.Maybe some small ships are submerged in a complex background.Another problem for the ship detection and tracking is the complex background due to the large field of view, in which there are disturbances, such as clouds, land moving targets, and so on.Accounting for the characteristics of satellite video, the traditional target detection and tracking methods are difficult to perform well for ships in satellite video.
The goal of this paper is to detect and track moving ships based on satellite video.Inspired by the multiscale selective cognition property of the human visual system, we propose an integrated algorithm of ship detection and tracking.The main contributions can be summarized as the following three aspects: (1) in order to detect ships of different sizes, a dynamic multiscale saliency map (DMSM) is proposed to compute the differential image between two frames, which can detect motion ships of small displacement and avoid the possible holes inside motion ships generated by frame difference method.(2) The surrounding contrast and ship characteristics are utilized to discriminate moving ships from false alarms such as clouds and land moving target.
(3) An integrated ship detection and tracking scheme using the intermediate frame in every three adjacent frames is proposed, which avoids the problem that all frames are registered to the same reference frame.
The rest of this paper is organized as follows.Section 2 introduces the proposed moving ship detection and tracking method.Section 3 describes the experimental results and analysis, and Sec. 4 gives the conclusion.

Methodology
The flowchart of the proposed ship detection and tracking method is shown in Fig. 2, which mainly includes two stages, ship detection stage and ship tracking stage.
At the first stage, for given two frames, first, camera motion compensation based on least square image matching is presented.Second, a saliency detection method is presented to construct DMSM based on the differential image.Third, the DMSM is segmented to obtain a binary image, and then moving ships are extracted with the help of ship characteristics and surrounding contrast.
At the second stage, first, important elements such as centroid distance, area ratio, and histogram distance are used to perform ship matching between two frames.And then, ship association and tracking is proposed by using the intermediate frame in every three frames.

Motion Compensation
The video captured by the satellite-mounted camera always includes camera motion.Therefore, given two frames t − 1 and t, the natural solution is to estimate the transformation parameters and try to compensate the camera motion.Many approaches of affine transformation between two images have been reported in the literature.Most of these methods require a set of matched feature points to estimate the affine transformation parameters.However, the matched feature points are easy to fall on the moving objects which can result in the wrong affine parameters.
Therefore, the least square image matching method is used to compensate motion which does not require matched feature points.Denote frame t − 1 and frame t as the source frame f S ðx; y; t − 1Þ and target frame f T ðx; y; tÞ, respectively.To account for intensity variations, the relationship between two frames can be modeled by the following transform: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 1 8 8 where m 1 ∼ m 8 are the transform parameters.Among them, m 1 ∼ m 4 form the 2 × 2 affine matrix, m 5 and m 6 are the translation vector, and m 7 and m 8 embody a change in contrast and brightness.In order to estimate these parameters, the following quadratic error is to be minimized: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 1 0 9 Here, first, to simplify the minimization, the error function of Eq. ( 2) is derived through a Taylor-series expansion.A more accurate estimate of the actual error function can be determined using a Newton-Raphson style iterative scheme. 24In particular, on each iteration, the estimated transformation is applied to the source image, and a new transformation is estimated between the newly registered source and target images.Second, in order to adapt to the large displacement between two frames and improve the registration speed, a coarse-to-fine registration scheme is adopted.The details can be found in Ref. 25.After the affine parameters are obtained, it is applied to the source frame to obtain the registered source frame.This method can achieve subpixel registration accuracy and adapt to the registration between two frames with illumination changes.

Dynamic Multiscale Saliency Map
If we have given two registered frames obtained at different time, the simplest way to detect motion regions is by frame differencing. 26However, if the displacement of motion ship between two frames is small, the holes often appear inside motion ships in the differential image.If there are similar intensity values in the entire ship, it will also cause the holes inside motion ship.In all these cases, it is difficult to detect a complete moving ship, as shown in Fig. 3(c).
As is known, visual saliency is one of the preattentive processes which makes us to focus our eyes on attractive regions of the scene. 27Due to the ability to capture the salient region, visual saliency has been widely applied in target detection, which is usually used to segment the salient target. 28,29Therefore, we take advantage of this technique to highlight ship areas while suppressing the background in the differential image.
However, there may be many moving ships with different sizes and speeds in each frame, so it is difficult to extract a complete ship without holes from the single-scale saliency map of the differential image.As shown in Fig. 3(d), the single-scale saliency map with spectral residual (SR) model 30 cannot eliminate the holes inside motion ship effectively.Therefore, we propose to compute the DMSM of the differential image based on SR model, which is calculated with the following steps: Step 1: A Gaussian pyramid of the differential image is built with n scales, expressed as fG i ji ¼ 1;2; • • • ; ng.
Step 2: Calculate the log amplitude and phase spectrum of image G i E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 3 4 8 F i ðfÞ ¼ F ½G i L i ðfÞ ¼ log½kF i ðfÞk Φ i ðfÞ ¼ ph½F i ðfÞ; (3) where F ð•Þ is the Fourier transform, phð•Þ is a function for computing phase spectrum, Φ i and kF i ðfÞk are phase and amplitude spectra, respectively, and L i ðfÞ denotes the log amplitude spectrum.
Step 3: Calculate the SR of G i : E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 6 9 1 where h is the 3 × 3 averaging filter.
Step 4: Do an inverse Fourier transform for SR by keeping the phase spectrum, and we obtain saliency map of G i : ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 6 2 8 Step 5: The saliency map S i is resized to the size of differential image ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 5 8 5 where the resized saliency map Si has the same width and height as that of differential image.Step 6: DMSM is calculated based on the combination of all resized saliency maps of n scales with a two-dimensional Gaussian filter g E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 5 0 9 As shown in Fig. 3(e), regions of moving ships are highlighted in the output DMSM, and the holes inside the motion ship are successfully eliminated compared with the single SR model.As seen clearly from the zoomed-in regions [shown in Figs.3(f1)-3(h1) and 3(f2)-3(h2)], our proposed method can obtain better performance in eliminating holes, because DMSM is generated under different resolutions just like the human visual system.

Ship Region Detection
Because the ship regions are relatively salient in DMSM, ship candidates can be obtained by the segmentation of DMSM.Here, a simple segmentation method based on the mean and standard deviation of DMSM is applied to compute an adaptive threshold with the following equation E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 3 2 5 where μ s and σ s are the mean and standard deviation of DMSM, respectively, λ is a coefficient which was empirically set to 1.0 to 2.0 in our experiments.Further, mathematical morphology dilation operation is applied to eliminate the remaining holes within regions and a binary image is obtained [Fig.4  After candidate ship regions obtained, there may exist some false alarms, such as ship wakes, ocean waves, land, and clouds.As shown in Figs.4(b) and 4(c), we can see that there are two regions without moving ships.Therefore, we need to further eliminate obvious false candidates with the following steps: Step 1: For each candidate region [Fig.5(a)], it is segmented by Otsu segmentation method 31 [result shown in Fig. 5 Step 2: Ships always have a limited area, length, and width range.According to these constraints, false candidate regions, such as very large or very small islands and clouds, can be eliminated with proper thresholds.Furthermore, ships are commonly long and thin.Therefore, the ratio of the length to the width is larger than a given threshold.According to this condition, obvious false alarms, including islands and clouds with very small ratios, are eliminated. 13tep 3: To obtain more background region, the morphology dilation operation is further applied to Fig. 5(c) [result shown in Fig. 5(e)], and the corresponding region is shown in Fig. 5(f), which includes both the background and foreground.Figure 5(g) is background region after subtraction operations.
Step 4: Because the intensity of the sea surface in the image is quite different from that of the ship, the surrounding contrast between sea and ship would be helpful to eliminate obvious false candidates.Based on the above characteristic, we regard ships which are not satisfied with the following conditions as obvious false alarms: ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 2 5 9 where μ FG and σ FG are the mean and standard deviation of the foreground region, μ BG and σ BG are the mean and standard deviation of the background region, and μ FGþBG and σ FGþBG are the mean and standard deviation of the foreground and background region, respectively, γ is empirically set to 1.5 to 2.0 in our experiments.
After false alarm elimination, the resulting regions are returned as moving ships.

Ship Matching between Two Frames
Moving ships are detected after false alarm elimination, respectively, in two frames, as previously discussed.Suppose the moving ships in the registered source frame are expressed as fS 1;t

Ship Tracking
From the above sections, the detected and matched moving ships are obtained from every two frames.Suppose two pairs of matched moving ships S 1 and S T ψ : Threshold of the matching ships.
Output: Matched ship pairs.
Step 1. Compute the centroid distance Dist i;j ðt;t−1Þ between the i'th ship S i;t and the j'th ship S j;t−1 ; t e m p : i n t r a l i n k -; e 0 1 0 ; 1 1 6 ; 5 8 9 where ðx i t ; y i t Þ and ðx j t−1 ; y j t −1 Þ are the coordinates of centroids.
Step 2. If Dist i;j ðt;t−1Þ < T d a. Compute the area ratio ARði; jÞ between the i'th ship S i;t and the j'th ship S j;t−1 : E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 5 0 7 ARði; jÞ ¼ where areað•Þ is the area, that is, the pixels number.Area of S j;t−1 and S i;t should be similar, even though the ship has been moved or rotated.
b. Compute the histograms of S i;t and S j;t−1 with H-bin, p i ¼ fp i k g k ¼1;2; • • • ;H and q j ¼ fq j k g k¼1;2; ; t e m p : i n t r a l i n k -; e 0 1 2 ; 1 1 6 ; 4 1 2 Bhattacharya distance 32 between two distributions is defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 1 1 6 ; 3 4 1

BDði; jÞ
d.The total metric is finally defined by the two metrics E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 1 1 6 ; 2 7 8 where δ is a weighting coefficient.
e. if ψ i;j ðt;t−1Þ > T ψ , fS i;t ; S j;t−1 g is one pair of candidate matching ships.

end for j
A high total metric value indicates a good matching with the target ship.If S i;t have several candidate matching ships satisfied the above conditions, we take the ship pair with the highest total metric value as the matched ships.
end for i denote the ship-i in frame t which is detected based on frames t − 1 and t, and S ðt;tþ1Þ j;t denote the ship-j in frame t which is detected based on frames t and t þ 1.We define the overlap score R i;j between the two ships as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 1 1 6 ; 3  are regarded as the same ship, through which the ship association and tracking in three frames is realized.As shown in Figs.6(b) and 6(c), S 1 and S 3 , S 2 and S 4 are two pairs of associated ships, through which ship tracking can be realized.
The proposed algorithm resolves some problems in multiship tracking, such as the appearance of new moving ships and the disappearance of old moving ships.As shown in Fig. 6, it can be seen that a new moving ship S 5 is detected and tracked based on frames t and t þ 1.Furthermore, the proposed method avoids the registration problem of all frames to the same reference frame for moving ship detection and tracking.

Experiments Results
Considering the lack of benchmarking dataset of satellite video including moving ships, we use the sequences of Geostationary Orbit Space Surveillance System (GO3S) satellite video and our synthetic satellite video to evaluate the performance of the proposed method.All the experiments in the following are conducted on a desktop PC with a 1.40-GHz CPU and 4-GB memory, and our code are written in Microsoft Visual Studio 2013 with C++ and OpenCV library.

Data Set Description
The first type of satellite image sequences is cut out from the image sequences of geostationary GO3S satellite and include moving ships from the 1871st frame (see Fig. 7).Video 1 is cut out from the frame 1871 (see Fig. 7) of synthetic GO3S satellite video. 33The cutting frame size is 650 × 360 pixels containing different sizes of ships, and the original frame rate is 10 frames per second (fps).However, because some ships may move slowly, if the video has a high frame rate, the change in the differential image is not obvious.Therefore, we reduce the frame rate to 5 fps to make the differential image change obvious, and the used video consists of 19 frames.
In Video 2, due to the lack of benchmarking datasets of satellite video including moving ships, we further evaluate our proposed algorithm with our synthetic satellite video including large and small ships.The used video consists of 19 frames, and the frame size is 1024 × 768 pixels containing eight different sizes of ships (Fig. 9).

Parameter Selection
In this section, several parameters used in our method during detection and tracking are presented.First, to improve the processing speed, in motion compensation, we decompose the two each frame into five layers of pyramid, and each layer of pyramid image only needs three iterations to get better results.Second, in DMSM detection, we set pyramid scales n ¼ 3, and the Gaussian kernel size of filter, g, 15 × 15 pixels.Third, in ship region detection, the coefficient, λ, is set to 1.0 for the two image sequences.The structuring element in all the morphology dilation operation is 3 × 3. Following, for ship matching, we set the three parameters during the ship matching process, i.e., T d ¼ 70, threshold of the distance between ships, T ψ ¼ 0.7, threshold of the matching ships, and δ ¼ 0.5, the weighting coefficient.Finally, in ship associate and tracking, we set T R to 70%.We employ Recall and Precision to evaluate the performance, which are defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 1 1 6 ; 2 0 5 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 7 ; 1 1 6 ; 1 5 1 where TP (true positive) is a moving ship detected and tracked that turned out to be an actually moving ship, FN (false negative) is an actually moving ship but not detected and tracked, and FP (false positive) is a target detected and tracked that is not a moving ship.
In addition, we apply the widely used spatial overlap to measure whether a bounding box is true positive or false positive, and the threshold of the spatial overlap is set to 0.6.The spatial overlap is calculated by the following equation: 34 where areaðGT ik ∩ ST jk Þ is denoted as the intersection of the ground truth bounding box GT i and the detected bounding box DT j in frame k, and areaðGT ik ∪ DT jk Þ is denoted as the union of GT ik and DT jk .

Effectiveness of Our Method
Example 1: In Video 1 [Figs.8(a)-8(e)], the ships are so small that they only take up dozens of pixels.Furthermore, the ships have similar shape and intensity values, and even are covered with thin clouds.All these factors introduce more difficulties for our detection and tracking task.Figure 8(a) shows the results of the proposed method for frame 2. We can observe that there are five moving ships, including a smaller ship or yacht (ship 5) that was leaving the coast and covered by thin clouds.However, this ship has been tracked failure in frame 3, frame 4 [see Fig.The estimated trajectory of the ships movement based on the proposed ship tracking algorithm is shown in Fig. 8(f).It can be seen that ship 1 to ship 6 have been tracked very well, except three false alarms are present in ship 5 and one false alarm in ship 6.The recall and precision are listed in Table 1 which are calculated with the total 19 frames.Table 1 Overall performance of the proposed algorithm for Video 1.

No. of ship
Recall Precision  9(d)], one smaller ship (ship 8) has been tracked failure which is disappeared in the thin cloud, because the contrast of this ship has the similar contrast of the thin cloud.In frame-19 [Fig.9(e)], the wave of ship 1 is mistakenly detected as a moving ship (enclosed with a very small rectangle).
The estimated trajectory of the ships movement based on the proposed method is shown in Fig. 9(f).It can be seen that ship 1 to ship 8 have been tracked very well, except ship 8 is not tracked successfully in two frames, and there is a false alarm near ship 1 in frame 19.The recall and precision are listed in Table 2 which are calculated with the total 19 frames.

Saliency map
In this section, we conducted several experiments to compare our DMSM method with several saliency map methods (SR, 30 Itti, 27 GBVS, 35 and Signature algorithm 36 ).Two examples in Figs. 10 and 11 validate that the saliency map obtained by our approach performs better than

Ship detection
To test the effectiveness of the proposed method, we compare it with the recent method of R2CNN_head 10 and several background subtraction methods 37 using BGSLibrary: available at GitHub repository: https://github.com/andrewssobral/bgslibrary.Here, for R2CNN_head method, we use the pretraining model "ResNet_v1_101.ckpt,"available at GitHub repository: https://github.com/yangxue0827/R2CNN_HEAD_FPN_Tensorflow,to initialize our network.For our dataset with different size of ships, we further train the model with a total of 10k iterations.The results for Video 1 and Video 2 are shown in Tables 3 and 4, respectively.As can be seen, the precision of the proposed method can reach 96% and 98.6%, and the recall of the  proposed method can reach 93% and 99.3% for Video 1 and Video 2 respectively, which indicates that the proposed method is of high accuracy.The precision of R2CNN_head method can reach 97.6% and 100%, while the recall of this method is 75% and 64.5% for Video 1 and Video 2 with small ships, respectively.

Ship tracking
We compared our method with classical Lucas-Kanade tracker, 43 the other several tracking methods in VIVID Tracking Evaluation Testbed V3.0 44 (basic mean shift, 34 histogram shift, variance ratio, 45 and peak difference feature shift).For Lucas-Kanade tracker, 24 corners were first detected in one frame and tracked in the other frame.For the other tracking methods in VIVID Tracking Evaluation Testbed V3.0, each moving ship was manually detected in the first frame, and tracked in the other frames using these tracking methods, and we compute Recall and Precision of this ship.
Figure 12 shows the tracking results of the six ships in Video 1.As can be observed, our method has high recall and precision at ship tracking than other five methods.Moreover, it has strong robustness to ships of different sizes and can provide a feasible way for ship detection and tracking in satellite video.As Fig. 12 shows, our proposed method performs the best among the tested methods even for ship 5 and ship 6 with very little size.Figure 13 shows the tracking results of eight ships in Video 2. Our method has high recall and precision at ship tracking than other methods.As shown in Fig. 13, our proposed method performs the best among the tested methods for ships of different sizes, which can provide a feasible way for ship detection and tracking.As shown in Figs. 12 and 13, the Lucas-Kanade method fails to track small ships, such as ship 5 and ship 6 in Video 1 and ship 7 and ship 8 in Video 2. The mean shift method gets the poor tracking results for ships of different sizes.The histogram shift method and the peak difference method are good for tracking large ships, but it is very poor for tracking small ships.However, the variance ratio method is tracking instability whether tracking large ships or small ships.

Conclusion
This paper provides a method of ship detection and tracking from satellite video.In the ship detection stage, the motion compensation between two adjacent frames is required to make the background stable.After that, the foreground can be extracted from the background based on DMSM of differential images.Then, a moving ship is detected based on the analysis of the surrounding contrast and ship characteristics.In ship tracking stage, the moving ships are matched based on the combination of centroid distance, area ratio, and histogram distance of ships between every two frames.Finally, the ship tracking is realized based on a ship association scheme.Our method has been tested using a set of satellite videos with different size of ships.The ships have been successfully detected and tracked, and the performance is analyzed by the calculation of recall and precision.

Fig. 1
Fig. 1 Comparison of three types of videos.(a) Ground video frame.(b) Aerial video frame.(c) Satellite video frame.

Fig. 2
Fig. 2 Flowchart of the proposed ship detection and tracking method.

Fig. 3
Fig. 3 Example of saliency map of differential image.(a) and (b) The registered source frame and the target frame.(c) The differential image of (a) and (b).(d) The saliency map based on SR model.(e) DMSM of proposed method.To display clearly, two regions possibly including ships are shown in detailed forms.(f1) and (f2) Two zoomed-in regions of (c).(g1) and (g2) Two zoomed-in regions of (d).(h1) and (h2) Two zoomed-in regions of (e).
(a)].Then, the candidate ship regions can be obtained by AND operation between a binary image and each frame of the two frames, respectively [Figs.4(b) and 4(c)].

Fig. 4
Fig. 4 Ship candidate regions in two frames.(a) Regions after morphology operation.(b) The candidate motion regions in registered source frame.(c) The candidate motion regions in target frame.
(b)].Then, to eliminate the small holes possibly existed in the candidate region [Fig.5(b)], the morphology dilation operation is applied to the segmentation result [result shown in Fig. 5(c)], and the corresponding moving ship is shown in Fig. 5(d).

S 4 ,
and S 5 have been obtained based on frames t and t þ 1 [Figs.6(c) and 6(d)].Then, all the moving ships in frames t − 1, t, and t þ 1 are associated using the intermediate frame t, which can realize ship tracking and give the ship trajectory in the satellite video.Let S ðt−1;tÞ i;t

Fig. 6
Fig. 6 An example of ship association and tracking in three frames.(a) Two ships detected from frame t − 1 based on frames t − 1 and t .(b) Two moving ships detected from frame t based on frames t − 1 and t .(c) Three moving ships detected from frame t based on frames t − 1 and t .(d) Three moving ships detected from frame t þ 1 based on frames t − 1 and t .

Fig. 7
Fig.7The frame 1871 in the GO3S satellite video, and the area of the red rectangle is image frame of Video 1.
8(b)], and frame 5, because the contrast of this ship has the similar contrast of thin cloud.Due to the ship 5 with very fast velocity, the ship wave is misjudged as the ship [see Fig. 8(d)].As shown in Fig. 8(e), a yacht (the ship-6) is released from ship 4, fortunately, our algorithm can detect and track the new-emerging ship (ship 6).

Example 2 :
Li et al.: Ship detection and tracking method for satellite video based on multiscale. . .Journal of Applied Remote Sensing 026511-10 Apr-Jun 2019 • Vol.13(2) Downloaded From: https://www.spiedigitallibrary.org/journals/Journal-of-Applied-Remote-Sensing on 07 Nov 2019 Terms of Use: https://www.spiedigitallibrary.org/terms-of-use Video 2 is covered by a thin cloud, especially in the right side of the image.These frames are shown in Figs.9(a)-9(e) including eight moving ships.In frame 14 [Fig.
obtained by other methods.As shown from the difference image [Figs.10(c) and 11(c)], due to the ship displacement between two frames is small, and there are holes inside motion ships.In Figs.10(d) and 11(d), the SR model cannot detect the moving ships without holes clearly.In Figs.10(e)-10(f) and 11(e)-11(f), the Itti model and GBVS model cannot detect all the moving ships.In Figs.10(g) and 11(g), the signature algorithm can highlight the moving ships; however, the true ship regions are enlarged too much, which causes ships close enough to mix up easily.Contrarily, our DMSM highlights the regions with moving ships, and these regions are not enlarged too much, as in Figs.10(h) and 11(h).

Fig. 12
Fig. 12 Recall and precision of six ships with different methods in Video 1.(a) Recall.(b) Precision.

Fig. 13
Fig. 13 Recall and precision of eight ships with different methods in Video 2. (a) Recall.(b) Precision.

Table 2
Overall performance of the proposed algorithm for Video 2.

Table 3
Comparison results of ship detection for Video 1.

Table 4
Comparison results of ship detection for Video 2.
Liang Chen received his BS and MS degrees in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 2003 and 2006, respectively.And he received his PhD in the State Key Laboratory of Remote Sensing Science, Chinese Academy of Science, Beijing, China, in 2009.He is currently a senior engineer at Qian Xuesen Laboratory of Space Technology, China Aero-space Science and Technology Corporation.His research interests include remote sensing target tracking, data mining, and parameter retrieval.Feng Li (M'07-SM'10) received his BSEE degree from the Lanzhou Railway University in 1999, his MEng degree from China Academy of Space Technology in 2002, and his PhD in electrical engineering from The University of New South Wales in 2009.Following several years working on astronomical image processing in CSIRO, Australia, and on remote sensing image processing in Chinese Academy of Science, respectively, he is currently a PI in the Qian Xuesen Laboratory of Space Technology.His research interests include image registration, super resolution and compressive sensing, moving detection.Meiyu Huang received her BS degree in computer science and technology from Huazhong University of Science and Technology, Wuhan, China, in 2010, and her PhD degree in computer application technology from the University of Chinese Academy of Sciences, Beijing, China, in 2016.She is currently an assistant researcher in the Qian Xuesen Laboratory of Space Technology, China Academy of Space Technology, Beijing, China.Her research interests include machine learning, ubiquitous computing, human-computer interaction, computer vision and image processing.