Translation domain segmentation model based on improved cosine similarity for crowd motion segmentation

Abstract. With the continuous growth of the global population, large-scale public gatherings have become more common, and crowd management at these gatherings has become an urgent problem for public safety management. Crowd motion analysis and early warning based on crowd motion segmentation in video surveillance systems has become an important research topic in computer vision. A translation domain segmentation (TDS) model based on improved cosine similarity (ICS) is proposed to segment moving crowds with different crowding levels and complex motion modes. The method reconstructs the objective function of the basic TDS model by ICS to simultaneously measure the difference between both magnitude and direction of two vectors; thus, undersegmentation due to the magnitude difference can be avoided. By switching between “localization” and “globalization” modes of the objective function, the algorithm can be applied to segment crowds with different densities and motion states. Moreover, by simultaneously introducing local motion magnitude and local frame difference magnitude thresholds, nonforeground regions can be excluded from the initial regions during region evolution. Experimental results show that the proposed method achieves superior performance and higher accuracy compared to existing flow field-based methods when applied to complex scenes containing moving crowds.


Introduction
With the continuous growth of the global population, largescale public gatherings-such as religious pilgrimages, parades, celebrations, and sporting events-have become more frequent; therefore, crowd management at these gatherings has become an urgent and important problem to be addressed in the field of public safety management.Usually, people in these gatherings move in a limited space, such as urban streets, semienclosed squares, and fully enclosed shopping malls, which are prone to catastrophic accidents, such as crowd congestion and trampling.][3][4][5][6][7][8] Considering that the most common cause of various accidents is collision among crowds moving in different directions, dividing moving crowds into subgroups based on their movement direction is particularly important; this has been an important research topic in the field of image processing and computer vision. 96][17][18][19][20][21][22][23][24][25][26] Among them, the segmentation method based on single person tracking relies on the complete trajectories of all individuals (or feature points and feature areas) in the crowd; therefore, it is only suitable in the case when there is no occlusion or slight occlusion.Once the crowd increases, there is severe occlusion and the acquisition of complete trajectories becomes very difficult resulting in a significant drop in the performance of such methods.
Due to the difficulties encountered in tracking individuals (such as occlusion), considering the overall self-organization characteristics of a moving crowd, many researchers have started to focus on the entire crowd instead of individuals in the crowd so that an appropriate motion model can be adopted to model the overall motion of the crowd.At present, studies mainly focus on two kinds of methods for crowd motion modeling: flow-based model and nonflowbased model. 9Because a flow field such as the optical flow field is usually a good estimation of the motion field, a crowd motion segmentation method based on the flow field model was proposed as one of the earliest methods for crowd motion segmentation.It is the basis for a variety of subsequently proposed methods.5][26] Most existing flow-based models only focus on a specific crowd density.For example, particle flow fields and streak flow fields are, generally, only suitable for a high-density slow-moving crowd.When this kind of model is applied to a low-density crowd, the possibility of oversegmentation is high.To make the motion model based on flow field applicable to crowds with any density, considering that different flow fields are essentially vector fields, some researchers introduced a vector domain segmentation method for specific vector distribution in the flow field model, such as local translation domain segmentation (LTDS) proposed by Wu and Wong 20 Although LTDS can be applied for motion segmentation of crowds with different densities, it is prone to undersegmentation between the foreground and background because its objective function considers only the direction differences between vectors, regardless of the magnitude differences.
2][33][34] Dynamic texture-based methods first model the moving crowd as a dynamic texture 35 with spatiotemporal statistical properties, and then use the matching among model parameters to perform crowd motion segmentation or abnormal behavior detection.As current dynamic texture models are relatively simple (such as linear dynamic system), crowd motion segmentation methods based on dynamic texture are currently only applicable to crowds with relatively simple movement modes and medium/lowdensities.Tracklet-based methods first obtain the tracklets of the keypoints in the crowd using a tracker [such as Kanade-Lucas-Tomasi (KLT)], and then applies the similarity measure among the tracklets to complete the crowd motion segmentation.For example, Sharma and Guha 34 used the tracklets acquired by KLT to form a complete trajectory of keypoints, and then applied the trajectory clustering approach (TCA) to complete the segmentation of moving crowds.However, this method is not suitable for short-term motion segmentation owing to its requirement for long-term trajectories of the keypoints in the crowd.Zhou et al. 31,32 and Fan et al. 33 utilized coherent filtering (CF) based on coherent neighbor invariance to perform short-term crowd motion segmentation via the local spatiotemporal relationships and motion correlations among tracklets.However, because coherent neighbor invariance primarily focuses on pairwise motion consistency and ignores the motion difference among all the keypoints in the local region of the center point, undersegmentation occurs easily in crowds with numerous motion patterns.
In this study, a TDS method based on improved cosine similarity (ICS) is proposed for crowd motion segmentation.This method adopts the optimized vector domain segmentation model in the flow field and can be applied for the segmentation of crowds with different crowding levels and complex motion modes.The main contributions of this study are as follows: 1. ICS is used to reconstruct the objective function of the TDS model.Compared with the similarity measure used in TDS, which can only measure the direction difference between vectors, ICS can measure the difference between both magnitude and direction of two vectors, which can effectively avoid undersegmentation caused by the magnitude difference; thus, it can greatly improve the adaptability of the TDS model.2. The proposed method can be applied for motion segmentation of crowds with different densities and motion modes.When used for segmenting a crowd with high density and slow movement, the objective function can be localized so that the segmentation is completed by evaluating the motion consistency among local regions; when used for segmenting a moving crowd with medium-or low-density, the segmentation can be accomplished directly by evaluating the motion consistency between all vectors and each initial region.3.In the selection of initial regions, both the local motion magnitude threshold and local frame difference magnitude threshold are introduced in all candidate regions with the best local motion consistency to exclude the nonforeground regions from the initial regions as much as possible, which will inevitably lead to incorrect segmentation results.

Related Works
To overcome the problems of traditional methods based on individual tracking, researchers have performed crowd motion segmentation by establishing a series of crowd motion models according to the self-organization characteristics of a moving crowd.Currently, most crowd motion models are based on flow fields.[26]

Optical Flow Field
Research on modelling a moving crowd by optical flow is the most developed.For example, Hu et al. 15 used a Gaussian adaptive resonance theory network to extract the prominent flow vectors from a dense optical flow field and then constructed a directed neighborhood graph based on the shortest path search for these flow vectors.Hierarchical agglomerative clustering algorithm was used to segment the directed neighborhood graph to obtain the final segmentation results of a moving crowd.Zhang et al. 16 obtained trajectory chains of feature points using an orientation distribution function in a sparse optical flow field; then, they used a spectral clustering method (number of clusters determined by prior knowledge) to complete the classification of trajectories to perform slow-motion segmentation in densely crowded locations.This type of approach usually establishes a crowd motion model directly based on the similarity among optical flow vectors.Because the model is relatively simple, it is generally only applicable to a crowd with good motion consistency.

Particle Flow Field
Particle flow field is based on the Lagrangian fluid dynamics framework.When used for motion segmentation, the particle trajectory can be estimated by numerical integration through the moving particle grid in the optical flow field.For example, Ali and Shah 22 first used the Lagrangian particle dynamic (LPD) algorithm to estimate the particle flow field based on the optical flow field, and then used the finite time Lyapunov exponent to extract the boundary among different flow fields to obtain the segmentation results for a crowd.Because a particle flow field ignores the spatial domain variation and the time delay is obvious, a crowd motion segmentation method based on particle flow field is only suitable for a high-density slow-moving crowd.

Streak Flow Field
To obtain a better flow-based model for a moving crowd, Mehran et al. 24 introduced streak line in fluid dynamics to estimate the motion field of a crowd, which is called streak flow.Because streak flow can preserve the motion information of the flow for a period, the segmentation performance is better than particle flow when applied to crowds with obvious motion changes.However, the streak flow field is obtained on the basis of the optical flow field.The calculation process of streak flow requires very high accuracy optical flow; however, it is very difficult to meet accuracy requirements for optical flow using conventional methods.

Vector Domain Segmentation for Flow Field Model Application
To make the flow field model suitable for crowds with different densities and motion modes, some researchers have used the vector domain segmentation model 36,37 in the vector field to establish a flow-based model of a moving crowd, such as the LTDS model proposed by Wu and Wong 20 Through the normalization of the vector magnitude and the addition of foreground area constraints, LTDS extends the TDS model proposed by Roy et al. 36 to the nonunit field; it can be applied to the segmentation of crowds with different densities by localization.However, because the objective function adopted by LTDS only considers the direction difference between vectors, undersegmentation between foreground and background can easily occur.
The proposed crowd motion segmentation method is based on a segmentation model in the optical flow field.The ICS measure is used to replace the similarity measure based on normalized vector inner product in the objective function of LTDS.The improved objective function can distinguish between both magnitude and direction of vectors to achieve better segmentation performance.

Method
The overall architecture of the crowd motion segmentation method proposed in this study is shown in Fig. 1.For applying the algorithm to a high-density slow-moving crowd: (1) the optical flow field between two consecutive frames is first obtained by means of an optical flow extraction algorithm (such as the Brox algorithm in Ref. 38); (2) the local ICS is used to obtain the local motion consistency map (LMCM); (3) the initial regions for level set evolution are extracted from the LMCM; (4) through the level set algorithm, the objective function of ICS-based LTDS is used to complete the region evolution in the LMCM, and regions with multiple motion directions are obtained; (5) the final crowd segmentation results are obtained by merging the regions with same direction.In contrast, when the algorithm is used for crowds with medium-or low-densities and complex motion modes, after the initial region extraction step, region evolution is not performed in the LMCM but in the global motion consistency map (GMCM) corresponding to each initial region.The GMCM is obtained by calculating the ICS between the average vector of each initial region and all the vectors.

TDS and LTDS
Given a unit vector field E, the translation domain (TD) is defined as the region in which all vectors have the same direction.The field lines in a TD are a set of parallel lines.Therefore, there must exist a unique normal vector perpendicular to the field lines, denoted as a, which can be used as a characteristic parameter to represent the TD. a is also known as the dominant translation parameter.Roy et al. 36 proved that a vector domain is a TD if and only if there is a unique vector aðΩÞ, for ∀x ∈ Ω, that satisfies the following equation: According to Eq. ( 1), Roy et al. 36 proposed a TDS model for vector domain segmentation.The corresponding objective function of the model is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 6 3 ; 6 9 7 JðτÞ ¼ where Ω is a TD; τ is an evolution parameter defined to facilitate the use of the active contour algorithm; ½aðτÞ • EðxÞ 2 is used to measure the motion consistency between all vectors in E and the TD determined by aðτÞ, which can be called the GMCM corresponding to aðτÞ; the integral term ∫ ΩðτÞ ½aðτÞ • EðxÞ 2 dx represents the sum of the squares of the errors in Ω in the GMCM, which reflects the motion consistency of Ω; the integral term ∫ Ω dx represents the area of Ω; and μ is a positive constant.
On the basis of Eq. ( 2), given a deformation velocity V, the Gâteaux derivative in the direction of V can be given as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 5 1 5 where ∂Ω is the boundary of Ω, dlðxÞ is the integral variable of ∂Ω, and N τ is the unit normal vector of ∂Ω pointing in the direction of the interior of Ω.Then, the minimization of Eq. ( 2) is performed to use the active contour algorithm to minimize Eq. ( 3); the corresponding evolution equation of the active contour algorithm is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 6 3 ; 4 0 5

<
: where C is ∂Ω, which is the contour of the TD for evolution, and C 0 is its initial value.Meanwhile, Roy et al. 36 also proposed an estimation method for the dominant translation parameter aðτÞ, which takes the first half of the right side of Eq. ( 2) as the objective function and obtains the optimal estimate of aðτÞ by minimizing the objective function, as follows: As Q τ is a real symmetric matrix, the optimal estimation âðτÞ of aðτÞ can be determined as the eigenvector corresponding to the smallest eigenvalue of Q τ by the quadratic programming model.Because the TDS model is proposed for the TD in the ideal unit vector field, it cannot be directly applied to the segmentation of an actual motion field.To make the TDS model applicable to crowd motion segmentation, Wu and Wong 20 established a LTDS model by the normalization of the vector, addition of a foreground motion region area constraint, and localization according to Eq. ( 2).The objective function of LTDS is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 7 1 9 JðA; τÞ ¼ where Ω is a local TD; U is a motion field and its magnitude kUk is mapped to the range [0, 1]; lðxÞ is a neighborhood centered on x and jlðxÞj is the area of lðxÞ; AðxÞ is the dominant translation parameter of lðxÞ and can be estimated according to Eq. ( 5); and F ½AðxÞ; x is the mean of the sum of the squared errors inside lðxÞ, reflecting the local motion consistency.Given the size of lðxÞ, we can calculate F ½AðxÞ; x of each vector in U and use them to form an LMCM, which can better describe the consistent slow motion of high-density crowds than GMCM; γ is a constant; G½kUðxÞk is proportional to kUk, and is used to measure the extent to which x belongs to the foreground, so that μ∫ ΩðτÞ GðkUðxÞkÞdx can be considered as the foreground area constraint.Similar to the evolution equation [Eq.( 4)] of TDS, and considering that F ½AðxÞ; x is an even function, which makes it impossible to distinguish two completely opposite vectors, the evolution equation of the active contour algorithm corresponding to the LTDS model is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 3 2 6 ; 4 0 1 where ŪlðxÞ∩Ω is the average vector of the region lðxÞ ∩ Ω.

ICS
It is well known that the cosine similarity (CS) measure uses the cosine of the angle between two vectors in the vector space to measure the difference between these two vectors: ; t e m p : i n t r a l i n k -; e 0 0 8 ; 3 2 6 ; 2 4 9 CSðx; yÞ where θ xy is the angle between vectors x and y.The closer the cosine of the angle is to 1, which indicates that the angle will be close to 0, the more similar the two vectors will be.However, as the CS normalizes the magnitudes of the two vectors during calculation, it is impossible to measure the difference between the magnitudes of the two vectors.
To measure both magnitude and direction difference between the two vectors using CS, we first add a scale factor that can measure the difference between the magnitudes of the two vectors in the original CS, as shown in Eq. ( 8); this is defined as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 6 3 ; 7 5 2 k M ðx; yÞ ¼ minðkxk; kykÞ maxðkxk; kykÞ : Considering that the range of k M ðx; yÞ is [0, 1] and that of CS is [−1;1], to unify the ranges of both values, by linear mapping, the CS is adjusted to E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 6 3 ; 6 8 4 CS 1 ðx; Based on Eqs. ( 9) and ( 10), the ICS is given as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 6 3 ; 6 2 8 ICSðx; yÞ ¼ CS 1 ðx; yÞ • k M ðx; yÞ The basic ICS defined in Eq. ( 11) cannot be directly used for TD segmentation because there is an imbalance when evaluating the magnitude and direction difference.For example, if the similarity is given as 0.5, when k M → 1 (i.e., there is almost no difference in magnitude between the two vectors) and cos θ xy → 0 can be obtained from Eqs. ( 10) and ( 11), then θ xy → π∕2; however, when θ xy → 0 (i.e., there is almost no difference in direction between the two vectors), CS 1 ðx; yÞ → 1, then k M → 0.5.This shows that the ICS between the two vectors with small difference in direction and half the difference in magnitude is the same as the ICS between the two vectors that are perpendicular to each other and have the same magnitude, as shown in Fig. 2(a).Considering that there are two vectors perpendicular to a given vector in the same plane, ICSðx; For the actual application of this study, ICSðx; y 1 Þ, ICSðx; y 2 Þ, and ICSðx; y 3 Þ should not be the same.Although the magnitude of y 1 is quite different from that of x, they can be considered to have good similarity because they have the same directional angle (i.e., they have exactly the same direction of motion).In contrast, both y 2 and y 3 are perpendicular to x, resulting in a large difference in their direction of motion.In this case, even if they have the same magnitude, the similarity between them should be very low.Otherwise, two vectors with vertical or even opposite directions may be segmented into the same region, i.e., all the vectors in Fig. 2(b) will be segmented into the same region.That is, when segmenting vectors, the importance of the similarity between the directional angles is significantly higher than the similarity between the magnitudes; for the case shown in Fig. 2, the condition ICSðx; should be satisfied.To enhance the proportion of directional angle similarity in the basic ICS defined in Eq. ( 11), we add an exponential adjustment factor β to CS 1 ðx; yÞ that measures the directional angle difference in ICS; then ICS is redefined as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 3 2 6 ; 5 5 4   ICSðx; y; βÞ ¼ ½CS 1 ðx; where β > 1.The larger β is, the greater will be the proportion of the directional angle similarity in ICS.Meanwhile, if the minimum similarity s min is given, i.e., ICSðx; y; βÞ ≥ s min , the larger β is, the smaller will be the variation range (defined as R β θ ) of the angle difference θ xy , as shown in Fig. 3.
The value of β can be determined according to the distribution of the vectors in the optical flow field, especially the distribution of the vector direction by using the maximum allowable R β θ at minimum similarity s min .For example, the minimum similarity s min is given as 0.5 and R β θ is given as π∕6 at this similarity [as shown in Fig. 4(a)]; when k M ¼ 1, ½CS 1 ðx; yÞ β takes the minimum value of 0.5 and the corresponding direction angle θ xy has the maximum value of π∕6, so the following equation holds as follows: By solving Eq. ( 13), we can obtain β ¼ 9:9969.Once β is determined, for a given minimum similarity s min ¼ 0.

ICS-Based TDS and ICS-Based LTDS
It can be seen from Eqs. ( 2) and ( 6) that F ½AðxÞ; x in the objective function of LTDS is similar to the term ∫ ΩðτÞ ½aðτÞ • EðxÞ 2 dx in the objective function of TDS; they essentially use the CS to measure whether the vector to be segmented is perpendicular to the dominant translation parameter aðΩÞ of Ω.In the actual application process, when Ω is constantly evolving, the degree of its unit TD approximation becomes worse; the difference between the magnitudes of the vectors may also increase.In this case, if the magnitude difference is not equalized due to improper selection of μ, undersegmentation may occur, i.e., regions with vectors having small direction difference but large magnitude difference may be segmented into one region, as shown in Fig. 5.In an actual motion field, the difference between magnitude is often an important basis for distinguishing the foreground from the background; however, the error constraints in the objective function of both TDS and LTDS do not consider the magnitude difference between vectors, which causes the segmentation results to include more background in the motion regions.
Because ICS can measure the difference between both direction and magnitude of vectors, it is expected to improve  In the left, the vectors of foreground (gray shaded region) and background (green shaded region) have no difference in terms of direction, but they have obvious difference in magnitude.When the initial region Ω of the active contour algorithm is inside Ω 1 (as indicated by the red dashed line on the left), if the magnitude difference is not equalized due to improper selection of μ, the evolution result Ω 0 (the area within the red border on the right) will contain both the foreground and the background, i.e., Journal of Electronic Imaging 023011-6 Mar∕Apr 2019 • Vol.28 (2)  the segmentation performance by improving the objective function of LTDS and TDS.The key is to use ICS to redefine the error constraint term in the objective function of LTDS and TDS.Referring to Eq. ( 2), the objective function of the TDS redefined by ICS (ICS-TDS) is as follows: where bðΩÞ is the same vector as the field line of the TD Ω.Similar to aðΩÞ, bðΩÞ can also be used as the dominant translation parameter to represent Ω.However, unlike aðΩÞ, because the magnitude of bðΩÞ is involved in the minimum and maximum calculation, its optimal estimation cannot be obtained by the quadratic programming algorithm.However, we can use the average vector ŪΩ of all vectors in Ω to approximate bðΩÞ according to the definition of bðΩÞ, i.e., E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 6 3 ; 4 4 8 bðτÞ Although bðΩÞ does not obtain its optimal estimation by Eq. ( 15), its acquisition process is greatly simplified compared to aðΩÞ; its time complexity is significantly reduced from OðjΩj 3 Þ to OðjΩjÞ.
After defining the objective function of ICS-TDS given by Eq. ( 14), referring to the localization method in Eq. ( 6), we can obtain the objective function of ICS-based LTDS (ICS-LTDS) by redefining F ðAðxÞ; xÞ with ICS: where F ICS ½Bðτ; xÞ; x is the mean of the sum of the square of ICS-based errors in the neighborhood lðxÞ, which can be used to form the ICS-based LMCM; the definitions of the other parameters are the same as those in Eq. ( 6).It should be noted that the purpose of localization is to avoid the global parameter aðΩÞ or bðΩÞ from affecting the error estimation in the local range such that the estimation of Bðτ; xÞ in F ICS ½Bðτ; xÞ; x should not be the average vector ŪΩðτÞ of ΩðτÞ defined by Eq. ( 15), but the average vector ŪlðxÞ of lðxÞ as follows: Similar to TDS and LTDS, the minimization of the objective functions in Eqs. ( 14) and ( 16) can also be performed by the active contour algorithm.The evolution equation of ICS-TDS can be obtained by rewriting Eq. (4) as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 8 ; 6 3 ; 3 6 2 8 > > < > > : As F ICS ½Bðτ; xÞ; x is not an even function, opposite direction vectors can be well distinguished.Therefore, the evolution equation of the active contour algorithm corresponding to ICS-LTDS can be rewritten by referring to Eq. ( 7): E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 9 ; 6 3 ; 2 7 1

Acquisition of Initial Regions
As we know, for the active contour algorithm, the selection of the initial regions has a significant impact on the final segmentation results.The segmentation results obtained by erroneous initial regions may be intertwined with the correct segmentation results; therefore, they cannot be eliminated.
To obtain suitable initial regions of evolution, Wu and Wong 20 first selected the minimum point p in the LMCM and then calculated the mean F p of F and the mean optical flow magnitude Mp in the neighborhood of p.Only the neighborhood of a point with F p and Mp higher than given thresholds, which indicates that the motion consistency in its neighborhood is good enough and the possibility that its neighborhood is the foreground is sufficiently high, can be considered as the initial region.However, when this initial region is used for medium-or low-density crowds, if the optical flow estimation error leads to some larger magnitudes of optical flow in the background, some erroneous initial regions may be selected in the background, as shown in Fig. 6(b).In Fig. 6(b), the initial regions in the red circle are all erroneous initial regions.
Although the magnitudes of optical flow in these erroneous regions are relatively large, the gray level changes in these background regions are actually small and can be easily estimated by the frame difference method.Therefore, in addition to F p and Mp , we introduce the regional mean Dp based on the frame difference for initial region determination.Only when F p , Mp , and Dp are all higher than given thresholds, which are, respectively, called local error threshold TH F , local motion magnitude threshold TH M , and local frame difference magnitude threshold TH D, the neighborhood of the minimum point p can be confirmed as the initial region.The initial regions extracted after the introduction of Dp can be seen in Fig. 6(c).By comparing Figs.6(b) and 6(c), it can be seen that most of the erroneous initial regions in the background are removed.

Algorithm Flow of the Proposed Method
As mentioned above, ICS-based LMCM is more suitable for segmenting high-density crowds with slow consistent motion because it is based on the local ICS error constraints that can well reflect the motion consistency between local regions.In contrast, ICS-based GMCM is based on the ICS error constraints between all vectors in the motion field and a given TD; therefore, it is more suitable for segmenting motion regions that are consistent with the given TD.If the initial regions contain all possible motion modes, the GMCMs corresponding to all the initial regions can be used to segment the moving crowd with medium-or low-density and complex motion modes.
In summary, the detailed flow of the proposed crowd motion segmentation algorithm based on ICS-LTDS/ICS-TDS is shown in Algorithm 1.
Constructing a set of the initial regions fC 1 0 ; C 2 0 ; : : : ; C m 0 g by the undeleted points {p 1 ; p 2 ; : : : p m } in P (11) if L = = "high" //crowd has high density and slow motion (12) for j ¼ 1 to m (13)  Region evolution through the level set algorithm based on ICS-LTDS-based evolution equation [Eq.(19)] with the initial region C j 0 in LM (14) end for (15) Regions with multiple motion directions fC 1 ; C 2 ; : : : ; C m g are obtained (16) else //crowd has medium-or low-density and complex motion mode Calculating the GMCM GM j corresponding to C j 0 based on ICS (19)  Region evolution through the level set algorithm based on ICS-TDS based evolution equation [Eq.(18)] with the initial region C j 0 in GM j (20) end for (21) Regions with multiple motion directions fC 1 ; C

Experiments and Discussion
In this study, two series of experiments were conducted to verify the performance of Algorithm 1.In the first series of experiments, the proposed ICS-LTDS algorithm was applied for the segmentation of high-density slow-moving crowds.
In the second series of experiments, the proposed ICS-TDS algorithm was used to segment medium-or low-density crowds with complex motion modes.Most crowd video/ image sequences used in the experiments were taken from two public datasets, i.e., UCF dataset 22 and UCSD dataset, 28 and a small part from YouTube (https://www.youtube.com/).In addition, methods based on flow field models, specifically, the LTDS algorithm, 20 TDS algorithm, 36 LPD algorithm, 22 streak flow-based algorithm, 24 and the prior knowledge-based trajectory tracking algorithm, 16 and recently proposed typical nonflow-based methods, specifically, the CF algorithm 32 and TCA, 34 were applied to the same dataset for performance comparison.During the experiments, all algorithms were run on the MATLAB 8.6 platform in the Windows 10 Pro environment.To evaluate the segmentation performance of the algorithm, the ground truth was set manually for some representative images.Meanwhile, on the basis of the ground truth, the numerical segmentation accuracy (SA) corresponding to different directions was given by means of the Jaccard similarity coefficient 39 among sets.The numerical SA is calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 6 3 ; 4 6 3 where i represents the index of the regions corresponding to different motion directions, Ω i is the segmentation result corresponding to i, and T i is the ground truth region corresponding to i.

Segmentation of High-Density Slow-Moving
Crowds Algorithm 1 is applied for segmentation of high-density slow-moving crowds, as described in Sec.3.5, in the LMCM by using the evolution equation [Eq.(19)] based on the objective function of ICS-LTDS (β ¼ 9:9969).In all experiments, if the size of the entire image is mapped to ½0;1 × ½0;1, the size of the neighborhood lðxÞ is 0.02 × 0.02.The initial region of evolution is centered on the minimum of the LMCM and the size is 5 pixels × 5 pixels.The thresholds TH F and TH M are set to 0.06 and 2, respectively.When the threshold TH D is used for different image sequences, the set values are between 1.3 and 3.4.The window size for calculating F p , Mp , and Dp is 5 pixels × 5 pixels.The constant γ in G is set to 0.1.The constant μ is given by referring to LTDS: 20 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 1 ; 6 3 ; 1 5 5 μ ¼ where Δθ is the angle difference between the two regions to be segmented, which can be given first by R β θ , and adjusted according to the actual situation; F and Ḡ are the average values of F and G in the initial region.For the settings of LTDS, parameters that are similar to those of ICS-LTDS, such as the size of lðxÞ, TH F , TH M , γ, and μ, are all set to the same values as in ICS-LTDS for performance comparison.The representative segmentation results of the flow field model-based methods (i.e., ICS-LTDS, LTDS, LPD, streak flow, and prior knowledge-based trajectory tracking) for four image sequences of high-density crowds with slow motion are shown in Fig. 7.The SA calculated according to Eq. ( 20) is shown in Table 1.
To compare the segmentation results of the proposed ICS-LTDS algorithm with those of the typical nonflow-based methods in scenarios with high-density slow-moving crowds, we selected pilgrims and marathon as test sequences, and applied the proposed ICS-LTDS, TCA, 34 and the CF-based algorithm 32 to them.Representative segmentation results obtained from them are shown in Fig. 8 and the segmentation accuracies calculated according to Eq. ( 20) are shown in Table 2.For TCA, the following parameter settings were used: p ¼ 16, r ¼ 40 (10 for pilgrims), τ ¼ 30, Δ ¼ 100, δ ¼ 1, α ¼ 0.1, β ¼ 20, and γ ¼ 150; for the CF algorithm, K ¼ 20, z ¼ 0.025, the upper bound of Φ was taken as 1, and the threshold coefficient α was 0.5.
Similar to crowd flow in high-density slow-moving crowds, the proposed ICS-LTDS algorithm can also be applied to slow-moving traffic flow or mixed flow.We selected two image sequences for algorithm evaluation in this noncrowd scenario.In addition, flow-based ICS-LTDS, LTDS, and LPD, and nonflow-based TCA were selected for the evaluation and comparison in the same noncrowd scenario.Representative segmentation results for two image sequences of noncrowd scenes are shown in Fig. 9.The segmentation accuracies calculated according to Eq. ( 20) are shown in Table 3.
Further, we obtained the ground truth of 100 consecutive frames from the marathon sequence manually to perform evaluation on a sequence containing a slow-moving crowd.The segmentation results of the representative five consecutive frames of the three best-performing flow-based algorithms (i.e., the proposed ICS-LTDS, LTDS, 20 and LPD 22 ) and the nonflow-based CF algorithm 32 are shown in Fig. 10.The highest, lowest, and average segmentation accuracies of all 100 frames in the entire sequence are shown in Table 4.
From the results in Figs.7-10 and Tables 1-4, the following are clear: (1) As the principles of particle flow and streak flow are similar, both process motion information recorded for a period; therefore, the segmentation results of LPD and streak flow are similar.The proposed ICS-LTDS is an improvement of LTDS and so the segmentation results of these two methods are similar.(2) Although particle flow and streak flow are highly suitable for segmenting slow-moving targets, when the speed of the target has a downward trend and tends to stop, both methods will assume that the target has not moved for a period and thus abandon the extraction of the target.For example, in the highway traffic sequence (last row in Fig. 9), the traffic that decelerates and gradually stops because of congestion is not segmented by LPD.In contrast, as ICS-LTDS and LTDS are based on real-time optical flow, both segment these targets.(3) As the prior knowledge-based trajectory tracking method is used in the sparse optical flow field formed by feature points, it is only suitable for the motion segmentation of people in extremely crowded places.For a scene where the feature points are not easily extracted or the crowd has looser distribution, missed segmentation is highly possible, as shown in the last row of Figs.7(a)-7(d).( 4) From the discussion in Sec.3.3, as the objective function of LTDS only distinguishes the direction differences between vectors regardless of the magnitude differences, it is prone to produce undersegmentation results between foreground and background.This phenomenon is particularly evident in the segmentation results for the roundabout and highway traffic sequences in Fig. 9(c), and the marathon sequence in Fig. 10(c), which results in a lower SA than ICS-LTDS with improved objective function.(5)  As TCA needs to record the complete trajectories of the keypoints, it cannot cope with short-term changes of the moving regions.This makes it easy to produce undersegmentation between the foreground and background, as shown in Figs.8(c) and 9(e).( 6) Except for the pilgrims and kabba sequence, the SA of ICS-LTDS is higher than other methods, indicating that the ICS-LTDS algorithm is more suitable for segmentation of high-density slow-moving crowds than existing flow-based and nonflow-based crowd segmentation algorithms.

Segmentation of Medium-or Low-Density
Crowds with Complex Motion Modes Algorithm 1 is applied to medium-or low-density crowds with complex motion modes, as described in Sec.3.5, in the GMCM corresponding to each initial region by using the evolution equation [Eq.( 18)] based on the objective function of ICS-TDS (β ¼ 9:9969).In all experiments, the average vector of the initial region is taken as bðτÞ to calculate the GMCM.Other parameters, such as TH F , TH M, TH D, γ and μ, are all consistent with ICS-LTDS.
In LTDS, the evolution of the objective function is performed in LMCM, which is quite different from ICS-TDS.Therefore, in addition to LTDS, segmentation results of TDS, which also performs the evolution in the GMCM, are included for comparison.However, as the basic TDS proposed in Ref. 36 is only applicable to the unit vector field, it cannot be directly used in the actual scenes, as shown in Fig. 10.To make TDS available for crowd segmentation in the actual motion field, by referring to LTDS, we improve TDS by normalizing the motion vector and adding the foreground motion area constraint in the objective function of TDS: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 2 ; 3 2 6 ; 2 3 1 The TDS defined by Eq. ( 22) can be called the TDS with G, abbreviated as TDS-G.According to Eqs. ( 4) and (7), the evolution equation of TDS-G can be defined as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 3 ; 3 2 6 ; 1 5 1

<
: In addition, among other methods based on flow field modeling, the particle flow-based LPD is selected for algorithm comparison.To facilitate the comparison among Table 1 Segmentation accuracies of the proposed ICS-LTDS, LTDS, LPD, streak flow, and prior knowledge-based trajectory tracking for the scenes in Fig. 7.
algorithms, the common parameters of ICS-TDS, TDS-G, and LTDS are given the same values.The representative segmentation results of ICS-TDS, TDS-G, LTDS, and LPD applied to four image sequences are shown in Fig. 11, and the segmentation accuracies are shown in Table 5.
It should be noted that in a close-up scene containing multidirectional pedestrians as shown in the fourth row of Fig. 10, the term T i in Eq. ( 20) is difficult to estimate due to the numerous moving directions of the pedestrians, mutual occlusion, and even differences in moving directions of the different parts of the same body, which makes it impossible to calculate the SA separately for different directions according to Eq. (20).Considering that the SA should simultaneously measure the similarity between both area and direction of the segment and the ground truths, on the basis of Eq. ( 20), we add the direction consistency measure between Ω i and Ω i ∩ T, so that the mean of the SA for different directions can be directly calculated without segmenting the ground truth T. The SA after adding the direction consistency measure is as follows: From the results shown in Figs.11-13 and Tables 5-7, the following is clear: (1) The particle flow-based LPD algorithm is not suitable for segmenting crowds with loose distribution, especially for segmenting crowds with complex motion modes.For example, in the segmentation results of crosswalk, as shown in the first row in Fig. 11(e) and the last row in Fig. 13(d Although TDS-G and LTDS have almost the same objective function, TDS-G evolves in the GMCM while LTDS evolves in the LMCM.Except for UCSD pedestrians, the segmentation results of TDS-G are better than those of LTDS.It is indicated that the method based on the evolution in the GMCM is more suitable for segmenting medium-or low-density crowds with complex motion modes.As TDS-G has the same objective function as LTDS, which only distinguishes the vector direction but not the vector magnitude, the undersegmentation results between foreground and background caused by the objective function are still serious in the scene shown in Figs.11 and 13; therefore, their SA is significantly lower than that of ICS-TDS.(4) As TCA cannot cope with short-term changes in the moving regions, when it is applied to crowd scenes containing complex motion patterns, the segmentation results still contain severe undersegmentation, as shown in Fig. 16

Conclusion
In this study, a TDS model based on ICS was proposed; this model can be used to segment moving crowds with different crowding levels and complex motion modes.The method reconstructs the objective function of the TDS model by ICS so that the objective function can simultaneously measure the difference between both magnitude and direction of two vectors; thus, it can effectively avoid undersegmentation between the foreground and background due to the difference in magnitude.By switching between the "localization" and "globalization" modes of the objective function, i.e., performing the region evolution in LMCM or GMCM, the algorithm can be applied to segment crowds with different densities and motion states.In addition, by introducing the local motion magnitude threshold and local frame difference magnitude threshold simultaneously, the nonforeground regions can be excluded from the initial regions of evolution; thus, the undersegmentation and mis-segmentation results caused by erroneous initial regions are greatly reduced.The experimental results showed that for a variety of complex scenes containing moving crowds, the segmentation

Fig. 1
Fig.1Architecture of the proposed method.
e m p : i n t r a l i n k -; e 0

Fig. 2
Fig. 2 Imbalance of vectors with the same ICS defined in Eq. (11).(a) ICS between the two vectors with no difference in direction but half the difference in magnitude [i.e., ICSðx; y 1 Þ] is the same as the ICS between the two vectors perpendicular to each other with same magnitude [i.e., ICSðx; y 2 Þ and ICSðx; y 3 Þ].(b) An optical flow field containing the optical flow vectors having the relationship described in panel (a) (the vector in red is the given reference vector x ).

Fig. 3
Fig.3Relationship between R β θ and β.When the minimum similarity s min is given, i.e., ICSðx; y; βÞ ≥ s min , the larger β is, the smaller will be R β θ .

¼ 2 ×
5, θ xy is limited to π∕6 irrespective of the change in k M , as shown in the shaded area in Fig.4(b).In other words, when β ¼ 9:9969, in the limit case, the ICS with θ xy ¼ 0 and k M ¼ 0.5 is the same as the ICS with θ xy ¼ π∕6 and k M ¼ 1, i.e., ICSðx; y 1 ; 9.9969Þ ¼ ICSðx; y 2 ; 9.9969Þ ¼ ICSðx; y 3 ; 9.9969Þ [as shown in Fig.4(a)]; thus, this avoids extreme imbalance between the magnitude difference and the direction difference shown in Fig.2.In this manner, when using the ICS to perform the segmentation of a TD, if the minimum similarity s min ¼ 0.5 is given, the maximum directional angle difference among all vectors in the TD is restricted to 2R 9.9969 θ ðπ∕6Þ ¼ π∕3, and the normalized maximum magnitude difference is limited to [0.5,1].

Fig. 5
Fig.5Undersegmentation caused by the magnitude difference between foreground and background.In the left, the vectors of foreground (gray shaded region) and background (green shaded region) have no difference in terms of direction, but they have obvious difference in magnitude.When the initial region Ω of the active contour algorithm is inside Ω 1 (as indicated by the red dashed line on the left), if the magnitude difference is not equalized due to improper selection of μ, the evolution result Ω 0 (the area within the red border on the right) will contain both the foreground and the background, i.e., Ω 0 ¼ Ω 1 ∪ Ω 2 .

−
m p : i n t r a l i n k -; e 0 ICSðbðτÞ; UðxÞ; βÞÞ 2 dx

Fig. 6 Algorithm 1
Fig. 6 Example of the acquisition of initial regions.(a) The minimum (red points) extraction results in LMCM; (b) The initial regions extracted by F p and Mp , where the initial regions within the red circle are the erroneous initial regions in the background; and (c) The initial regions extracted after the introduction of Dp , where all the erroneous initial regions in (b) are removed.

Fig. 7
Fig. 7 Image sequences of high-density slow-moving crowds used for testing: (a) Mecca, (b) pilgrims, (c) marathon, and (d) kabba.Representative segmentation results (top to bottom): Ground truth images, segmentation results of the proposed ICS-LTDS, segmentation results of LTDS, segmentation results of LPD, segmentation results of streak flow, and segmentation results of prior knowledge-based trajectory tracking.
), there are obvious missing or incorrect segmentations.Meanwhile, in the segmentation results of multidirectional pedestrians, several pedestrians with distinctly different motion directions are segmented in the same region, resulting in severe undersegmentation [as shown in Fig.14, which is the zoomed-in version of the last row of Fig.11(e)].(2) As LMCM is based on the motion consistency between local regions, when pedestrians with little difference in motion direction occlude each other, their motion boundaries are not clear enough in LMCM, which makes LTDS prone to produce undersegmentation results for pedestrians that occlude each other, as shown in Fig.15[the zoomed-in version of the last row of Fig.11(d)].(3) (b) [zoomed-in version of the last row of Fig. 12(c)].Meanwhile, as the coherent neighbor invariance mainly focuses on the pairwise motion consistency and ignores the motion difference among all the keypoints in the local region of the center point, when the CF algorithm is applied to crowd scenes containing complex motion patterns, undersegmentation results are easily produced, as shown in Fig. 16(c) [zoomed-in version of the last row of Fig. 12(d)].(5) For the four scenes shown in

Fig. 13
Fig. 13 Segmentation results of the representative five consecutive frames (top to bottom) in the crosswalk sequence: (a) ground truth; (b) segmentation results of proposed ICS-TDS; (c) segmentation results of TDS-G; (d) segmentation results of LPD; and (e) segmentation results of CF algorithm.

Fig. 10 ,
Fig.10, ICS-TDS is superior to other methods in terms of both segmentation effect and SA, indicating that the ICS-TDS algorithm is more suitable for segmenting mediumor low-density crowds with complex motion modes than existing flow-based and nonflow-based crowd motion segmentation algorithms.
performance and accuracy of the proposed ICS-LTDS/ TDS-based crowd motion segmentation method are superior to existing flow field-based motion segmentation methods.

Table 7
Highest, lowest, and average segmentation accuracies of all 100 frames in the crosswalk sequence.