Pointer-type instrument positioning method of intelligent inspection system for substation

Abstract. Robot intelligent inspection is widely used in the positioning of various pointer instruments in power, petroleum, chemical, and other industries. Aiming at the technical problems of poor adaptability, poor real-time performance, and low positioning accuracy of the pointer instrument positioning method in the existing substation intelligent inspection robot system, we propose a simple and effective pointer instrument positioning detection algorithm. The algorithm first extracts locally adaptive regression kernels (LARK) features of the input image, and the dimension of the LARK feature is reduced using the principal components analysis algorithm. Then, the template image is slid in the input image, the cosine similarity is used as an evaluation index, and the Fourier transform is used to accelerate the convolution operation in the cosine similarity calculation. Finally, the accelerated-KAZE algorithm is used to extract the feature points of the pointer-type instrument area image and the template image, and the statistical method of grid motion was used to eliminate the wrong matching points. The remaining matching points were processed by random sample consensus algorithm, and the homography matrix was obtained. The image registration was completed by the homography matrix, and the pointer-type instrument region positioning was realized. The experimental results show that the proposed method has good adaptability, strong real-time performance, and high accuracy of pointer-type instrument positioning.


Introduction
With the development of industrial automation, image processing, and pattern recognition technology, for all kinds of pointer instruments widely used in electric power, petroleum, chemical, and other industries, it is necessary to carry out robot intelligent inspection to replace manual inspection and improve inspection efficiency, reduce the risk of manual inspection and inspection costs, as shown in Fig. 1. In the existing substation intelligent inspection robot system, there are generally two methods for the positioning of pointer instruments 1,2 : (1) Assuming that the meter is a circular meter, the circle detection is performed by the Hough algorithm, and the position of the pointer instrument in the image to be recognized is located. (2) The feature points of the pointer instrument and the feature points of the template image are detected. The position of the pointer instrument is located by matching the corresponding feature points. The positioning method of the pointer instrument (1) requires the pointer instrument to be a circular instrument, and there is no interference of other circular objects in the surrounding environment. The camera of the shooting instrument is required to be fixed, which will be limited in practical applications. The Hough transform circular detection method has high time complexity and low real-time performance. When there are multiple pointer instruments in the image, the algorithm takes a long time. In addition, it is also required that the angle of view deflection of the pointer instrument cannot be too large, otherwise it cannot be detected. The method (2) requires that the surrounding environment of the pointer instrument has no other external objects, and the meter must occupy the main position in the input picture. In addition, the pointer instrument must have stable feature points, otherwise the feature points cannot be extracted from the instrument area, resulting in positioning failure. At the same time, using the conventional machine learning or deep learning methods, due to the small number of samples, even if the sample images are generalized, it is still easy to cause over-fitting 3 and reduce the positioning accuracy. In addition, for the target detection algorithm combined with classifier, due to the training of each classifier, a large number of samples are required, while in the substation specific scene, there are no various calibration samples. Therefore, these methods are not suitable for pointer instrument positioning in substation scenarios. Accurate positioning is the basis of pointer-type instrument recognition, and the difficulty of pointer-type instrument positioning is that the angle, time, and distance of shooting are different, and the illumination, position, posture, size, and color of the instrument in the picture will be biased. 4 In addition, there are a variety of instruments in the substation, and the textures of the various instruments are not consistent, so the direct use of the feature point matching algorithm will be affected by other environmental objects, resulting in positioning failure. 5 In this paper, for the positioning problem of the pointer instrument in the substation intelligent inspection robot system, use the locally adaptive regression kernels (LARK) feature that is stable under different illumination and noise conditions. The principal components analysis (PCA) algorithm is used to highlight the edge features of the instrument and improve the accuracy of the instrument positioning. At the same time, multiscale detection of the input image at different scales of LARK feature map can effectively prevent missed detection, and is also applicable to the case where there are multiple pointer instruments. For the problem that the sliding window is slow to match the input image of the pointer instrument, the Fourier transform is used to accelerate the convolution operation and improve the positioning speed of instrument. Finally, the image registration technology effectively solves the positioning error caused by the angle of view deflection of the pointer-type instrument input image, improves the registration accuracy of the pointer-type instrument area positioning, and provides high-precision positioning for subsequent pointer-type meter reading recognition. The proposed identification method of pointer instrument in this paper uses Fourier transform to accelerate convolution operation, and ameliorates the cosine similarity. The method based on grid motion statistics is used to eliminate the wrong matching points, and the secondary registration of the image is completed, which greatly improves the identification speed of pointer instrument, and effectively reduces the influence of illumination conditions, angle deflection and other factors on the identification accuracy of pointer instrument.

Instrument Feature Extraction
The characteristics of the pointer instrument should be strongly distinguishable from the related background characteristics. The LARK feature is selected as the feature of the pointer instrument. The LARK feature is proposed by Haejong 6 in 2010. The LARK feature is mainly used to describe the characteristics of general objects, and is used in the fields of saliency detection, object detection, and motion detection. 7 Figure 1(b) shows a typical pointer instrument. The positioning task of the pointer instruments to find a similar area in the image to be matched through the template image, the input image of the pointer instrument to be identified and positioned is shown in Fig. 2, and the template image is shown in Fig. 3. Due to the illumination of the pixels and the instability of the random noise, the required accuracy cannot be achieved directly by pixel matching. Therefore, it is desirable to extract a feature that remains stable in illumination and random noise as a feature of a pointer instrument. Once stabilized, the characteristics of the pointer instrument should be highly distinguishable from the background characteristics of the pointer. Because the LARK feature has good stability to illumination, Gaussian noise and perspective, and because the feature has a large difference from the nonpointer instrument area, it can effectively distinguish the pointer instrument from the background image. Therefore, this paper selects the LARK feature as a feature of the pointer instrument.
The essence of LARK is to obtain local features. The LARK feature is a function related to the geodesic distance of the pixel in the center of the window and its surrounding pixels. Taking the 5 × 5 window as an example, select pixel X13 as the center, as shown in Fig. 4(a). Each pixel in the window is calculated using Eq. (1), and the image composed of the geodesic distance value is obtained, as shown in Fig. 4(b). Finally, the LARK feature map describing pixel X13 is obtained by Eq. (2).
The formulas for calculating the geodesic distance and LARK eigenvalues are shown in Eqs. (1) and (2) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 4 6 0 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 4 1 6  where dz represents the difference between the two coordinate pixels, ds 2 represents the arc length of the image pixel, and K represents the local kernel of the central pixel.

Feature Dimension Reduction
Extracting the LARK feature through the aforementioned process is a high-density feature, and it is apparent that the high-density feature contains a large amount of redundant information and noise. In addition, high-dimensional space has the problem of sample sparseness and difficulty in calculating distance. Therefore, data dimensionality reduction can reduce the error caused by redundant information, improve the accuracy of identification matching, and accelerate subsequent calculations.
There are two kinds of dimensionality reduction algorithms: PCA dimensionality reduction algorithm and local retention projection dimensionality reduction algorithm. Among them, PCA is used to extract the main features of the data. PCA transforms the original data into a group of linearly independent representation vectors through linear transformation. 8 Locality preserving projection (LPP) is a local hold projection method based on the linear approximation of Laplacian eigenmaps. 9 This is A method of using linearity to approximate nonlinear dimensionality reduction. It has the advantages of manifold learning and linear dimensionality reduction. In addition to retaining the edge information of the instrument, the LPP algorithm also retains other edge information, and the PCA algorithm can effectively preserve the edge information of the instrument, which is distinct from the background, and better represents the features of the image after dimension reduction.
The PCA transforms the raw data into a set of linearly independent representation vectors for each dimension by linear transformation. From the perspective of optimization, when the high-dimensional features are mapped to the low-dimensional feature data, the loss is minimal, that is, the distance from the low-dimensional reconstruction back to the high-dimensional feature is closest to the original point. Suppose there is a matrix of m samples of n dimensions: x ¼ fx 1 ; x 2 ; x 3 ; : : : x m g T , and W is an n-dimensional orthogonal matrix: w ¼ fw 1 ; w 2 ; w 3 ; : : : ; w m g, which satisfies Eq. (3) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 1 9 9 According to the principle of minimum loss, the required calculation formula (4) is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 1 5 3 According to the relationship between the F norm and the matrix trace, the Eq. (5) can be obtained Finally, solving the PCA problem simplification formula (6) is as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 4 7 2 max w traceðW T X T XWÞ s:t: WW T ¼ I: According to the definition of the feature vector, W is a feature matrix of the covariance matrix X T X. The larger the feature value, the richer the information contained in the direction of the corresponding feature vector. Therefore, W is the K eigenvectors corresponding to the corresponding eigenvalues in the X T X feature matrix.
The PCA algorithm is used to reduce the dimensions of the LARK feature map of the input image, as shown in Fig. 5. Then, the LARK feature map of the reduced-dimensional input image is scaled to obtain a LARK feature map of the input image at a plurality of scales.

Similarity Calculation
For image detection, the sliding window method is generally used to detect the object. In this paper, the LARK feature image of the reduced-dimensional template image is used as the sliding window, and the sliding window is used to slide in the LARK feature image of the multiscale input image. The cosine similarity of the sliding window and the LARK feature image of the input image are calculated after each sliding, and multiple cosine similarities of the LARK feature image of the input image at each scale are obtained after multiple sliding. Cosine similarity is often used to offset high-dimensional Euclidean distance problems. It measures the similarity between two vectors by measuring the cosine value of the product space of two vectors, especially for the similarity comparison of high-dimensional vectors.
Since the sliding window needs to match each block area, the execution speed of the algorithm is too slow. Therefore, the Fourier transform can be used to accelerate the calculation of the convolution operation in the cosine similarity between the sliding window and the LARK feature map of the input image. 10 For the LARK feature, this paper first uses the cosine similarity as the judgment decision function, and the cosine similarity matching can be derived from the optimal Bayesian decision. 6 The calculation of cosine similarity is shown in Eq. (7) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 1 1 5 The feature is in the form of a matrix, so the cosine similarity matrix of the matrix is expressed by Eq. (8) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 7 1 1 where trace represents the trace of a two-dimensional square matrix and the sum of diagonal By transforming Eq. (8), Eq. (9) can be obtained E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 6 1 9 It can be seen from the formula that the value is equal to the cosine similarity of each feature multiplied by the inner product of . It is concluded that cosine similarity takes into account the similarity of length and angle. Then, the plurality of cosine similarities is transformed into a plurality of resemblance map (RM) similarities, and several RM similarity degrees are used to construct the similarity graph of the input images at each scale. Instrumentation detection is performed on the similar graph of the input image at each scale. Figure 6 shows a graph of cosine similarity and RM similarity of an input image.
However, the foreground and background cannot be distinguished by cosine similarity effectively, so Eq. (10) was used to convert cosine similarity ρ i to RM similarity E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 1 1 6 ; 4 5 2 Several RM similarity maps of input images at different scales can be obtained. Compared with cosine similarity ρ i , using fðρ i Þ as RM similarity can distinguish background and foreground more effectively, as shown in Fig. 6(b). As shown, it can be clearly seen that using the RM similarity, the extreme value is more obvious, so the difference between the foreground and the background is quite different.
Judging each RM similarity graph, if at least one of the similarities in the similarity graph has a maximum similarity value greater than a set threshold of 0.5, it is confirmed that there is a pointer-type instrument area in the input image. For the similarity graph in which the maximum similarity value is greater than the set threshold, the regions in the original input image corresponding to the top 1% with the highest similarity value among the similar graphs are selected as the preliminary pointer-type meter candidate regions. Generally, 20 to 30 preliminary pointertype meter candidate areas are obtained, as shown in Fig. 7.
Since there are many overlapping parts in these areas, this paper uses the nonmaximum suppression (NMS) algorithm to eliminate overlapping areas. The idea of the NMS algorithm is that if the intersection of two rectangular frames is greater than a certain threshold, only the rectangular frame with the largest similarity is retained, and the rectangular frame generated by the offset is eliminated. The NMS algorithm can filter out the duplicate candidate boxes well, as shown in Fig. 8. In this way, by extracting the LARK feature, constructing the RM similarity map and using the nonmaximum value can successfully detect the pointer instrument area.

Image Registration
In the substation intelligent inspection robot system, constraint by the shooting angle, the image of pointer instrument often has angle deviation, and the angle deviation is unknown, so it is impossible to complete the high-precision recognition of pointer instrument reading. Therefore, it is necessary to register the image and change the attitude of the input image to that of the template image.
Image registration methods are often used to extract feature points and describe them through feature points, and then match to obtain a mapping matrix. Common feature point extraction and feature point description algorithms include scale-invariant feature transform (SIFT), 11 speeded up robust features (SURF), 12 oriented FAST and rotated BRIEF (ORB), 13 accelerated-KAZE (AKAZE), 14 PCA-SIFT, 15 affine scale-invariant feature transform (ASIFT), 16 and learned invariant feature transform (LIFT). 17 Among them, the AKAZE algorithm not only maintains the advantages of nonlinear scale space, but also combines the stability and robustness of key point detection of scale invariant feature transformation. It can effectively reduce the computational complexity of feature description vectors, reduce the dimensionality of feature vectors, and improve the effectiveness of feature extraction and registration speed. And the AKAZE algorithm has a relatively small mean square error (MSE), this paper uses the AKAZE algorithm to extract the feature points of the pointer-type instrument area image and the template image. Then feature point matching is performed. Common algorithms include brute force (BF) and fast library for approximate nearest neighbors (FLANN). Among them, the BF algorithm tries all the possibilities to find the nearest neighbor, whereas the FLANN algorithm finds the nearest neighbor without finding the optimal matching parameters. However, there are a large number of erroneous matching points in these two methods, as shown in Fig. 9. Therefore, the error matching point needs to be eliminated. In this paper, the method based on grid motion statistics 18 is used to divide the image into multiple small block regions. If the matching positions in each  block region are consistent, then the region is determined to be a correct match. Otherwise, it is a mismatch. After the error matching point is eliminated, the remaining matching points are obtained, as shown in Fig. 10.
After eliminating the wrong matching points, the residual matching point is processed by the random sample consensus (RANSAC) algorithm, 19 and the homography matrix is obtained. The RANSAC algorithm can use an iterative method to find the optimal homography matrix in a set of noise-containing data sets, so that the projection error of the model and all points in the sample set are minimized, even if the value of Eq. (11) is the smallest E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 3 2 9 where i represents each point and h represents each parameter in the homography matrix.
In this way, the image registration is completed by the homography matrix, and the precise positioning of the pointer instrument area in the input image is realized.

Experimental Analysis
The overall algorithm flow of the pointer instrument positioning in this paper is shown in Fig. 11. First, the LARK feature of the input image is extracted, the dimension of the LARK feature is reduced by the PCA algorithm, and the edge features are highlighted. The LARK feature of the reduced-dimensional template image is used as the sliding window and the LARK feature of the input image at multiple scales. The figure slides to obtain the cosine similarity graph at different scales, and uses the Fourier transform to accelerate the convolution operation in the cosine similarity calculation. Then, cosine similarity graphs at different scales are transformed into multiple RM similarity graphs to form the similarity graphs of input images at each scale, and instrument detection is carried out at each scale. If the maximum RM similarity in the similarity graph is greater than the threshold, the region with the highest RM similarity is selected as the preliminary candidate region of pointer instrument. Since there are many overlapping portions in the selected region, the NMS algorithm is  used to exclude the overlapping region to obtain the final pointer-type instrument candidate region. Finally, the AKAZE algorithm is used to extract the feature points of the pointer instrument area image and the template image, and the matching is performed. The grid motion statistics method is used to eliminate the false matching points, and then the RANSAC algorithm is used to process the remaining matching points to obtain the homography matrix. Image registration is completed by homography matrix, and achieves accurate positioning of the pointer instrument area in the input image.
The experiment analyzes the execution efficiency of the Fourier transform convolution operation algorithm. When positioning the pointer instrument, the template image is often large, and the Fourier transform can effectively accelerate the image processing time. For the acceleration time of the verification algorithm, different target images and nontemplate images are selected for convolution operation, and the different acceleration time is calculated. The size of the image target is selected to be two pixel levels of 64 × 64 and 128 × 128. The template image area size is selected to be 364 × 243, 547 × 364, and 912 × 608 pixels, correspond to size 1, size 2, and size 3 in Table 1, respectively. This size is the most common size in an image in a known image library, and the results of the test are shown in Table 1.
It can be seen from Table 1 that the convolution operation using the Fourier transform greatly speeds up the image processing process, and the optimization time multiplier is from 267 to 1210 times, and as the image size increases, the optimization factor also increases.  At the same time, the experiment also compared and analyzed a variety of image feature point detection and matching algorithms: SIFT, SURF, ORB5000, ORB10000, and AKAZE. The comparison results are shown in Table 2.

Start
It can be seen from Table 2 that AKAZE is the best method only in terms of the MSE, and its difference value is lower than other methods. From the perspective of time, AKAZE also takes the lowest time, only 200 ms. Therefore, AKAZE is used as the final image registration method in this paper.
Our project has collected a large number of samples, especially China Southern Power Grid, which provides many actual samples of substations for this paper. In this paper, some  Fig. 12 shows some test samples (6 × 9 ¼ 54).
To obtain the accuracy of the algorithm to recognize the pointer meter, based on the test sample, we have done three sets of experiments. The results are shown in Table 3.
It can be seen from the experimental results that the proposed algorithm performs well in the test set, and the average recognition accuracy of the pointer instrument is as high as 99.36%. It also confirms that the proposed pointer instrument positioning algorithm has good robustness to perspective transformation, illumination change, and other factors. High-precision pointer instrument positioning ensures the accuracy of pointer readings, with an average error of 2.10%. Therefore, the method proposed in this paper can be accurate pointer instrument positioning, pointer reading accuracy to meet the actual requirements.

Conclusion
Aiming at the positioning problem of pointer instrument in substation intelligent inspection robot system, the stable LARK feature combined with PCA algorithm is used to highlight the edge feature of the instrument and improve the accuracy of the instrument positioning. At the same time, multiscale detection of the input image at different scales of LARK feature map can effectively prevent missed detection, and is also applicable to the case where there are multiple pointer instruments. Then, the sliding window method is used to detect the input image of pointer instrument, which can be used when there are multiple pointer instruments in the image. In view of the slow matching problem of sliding window, Fourier transform is used to accelerate the convolution operation and improve the positioning speed of the instrument. Finally, through image registration, the positioning error caused by the angle of view deflection of the input image of the pointer instrument can be effectively solved. The grid motion statistical method is used to eliminate the wrong matching point, and the RANSAC method is used to process the remaining matching points, and improved registration accuracy of pointer instrument area positioning. The registration accuracy of the regional positioning provides high-precision positioning for subsequent pointer instrument reading recognition. The method can realize the accurate positioning of the pointer instrument in the intelligent inspection system of electric power, chemical industry, and petroleum industry, and can be used for the case where there are multiple pointer instruments in the input image. The method has good adaptability, strong real-time performance, and accuracy.