|
1.INTRODUCTIONAs an important backbone enterprise related to national energy security and the lifeblood of national economy, State Grid Corporation undertakes the basic mission of providing safe and sustainable power supply for economic and social development. The power equipment plays an important role in the national grid. Generally speaking, to find the stability of power equipment, it is generally necessary to conduct artificial detection, measure its data, and then judge its stability according to its data. Three main issues need to be considered during data detection: estimation of current health status, prediction of future status and failure time, and determination of the impact of failure on system performance1. Anomaly detection specifically refers to the early detection of potential product problems 2 At present, most countries are encouraging the development of artificial intelligence and big data technology. In the direction of power equipment fault prediction, a large number of researches on the application of big data have been carried out and achieved fruitful results 3-7. Compared with the traditional empirical knowledge and basic mathematical model for power equipment quality prediction, some scholars have applied neural network to data prediction of power equipment8-9 Based on the method10, this paper proposes a power equipment data prediction detection method, and then carries out the envelope of machine learning algorithm. It can fully excavate the rule knowledge between data, which not only overcomes the large judgment deviation of traditional experience and the lack of rules in the existing knowledge base, but also avoids the data accuracy, high efficiency and excellent accuracy which are difficult to achieve with the traditional mathematical model due to the complex structure of power equipment. It has reference significance. 2.LSTM DEEP LEARNING MODELIn neural network prediction, for RNN, only the output of the last node can be accepted, and the weight of the whole layer network is shared, so gradient disappearance or gradient explosion is easy to occur. LSTM 11 model is a deep neural network model, which can carry out linear and nonlinear fitting of data features, carry out deep extraction of data features, and dig potential rules. It usually shows better prediction performance than traditional models in prediction tasks. 2.1LSTM Network structureThe LSTM network layer is composed of cell units connected. The unit structure is shown in Fig.1, It can be seen that the repeated structure A of LSTM has four layers. The main line running through these units is line 2 above. The main structure of LSTM consists of three gates that control the increase and decrease of cell information, namely the forget gate, the input gate and the output gate. Forget Gate The forget gate is responsible for the reduction of information in cells, controlling what information should be forgotten. The forget gate formula is ft = σ(Wf · [ht–1, xt] + bf). After connecting ht–1 and xt it multiplicates a weight Wf and adds bias bf, The weight and deviation are the parameters that the network model needs to learn. If the size of hidden state (i.e., size of hidden Layer of neurons) is hsize then Wf is hsize * hsize. The value of hsize is set manually. Input Gate The input gate determines what information is saved in the cell and what information is added to the cell. The formula for the gate is as follows: Input Gate (a Sigmoid function layer) and tanh layer, the two neural network layers will learn their parameters like the forgetting gate before. Output gate The output gate determines what information the cell needs to output. The output gate formulas are as follows: Output gate also has its own weight parameters to learn. The whole one-layer neural network is composed of several LSTM units, which send the original sequence (x1, x2, ⋅⋅⋅xn) into the network, through the LSTM network, and finally expand into the required dimensions from the full connection layer. 3.DATA DIMENSION REDUCTION AND DETECTION MODELPCA algorithm is used to reduce the dimension of the predicted data, and the processed data is convenient for later detection, and then anomaly detection is performed according to local outlier factor (LOF) algorithm. 3.1PCAPrincipal component analysis (PCA) is a common linear dimension reduction method. The main way is to project the information in the high-dimensional space to the low-dimensional space through linear projection, and make the projected information retain the maximum original information and reduce the dimension of data. Assuming that the original data set has m samples and n features, the original data set can be expressed as the following m×n order matrix, shown as Eq.6. Since the dimensional influence between features in the data set needs to be eliminated before calculating the matrix covariance, standardization needs to be carried out first, as shown in Eq.7. In Eq.8, Use the equation | λ-R|=0 to calculate the characteristic root of the matrix in Eq.7, and then use the characteristic root λ1 ≥ λ2 ≥ … ≥ λη ≥ 0 to calculate the orthogonal unitized eigenvectors e1, e2,…, en. In the case of keeping the total variance of the data set unchanged, the contribution rate of the ith principal component is After the original data matrix is processed by principal component analysis to reduce the dimensionality, the relationship between the principal components C1, PC2, …, PCs(s ≤ n) and the original data feature xi is: In Eq.10, aij and PCi are not related to each other, and 3.2Local outlier factorLocal Outlier factor (LOF) is a density-based outlier detection method. In sample set D, the distance between a sample point o and its nearest neighbor far from its kth is defined as the kth distance dk(o) of point o, and the Kth distance neighborhood Nk(o) of point o is defined as all sample points whose distance from point o is not more than dk(o), as Eq. 11 : In Eq. 11, d (o, o’) is the distance from point o’ to point o’, if d (o, o’) < dk (o), then the reachable distance dreach(o, o’) of sample point o is defined as dk (o),else for d (o, o’), As shown in Eq.12: Where, the nearest neighbor number k needs to be selected according to the actual sample size and sample distribution.However, sometimes sample distribution will affect the selection of outliers, so LOF uses locally accessible density to represent the density of sample point o, and defines the locally accessible density of point o as lrdk (o): |Nk(o) | is the number of sample points contained in the kth distance neighborhood of point o. Define the average value of the ratio of the local reachable density of point o to the local reachable density of all points in the kth distance neighborhood Nk(o) as the local outlier LOFk (o), as shown in Eq.14: According to Eq.14, the lower the local reachable density of sample point o is compared with that of its kth nearest neighbor, the higher the value of LOFk (o) is, the greater the possibility that point o is an outlier. 4.THE SIMULATION TEST AND RESULT ANALYSISSince the failure of power equipment occurs in high and low temperature environment, high and low temperature data is used as the detection standard. Figure 7 is the flow block diagram of the abnormal detection method of power equipment proposed in this paper. The whole process block diagram is divided into data prediction module and envelope detection module, in which the data prediction module mainly completes the data prediction. The envelope detection module completes the feature extraction of the prediction test data and the data dimension reduction processing. Finally, a data set that can be input into the local outlier factor model is formed to complete the learning and classification of input features. The specific implementation steps are as follows:
4.1Model evaluation indicatorsAnomaly detection is essentially a classification problem, the all samples are divided into two categories: abnormal state and normal state, the test using confusion matrix in table 1 for test results were described, Table 1.Confusion matrix
This experiment uses Precision (P) and Recall (R) as the algorithm evaluation indicators, and the calculation formula is as follows: P represents the proportion of predicted anomalies and the number of actual or abnormal samples to the total number of predicted anomalies. The larger the value, the better the anomaly detection effect of the algorithm. R represents the number of predicted anomalies and actual or abnormal samples as a percentage of the total number of actual anomalies. The larger the value, the better the anomaly detection effect of the algorithm. 4.2Data introductionThe experimental data are transformer data collected by Gansu Power Company of State Grid. Transformer products use load, local discharge, dielectric loss of insulating oil, polarization index, volume resistivity, sugar aldehyde content, ground current of iron core, insulation resistance of iron core, DC resistance of winding as detection characteristics. The high temperature of transformer products for environmental testing is 62 °C and the low temperature is -26 °C. The characteristic values of transformer products are recorded every 20 minutes. There are 50 kinds of sampling transformer products. In the test process of a transformer product, the monitoring data of its characteristic load is shown in Figure 8. The residual characteristics of the product are similar to the tendency of the load. It can be seen that the variation trend of product eigenvalue is highly correlated with the ambient temperature. Therefore, in this example, 9 characteristics of transformer products are analyzed as key parameters. Transformer products will be tested at normal temperature, and the load curve of transformer products at normal temperature is shown in the figure 9 below: 4.3Analysis of experimental process and resultsDue to different data units, the contribution to fitting is also different, so it is necessary to standardize the data to the same order of magnitude, and accelerate the convergence of the model. The standardized formula is shown in Eq.17. The experiment completed the programming calculation in Python3.8 environment, relying on the deep learning framework keras and TensorFlow. Data processing The mean and variance of normal temperature data were introduced into the network as cell state and hidden State, and the temperature sequence of the test was introduced into the network to predict the feature sequence in the test environment. Each individual product data are extracted, united into the same data length, after contrast found at around 200, most data length hence take 200 for unification of data length, will integrate the different products, the result was a 3 d data, one of the first dimension represents the product number, the length of the second dimension represents the data, The third dimension represents the number of features. Use the first 40 products as a training set and the last 10 products as a validation set. LSTM neural network selected the transfer function of Sigmoid function, the number of trained hidden nodes is 32, the optimizer uses Adam, the magnitude of network gradient update learning rate is 0.001, learning rate attenuation is 0.0002, the maximum number of network training is 20. Effect display Figure 10 shows the convergence of the loss function on the training set and verification set as the number of iterations increases. From the overall loss value, the loss value tends to converge to 0.09. After 15 training rounds, it began to converge. The final model prediction is compared with the real data (taking characteristic current reserve coefficient Product 1 as an example), as shown in Figure 11. The same can be used for data prediction of abnormal products, as shown in Figure 12. It can be seen that the overall trend of the data is well fitted, which is also applicable to other parameters. It has certain practical significance and better guiding function for the power equipment experiment data prediction. In order to reduce the redundancy of the features of the data set, speed up the training of the model, and reduce the amount of calculation of the anomaly detection algorithm, the principal component analysis of the features of the data set was carried out in this experiment. The contribution rate of each principal component is shown in the figure. It can be seen from Figure 13 that the cumulative contribution of the first two principal components is close to 50%. Therefore, in this paper, 9 transformer product features in the original data set are reduced to 2 principal component features, which can effectively reduce the actual operation of the subsequent LOF detection algorithm and improve the efficiency of transformer product abnormal detection. The two-dimensional data after dimensionality reduction are input into the detection model of LOF to achieve successful visualization of the envelope surface. The classification effect of the high-temperature data model is shown in the figure. The normal products are inside the cochromatic envelope, and the abnormal products are outside the red envelope. Similarly, the predicted low temperature data can also be extracted, and then abnormity detection is carried out by LOF algorithm after dimensionality reduction. The envelope detection results are shown in Figure 15. Combined with the high temperature and low temperature envelope detection maps, five abnormal points can be found. According to the product numbers, the abnormal product numbers are 45-50 respectively, and their detection data are quite different from other normal thruster products. According to the research of this experiment, the method can effectively detect abnormal products. At the same time, in order to increase the accuracy, using the sampling method, 10 new data set, the 10 is the data set as the input of 10 local outlier factor model for training, get 10 different learning device, use the LOF-bagging algorithm and LOF algorithm for multiple tests, the final result shown in the table 2 below, Table 2.Comparative Results
The experiment shows that the algorithm proposed in this paper has higher prediction accuracy than the ordinary anomaly detection algorithm. 5.CONCLUSIONIn order to solve the problem of abnormal detection of traditional power equipment, this paper proposes a method of predicting and checking power equipment test data based on LSTM-LOF model. According to the previous normal temperature data, the LSTM model predicts the high and low temperature data in the actual working environment. PCA is used for data dimensionality reduction extraction, and finally LOF algorithm is used for data detection. This algorithm solves the problem of predicting power equipment components by relying on experience knowledge, and can assist engineers to detect anomalies. Meanwhile, this algorithm has higher prediction accuracy than ordinary anomaly detection algorithms. Subsequently, with the increase of various data of power equipment, more base learners can be used for model training. Meanwhile, the integration strategy can be adjusted to weighted average, and the weight coefficients of different base learners can be dynamically adjusted in the process of model training, so as to obtain better anomaly detection results. REFERENCESWang, Y., Limmer, S., Olhofer, M., Emmerich, M. T. M., Back, T.,
“Vehicle fleet maintenance scheduling optimization by multi-objective evolutionary algorithms,”
In 2019 IEEE congress on evolutionary computation, 442
–449
(2019). https://doi.org/10.1109/CEC.2019.8790142 Google Scholar
Ordóñez, C., Lasheras, F. S., Roca-Pardiñas, J., de Cos Juez, F. J.,
“A hybrid ARIMASVM model for the study of the remaining useful life of aircraft engines,”
Journal of Computational and Applied Mathematics, 346 184
–191
(2019). https://doi.org/10.1016/j.cam.2018.07.008 Google Scholar
Wu, H. Y., Miao, W. W., Guo, B., et al.,
“Research on State Prediction Algorithm of Power Communication Equipment Based on Improved Decision Tree,”
Computer and Digital Engineering, 49
(1), 17
–20,74
(2021). Google Scholar
Gao, Y. X., Sun, S. Z.,
“Application of wavelet denoising combined with ARMA model in the prediction of failure rate of power equipment,”
Journal of Inner Mongolia University of Technology, 38
(2), 122
–128
(2019). Google Scholar
Liu, M. W.,
“Application analysis of improved association rule method in power equipment fault prediction,”
Shandong Industrial Technology, 175
(3),
(2018). Google Scholar
Wu, H. Y., Chen, P., Guo, B., et al.,
“State prediction of power communication equipment based on attention mechanism and LSTM[J],”
Computers and modernization, (),
(10), 12
–16
(2020). Google Scholar
Wang, C. B., Chen, G., Zhou, R., et al.,
“Fault prediction method of power equipment based on survival analysis model,”
Power Big Data, 23
(5), 1
–8
(2020). Google Scholar
Chen, H. J., Zhang, P., Jia, Y. F., et al.,
“Research on operating temperature prediction method of power equipment based on BP neural network,”
Electronic world,
(10), 40
–41
(2018). Google Scholar
Yang, J. H., Liu, Y., Liu, J., et al.,
“Parallel F-LSTM Model and Its Application in Power Communication Equipment Fault Prediction,”
Journal of Wuhan University,
(3), 263
–268
(2019). Google Scholar
Zhang, Y. F., Meng, F. Y., Wang, Y. Q., et al.,
“Research on temperature prediction method of power equipment based on improved LSTM,”
Journal of Electronic Measurement and Instrumentation, 35
(12), 167
–173
(2021). Google Scholar
Hochreiter, S., Schmidhuber, J.,
“Flat minima,”
Neural Computation, 9
(1), 1
–42
(1997). https://doi.org/10.1162/neco.1997.9.1.1 Google Scholar
|