• Home
  • Scientific results
  • Application of Machine Learning technology in Smart Grid Data Analysis - Dr. Yuan Zhihui

Application of Machine Learning technology in Smart Grid Data Analysis - Dr. Yuan Zhihui


Smart grid based on the physical power grid, the grid is a kind of for the future of the next generation of sustainable and environmentally friendly, it is based on the two-way information communication network, on the basis of the blend of the sensor measurement technology, communication technology, information technology, computer technology and intelligent decision-making and control technology and other advanced technology, can ensure the security and stability of power grid to work effectively. As shown in figure 1, a smart grid covering power generation, transmission, substation, power distribution and utilization, and scheduling and so on each link, the electric power market demand and function of each stakeholders in the coordinate, the ensure system run efficiently, reduce operating costs and the parts of environmental impact at the same time, as far as possible to improve the reliability of system, self-healing and stability. Its main features include meeting the needs of customers, allowing all kinds of distributed new energy power generation equipment to enter the network, resisting attack and self-healing from faults, and realizing on-demand allocation of power market resources. Data analysis is the core of big data processing in smart power grid. Due to the characteristics of massive, complex and diverse big data, many traditional small data analysis algorithms in the big data environment are no longer applicable, and new data analysis methods or existing data analysis methods need to be adopted or improved. In recent years, with the rapid development of Artificial Intelligence (AI) and big data technology, research on how to apply these new technologies and theories to improve the intelligent level of smart grid operation and management has increasingly attracted extensive attention from academia and industry at home and abroad. Among them, the smart power grid data analysis technology based on machine learning theory is one of the research hotspots in this field, which involves a wide range of disciplines, high technical complexity, and is still lack of system methodology and industry standards.



FIG. 1 Schematic diagram of smart grid

Machine learning theory is an important sub-field of AI, which involves multiple disciplines such as probability theory, convex analysis and matrix theory. It mainly studies how to obtain knowledge from data and use it for decision inference, and is one of the main means to realize AI. Machine learning can be broadly divided into supervised learning, semi-supervised learning, unsupervised learning and reinforcement learning according to the different information provided by training samples and feedback methods, as shown in Figure 2. These technologies are now widely used in various industries. In the context of smart power grid, it is necessary to use various analysis methods and give reasonable explanations based on domain knowledge according to specific application scenarios and information physical coupling characteristics of smart meter data and considering the computing performance of machine learning algorithm, so as to improve the quality and efficiency of service analysis.  


FIG. 2 Classification of machine learning

The data application research of smart grid is a systematic project, and the specific business implementation process is the result of the synergy and integration of several technologies. With the further development of power system business application, smart power grid data analysis requires more and more technical performance and integration. In recent years, analysis methods such as deep learning, deep reinforcement learning and data visualization have become important technologies for smart grid data analysis in order to obtain more accurate and deeper understanding from smart grid data. As an important part of machine learning, deep learning is mainly based on deep neural network to automatically extract and select high-level feature expression from raw data. Learning modes include supervised learning, semi-supervised learning and unsupervised learning. Deep learning architectures such as recurrent neural network, convolutional neural network and deep confidence network are widely used in computer vision, natural language processing, speech recognition, medical diagnosis and other fields, and have achieved performance comparable to or even better than that of human experts. The rise of deep learning mainly benefits from the support of software architecture, hardware platform and big data. The relationship between artificial intelligence, machine learning and deep learning is shown in Figure 3. In fact, machine learning is only a method to realize artificial intelligence, while deep learning is an important means to realize machine learning.


FIG. 3 Schematic diagram of the relationship between artificial intelligence, machine learning and deep learning

With the continuous evolution and upgrading of future power grid, the complexity and uncertainty of smart grid from investment planning to asset management, from safe operation to economic operation are increasingly increasing. Abstract power grid production and decision-making process with machine learning algorithm, construct intelligent learning model, establish logical mapping relationship between input and output, can constantly reduce the cost of manpower and material resources, and improve the accuracy and efficiency of decision-making. Machine learning algorithms based on smart grid data can bring new methodologies and innovative ideas, constantly improve our understanding of power grid technology and business operations, and see the nature of the problem. Here we introduce the application of machine learning technology in smart grid data analysis from the following three aspects.

1. Application of machine learning technology in probabilistic load forecasting

With the continuous penetration of renewable energy sources and electric vehicles, the uncertainty and complexity of the smart grid have increased significantly, making it more challenging to maintain the stability of the system. Therefore, it is necessary to accurately quantify the uncertainty of future load for the safe and stable operation of power system. In particular, probabilistic load forecasting plays an important role in many decision-making processes of power companies, such as real-time electricity price forecasting, demand response and random unit start and stop. Different from the traditional point forecasting method which only outputs an expected value, probabilistic load forecasting can provide quantification of the uncertainty of future power demand by quantifying quantile, prediction interval or probability density function.

At present, research on load forecasting technology can be roughly divided into two categories: one is statistical learning method, the other is machine learning method. Statistics-based methods mainly learn the relationship between target variables and explanatory variables, and use goodness of fit to describe the degree of fitting between statistical model and explanatory variables. Statistical method needs expert knowledge to guide the learning process of the model, the result also has the interpretability, typical statistical models such as multiple linear regression, autoregressive moving average method and its variants and exponential smoothing method, based on statistical methods need more time to adjust to a lot of super parameters in the model, and the risk of a fitting. Method based on machine learning, on the other hand, can be regarded as a more advanced statistical method, suitable for processing nonlinear relationship, the adjustment only a small number of parameters, and thus more robust to data, the reconfiguration of the parameters is more simple, more friendly to users, more effective to deal with nonlinear relationships, but it is a black box method of study, the results also lack interpretability, Typical models include support vector machine, gradient lifting, neural network, classification regression tree and K-nearest neighbor model.

With the widespread deployment of smart meters, more and more meter data are collected for demand response projects, making user-level load forecasting increasingly important. Compared with the aggregated system-level load time series data, the regularity of ammeter level data is not obvious, and it presents strong diversity and time variability, so it is more difficult to predict. Recently, deep learning, as a powerful machine learning technology, has been applied to load prediction, and there are many scholars and teams in this field. Including the use of the length of memory recursive neural network short-term load forecasting for residential users, using the length of the pool depth of memory neural network load forecasting for residents, the neural network based on rough neurons to deal with the uncertainty in the data, using neural network based on loss function when the length of the memory used to improve customer the probability of load forecasting, the use of the depth of the residual god The neural network is used for probabilistic load prediction and the improved quantile regression neural network is used for probabilistic load prediction. All the above studies show that deep learning has a wide range of applications in probabilistic load forecasting.

2. Application of machine learning technology in non-invasive load monitoring

Non-intrusive load monitoring is a technology which can infer the usage mode and operation state of the user's electrical appliances according to the power signal of the user feeder. At present, many studies show that based on fine-grained customer electricity information, non-invasive load monitoring technology can be used to achieve demand side energy management in smart grid.

Non-invasive load monitoring was originally defined as a single channel blind source separation problem to extract source signals from mixed noisy observation signals. Due to a number of household electrical appliances start-stop operation at the same time, and the lack of knowledge about electrical characteristic electrical, noninvasive load monitoring is essentially a problem cannot be identified, there are a lot of technology to try to solve this problem, such as factor hidden markov model, signal processing, K - nearest neural network method, the depth of sparse coding and depth. In general, non-invasive load monitoring algorithms can be classified into learning-based methods (including unsupervised, semi-supervised and supervised methods) and optimisation-based methods.

More recently, deep learning has been applied to non-invasive load monitoring due to its powerful feature representation learning ability. Some researchers have verified that bidirectional long and short memory neural network and convolutional neural network can obtain higher F1 score than combinatorial optimization method or factor hidden Markov model. Another researcher proposed a sequential to point model based on convolutional neural network. The input of this model is a feeder input signal sequence, and the output is the power value of an electrical appliance. However, these methods can only monitor a single electrical appliance at the same time, but can not monitor the running state of multiple electrical appliances at the same time. Since standard deep learning methods cannot deal with multi-label learning, some researchers propose deep dictionary learning and deep transformation learning for non-invasive load monitoring. These supervised deep learning methods all show excellent performance, but they need to collect enough accurate training samples, which is sometimes expensive and impractical. In addition, although convolutional neural networks and short-duration memory neural networks in deep learning are very flexible and effective, the training difficulty of these deep neural networks also increases significantly with the increase of hidden layers. Therefore, these deep neural networks have fewer layers, which limits the feature representation learning ability of deep neural networks.

When a large number of labeled training samples are collected, practice shows that deep learning can achieve excellent performance. However, collecting and labeling a large number of training samples costs a lot of manpower and material resources and is expensive, and may also involve privacy protection issues. For example, collecting training samples for load monitoring requires separate sensor installation on each electrical appliance, which is unrealistic. It takes a lot of time for medical experts to mark lesion samples of a large number of medical images. In Internet recommendation system, users are requested to mark interested web pages, but few users are willing to take the time to provide marks. Therefore, "fewer samples with labels, more samples without labels" is a learning norm. In this case, semi-supervised learning is a better solution to use unlabeled samples for learning.

(1) Semi-supervised learning

Semi-supervised learning can use both labeled samples and unlabeled samples. In general, only a small number of labeled samples and a large number of unlabeled samples enable semi-supervised learning methods to discover and learn the potential knowledge structure in the data. Therefore, compared with supervised learning, semi-supervised learning model can achieve better performance under a small number of labeled samples. Recently, semi-supervised learning has also been applied to non-invasive load monitoring. Some researchers have proposed a semi-supervised self-training-nearest neighbor method, others have proposed a semi-supervised learning method based on wavelet analysis and cooperative training, and some have proposed three graph-based semi-supervised learning methods.

(2) Feature extraction based on deep learning 

In the scenario of non-invasive load monitoring, feature learning has also received extensive attention, because feature is an important factor affecting the final learning effect, even more important than algorithm in many tasks. In actual machine learning systems, it often takes a lot of manpower to try and design different features and feature combinations to improve system performance. Due to its multi-resolution and time-frequency localization characteristics, wavelet transform is often used to extract power signal features, but the generalization performance of high-frequency transient features extracted by wavelet coefficients is limited. Because STFT can keep the local properties of the signal in time domain, some researches have used STFT to extract signal features for non-invasive load monitoring. There are also literatures that use delay coding to convert power signals into multi-dimensional time-lag feature vectors for non-invasive load monitoring. More recently, deep learning has also been applied to non-invasive load monitoring due to its success in pattern recognition, speech recognition and computer vision. Deep learning can automatically integrate feature representation learning and predictive model learning without human intervention. Some researchers have applied deep dictionary learning and deep transform learning to supervised feature extraction of non-invasive load monitoring. Some researchers use autoencoders and convolutional neural networks based on bidirectional long and short memory neural networks to extract high-level features of power signals. But these techniques can't take advantage of unlabeled data because they're just supervised learning methods.

(3) Semi-supervised deep learning 

Combining deep learning with semi-supervised learning can not only make use of the powerful features of deep learning to express learning ability, but also reduce the cost of labeling by using a large amount of unlabeled data. Some researchers designed an unsupervised transformation and stability function for learning unlabeled data and integrated it into the loss function for network training. Another researcher proposed a time-domain integration method based on convolutional neural network for semi-supervised image classification by integrating trapezoidal network and loss function. Inspired by time-domain ensemble method and Polyak average, some researchers proposed a mean teacher method based on convolutional neural network for semi-supervised image classification. These semi-supervised deep learning methods rely on random transformations and regularization techniques to exploit unlabeled data. All the above methods use two-dimensional convolution to extract salient features of images such as scale invariance for image classification, so they are not suitable for processing time series signals.  

Application of machine learning technology in microgrid energy management

In recent years, with the popularity of distributed energy such as photovoltaic power generation, energy storage system and electric vehicles, microgrid, as an organic part of smart grid, can use advanced two-way information communication and energy control technology to absorb distributed energy, providing an effective way for the majority of users to participate in load management more actively. Users can adjust the consumption mode of controllable load based on dynamic electricity price or real-time incentive to save energy and optimize energy efficiency. This technology, also known as energy management based on demand response, has attracted extensive attention at home and abroad. Among them, reinforcement learning is a feasible technical scheme for micro grid energy management.

Reinforcement learning is an important research field of machine learning. It mainly studies how agents learn to take actions to obtain maximum cumulative benefits in the process of continuous interaction with the environment. After the training, the agent can take optimal actions in the unknown new environment according to the strategy learned. At present, reinforcement learning has achieved great success in robotics, automation, games and other fields. At the same time, how to apply reinforcement learning to design the demand response project in smart grid has also attracted extensive attention. Some researchers first used and discussed reinforcement learning method in scheduling optimization of customer demand response, and some researchers decomposed this method to electrical equipment level to achieve higher computational efficiency. Some researchers further integrate smart energy hubs into customer demand response management and enable real-time energy monitoring to speed up the learning process. Some researchers combine deep neural network and reinforcement learning to form deep reinforcement learning for economic and efficient resident load scheduling control, in which the deep neural network is used to estimate customers' rewards when making decisions, and reinforcement learning is used to determine customers' next best actions. Other researchers have considered the impact of future electricity prices and applied reinforcement learning to real-time decision-making of electric vehicles to minimize charging costs. Some researchers have explored the feasibility of applying deep reinforcement learning to load frequency control in the scenario of continuous penetration of new energy power generation equipment. Some researchers apply deep reinforcement learning to interruptible load control in demand response. Some researchers apply deep reinforcement learning to reactive power control in distribution networks. Some researchers use multi-agent reinforcement learning to coordinate and schedule charging decisions of multiple electric vehicles to avoid overload and reduce charging costs. Some researchers apply multi-agent reinforcement learning to controllable load scheduling of household appliances to save energy and reduce consumption. Some researchers applied deep reinforcement learning to the demand response of multi-micro grid with uncertainties such as load, electricity price and electricity generation. As a data-driven technology, reinforcement learning can avoid inefficient learning and is especially suitable for complex power grid models under various uncertainties, thus it is of great help to solve the demand response problem of smart grid.

This is one of machine learning technology in smart grid data analysis of three typical applications, by the introduction and analysis on deep learning has strong ability of data analysis, prediction and classification, completely agree with the demand of the smart grid, cuhk data applications, will be the future machine learning is applied in a hot research direction of the smart grid data analysis.