PHOTOVOLTAIC GENERATION PREDICTION USING THE DEEP LEARNING LONG SHORT TERM MEMORY MODEL

ABSTRACT


I. INTRODUCTION
Currently, the planet shows an increase in energy needs driven by the technological and economic development of society.One of humanity's major problems is its dependence on fossil fuels, as they cause a strong environmental impact, in addition to various changes in the economic sphere.The challenge is to ensure that renewable energy sources gradually replace traditional fossil fuels.The main advantage of renewable energies is their lower environmental impact, as they reduce the number of pollutants in the atmosphere, as well as their less concentrated territorial distribution.They are continuous and inexhaustible energy sources, making them the alternative of the future [1].
One of the most widely used renewable sources today is solar energy.Photovoltaic (PV) systems connected to electrical grids are one of the distributed generation technologies with the greatest impact and growth in recent years.In fact, the world's annual solar PV capacity has increased exponentially over the last ten years.It is now the cheapest type of energy in a large number of countries due to substantially lower production costs of PV modules [1].
Atmospheric variables, such as solar irradiance, temperature, humidity and cloud properties, can directly and indirectly influence PV power generation.The dependence of the electrical energy generated in a photovoltaic farm on weather conditions, and the high variability of these conditions, make the problem of predicting the energy generated in a photovoltaic farm a complex task [2].
The future time period for PV generation prediction, or the duration between the actual time and the effective time of prediction is the forecast horizon [3].Some researchers propose three categories for the forecast horizon: short term (up to 24 hours), medium term (1 day-1month), and long term (1month-1year).Others have added a fourth category based on the requirements of the decision-making process for smart grids or microgrids, aptly named very short-term or ultra-short-term (less than 1 hour) forecast horizon.However, so far there is no universally agreed classification criterion [4].
The large-scale penetration of PV in today's power systems requires forecasting models to operate the power grid economically and reliably [5].Accurate solar forecasting eliminates the impact of uncertainty of solar PV power production, improves system stability, increases the penetration level of the PV system, and reduces the maintenance cost of auxiliary devices.In addition, it is a powerful tool that helps power system operators and designers to model and manage solar PV plants efficiently [6].
Since the very emergence of photovoltaic plants, several techniques have been developed to achieve predictive models that contribute to the improvement of the management of these plants.Currently, intensive work is being done in the application of artificial intelligence tools for the development of these types of models, based on the proven capacity of these techniques in the handling of information contained in large volumes of data obtained from the systems under study.
The photovoltaic plant of the Central University "Marta Abreu" of Las Villas (UCLV), in the province of Villa Clara in Cuba, has a nominal power of 1.1 MW, was put into operation in 2019 and is located in the southwest area of the Faculty of Electrical Engineering of the Central University "Marta Abreu" of Las Villas, at 22.4° north latitude and 79.96° west longitude.
This photovoltaic plant has the necessary characteristics to develop prediction models that can contribute to the above mentioned.In particular, it has a large group of historical measurements of the following variables: solar irradiance, ambient temperature, temperature of the photovoltaic modules and power generated.In other words, we have the necessary elements to carry out the study in this plant.Therefore, the main objective of this work is to predict the power generated in the UCLV photovoltaic plant using the Long Short-Term Memory (LSTM) deep learning model.

II.1 PREDICTIVE MODELS BASED ON ARTIFICIAL INTELLIGENCE TECHNIQUES WITH MACHINE LEARNING
Different international publications have addressed the issue of predicting the different variables associated with photovoltaic systems using artificial intelligence techniques.
Reference [7] presents a hybrid model for the long-term prediction of the photovoltaic power generated in a photovoltaic installation based on Artificial Neural Network (ANN) and fuzzy logic.It uses temperature, dew point, wind speed and direction, and solar irradiance as input variables.The proposed model is compared with other prediction models and in all cases presented a superior performance with a Mean Absolute Percentage Error (MAPE) value of 29.60 %.
In reference [8] a Neural Network Ensemble (NNE) prediction model trained by Particle Swarm Optimization (PSO) is proposed to predict the day-ahead power in a smart grid.The model uses as inputs historical data of PV power, solar irradiance, wind speed, temperature and humidity.The performance of the model was measured against five other prediction methods and the NNE method was superior with a MAPE value of 9.75%.
Reference [9] presents a Support Vector Regression (SVR) model to predict the power output of a PV plant for a short-term time horizon.It uses as inputs PV power measurements and solar irradiance forecast from the Numerical Weather Prediction (NWP).The model is able to generate good predictions for clear and cloudy sky conditions with a Root Mean Square Error (RMSE) value of less than 15%.The results obtained are compared with a physical model.

II.2 PREDICTIVE MODELS BASED ON DEEP LEARNING ARTIFICIAL INTELLIGENCE TECHNIQUES
Most conventional approaches to solar power forecasting are not capable of digging deep into the time series and uncovering implicit and relevant information.With the huge data of the modern power system, the use of conventional approaches is not adequate to ensure accurate prediction.Deep learning approaches are becoming increasingly popular due to their good ability to describe dependencies in the time series.Recently, deep learning approaches have emerged as powerful tools that enable complicated pattern recognition, regression and prediction analysis [10], [11].
Reference [12] proposes an LSTM deep learning model for predicting PV power.The proposed model is compared with two other models and proved to be the best performing model with an RMSE value of less than 21%.
Reference [13] presents a LSTM model coupled with a deep neural network.The model is used to predict the load and PV power generated in a smart grid.The performance of the model is compared with other models and a satisfactory result is obtained.
In [14], a Deep Convolutional Neural Network (DCNN) model is used to predict the power of a PV system.The accuracy of the model is compared with a persistent model and a SVR model resulting superior with a Normalized Mean Absolute Percentage Error (nMAPE) value of 11.80%.
Reference [15] proposes a forecasting algorithm to predict PV power generation using a LSTM neural network and a synthetic weather forecast.The proposed model is compared with several machine learning models: a Recurrent Neural Network (RNN), a Generalized Regression Neural Network (GRNNN) and an Extreme Learning Machine (ELM) and the results of the LSTM model were superior in all cases analyzed.

III.1 RECURRENT NEURAL NETWORKS
Recurrent neural networks (RNN) [16] are a type of neural networks in which the connections between units form a directed cycle.This creates an internal state of the network that allows it to exhibit dynamic temporal behavior.Unlike feed-forward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs.
The key idea of RNNs is to use sequential information.In a traditional neural network, all inputs and outputs are assumed to be independent of each other.However, for many applications, this assumption is not always true.RNNs are called recurrent because they repeat the same task for each element of a sequence, and the output depends on previous computations.In other words, RNNs have a memory that captures information about what has been computed so far [16].
Figure 1 illustrates a typical RNN structure, where   is the input at time step t ,   is the hidden state at the at time step t, and   is the output at time step t.In detail, it is shown how an RNN is unfolded into a complete network.By unfolding the RNN, the network is presented in a full sequential format.
Note that the hidden state at time step t is also known as the network memory and is calculated from the hidden state at the previous time step and the input at the current time step, as suggested in the following equation: Source: [16].
RNNs use the same model to perform the sequence prediction at different time instants t.Due to this property, they can be used to process variable and large sequences.When working with RNNs, it is first necessary to select the type or architecture of RNN to be used, since the transition function and the handling of the internal state of the network will depend on the RNN architecture [17].
After selecting the RNN architecture, it is recommended to preprocess the data beforehand, so that the model can use them efficiently.After preprocessing the data, the best model is selected based on a performance measure.While the RNN model has proven to be a powerful tool in terms of handling the data sequence assuming that the current time step depends on previous time steps; it also suffers from some limitations, as it has little ability to learn long-term dependencies due to rapid or explosive gradient decay as it propagates through layers or time instants.To address this problem, new RNN architectures have emerged among which are LSTM recurrent neural networks [17].

III.2 RECURRENT NEURAL NETWORKS OF THE LSTM TYPE
The Long Short Term Memory model (LSTM) [17], [18] arises to overcome the problem of gradient vanishing and thus to learn long term dependencies.In this model, the hidden layer nodes are replaced by special nodes called memory cells.Figure 2 shows the structure of a memory cell of the LSTM model.Source: [18].
Each memory cell has a recurring connection with a fixed weight, ensuring that the gradient as it propagates through time does not rapidly diminish or explode.A memory cell is composed of single nodes in a specific connection pattern.Between the nodes that make up a memory cell are a series of gates that are responsible for managing the flow of information in the unit.The components of a memory cell are detailed below [18]: 1. Input node: the input node performs the linear combination of the input vector   and the output of the hidden layer at the previous instant ℎ −1 .This node delivers new information to the memory cell.  .Note that when the output gate is approximately one, the information in the memory cells is passed to the hidden state to be used by the output layer; and when the output gate is approximately zero, the information in the memory cells is retained by itself.Given an input sequence { 0 ,  1, … ,   } and using the memory cell components, the basic behavior of the LSTM model is described by the following equations: ℎ  = tanh(  ) •   (7) Where   ,   ,   ,   and  ℎ ,  ℎ ,  ℎ ,  ℎ are the weights connecting the layers, t is the time instant of the sequence, -1 is the previous time instant, ℎ −1 is the output value of the hidden layer of the network at the previous time instant;   ,   ,   ,   are the bias parameters, ℎ and σ are the hyperbolic and sigmoid tangent activation functions respectively.
The goal of LSTM neural networks is to learn when to let new information into the internal state and when to let information out of the memory cell.Back-Propagation Through Time (BPTT) is used for learning, and the weights of the gates and the input and output nodes are adjusted [18].
In practice, the LSTM model has a better ability to learn long-term dependencies compared to simple RNNs.

III.3 FORECAST MODEL PERFORMANCE
Performance estimation is critical for assessing the accuracy of a model's predictions.Common tools include: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE).MAE estimates the average significance of the errors in a forecast data set, averages the differences between actual observations and predicted outcomes across the entire test sample, and assigns all individual discrepancies equal weight.Similarly, RMSE estimates the mean value of the error by the square root of the average of the squared differences between the predicted values and the actual observations.It is therefore more robust in dealing with large deviations that are especially undesirable, giving the researcher the ability to identify and eliminate outliers.However, both average metrics (MAE and RMSE) can vary from zero to infinity.In contrast, MAPE is a standard prediction technique that measures prediction accuracy and justifies the diversity of predictions for real data sets [1].The equations for these metrics are as follows: Where   and   are the corresponding predicted and measured values of PV power and N is the number of test samples.

III.4 PROPOSED METHODOLOGY
Figure 3 shows a summary of the methodology used to predict the power generated in a photovoltaic installation using the LSTM model.
This process takes into account several fundamental steps that include obtaining historical data from the PV plant and preprocessing them to eliminate outliers in each of the time series.
Then, a data filtering process is applied in order to reduce the computational load of the prediction models and the training time; in this process, all the measurements corresponding to the nighttime hours when the plant does not generate active power are removed from the database.Therefore, only 14 measurements are analyzed each day, corresponding to the plant's working hours between 6:00 am and 20:00 pm.A statistical analysis of the data is also performed, including the correlation analysis between the meteorological variables and the power generated, in order to define the inputs of the prediction model.
Subsequently, data normalization is performed to avoid distortion or deviation of the results and to achieve a more accurate prediction, then the data is divided into training data and test data.

IV.1 STATISTICAL ANALYSIS OF THE DATA
A database was compiled containing measurements of active power generated (expressed in MW), solar radiation (expressed in W/m²), PV module temperature (expressed in °C), ambient temperature (expressed in °C), wind speed (expressed in m/s) and wind direction (expressed in 0°-360°).Those measurements were taken with a temporal resolution of one hour from July 2018 to April 2021.Figure 4 shows a fragment of the database.

Date
Wind direction Wind speed Solar irradiance T modulo T ambient PV power '17-jul-2018   Tables 1, 2, 3, 4, 5 and 6 summarize the main statistical indicators of the variables active power generated, solar irradiance, PV module temperature, ambient temperature, wind speed and wind direction for each year that was present in the database.
When analyzing the results shown in the tables above, it can be said that in general the quality of the data was acceptable, although there were values that at first glance could be considered erroneous.
Among these were, for example, minimum values of ambient temperature and PV module temperature equal to zero, maximum values of ambient temperature higher than 40°C, solar irradiance values higher than 1300 W/m² and power values higher than 1.1 MW, evidently all the above mentioned measurements were outliers for each of the time series.
To solve these problems and improve data quality, an outlier cleaning process was applied that took into account the extreme limits of each time series.As for missing measurements, if these corresponded to small data segments (≤ 1 h) they were replaced by the average value of four observations, two points before and two points after to maintain the originality and length of the input sequences.
Finally, when there were large sequences (from several hours to weeks) of missing or defective data, imputation was performed, although this problem was practically absent in the available database.The correlation analysis of the data was a very important aspect, since it allowed defining which variables were to be taken into account in a prediction model.
In this case, a database of the UCLV photovoltaic plant was available, which contained time series of several variables for a period of time close to 3 years.The variable to be predicted was the PV power generated at the facility.Therefore, in the correlation analysis, each of the variables was analyzed with respect to the PV power generated.
Table 7 shows the values obtained for the correlation of the different meteorological variables with respect to the photovoltaic power generated.
As can be seen, the highest correlation was presented by the solar irradiance variable, followed by module temperature, ambient temperature and wind speed.These correlation values were considered strong, therefore, these variables were selected as inputs to the prediction model.
The wind direction variable presented a very weak correlation and was not considered as an input to the prediction model.Source: Authors, (2021).

IV.2 PREDICTION RESULTS USING THE DEEP LEARNING MODEL LSTM
To perform the prediction of the power generated at the UCLV photovoltaic plant, a deep learning model LSTM was implemented, currently recommended in a considerable number of publications, due to its advantages and its proven effectiveness in learning long-term dependencies compared to simple RNNs.
The model that was implemented contained two hidden layers and each layer possessed 200 memory cells.For its training the initial learning rate was set to 0.05, the Adaptive Estimation of Momentum (ADAM) algorithm was used as the optimizer, the number of training epochs was 250.In the training process of the LSTM model, different variants of data splitting were tested.In this process, one of the options that provided the best results was the division of the data according to the seasons of the year.In other words, the measurements of one year of work of the photovoltaic plant were taken and divided into periods of three months so that each period corresponded to a season of the year under study.For the training of the LSTM model, 90% of the station data was used and the remaining 10% was used to test its performance.The model was implemented in MATLAB 2019a and run on a computer with Intel(R) Core(TM) i3 CPU at 2.4 GHz and 8 GB of memory.
To analyze the performance of the proposed model, predictions of the power generated in the PV system were made for three days of different behavior (sunny day, cloudy day and partially sunny day).In this case, a short-term prediction was made (for 14 hours of the following day).The results of these predictions are shown in Figures 5,6     As can be seen in the previous figures, the predictions made by the LSTM deep learning model accurately reflected the PV power generation patterns on each day analyzed, even though the behavior of each of these days was different.Evidently the highest accuracy of the prediction model was obtained for the sunny day and the highest uncertainties in the prediction were presented in the cloudy day as expected.
Table 8 shows the prediction accuracy of the LSTM deep learning model for a short-term time horizon (14 hours) and for each of the analyzed day types.Three fundamental metrics were

Figure 2 :
Figure 2: Structure of a memory cell of the LSTM model.Source:[18].

4 .
Forgetting gate (  ): this gate allows to restrict the information kept in the internal state of the memory cell. 5. Output gate(  ): the output gate controls the flow of information out of the memory cell.The value finally delivered by a memory cell is given by the internal state of the unit and the output gate.6. Candidate memory cell (̃): the LSTM model needs to calculate the candidate memory cell ̃, its calculation is similar to that of the three gates (input, forgetting and output), but uses a ℎ function as the activation function with a range of values between [-1,1].7. Hidden state (ℎ  ): the ℎ function ensures that the value of the hidden state element is between [-1,1] Then the characteristics of the model are defined and its training is performed, after which the proposed model is validated and tested.Several statistical indicators (MAE, RMSE, MAPE) are used to quantify the accuracy of the developed model.Finally, the designed LSTM model can be used for energy production prediction.

Figure 5 :
Figure 5: Short-term forecast of PV generation for a sunny day in July 2020.Source: Authors, (2021).
2. Internal state (  ): is the main node of the LSTM model, it uses a linear activation function and fixed weights.Since a fixed weight is used in the internal state recurrence, the error propagates through time without the problems of gradient disappearance or explosion.3. Input gate (  ): is the first gate used by the LSTM network, it controls the flow of information entering the memory cell, if the value of the gate is zero, no new information enters the memory cell, on the other hand if it is one, all new information enters the memory cell.

Table 1 :
Statistical analysis of the active power generated variable.

Table 2 :
Statistical analysis of the solar irradiance variable.

Table 3 :
Statistical analysis of the temperature of the PV modules variable.

Table 4 :
Statistical analysis of the ambient temperature variable.

Table 5 :
Statistical analysis of the wind speed variable.

Table 6 :
Statistical analysis of the wind direction variable.

Table 7 :
Correlation between PV power generated and meteorological variables.