Electrical Power and Energy Systems 141 (2022) 108143 A 0 M a b c d e h R Contents lists available at ScienceDirect International Journal of Electrical Power and Energy Systems journal homepage: www.elsevier.com/locate/ijepes An advanced short-term wind power forecasting framework based on the optimized deep neural network models Seyed Mohammad Jafar Jalali a, Sajad Ahmadian b, Mahdi Khodayar c, Abbas Khosravi a, iadreza Shafie-khah d, Saeid Nahavandi a, João P.S. Catalão e,∗ Institute for Intelligent Systems Research and Innovation, (IISRI), Deakin University, Geelong, Australia Faculty of Information Technology, Kermanshah University of Technology, Kermanshah, Iran Department of Computer Science - University of Tulsa, USA School of Technology and Innovations, University of Vaasa, Vaasa, Finland Faculty of Engineering of University of Porto and INESC TEC, Porto, Portugal A R T I C L E I N F O Keywords: Deep neural networks Evolutionary computation Neuroevolution Optimization Wind power forecasting A B S T R A C T With the continued growth of wind power penetration into conventional power grid systems, wind power forecasting plays an increasingly competitive role in organizing and deploying electrical and energy systems. The wind power time series, though, often present non-linear and non-stationary characteristics, allowing them quite challenging to estimate precisely. The aim of this paper is in proposing a novel hybrid model named Evol- CNN in order to predict the short-term wind power at 10-min interval up to 3-hr based on deep convolutional neural network (CNN) and evolutionary search optimizer. Specifically, we develop an improved version of Grey Wolf Optimization (GWO) algorithm by incorporating two effective modifications in its original structure. The proposed GWO algorithm is more effective than the original version due to performing in a faster way and the ability to escape from local optima. The proposed GWO algorithm is utilized to find the optimal values of hyperparameters for deep CNN model. Moreover, the optimal CNN model is employed to predict wind power time series. The main advantage of the proposed Evol-CNN model is to enhance the capability of time series forecasting models in obtaining more accurate predictions. Several forecasting benchmarks are compared with the Evol-CNN model to address its effectiveness. The simulation results indicate that the Evol-CNN has a significant advantage over the competitive benchmarks and also, has the minimum error regarding of 10-min, 1-hr and 3-hr ahead forecasting. 1. Introduction In recent years, wind energy has gained remarkable attention as a clean source of electricity that addresses crucial environmental con- cerns [1–4]. The stability and reliability of energy production and the reduction in greenhouse gas emission are significant challenges recently emerged in the domain of power engineering [2,5–8]. The accurate prediction of wind power, which is considered as a highly varying time series with a stochastic and intermittent nature, plays a key role in overcoming such issues [1,9]. Even though wind power generated by a wind turbo depends heavily on atmospheric climatic conditions, the accurate prediction of wind power results in improved wind energy predictions. Consequently, the recent literature presents a broad va- riety of time series forecasting algorithms for the prediction of wind power time series. The nature of wind data is stochastic and chaotic, which means that predicting wind power with linear models is a very ∗ Corresponding author. E-mail address: catalao@fe.up.pt (J.P.S. Catalão). challenging task [10]. Furthermore, the length of the prediction horizon correlates negatively with the accuracy of the forecasting algorithm [9]. Ultrashort- term wind forecasting relates to forecasting of wind data within a few minutes to one hour ahead. This operation is primarily aimed at clearing the electricity sector, grid operations in real time and regulatory activities [11]. Short-term predictions are generally for a duration from one hour to several hours ahead. This type of forecasting is typically used for unit engagement and operational safety in the energy industry [11]. The wind power forecasting methodologies presented in recent technical literature can be categorized into four classes: (1) The persistent model (PR) assumes that the future values of wind measurements have similar values as the most recent historical mea- surement. The smoothness assumption in the model leads to a simple method with the lowest computational resources required; however, vailable online 6 April 2022 142-0615/© 2022 The Authors. Published by Elsevier Ltd. This is an open access ar ttps://doi.org/10.1016/j.ijepes.2022.108143 eceived 11 October 2021; Received in revised form 6 February 2022; Accepted 18 ticle under the CC BY license (http://creativecommons.org/licenses/by/4.0/). March 2022 http://www.elsevier.com/locate/ijepes http://www.elsevier.com/locate/ijepes mailto:catalao@fe.up.pt https://doi.org/10.1016/j.ijepes.2022.108143 https://doi.org/10.1016/j.ijepes.2022.108143 http://crossmark.crossref.org/dialog/?doi=10.1016/j.ijepes.2022.108143&domain=pdf http://creativecommons.org/licenses/by/4.0/ International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. p l Nomenclature 𝑤𝑙−1 𝑖𝑘 Kernel of the 𝑖th neuron at layer 𝑙 − 1 towards the 𝑘th neuron of the layer 𝑙 𝛼 Alpha wolf 𝛽 Beta wolf 𝛿 Delta wolf 𝛥𝑙 𝑘 Delta of the 𝑘th neuron at layer 𝑙 𝛤 Standard gamma function 𝑦̂𝑖 Vector of predicted wind power data 𝜔 Omega wolf 𝑋𝑝 Position vectors of the prey 𝑋⃗ Position vectors of the grey wolf 𝑏𝑙𝑘 Scalar bias of the 𝑘th neuron at layer 𝑙 𝑏𝑗 Total number of the item type 𝑗 𝑐𝑜𝑛𝑣1𝐷𝑧(., .) Fully convolutional operation in 1-D space using the zero padding 𝑙 Current convolution layer 𝑙 − 1 Previous convolution layer 𝑙𝑏 Lower bounds of the search space 𝐿𝑒𝑣𝑦() Levy flight function 𝑟𝑒𝑣(.) Array reversing 𝑠𝑙−1𝑖 Output of the 𝑖th neuron of layer 𝑙 − 1 𝑢𝑏 Upper bounds of the search space 𝑥𝑙𝑘 𝑘th feature of input 𝑥𝑖𝑗 Integer values of the 𝑋𝑖 position in 𝑗𝑡ℎ dimension 𝑦𝑙𝑘 Intermediate output of neuron from the input 𝑦𝑖 Vector of observed wind power data 𝑦(𝑡) Actual wind power for the time step 𝑡 𝑦𝑖𝑗 Transformed real number of the 𝑗𝑡ℎ dimen- sion of individual 𝑖 AEMO Australian Energy Market Operator AI Artificial intelligence ANN Artificial neural network AR Auto regressive ARIMA Auto regressive integrated moving average ARMA Auto regressive moving average B𝑠 Batch size BaNN Bagging neural network BP Back propagation CNN Convolutional neural network D𝑟 Dropout rate DE Differential evolution Evol-CNN Proposed evolutionary CNN framework FFNN Feed-forward neural network FP Forward propagation GWO grey wolf optimizer IGWO Improved version of GWO K𝑠 Kernel size L𝑟 Learning rate LSTM Long short-term memory M𝑟 Momentum rate the accuracy of this model is significantly declined as the prediction horizon is expanded [11]. (2) The Physical models work on the basis of numerical weather pre- diction (NWP) by considering different meteorological parameters such 2 MAE Mean absolute error MAPE Mean absolute percentage error MI Mutual information MP𝑠 Maxpooling size MSE Mean square error N𝑐 Number of convolutional layers N𝑒 Number of epochs N𝑓 Number of filters PR Persistent model PSO Particle swarm optimization RBFNN Radial basis function neural network ReLU Rectified linear unit RMSE Root mean square error SAE Stacked auto-encoder SGD Stochastic Gradient Descent SVR Support vector regression as temperature, pressure, and obstacles. NWP leads to accurate pre- dictions for long-term wind forecasting tasks, and utilize in problems corresponding to large-scale predictions in very wide areas. The major disadvantage of this method is the large time and memory complexity that is addressed by using supercomputers. Thus, this method can- not be employed for real-time predictions with limited computational resources [12]. (3) Statistical methods learn the mathematical relationship between different variables corresponding to real-time wind data samples. This category of approaches include the auto regressive (AR), auto re- gressive moving average (ARMA), auto regressive integrated moving average (ARIMA), Bayesian approach, as well as the grey predictions. In this domain, the authors of [13] developed a novel version of ARIMA for the short-term prediction of wind data that eases the decision making in wind sectors. Also, the research work in [14] provides a novel signal decomposition technique based on ARIMA to capture the general trend of wind speed datasets. Furthermore, a robust spline regression model is proposed in [15] which is optimized by the vari- ational Bayesian method [16]. In this class of approaches, an infinite Markov switching AR is also presented by [17] as a non-parametric Bayesian framework with flexible posterior distribution for accurate wind forecasting in large-scale datasets. (4) Artificial intelligence (AI) methodologies have shown their great potential for solving numerous types of real world applications [18– 24]. AI algorithms such as artificial neural networks, support vector regression [25], transfer learning [26] and fuzzy systems led to novel wind prediction algorithms in very recent studies [27]. Artificial neural networks (ANNs) are widely used as nonlinear models that extract powerful relationships between the input wind measurements and the future wind speed values. In this group, ANNs are successfully applied to the prediction of different weather time series with various time scales [28]. Feed-forward ANN, recurrent ANN, radial basis function (RBF) ANN, and adaptive wavelet ANN are recently proposed for wind speed and wind power forecasting applications. While these methodologies provide a highly nonlinear regression model, they cannot handle the large variations in the wind time series due to the lack of computational layers. Also, determining the optimal values of ANNs parameters such as their weights and biases is a chal- lenging task in this research area [29]. As a result, recent studies are focused on deep neural networks that consist of large number of latent layers to extract wind power data in a supervised and unsupervised manner [30]. Among recent deep learning models, stacked autoencoders [31] are resented to develop an accurate multi-scale prediction method that earns heterogeneous wind features in a real-time fashion. In [32], International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. a t a f p C p f e s i v i f g t n p b s D t t D e t d I g i a i o p s t r d p 2 w f s c T t t b w a h f o m s 2 n f 1 C G T T i t s w v a w t f 𝑄 a deep belief network is applied that can learn the most significant unsupervised wind features in a probabilistic manner. Also, the authors of [11] developed an interval version of this model that can address the uncertainties and noise in wind measurements. Long Short-Term Memory network [33] is a recurrent version of deep ANNs that applies large number of temporal latent layers to effectively learn powerful emporal characteristics of the wind power data. In [34], an ensemble algorithm based on deep CNN and wavelet transform is proposed for wind power forecasting using a real wind farm dataset from China. The research work presented in [35] proposes an improved empirical mode decomposition combined with bagging neural network (BaNN), K-means clustering method, and shark smell optimization (SSO) algo- rithm to predict wind power data. In [36], a hybrid deep CNN with radial basis function neural network (RBFNN) and double Gaussian unction (DGF) approaches are applied to 24 hr-ahead short-term wind ower forecasting. In [37], a strong combination of two-dimensional NN and improved wavelet transform trained by improved version of article swarm optimization (PSO) is proposed to forecast wind power or short and long-term forecasting horizons. [38] applies a differential volution (DE) algorithm to optimize the hyperparameters of deep long hort-term memory (LSTM) neural networks for a time series forecast- ng problem. In another work [39], a forecasting algorithm based on ariational-mode decomposition long short-term memory (VMD-LSTM) s proposed in order to improve the accuracy of multi-step wind power orecasting. In most of the previous works, the architecture of deep learning al- orithms is manually designed, which is a time consuming and difficult ask to do [40–42]. In order to overcome these issues, advanced deep euroevolution search methods have been introduced to design this rocedure automatically and efficiently [43,44]. These methods have een successfully implemented and used in several real-world problems uch as computer vision, fraud detection and medical domains [40]. eep neuroevolution is defined as the process of optimizing the archi- ecture of deep learning networks by evolutionary methods in order o achieve highest accuracy and obtain the optimal architectures for NNs [40]. In this paper, a novel deep neuroevolution method is presented to mploy convolution operation for extracting powerful temporal fea- ures from the wind power signals. Our model is optimized by a new eep neuroevolution algorithm for short-term wind power prediction. n contrast to classic deep models, we introduce a modified version of rey wolf optimizer (GWO) with CNNs to overcome the local minimum ssues in the optimization of the hyperparameters and obtain the best rchitectures with highest accuracy. Besides, we employ the mutual nformation (MI) feature extraction strategy in order to obtain the ptimal input features for our proposed deep learning model. Our roposed forecasting algorithm is applied on both ultrashort-term and hort-term forecasting horizons. In summary, the main contributions of his study can be summarized in four main categories: • An improved version of GWO algorithm is proposed as a novel evolutionary optimization approach to strengthen the search space capability and reduce the possibility of trapping into local optima. • An efficient forecasting method is developed based on the im- proved GWO algorithm and CNN deep learning model for wind power signals to obtain the highest prediction accuracy. To the best of our knowledge, this is the first attempt to solve wind power prediction problem by a deep neuroevolution algorithm. To this end, the proposed method considers nine critical hyper- parameters of deep CNNs to be optimized by the improved GWO algorithm. • An effective mutual information feature extraction strategy is applied to increase the accuracy of the wind power forecasting procedure and gain the optimal input features for such a highly 3 volatile signal. • Our simulations show the superiority of the proposed method over the state-of-the-art and recently published works in terms of 10 min up to 3 h ahead predictions for two case studies. The rest of the paper is organized as follows: In Section 2, the theo- ies and components of the proposed method are presented. Section 3 enotes to the numerical analysis for wind power data and finally, the aper is concluded in Section 4. . Proposed method In this section, we develop an integration of deep neural net- ork and evolutionary search strategy to propose a novel wind power orecasting method called Evol-CNN. To this end, the available time eries data for wind power in the past is used as the input of deep onvolutional neural network for forecasting wind power in the future. here are several hyperparameters used in deep CNN architecture that heir values impact on the performance of deep CNN. In other words, he accuracy of deep CNN for forecasting wind power can be improved y determining the optimal values of its hyperparameters. Therefore, e develop an improved version of grey wolf optimization algorithm s an evolutionary search strategy to obtain the optimal values of yperparameters for the deep CNN. In the following subsections, we irst discuss the representation of the solutions in the evolutionary ptimization algorithm. Then, the fitness function used in the proposed ethod is introduced in details. Finally, we discuss our novel search trategy for the Evol-CNN algorithm. .1. Representation of solutions In the proposed method, we adopt one-dimensional convolutional eural network since the dimension of the data used for time series orecasting problems has generally one dimension. Thus, we apply D-CNN for forecasting the unknown values of wind power data. 1D- NN has several hyperparameters and the aim of using improved WO algorithm is to obtain optimal values for these hyperparameters. o this end, we optimize nine critical hyperparameters by Evol-CNN. herefore, we need to define each solution in the population space of mproved GWO algorithm as a vector of nine values corresponding to he considered hyperparameters. On the other hand, the basic search trategy of GWO algorithm is mainly applied for optimization problems ith continuous domain of individuals, while the hyperparameters alues of a CNN should mostly be discrete values. Therefore, we use n encoding transformation function to map each real number vector, hich transforms the position of an individual in the continuous space o a new integer vector for an individual in the discrete space as ollows [45]: 𝑖𝑗 = ⌊𝑏𝑗 ∗ 𝑥𝑖𝑗 − 𝑙𝑏 𝑢𝑏 − 𝑙𝑏 + 0.5⌋, 𝑗 = 1,… , 𝑛 (1) where 𝑥𝑖𝑗 represents the real values of the 𝑋𝑖 position in the 𝑗𝑡ℎ dimension and 𝑄𝑖𝑗 denotes to the transformed integer number of the 𝑗𝑡ℎ dimension of individual 𝑖. 𝑏𝑗 denotes to the total number of the item type 𝑗 and the lower and upper bounds of the search space are represented by 𝑙𝑏 and 𝑢𝑏, respectively. The overall schema of representation of solutions in the population space of IGWO algorithm for the Evol-CNN is shown in Fig. 1. 2.2. Calculation of fitness function Convolutional neural networks (CNNs) are a type of deep neural net- works which generally processes complex and large datasets. Despite the fact that CNNs were effectively used in a wide range of real- world problems, quite few researches have confirmed CNN for wind power forecasting. Mostly the CNN technique involve three main layers: convolutional layer(s), pooling layer(s), and fully-connected layer. The convolutional layer uses an algebraic theory known as ‘‘convolution’’ to International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. w u ( s T c 𝐸 Fig. 1. Representation of each solution in population space of IGWO algorithm for Evol-CNN. w a r 𝐴 𝐶 w t [ retrieve feature points inside the raw data, whereas the pooling layer is utilized to decrease the dimension of the input data. At last, a fully- connected layer at the end of the CNN forecasts the points based on the retrieved unique features. Also every convolutional layer is designed to derive patterns from the response component (wind power) as shown below: 𝑦𝑘𝑖𝑘 = 𝑓 ( ( 𝑊 𝑘 ∗ ℎ ) 𝑖𝑗 + 𝑏𝑘 ) (2) where 𝑓 denotes to the activation function. Here, 𝑊 𝑘 is the kernel eight and * is the convolutional layer. We also use Rectified linear nit (Relu) as the activation function in the present work. In the optimization process of the Evol-CNN, the mean square error MSE) is used to evaluate the performance of performed CNN for each olution in the search space which is considered as the fitness function. herefore, the fitness function used in the optimization process can be alculated as follows: 𝑝 = 𝑀𝑆𝐸 = 1 𝑛 𝑛 ∑ 𝑖=1 (𝑦𝑖 − 𝑦̂𝑖)2 (3) where 𝑦𝑖 is a vector of observed wind power data points being predicted and 𝑦̂𝑖 represents the vector of 𝑛 predictions generated from a sample of 𝑛 wind power data points. We use dropout and learning rate techniques in the fully-connected layer as a form of regularization to prevent over-fitting during training. The nonlinear transformation of input data has been conducted via the processes of convolution, pooling, and the fully-connected layers. In addition to these configurations, we use momentum rate which is a powerful strategy helping to improve both training speed and accuracy. Thus, the distinguishing features of input data for prediction have been learnt in this procedure. Finally, training procedure is performed by SGD method. The aim of our novel Evol-CNN algorithm is in obtaining the optimal values of CNN hyperparameters leading to improve the performance of wind power forecasting. To this end, we need to consider the values set of CNN hyperparameters as a vector representing the solutions in the search space. Then, the proposed optimization approach can be applied to find the optimal values of CNN hyperparameters by exploring the search space. On the other hand, in the optimization process, we need to define a fitness function to evaluate the quality of each solution. To this end, we apply CNN model on the values of training samples and the prediction error value obtained by CNN is considered as the fitness function. Suppose that the historical wind power values for 𝑀 time steps are represented by a vector as follows: 𝑦 = (𝑦(0), 𝑦(1),… , 𝑦(𝑀−1)) (4) where 𝑦(𝑡) represents the actual wind power for the time step 𝑡. Then, we utilize CNN model to predict the wind power values for the next 𝑁 time steps. The predicted values of wind power for the next time steps are expressed as follows: ⃗̂𝑦 = (𝑦̂ , 𝑦̂ ,… , 𝑦̂ ) (5) 4 (𝑀) (𝑀+1) (𝑀+𝑁−1) where 𝑦̂(𝑡) denotes to the predicted wind power values for the time step 𝑡. After forecasting the wind power data using the CNN model, its error is considered as the fitness value of the corresponding solution. To this end, the MSE metric is used to calculate the error of CNN as the fitness value using Eq. (3). In the proposed model, the number of convolutional layers is automatically obtained by the used optimization strategy. Also, one max pooling layer followed by a dense fully connected layer are used for constructing the backbone of CNN architectures. 2.3. Search strategy In the Evol-CNN method, we design an improved version of grey wolf optimization algorithm as an evolutionary search strategy to obtain the optimal values of hyperparameters for CNN model. GWO algorithm is a swarm evolutionary meta-heuristic inspired by encircling and hunting behaviour of the grey wolves in nature [46]. We apply each individual in GWO algorithm to configure a CNN based on the obtained values of hyperparameters. Then, the wind power values in training data can be predicted by the configured CNN. The main optimization process of Evol-CNN starts with an ini- tialization step where a number of individuals are initialized with random values representing the hyperparameters values of CNN model. Therefore, the number of dimensions in each individual vector is equal to the number of hyperparameters optimized by the improved GWO al- gorithm. Each individual represents a solution containing the values of hyperparameters for CNN model. After the initialization step, the search procedure of improved GWO algorithm continues with obtaining new generations of the first population and repeating this step by a number of predefined iterations to find the optimal solution corresponding to the optimal values of hyperparameters. When designing the improved GWO algorithm, we consider the best solutions as the alpha (𝛼) wolves. The second and third best solutions, respectively, are called beta (𝛽) and delta (𝛿) wolves. It is assumed that the rest of the candidate solutions are omega (𝜔) wolves. Hunting behaviour (optimization procedure) is driven by 𝛼, 𝛽, and 𝛿. These three wolves accompany the other wolves. The encircling behaviour is mathematically modelled using the following equations: 𝐷⃗ = |𝐶.𝑋𝑝(𝑡) − 𝑋⃗(𝑡)| (6) 𝑋⃗(𝑡 + 1) = |𝑋𝑝(𝑡) − 𝐴.𝐷⃗| (7) here 𝑡 denotes the current iteration, 𝐴 and 𝐶 indicate coefficients, nd 𝑋𝑝 and 𝑋⃗ represent the position vectors of a prey and a grey wolf, espectively. The coefficients 𝐴 and 𝐶 are determined as follows: = 2𝑎.𝑟1 − 𝑎 (8) = 2.𝑟2 (9) here the components of 𝑎 are decreased linearly from 2 to 0 during he iterations and the vectors of 𝑟1 and 𝑟2 have random values in the 0, 1] interval. International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. w 𝐀 w b e n t a s f In order to model the hunting behaviour of the grey wolves math- ematically, we presume that 𝛼 (the best candidate solution), 𝛽, and 𝛿 have better awareness of the possible location of the prey. Accordingly, the first three best solutions that have been obtained so far have to be saved and the other search agents (including the omega wolves) need to change their positions based on the current location of the best search agents. In this respect, the distances between any other wolves (including Omegas) and these three best wolves are determined by the following equations in the decreasing order of their fitness: 𝐷𝛼 = |𝐶1.𝑋𝛼 − 𝑋⃗| 𝐷𝛽 = |𝐶2.𝑋𝛽 − 𝑋⃗| 𝐷𝛿 = |𝐶3.𝑋𝛿 − 𝑋⃗| (10) Using the Eqs. (6) and (7), these distances are applied to provide the new position of wolf 𝑋⃗(𝑡 + 1). 𝑋1 = 𝑋𝛼 − 𝐴1.𝐷𝛼 𝑋2 = 𝑋𝛽 − 𝐴2.𝐷𝛽 𝑋3 = 𝑋𝛿 − 𝐴3.𝐷𝛿 (11) 𝑋⃗(𝑡 + 1) = 𝑋1 +𝑋2 +𝑋3 3 (12) The best solution or prey is found by repetitively deploying the two encircling and hunting operators. In a nutshell, the GWO starts with a random grey wolf population. The mechanism of searching is principally driven by 𝛼, 𝛽 and 𝛿. They diverge in looking for prey when |𝐴| > 1 and converge in attacking prey when |𝐴| < 1. Eventually, if the stop criterion is met, the optimal solution (prey) is obtained. Although GWO is a robust optimizer that has demonstrated out- standing efficiency across a number of optimization algorithms, this pa- per offers a two-phase modification technique that boosts the efficiency of this algorithm stronger than before. The proposed modifications of GWO algorithm are described in the following. - First modification: Based on the greedy selection (GS) procedure of DE algorithm, we employ the idea of ‘‘survival of the fittest’’ with the probability 𝑝. As per this technique, new dominant positions within each generation proceed to be improved for the next generations. Moreover, the worse positions are ignored. The formula for this operator is as following: 𝑥⃗(𝑡 + 1) = { 𝑥⃗(𝑡) 𝑓 ( 𝑥⃗𝑛𝑒𝑤(𝑡) ) > 𝑓 (𝑥⃗(𝑡)) and 𝑟𝑛𝑒𝑤 < 𝑝 𝑥⃗𝑛𝑒𝑤(𝑡) otherwise (13) where 𝑓 (𝑋(𝑡)) represents the last position fitness, 𝑟new and 𝑝 denote to the random values into the (0, 1) range, and 𝑋𝑛𝑒𝑤(𝑡) represents the new position obtained by Eq. (12). In each iteration, the value of 𝑝 in Eq. (13) is characterized into the [0, 1] range randomly. The search abilities are enhanced by the combination of GS into GWO since each leader wolf gets the opportunity to stay alive and afterwards share their observed information with other hunters in the next phases of the search procedure. - Second modification: In GWO, the parameter 𝐴 is utilized to monitor the step size of the search agents which tends to decrease linearly with iterations. In this modification phase, we use the strong functionality of the levy flight strategy in order to tune the parameter 𝐴. This modification improves the potential exploration and exploitation of GWO, continu- ously. Assume the ∞ parameter indicates to the step size as following: ∞⊕ Levy(𝛽) ∼ 0.01 𝑝 |𝑞|1∕𝛽 ( 𝑋𝑘 𝑖 −𝑋𝑘 best ) (14) where the values of 𝑝 and 𝑞 are defined by: 𝑝 ∼ 𝑁 ( 0, 𝜙2 𝑢 ) , 𝑞 ∼ 𝑁 ( 0, 𝜙2 𝑣 ) (15) 𝜙𝑢 = [ 𝛤 (1 + 𝛽) × sin(𝜋 × 𝛽∕2) ]1∕𝛽 , 𝜙𝑣 = 1 (16) 5 𝛤 [(1 + 𝛽)∕2] × 𝛽 here 𝛤 symbolizes the standard gamma function in interval [0, 2]. We modify the parameter 𝐴 using Eq. (17) as follows: = 𝐿𝑒𝑣𝑦(𝑋) ∗ 𝐮 (17) here 𝑋 represents the position of wolves and 𝑢 is a random value etween [0, 1] range. These concepts are used to improve the global xploration as well as local exploitation capacity of conventional tech- ology and to deepen the searching advantages of GWO. These two-stage modifications greatly improve the local exploita- ion and global exploration of GWO. We name this new algorithm s improved GWO (IGWO). The flowchart of the proposed IGWO is hown in Fig. 2. Besides, a pseudo-code of our advanced wind power orecasting framework has been provided in Algorithm 1. Algorithm 1 Pseudo-code of the proposed wind power forecasting model 1: Input: 𝑝𝑜𝑝_𝑠𝑖𝑧𝑒 (population size) and 𝑛 (maximum number of iterations). 2: Output: Predicted wind power. 3: Begin algorithm: 4: Split dataset into two sets including training set 𝑇 𝑟 and test set 𝑇 𝑒; 5: Initialize the grey wolf population 𝑋𝑖 (𝑖 = 1, 2,… , 𝑝𝑜𝑝_𝑠𝑖𝑧𝑒); 6: Initialize parameter 𝛼, 𝐴 and 𝐶; 7: for (each solution 𝑋𝑖 in the grey wolf population) do 8: Set a CNN model based on the values of solution 𝑋𝑖 as the hyperparameters; 9: Calculate the fitness of solution 𝑋𝑖 using Eq. (3) as the MSE error of CNN model obtained based on the training set 𝑇 𝑟; 10: end for 11: Let 𝑋𝛼 be the best solution; 12: Let 𝑋𝛽 be the second best solution; 13: Let 𝑋𝛿 be the third best solution; 14: Apply GS strategy; 15: while (number of iterations < n) do 16: for each solution 𝑋𝑖 in the grey wolf population do 17: Update the position of 𝑋𝑖 using Eq. (11); 18: Set a CNN model based on the values of solution 𝑋𝑖 as the hyperparameters; 19: Calculate the fitness of solution 𝑋𝑖 using Eq. (3) as the MSE error of CNN model obtained based on the training set 𝑇 𝑟; 20: end for 21: Update 𝛼, 𝐴 by Levy flight operator and 𝐶; 22: Update 𝑋𝛼 , 𝑋𝛽 and 𝑋𝛿 23: Increase the number of iterations by 1; 24: end while 25: Set a CNN model based on the values of solution 𝑋𝛼 as the hyperparameters; 26: Predict the wind power data in the test set 𝑇 𝑒 using the CNN model; 27: End algorithm Due to the selection of best possible sets of hyperparameters and ar- chitectures, the strategy of training CNNs is considered to be a complex and difficult problem with an uncertain search space. Moreover, in the IGWO, the stability between exploration and exploitation phases is suc- cessful, which can be quite effective in addressing complex challenges such as CNN training. Thus, we can obtain the best solution containing the optimal values of CNN hyperparameters after performing IGWO algorithm. It should be noted that, the evolutionary search strategy is applied based on the training data to configure the CNN model with best optimal hyperparameter values. Then, the configured CNN model is applied on test data to predict unknown values of wind power data points. The overall procedure of the Evol-CNN is conceptually presented in Fig. 3. 3. Experimental results and discussions 3.1. Wind power data We evaluate the novel Evol-CNN algorithm on 10 min intervals of wind power data provided by the Australian Energy Market Operator (AEMO) for the whole year of 2010 from an existing wind farm in Australia [47]. In this work, the data of Woolnorth wind farm located in northwest of Tasmania is taken into consideration. This wind farm consists of 62 turbines with 140 megawatt (MW) nominal capacity. It is important to note that the wind site of Woolnorth is one of the most challenging situations for wind power forecasting in Australia International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. d i t s u t t 𝑡 a b v i d a w o { 𝛥 3 r p t o a d h o b s m o i p t v a u i b p u s Fig. 2. The flowchart of the proposed IGWO. because of its location on the edge of a cliff facing the Southern Ocean [47]. Similar to [10], we divided the used dataset based on ifferent seasons to show the performance of wind forecasting models n different weather conditions. This scenario makes it possible to show he sensitivity of compared models to different seasons. Then, for each eason, 75 percent of the data is used for training, and the remaining is tilized for testing. The training set is used to train the models and the est set is used to evaluate the performance of the compared models in 6 erms of different evaluation metrics. m Table 1 The IGWO parameters. Parameter Value a [2→0] Population size 20 Number of iteration 20 Number of runs 10 Table 2 List of CNN hyperparameter symbols and their values. Symbol Value B𝑠 [10, 20, . . . , 100] N𝑒 [1, 300] N𝑓 [1, 300] K𝑠 [1, 25] MP𝑠 [1, 15] D𝑟 [0.2, 0.25, . . . , 0.65] L𝑟 [0.001, 0.006, . . . , 0.1] M𝑟 [0.05, 0.1, . . . , 0.95] N𝑐 [1, 2, . . . , 5] To obtain a better representation of the input features values and improve the forecasting performance, we employ mutual information (MI) strategy. By considering 𝑝(𝑡) as the value of wind power for time , we measure the MI of 𝑝(𝑡 − 𝑙 + 1) and 𝑝(𝑡 + 1) assuming 𝑙 regarded s the time-lag of wind power time series values. We measure the MI ased on the lags from 𝑙 = 1 to 𝑙 = 100. We pick the time lags with MI alues higher than a threshold 𝜏 = 0.4 to be considered as the selected nput sets for making a better correlation of wind power time series ata which leads to generate the time-lags from 𝑙 = 1 to 𝑙 = 29. Let us ssume we are currently at the time 𝑡 and the future time horizon’s ind power value will be predicted. Thus, based on this inference, ur selected input set is considered as 29 + 28 = 57 dimensional set 𝑝(𝑡 − 28), 𝛥𝑝(𝑡 − 27), 𝑝(𝑡 − 27),… , 𝑝(𝑡)} with the sequential difference 𝑝(𝑡) = 𝑝(𝑡) − 𝑝(𝑡 − 1) in the wind power dataset. .2. Initialization setups for evol-CNN Selecting values for the initial parameters in evolutionary algo- ithms to train deep neural networks plays an important role in the erformance of these types of networks, and IGWO is no exception from his rule. We chose the initialized values of the IGWO algorithm based n the recommendations in aforementioned works [46]. These values re as shown in Table 1. On the other hand, it is necessary to determine the architecture of eep CNNs before their training, which is associated by selecting the yperparameters aligned with each network layer. In this study, we ptimize nine hyperparameters with Evol-CNN framework including atch size (B𝑠), number of epochs (N𝑒), number of filters (N𝑓 ), kernel ize (K𝑠), maxpooling size (MP𝑠), dropout rate (D𝑟), learning rate (L𝑟), omentum rate (M𝑟), and number of convolutional layers (N𝑐). Based n the previous literature [48], these hyperparameters are the most mportant hyperparameters which have significantly impact on the erformance of CNN training. Table 2 shows the hyperparameters and heir ranges which are used for the experiments in this work. These alues are selected based on the suggestions from literature and trial nd error for not resulting in over-fitting. The other hyperparameters sed in CNN training are the activation function type which has been nitialized by powerful ReLU function, the optimizer is considered y SGD, and pooling type is considered with maxpooling. Also, the erformance of different forecasting models in this paper is evaluated sing three well-known evaluation metrics including the root mean quare error (RMSE), mean absolute percentage error (MAPE) and ean absolute error (MAE). International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. Fig. 3. An overview of the proposed wind power forecasting model. Table 3 Error values of forecasting methods for spring season using different time horizons. Model Time step 10 min 1 h 3 h RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE PR 0.1312 295.617 0.0972 0.1554 325.794 0.1022 0.2019 403.908 0.1328 AR 0.1108 256.223 0.0833 0.1389 221.026 0.0954 0.1961 349.241 0.09963 ARMA 0.1076 189.343 0.0782 0.1323 178.454 0.0902 0.1936 317.565 0.09577 ARIMA 0.0996 144.668 0.0754 0.1261 153.778 0.0855 0.1927 279.118 0.09042 SVR 0.0711 82.227 0.0415 0.1178 107.156 0.0734 0.1923 236.442 0.08166 CNN 0.0481 21.939 0.0277 0.0922 63.116 0.0581 0.1724 204.811 0.06343 FFNN 0.0673 37.513 0.0361 0.1124 97.212 0.0693 0.1881 225.666 0.07242 LSTM 0.0426 18.413 0.0254 0.0919 55.707 0.0577 0.1533 184.932 0.06122 DE-LSTM 0.0413 18.255 0.0247 0.0901 55.265 0.0531 0.1514 182.677 0.05896 SAE 0.0396 17.173 0.0236 0.0862 52.656 0.0472 0.1449 172.559 0.05361 DeepHybrid 0.0382 16.688 0.0222 0.0834 48.054 0.0456 0.1412 167.201 0.05032 GWO-CNN 0.0361 15.545 0.0205 0.0794 42.335 0.0424 0.1397 164.302 0.04754 Evol-CNN 0.0346 13.103 0.0177 0.0752 34.229 0.0313 0.1364 161.559 0.0403 Table 4 Error values of forecasting methods for summer season using different time horizons. Model Time step 10 min 1 h 3 h RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE PR 0.1354 508.831 0.0415 0.1559 279.494 0.0526 0.3814 305.667 0.0855 AR 0.0938 315.992 0.0369 0.1433 178.101 0.0445 0.3177 242.881 0.0694 ARMA 0.0903 256.166 0.0361 0.1356 135.772 0.0427 0.3086 192.442 0.0668 ARIMA 0.0881 114.883 0.0346 0.1282 91.552 0.0418 0.2981 162.982 0.0627 SVR 0.0742 42.372 0.0298 0.1223 76.636 0.0397 0.2591 145.872 0.0589 FFNN 0.0632 36.856 0.0288 0.1074 69.284 0.0392 0.2451 141.208 0.0575 CNN 0.0618 34.217 0.0281 0.1052 62.646 0.0388 0.2313 135.662 0.0556 LSTM 0.0593 31.193 0.0278 0.1045 56.123 0.0385 0.2051 129.362 0.0542 DE-LSTM 0.0575 30.227 0.0272 0.1023 53.676 0.0379 0.2012 126.018 0.0527 SAE 0.0532 27.883 0.0265 0.1008 51.099 0.0376 0.1972 124.026 0.0514 DeepHybrid 0.0511 26.109 0.0258 0.0986 49.898 0.0371 0.1943 120.433 0.0498 GWO-CNN 0.0496 24.656 0.0251 0.0971 47.222 0.0365 0.1897 117.404 0.0485 Evol-CNN 0.0472 21.267 0.0232 0.0953 43.227 0.0315 0.1866 113.228 0.0462 3.3. Simulation results In this section, we compare the performance of our proposed Evol- CNN method with the classical baselines for short-term wind power forecasting models including persistence (PR) algorithm [9], auto- regressive (AR), auto regressive moving average (ARMA), and auto- regressive integrated moving average (ARIMA). In addition, the single and hybrid methods in recent literature are compared with the pro- posed model. A single model approach applies a single regression architecture to undertake the prediction task. In order to demonstrate the impact of deep feature learning on wind data regression problems, we compare the Evol-CNN with shallow deep ANN-based methods, 7 including feed-forward neural network (FFNN), long short-term mem- ory (LSTM), and convolutional neural network (CNN). Besides, support vector regression (SVR) [49] is chosen as another powerful supervised learning benchmark used for regression tasks in the literature. On the other hand, hybrid algorithms use multiple methods of wind feature extraction to improve the accuracy of prediction tasks. In this work, we compare the proposed Evol-CNN model with the recently proposed hybrid differential evolution-LSTM (DE-LSTM) [50] algorithm that employs DE to optimize the LSTM hyperparameters, as well as deep stacked auto-encoder (SAE) [9] that learns rough patterns from the input wind data. Also, a combination of standard version of GWO with deep CNN has been provided in order to show the searching capability of our IGWO model. In addition, we compare our proposed Evol-CNN International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. d f f t i a c A m 0 v s r Table 5 Error values of forecasting methods for autumn season using different time horizons. Model Time step 10 min 1 h 3 h RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE PR 0.1029 111.586 0.0455 0.1188 113.868 0.0446 0.1819 166.859 0.0519 AR 0.9652 74.402 0.0405 0.1098 85.922 0.0408 0.1682 145.911 0.0462 ARMA 0.0933 65.881 0.0392 0.1094 78.912 0.0397 0.1635 141.667 0.0451 ARIMA 0.0892 58.922 0.0371 0.1089 73.443 0.0389 0.1592 138.509 0.0438 SVR 0.0826 34.832 0.0309 0.1055 52.052 0.0342 0.1476 127.669 0.0415 FFNN 0.0622 19.914 0.0289 0.0991 61.948 0.0329 0.1445 120.446 0.0397 CNN 0.0615 16.332 0.0283 0.0936 60.407 0.0324 0.1408 117.494 0.0391 LSTM 0.0598 14.601 0.0278 0.0944 53.245 0.0321 0.1373 116.221 0.0378 DE-LSTM 0.0573 13.651 0.0268 0.0913 51.806 0.0317 0.1355 115.109 0.0371 SAE 0.0521 11.282 0.0262 0.0882 49.202 0.0311 0.1317 111.865 0.0356 DeepHybrid 0.0493 10.769 0.0256 0.0853 46.099 0.0306 0.1284 107.257 0.0331 GWO-CNN 0.0472 10.121 0.0241 0.0821 44.788 0.0296 0.1266 104.556 0.0326 Evol-CNN 0.0445 9.343 0.0211 0.0794 39.545 0.0278 0.1238 102.433 0.0315 Table 6 Error values of forecasting methods for winter season using different time horizons. Model Time step 10 min 1 h 3 h RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE PR 0.1289 125.9188 0.0358 0.1549 215.661 0.0347 0.2977 357.326 0.0496 AR 0.1124 97.2224 0.0309 0.1481 178.545 0.0325 0.2561 289.156 0.0455 ARMA 0.1053 88.1532 0.0292 0.1419 148.919 0.0317 0.2451 251.727 0.0434 ARIMA 0.0956 79.4322 0.0278 0.1356 124.771 0.0308 0.2238 238.115 0.0416 SVR 0.0847 50.1192 0.0251 0.1243 99.0987 0.0288 0.2142 192.663 0.0391 FFNN 0.0809 25.6676 0.0235 0.1285 90.3269 0.0275 0.2055 188.919 0.0362 CNN 0.0758 22.5572 0.0228 0.0121 67.8111 0.0271 0.1863 183.434 0.0346 LSTM 0.0583 18.9991 0.0224 0.1159 66.9182 0.0266 0.1791 178.934 0.0338 DE-LSTM 0.0561 17.2118 0.0218 0.1134 44.3099 0.0259 0.1761 173.244 0.0332 SAE 0.0535 15.8782 0.0215 0.1113 39.5664 0.0254 0.1722 169.092 0.0327 DeepHybrid 0.0493 12.2999 0.0208 0.1073 34.3373 0.0251 0.1693 163.455 0.0321 GWO-CNN 0.0476 11.6673 0.0204 0.1089 30.8982 0.0259 0.1674 159.915 0.0309 Evol-CNN 0.0435 10.1983 0.0198 0.1022 27.2891 0.0246 0.1635 154.676 0.0302 Table 7 The best CNN architectures found by Evol-CNN. Dataset Horizon RMSE Hyperparameters N𝑓 K𝑠 N𝑒 B𝑠 MP𝑠 D𝑟 L𝑟 M𝑟 N𝑐 10 min 0.031 40 1 30 60 2 0.25 0.011 0.05 1 Spring 1 h 0.071 70 2 20 40 3 0.25 0.046 0.1 3 3 h 0.131 30 3 70 30 2 0.35 0.006 0.05 1 10 min 0.041 80 1 30 50 2 0.4 0.026 0.2 1 Summer 1 h 0.092 70 2 40 70 1 0.2 0.031 0.05 2 3 h 0.179 50 3 20 60 2 0.15 0.016 0.3 3 10 min 0.038 55 1 30 30 3 0.35 0.041 0.1 3 Autumn 1 h 0.072 20 2 20 30 6 0.25 0.031 0.15 3 3 h 0.114 20 3 40 20 2 0.45 0.021 0.25 1 10 min 0.043 35 1 20 30 1 0.25 0.026 0.1 2 Winter 1 h 0.098 50 1 20 40 4 0.2 0.006 0.3 2 3 h 0.156 30 2 30 30 5 0.15 0.011 0.2 1 T a a 1 m p f s p l a s f with the hybrid algorithm proposed in [11] named as DeepHybrid. The esign of this algorithm is based on deep belief network (DBF) and uzzy type II inference system (FT2IS) for the supervised regression of uture wind values. In order to have a fair comparison for choosing the best configura- ions for hyperparameters of deep ANNs, the learnable hyperparameters ncluding dropout rate (D𝑟), learning rate (L𝑟) and momentum rate (M𝑟) re taken into consideration. For CNN, LSTM and FFNN models, D𝑟 is onsidered with values corresponding to 0.3, 0.25 and 0.3, respectively. lso, L𝑟 is equal to 0.006, 0.021 and 0.36 for CNN, LSTM and FFNN odels, respectively. Finally, M𝑟 is assigned to values equal to 0.05, .3 and 0.2 for CNN, LSTM and FFNN models, respectively. These alues have been chosen based on the trial and error through a grid earch strategy. For other algorithms, the optimal values of parameters eported in their corresponding papers are used in the experiments. 8 w he number of runs and number of iterations for all baseline models re considered the same as our proposed Evol-CNN model. All of the lgorithms are implemented using Python 3.7 on a GPU of NVIDIA GTX 080 Ti with the Intel Core i7 CPU and 32 GB RAM. Tables 3–6 show the average of RMSE and MAPE of the different ethods to determine 10 min, 1 h and 3 h forecasting ahead of wind ower data points for different seasons. The RMSE and MAPE generated rom different algorithms for spring dataset are tabulated in Table 3, howing that the RMSE and MAPE generated by the Evol-CNN in each rediction step (from 10 min to 3 h forecasting horizon) carry out the owest values. In Table 4, the Evol-CNN algorithm has higher prediction ccuracy in comparison with other twelve benchmarks for summer eason. Moreover, the PR and SVR perform weaker than neural network amily algorithms. This happens since irregularity and the linearity of ind power data is very high and these two methods are not able International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. t a E f m a s i o o t i o C r b C T t f Fig. 4. Actual vs predicted values of Evol-CNN. Fig. 5. Convergence curves of proposed Evol-CNN algorithm. o compete with ANN algorithms. Among deep ANNs, DeepHybrid lgorithm outperforms other deep ANN frameworks. However, the vol-CNN already has a higher modelling capability of wind power orecasting for this case. As can be seen in Table 5, for ultrashort-term predictions, the PR odel has reasonably good performance, however yields poor results s time steps rise. With longer horizons, SVR and ANN methods have ignificantly smaller values of RMSE and MAPE compared to PR. LSTM mproves the RMSE and MAPE for 10 min, 1 h and 3 h compared to two ther NN methods such as CNN and FFNN. DE-LSTM framework still utperforms better than LSTM in all seasons because differential mu- ation operator stabilizes the search space of the LSTM algorithm and ncreases its accuracy. However, compared to deep ANNs, DeepHybrid utperforms better than the other ones. Among all methods, Evol- NN has the best forecasting performance for different horizons. The esult of RMSEs and MAPEs generated of winter season by the twelve enchmark methods are tabulated in Table 6, indicating that the Evol- NN in three different time horizons outperforms other benchmarks. his is primarily due to the extraction of more substantive features hrough CNN representation and also to the robustness of extracted eatures resulting from the optimization process in IGWO. 9 In Table 7, the best architectures found with lowest RMSE by Evol- CNN for three different time steps of four seasons are represented. The overall conclusion drawn from this table is that the values chosen by the Evol-CNN are approximately not computationally high. For example, to select the proper values for N𝑓 , the algorithm chooses numbers that range from 20 to 80, which are almost far from the end of N𝑓 interval equal to 300. Thus, it is deduced that for network training, the Evol-CNN chooses normal values with lower computational costs. In order to intuitively present the performance of the Evol-CNN algorithm, the test dataset of wind power time-series for spring season and their predicted values is shown in Fig. 4. In this figure, the blue and red lines indicate the actual and predicted wind power data points, respectively. For predicting the next 10 min interval, the two lines almost overlap, meaning that the predicted values are close to the actual real data points. Nonetheless, as the horizon steps increase, the performance for predicting the next 1 h and 3 h decreases. This is also rational, since it is more difficult to predict the 1 h and 3 h ahead wind power forecasting than the 10 min prediction. Fig. 5 illustrates the convergence curves of Evol-CNN algorithm us- ing 10 independent runs for spring dataset. According to this figure, as forecasting horizon goes up, the prediction error increases. Moreover, International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. Fig. 6. Violin plots of hyperparameters generated by Evol-CNN for 10 min interval. it is much easier to converge for a forecasting horizon of 10 min ahead compared to 1 h and 3 h ahead. Finally, for all forecasting horizons, the optimization process converges properly toward the end of iterations. Figs. 6–8 shows the violin plots of nine optimized hyperparameters for three different horizons of spring season. This figure is important since it can share valuable information about the selection of the main hyperparameter values for CNN architectural design procedure. For instance, to select the initialization values for dropout rate, Fig. 8 shows that the appropriate values for this hyperparameter fall into the value of around 0.3. On the other hand, the value of 0.6 is not suggested for CNN training since it does not contain large amounts of dropout values during 10 times of Evol-CNN running. Such an interpretation applies to other hyperparameters of Figs. 6–8 as well. In order to determine statistically the significance of the differences between the performance of the Evol-CNN and other benchmarks, the T-test statistic technique is conducted. This test is carried out on the basis of the Evol-CNN results at 5% significance level and degree of freedom equal to 3 against each of the other benchmarks. Table 8 lists the obtained 𝑝 values performed by T-test. By investigating the obtained 𝑝 values in this table, it can be seen that the null hypothesis (significant 10 difference) at 5% significance level is rejected in all cases. Therefore, we can conclude that the proposed EvolCNN model is significantly better than other compared models in three horizons of each season dataset. 4. Conclusion This paper presents a novel algorithm called Evol-CNN which is a combination of deep CNNs and improved version of GWO algorithm for wind power forecasting. The aim of this algorithm is in optimization of the CNN hyperparameters in a discrete space for improving the accuracy of wind power forecasting. We also use the MI strategy for obtaining the optimal features for our proposed model. In order to demonstrate the effectiveness of the Evol-CNN, the performance of this algorithm is compared with twelve forecasting benchmarks on an Australian wind farm dataset for three different horizons. Considering different short-horizon time steps for different scenarios, Evol-CNN showed relatively better performance than other benchmarks in terms of RMSE and MAPE evaluation metrics. International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. Fig. 7. Violin plots of hyperparameters generated by Evol-CNN for 1 h interval. Table 8 𝑝 values of T-test for Evol-CNN forecasting results vs other models. Season Horizon PR AR ARMA ARIMA SVR FFNN CNN LSTM DE-LSTM SAE DeepHybrid GWO-CNN 10 min 9.81E−07 5.92E−06 5.53E−06 4.91E−06 1.66E−05 1.58E−05 1.54E−04 3.61E−03 8.14E−04 1.90E−03 1.16E−02 1.03E−02 Spring 1 h 7.98E−07 8.97E−06 8.41E−06 7.93E−06 6.15E−06 5.22E−06 5.13E−05 1.18E−05 1.03E−05 1.12E−04 1.09E−04 7.66E−03 3 h 3.27E−06 2.77E−06 2.41E−06 1.93E−06 8.87E−07 5.28E−07 7.78E−06 6.12E−05 3.18E−04 2.35E−03 1.61E−03 1.32E−03 10 min 3.98E−07 3.51E−07 3.21E−07 2.81E−07 1.09E−06 2.23E−04 1.91E−04 1.69E−04 1.15E−03 5.99E−03 1.63E−02 1.23E−02 Summer 1 h 1.31E−06 1.26E−06 1.02E−06 1.12E−06 1.61E−06 4.59E−05 4.26E−04 4.11E−04 6.25E−04 1.24E−03 9.83E−04 1.11E−03 3 h 1.45E−06 1.21E−06 1.02E−06 9.11E−05 1.16E−04 3.72E−07 2.71E−05 1.65E−04 1.54E−04 4.57E−05 3.43E−03 1.78E−03 10 min 7.69E−07 6.44E−07 6.11E−07 3.16E−07 2.73E−06 2.56E−06 2.90E−04 3.18E−04 4.99E−04 9.00E−05 3.01E−03 1.91E−03 autumn 1 h 2.45E−05 1.22E−05 7.81E−04 3.21E−04 2.85E−06 6.15E−06 1.54E−04 7.51E−05 1.16E−04 4.19E−04 1.92E−04 1.25E−03 3 h 9.59E−05 8.12E−05 7.77E−05 7.13E−05 5.33E−06 2.12E−05 1.85E−05 1.58E−04 3.95E−04 4.57E−05 2.09E−03 1.56E−03 10 min 4.57E−07 9.11E−06 6.56E−06 4.13E−06 2.93E−06 2.77E−06 2.02E−06 4.43E−05 2.63E−04 4.57E−05 5.33E−04 2.46E−04 Winter 1 h 3.96E−06 1.90E−06 1.01E−06 9.72E−05 1.43E−04 6.12E−05 2.90E−04 9.27E−05 2.02E−03 2.18E−03 3.61E−03 2.25E−03 3 h 1.67E−07 6.22E−06 2.32E−06 1.13E−06 2.33E−03 9.43E−06 3.98E−04 1.16E−04 5.21E−04 2.22E−03 6.83E−04 3.15E−04 11 International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. Fig. 8. Violin plots of hyperparameters generated by Evol-CNN for 3 h interval. CRediT authorship contribution statement Seyed Mohammad Jafar Jalali: Investigation, Visualization, Writ- ing – original draft. Sajad Ahmadian: Methodology, Data curation. Mahdi Khodayar: Formal analysis. Abbas Khosravi: Supervision. Mi- adreza Shafie-khah: Conceptualization. Saeid Nahavandi: Validation. João P.S. Catalão: Writing – review & editing. Declaration of competing interest The authors declare that they have no known competing finan- cial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgment J.P.S. Catalão acknowledges the support by FEDER funds through COMPETE 2020 and by Portuguese funds through FCT, under POCI- 01-0145-FEDER-029803 (02/SAICT/2017). 12 References [1] Luo X, Sun J, Wang L, Wang W, Zhao W, Wu J, et al. Short-term wind speed forecasting via stacked extreme learning machine with generalized correntropy. IEEE Trans Ind Inf 2018;14(11):4963–71. [2] Jalali SMJ, Khodayar M, Ahmadian S, Noman MK, Khosravi A, Islam SMS, et al. A new uncertainty-aware deep neuroevolution model for quantifying tidal prediction. In: 2021 IEEE industry applications society annual meeting. IEEE; 2021, p. 1–6. [3] Jalali SMJ, Khodayar M, Khosravi A, Osório GJ, Nahavandi S, Catalão JP. An advanced generative deep learning framework for probabilistic spatio-temporal wind power forecasting. In: 2021 IEEE international conference on environment and electrical engineering and 2021 IEEE industrial and commercial power systems Europe. IEEE; 2021, p. 1–6. [4] Jalali SMJ, Ahmadian S, Khodayar M, Khosravi A, Ghasemi V, Shafie-khah M, et al. Towards novel deep neuroevolution models: Chaotic levy grasshopper optimization for short-term wind speed forecasting. Eng Comput 2021;1–25. [5] Khodayar M, Khodayar ME, Jalali SMJ. Deep learning for pattern recognition of photovoltaic energy generation. Electr J 2021;34(1):106882. [6] Jalali SMJ, Ahmadian S, Khosravi A, Shafie-khah M, Nahavandi S, Catalão JP. A novel evolutionary-based deep convolutional neural network model for intelligent load forecasting. IEEE Trans Ind Inf 2021;17(12):8243–53. [7] Jalali SMJ, Ahmadian S, Kavousi-Fard A, Khosravi A, Nahavandi S. Automated deep CNN-LSTM architecture design for solar irradiance forecasting. IEEE Trans Syst Man Cybern A 2021;52(1):54–65. http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7 International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al. [8] Saffari M, Khodayar M, Jalali SMJ, Shafie-khah M, Catalão JP. Deep convolu- tional graph rough variational auto-encoder for short-term photovoltaic power forecasting. In: 2021 international conference on smart energy systems and technologies. IEEE; 2021, p. 1–6. [9] Khodayar M, Kaynak O, Khodayar ME. Rough deep neural architecture for short-term wind speed forecasting. IEEE Trans Ind Inf 2017;13(6):2770–9. [10] Hill DC, McMillan D, Bell KR, Infield D. Application of auto-regressive models to UK wind speed data for power system impact studies. IEEE Trans Sustain Energy 2011;3(1):134–41. [11] Khodayar M, Wang J, Manthouri M. Interval deep generative neural network for wind speed forecasting. IEEE Trans Smart Grid 2018;10(4):3974–89. [12] Choi I-J, Park R-S, Lee J. Impacts of a newly-developed aerosol climatology on numerical weather prediction using a global atmospheric forecasting model. Atmos Environ 2019;197:77–91. [13] do Nascimento Camelo H, Lucio PS, Junior JBVL, de Carvalho PCM, dos Santos DvG. Innovative hybrid models for forecasting time series applied in wind generation based on the combination of time series models with artificial neural networks. Energy 2018;151:347–57. [14] Zhang J, Wei Y, Tan Z. An adaptive hybrid model for short term wind speed forecasting. Energy 2019;115615. [15] Wang Y, Hu Q, Srinivasan D, Wang Z. Wind power curve modeling and wind power forecasting with inconsistent data. IEEE Trans Sustain Energy 2018;10(1):16–25. [16] Liu Y, Qin H, Zhang Z, Pei S, Wang C, Yu X, et al. Ensemble spatiotemporal forecasting of solar irradiation using variational Bayesian convolutional gate recurrent unit network. Appl Energy 2019;253:113596. [17] Xie W, Zhang P, Chen R, Zhou Z. A nonparametric Bayesian framework for short-term wind power probabilistic forecast. IEEE Trans Power Syst 2018;34(1):371–9. [18] Ahmadian S, Moradi P, Akhlaghian F. An improved model of trust-aware recommender systems using reliability measurements. In: 2014 6th Conference on information and knowledge technology. IEEE; 2014, p. 98–103. [19] Tahmasebi F, Meghdadi M, Ahmadian S, Valiallahi K. A hybrid recommendation system based on profile expansion technique to alleviate cold start problem. Multimedia Tools Appl 2021;80(2):2339–54. [20] Ahmadian M, Ahmadi M, Ahmadian S, Jalali SMJ, Khosravi A, Nahavandi S. Integration of deep sparse autoencoder and particle swarm optimization to develop a recommender system. In: 2021 IEEE international conference on systems, man, and cybernetics. IEEE; 2021, p. 2524–30. [21] Moradi P, Rezaimehr F, Ahmadian S, Jalili M. A trust-aware recommender algorithm based on users overlapping community structure. In: 2016 sixteenth international conference on advances in ICT for emerging regions. IEEE; 2016, p. 162–7. [22] Hasani H, Jalali SMJ, Rezaei D, Maleki M. A data mining framework for classification of organisational performance based on rough set theory. Asian J Manag Sci Appl 2018;3(2):156–80. [23] Jalali SMJ, Hedjam R, Khosravi A, Heidari AA, Mirjalili S, Nahavandi S. Autonomous robot navigation using moth-flame-based neuroevolution. In: Evolutionary machine learning techniques. Springer; 2020, p. 67–83. [24] Jalali SMJ, Khosravi A, Kebria PM, Hedjam R, Nahavandi S. Autonomous robot navigation system using the evolutionary multi-verse optimizer algorithm. In: 2019 IEEE international conference on systems, man and cybernetics. IEEE; 2019, p. 1221–6. [25] Kong X, Liu X, Shi R, Lee KY. Wind speed prediction using reduced support vector machines with feature selection. Neurocomputing 2015;169:449–56. [26] Hu Q, Zhang R, Zhou Y. Transfer learning for short-term wind speed prediction with deep neural networks. Renew Energy 2016;85:83–95. [27] Marugán AP, Márquez FPG, Perez JMP, Ruiz-Hernández D. A survey of artificial neural network in wind energy systems. Appl Energy 2018;228:1822–36. [28] Qian Z, Pei Y, Zareipour H, Chen N. A review and discussion of decomposition- based hybrid models for wind energy forecasting applications. Appl Energy 2019;235:939–53. 13 [29] Ahmadian S, Khanteymoori AR. Training back propagation neural networks using asexual reproduction optimization. In: 7th conference on information and knowledge technology. IEEE; 2015, p. 1–6. [30] Liu X, Zhang H, Kong X, Lee KY. Wind speed forecasting using deep neural network with feature selection. Neurocomputing 2020;397:393–403. [31] Chen J, Zhu Q, Li H, Zhu L, Shi D, Li Y, et al. Learning heterogeneous features jointly: A deep end-to-end framework for multi-step short-term wind power prediction. IEEE Trans Sustain Energy 2019. [32] Wang K, Qi X, Liu H, Song J. Deep belief network based k-means cluster approach for short-term wind power forecasting. Energy 2018;165:840–52. [33] Liu H, Mi X, Li Y. Smart multi-step deep learning model for wind speed forecasting based on variational mode decomposition, singular spectrum analysis, LSTM network and ELM. Energy Convers Manage 2018;159:54–64. [34] Wang H-z, Li G-q, Wang G-b, Peng J-c, Jiang H, Liu Y-t. Deep learning based ensemble approach for probabilistic wind power forecasting. Appl Energy 2017;188:56–70. [35] Abedinia O, Lotfi M, Bagheri M, Sobhani B, Shafie-khah M, Catalao JP. Improved EMD-based complex prediction model for wind power forecasting. IEEE Trans Sustain Energy 2020. [36] Hong Y-Y, Rioflorido CLPP. A hybrid deep learning-based neural network for 24 h ahead wind power forecasting. Appl Energy 2019;250:530–9. [37] Abedinia O, Bagheri M, Naderi MS, Ghadimi N. A new combinatory approach for wind power forecasting. IEEE Syst J 2020. [38] Hu Y-L, Chen L. A nonlinear hybrid wind speed forecasting model using LSTM network, hysteretic ELM and differential evolution algorithm. Energy Convers Manage 2018;173:123–42. [39] Han L, Zhang R, Wang X, Bao A, Jing H. Multi-step wind power forecast based on VMD-LSTM. IET Renew Power Gener 2019;13(10):1690–700. [40] Stanley KO, Clune J, Lehman J, Miikkulainen R. Designing neural networks through neuroevolution. Nat Mach Intell 2019;1(1):24–35. [41] Mousavirad SJ, Jalali SMJ, Ahmadian S, Khosravi A, Schaefer G, Nahavandi S. Neural network training using a biogeography-based learning strategy. In: International conference on neural information processing. Springer; 2020, p. 147–55. [42] Ahmadian S, Jalali SMJ, Raziani S, Chalechale A. An efficient cardiovascu- lar disease detection model based on multilayer perceptron and moth-flame optimization. Expert Syst 2021;e12914. [43] Ahmadian S, Jalali SMJ, Islam SMS, Khosravi A, Fazli E, Nahavandi S. A novel deep neuroevolution-based image classification method to diagnose coronavirus disease (COVID-19). Comput Biol Med 2021;139:104994. [44] Jalali SMJ, Ahmadian M, Ahmadian S, Khosravi A, Alazab M, Nahavandi S. An oppositional-Cauchy based GSK evolutionary algorithm with a novel deep ensemble reinforcement learning strategy for COVID-19 diagnosis. Appl Soft Comput 2021;111:107675. [45] Li Z, He Y, Li H, Li Y, Guo X. A novel discrete grey wolf optimizer for solving the bounded Knapsack problem. In: International symposium on intelligence computation and applications. Springer; 2018, p. 101–14. [46] Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw 2014;69:46–61. [47] Cutler N, Outhred H, MacGill I. Final report on UNSW project for AEMO to develop a prototype wind power forecasting tool for potential large rapid changes in wind power. The Centre for Energy and Environmental Markets; 2011. [48] Sun Y, Xue B, Zhang M, Yen GG. Evolving deep convolutional neural networks for image classification. IEEE Trans Evol Comput 2019. [49] Santamaría-Bonfil G, Reyes-Ballesteros A, Gershenson C. Wind speed forecasting for wind farms: A method based on support vector regression. Renew Energy 2016;85:790–809. [50] Peng L, Liu S, Liu R, Wang L. Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy 2018;162:1301–14. http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50 http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50 An advanced short-term wind power forecasting framework based on the optimized deep neural network models Introduction Proposed method Representation of solutions Calculation of fitness function Search strategy Experimental results and discussions Wind power data Initialization setups for evol-CNN Simulation results Conclusion CRediT authorship contribution statement Declaration of competing interest Acknowledgment References