Electrical Power and Energy Systems 141 (2022) 108143

A
0

M
a

b

c

d

e

h
R

Contents lists available at ScienceDirect

International Journal of Electrical Power and Energy Systems

journal homepage: www.elsevier.com/locate/ijepes

An advanced short-term wind power forecasting framework based on the
optimized deep neural network models
Seyed Mohammad Jafar Jalali a, Sajad Ahmadian b, Mahdi Khodayar c, Abbas Khosravi a,

iadreza Shafie-khah d, Saeid Nahavandi a, João P.S. Catalão e,∗

Institute for Intelligent Systems Research and Innovation, (IISRI), Deakin University, Geelong, Australia
Faculty of Information Technology, Kermanshah University of Technology, Kermanshah, Iran
Department of Computer Science - University of Tulsa, USA
School of Technology and Innovations, University of Vaasa, Vaasa, Finland
Faculty of Engineering of University of Porto and INESC TEC, Porto, Portugal

A R T I C L E I N F O

Keywords:
Deep neural networks
Evolutionary computation
Neuroevolution
Optimization
Wind power forecasting

A B S T R A C T

With the continued growth of wind power penetration into conventional power grid systems, wind power
forecasting plays an increasingly competitive role in organizing and deploying electrical and energy systems.
The wind power time series, though, often present non-linear and non-stationary characteristics, allowing them
quite challenging to estimate precisely. The aim of this paper is in proposing a novel hybrid model named Evol-
CNN in order to predict the short-term wind power at 10-min interval up to 3-hr based on deep convolutional
neural network (CNN) and evolutionary search optimizer. Specifically, we develop an improved version of Grey
Wolf Optimization (GWO) algorithm by incorporating two effective modifications in its original structure. The
proposed GWO algorithm is more effective than the original version due to performing in a faster way and
the ability to escape from local optima. The proposed GWO algorithm is utilized to find the optimal values
of hyperparameters for deep CNN model. Moreover, the optimal CNN model is employed to predict wind
power time series. The main advantage of the proposed Evol-CNN model is to enhance the capability of time
series forecasting models in obtaining more accurate predictions. Several forecasting benchmarks are compared
with the Evol-CNN model to address its effectiveness. The simulation results indicate that the Evol-CNN has a
significant advantage over the competitive benchmarks and also, has the minimum error regarding of 10-min,
1-hr and 3-hr ahead forecasting.
1. Introduction

In recent years, wind energy has gained remarkable attention as
a clean source of electricity that addresses crucial environmental con-
cerns [1–4]. The stability and reliability of energy production and the
reduction in greenhouse gas emission are significant challenges recently
emerged in the domain of power engineering [2,5–8]. The accurate
prediction of wind power, which is considered as a highly varying time
series with a stochastic and intermittent nature, plays a key role in
overcoming such issues [1,9]. Even though wind power generated by
a wind turbo depends heavily on atmospheric climatic conditions, the
accurate prediction of wind power results in improved wind energy
predictions. Consequently, the recent literature presents a broad va-
riety of time series forecasting algorithms for the prediction of wind
power time series. The nature of wind data is stochastic and chaotic,
which means that predicting wind power with linear models is a very

∗ Corresponding author.
E-mail address: catalao@fe.up.pt (J.P.S. Catalão).

challenging task [10]. Furthermore, the length of the prediction horizon
correlates negatively with the accuracy of the forecasting algorithm [9].
Ultrashort- term wind forecasting relates to forecasting of wind data
within a few minutes to one hour ahead. This operation is primarily
aimed at clearing the electricity sector, grid operations in real time
and regulatory activities [11]. Short-term predictions are generally for a
duration from one hour to several hours ahead. This type of forecasting
is typically used for unit engagement and operational safety in the
energy industry [11].

The wind power forecasting methodologies presented in recent
technical literature can be categorized into four classes:
(1) The persistent model (PR) assumes that the future values of wind
measurements have similar values as the most recent historical mea-
surement. The smoothness assumption in the model leads to a simple
method with the lowest computational resources required; however,
vailable online 6 April 2022
142-0615/© 2022 The Authors. Published by Elsevier Ltd. This is an open access ar

ttps://doi.org/10.1016/j.ijepes.2022.108143
eceived 11 October 2021; Received in revised form 6 February 2022; Accepted 18
ticle under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

March 2022

http://www.elsevier.com/locate/ijepes
http://www.elsevier.com/locate/ijepes
mailto:catalao@fe.up.pt
https://doi.org/10.1016/j.ijepes.2022.108143
https://doi.org/10.1016/j.ijepes.2022.108143
http://crossmark.crossref.org/dialog/?doi=10.1016/j.ijepes.2022.108143&domain=pdf
http://creativecommons.org/licenses/by/4.0/


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

p
l

Nomenclature

𝑤𝑙−1
𝑖𝑘 Kernel of the 𝑖th neuron at layer 𝑙 − 1

towards the 𝑘th neuron of the layer 𝑙
𝛼 Alpha wolf
𝛽 Beta wolf
𝛿 Delta wolf
𝛥𝑙
𝑘 Delta of the 𝑘th neuron at layer 𝑙

𝛤 Standard gamma function
𝑦̂𝑖 Vector of predicted wind power data
𝜔 Omega wolf
𝑋𝑝 Position vectors of the prey
𝑋⃗ Position vectors of the grey wolf
𝑏𝑙𝑘 Scalar bias of the 𝑘th neuron at layer 𝑙
𝑏𝑗 Total number of the item type 𝑗
𝑐𝑜𝑛𝑣1𝐷𝑧(., .) Fully convolutional operation in 1-D space

using the zero padding
𝑙 Current convolution layer
𝑙 − 1 Previous convolution layer
𝑙𝑏 Lower bounds of the search space
𝐿𝑒𝑣𝑦() Levy flight function
𝑟𝑒𝑣(.) Array reversing
𝑠𝑙−1𝑖 Output of the 𝑖th neuron of layer 𝑙 − 1
𝑢𝑏 Upper bounds of the search space
𝑥𝑙𝑘 𝑘th feature of input
𝑥𝑖𝑗 Integer values of the 𝑋𝑖 position in 𝑗𝑡ℎ

dimension
𝑦𝑙𝑘 Intermediate output of neuron from the

input
𝑦𝑖 Vector of observed wind power data
𝑦(𝑡) Actual wind power for the time step 𝑡
𝑦𝑖𝑗 Transformed real number of the 𝑗𝑡ℎ dimen-

sion of individual 𝑖
AEMO Australian Energy Market Operator
AI Artificial intelligence
ANN Artificial neural network
AR Auto regressive
ARIMA Auto regressive integrated moving average
ARMA Auto regressive moving average
B𝑠 Batch size
BaNN Bagging neural network
BP Back propagation
CNN Convolutional neural network
D𝑟 Dropout rate
DE Differential evolution
Evol-CNN Proposed evolutionary CNN framework
FFNN Feed-forward neural network
FP Forward propagation
GWO grey wolf optimizer
IGWO Improved version of GWO
K𝑠 Kernel size
L𝑟 Learning rate
LSTM Long short-term memory
M𝑟 Momentum rate

the accuracy of this model is significantly declined as the prediction
horizon is expanded [11].
(2) The Physical models work on the basis of numerical weather pre-
diction (NWP) by considering different meteorological parameters such
2

MAE Mean absolute error
MAPE Mean absolute percentage error
MI Mutual information
MP𝑠 Maxpooling size
MSE Mean square error
N𝑐 Number of convolutional layers
N𝑒 Number of epochs
N𝑓 Number of filters
PR Persistent model
PSO Particle swarm optimization
RBFNN Radial basis function neural network
ReLU Rectified linear unit
RMSE Root mean square error
SAE Stacked auto-encoder
SGD Stochastic Gradient Descent
SVR Support vector regression

as temperature, pressure, and obstacles. NWP leads to accurate pre-
dictions for long-term wind forecasting tasks, and utilize in problems
corresponding to large-scale predictions in very wide areas. The major
disadvantage of this method is the large time and memory complexity
that is addressed by using supercomputers. Thus, this method can-
not be employed for real-time predictions with limited computational
resources [12].
(3) Statistical methods learn the mathematical relationship between
different variables corresponding to real-time wind data samples. This
category of approaches include the auto regressive (AR), auto re-
gressive moving average (ARMA), auto regressive integrated moving
average (ARIMA), Bayesian approach, as well as the grey predictions.
In this domain, the authors of [13] developed a novel version of ARIMA
for the short-term prediction of wind data that eases the decision
making in wind sectors. Also, the research work in [14] provides a
novel signal decomposition technique based on ARIMA to capture the
general trend of wind speed datasets. Furthermore, a robust spline
regression model is proposed in [15] which is optimized by the vari-
ational Bayesian method [16]. In this class of approaches, an infinite
Markov switching AR is also presented by [17] as a non-parametric
Bayesian framework with flexible posterior distribution for accurate
wind forecasting in large-scale datasets.
(4) Artificial intelligence (AI) methodologies have shown their great
potential for solving numerous types of real world applications [18–
24]. AI algorithms such as artificial neural networks, support vector
regression [25], transfer learning [26] and fuzzy systems led to novel
wind prediction algorithms in very recent studies [27]. Artificial neural
networks (ANNs) are widely used as nonlinear models that extract
powerful relationships between the input wind measurements and the
future wind speed values. In this group, ANNs are successfully applied
to the prediction of different weather time series with various time
scales [28]. Feed-forward ANN, recurrent ANN, radial basis function
(RBF) ANN, and adaptive wavelet ANN are recently proposed for wind
speed and wind power forecasting applications.

While these methodologies provide a highly nonlinear regression
model, they cannot handle the large variations in the wind time series
due to the lack of computational layers. Also, determining the optimal
values of ANNs parameters such as their weights and biases is a chal-
lenging task in this research area [29]. As a result, recent studies are
focused on deep neural networks that consist of large number of latent
layers to extract wind power data in a supervised and unsupervised
manner [30].

Among recent deep learning models, stacked autoencoders [31] are
resented to develop an accurate multi-scale prediction method that
earns heterogeneous wind features in a real-time fashion. In [32],


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

a
t

a
f
p
C
p
f
e
s
i
v
i
f

g
t
n
p
b
s
D
t
t
D

e
t
d
I
g
i
a
i
o
p
s
t

r
d
p

2

w
f
s
c
T
t
t
b
w
a
h
f
o
m
s

2

n
f
1
C
G
T
T
i
t
s
w
v
a
w
t
f

𝑄

a deep belief network is applied that can learn the most significant
unsupervised wind features in a probabilistic manner. Also, the authors
of [11] developed an interval version of this model that can address
the uncertainties and noise in wind measurements. Long Short-Term
Memory network [33] is a recurrent version of deep ANNs that applies

large number of temporal latent layers to effectively learn powerful
emporal characteristics of the wind power data. In [34], an ensemble

algorithm based on deep CNN and wavelet transform is proposed for
wind power forecasting using a real wind farm dataset from China.
The research work presented in [35] proposes an improved empirical
mode decomposition combined with bagging neural network (BaNN),
K-means clustering method, and shark smell optimization (SSO) algo-
rithm to predict wind power data. In [36], a hybrid deep CNN with

radial basis function neural network (RBFNN) and double Gaussian
unction (DGF) approaches are applied to 24 hr-ahead short-term wind
ower forecasting. In [37], a strong combination of two-dimensional
NN and improved wavelet transform trained by improved version of
article swarm optimization (PSO) is proposed to forecast wind power
or short and long-term forecasting horizons. [38] applies a differential
volution (DE) algorithm to optimize the hyperparameters of deep long
hort-term memory (LSTM) neural networks for a time series forecast-
ng problem. In another work [39], a forecasting algorithm based on
ariational-mode decomposition long short-term memory (VMD-LSTM)
s proposed in order to improve the accuracy of multi-step wind power
orecasting.

In most of the previous works, the architecture of deep learning al-
orithms is manually designed, which is a time consuming and difficult
ask to do [40–42]. In order to overcome these issues, advanced deep
euroevolution search methods have been introduced to design this
rocedure automatically and efficiently [43,44]. These methods have
een successfully implemented and used in several real-world problems
uch as computer vision, fraud detection and medical domains [40].
eep neuroevolution is defined as the process of optimizing the archi-

ecture of deep learning networks by evolutionary methods in order
o achieve highest accuracy and obtain the optimal architectures for
NNs [40].

In this paper, a novel deep neuroevolution method is presented to
mploy convolution operation for extracting powerful temporal fea-
ures from the wind power signals. Our model is optimized by a new
eep neuroevolution algorithm for short-term wind power prediction.
n contrast to classic deep models, we introduce a modified version of
rey wolf optimizer (GWO) with CNNs to overcome the local minimum
ssues in the optimization of the hyperparameters and obtain the best
rchitectures with highest accuracy. Besides, we employ the mutual
nformation (MI) feature extraction strategy in order to obtain the
ptimal input features for our proposed deep learning model. Our
roposed forecasting algorithm is applied on both ultrashort-term and
hort-term forecasting horizons. In summary, the main contributions of
his study can be summarized in four main categories:

• An improved version of GWO algorithm is proposed as a novel
evolutionary optimization approach to strengthen the search
space capability and reduce the possibility of trapping into local
optima.

• An efficient forecasting method is developed based on the im-
proved GWO algorithm and CNN deep learning model for wind
power signals to obtain the highest prediction accuracy. To the
best of our knowledge, this is the first attempt to solve wind
power prediction problem by a deep neuroevolution algorithm.
To this end, the proposed method considers nine critical hyper-
parameters of deep CNNs to be optimized by the improved GWO
algorithm.

• An effective mutual information feature extraction strategy is
applied to increase the accuracy of the wind power forecasting
procedure and gain the optimal input features for such a highly
3

volatile signal.
• Our simulations show the superiority of the proposed method
over the state-of-the-art and recently published works in terms
of 10 min up to 3 h ahead predictions for two case studies.

The rest of the paper is organized as follows: In Section 2, the theo-
ies and components of the proposed method are presented. Section 3
enotes to the numerical analysis for wind power data and finally, the
aper is concluded in Section 4.

. Proposed method

In this section, we develop an integration of deep neural net-
ork and evolutionary search strategy to propose a novel wind power

orecasting method called Evol-CNN. To this end, the available time
eries data for wind power in the past is used as the input of deep
onvolutional neural network for forecasting wind power in the future.
here are several hyperparameters used in deep CNN architecture that
heir values impact on the performance of deep CNN. In other words,
he accuracy of deep CNN for forecasting wind power can be improved
y determining the optimal values of its hyperparameters. Therefore,
e develop an improved version of grey wolf optimization algorithm
s an evolutionary search strategy to obtain the optimal values of
yperparameters for the deep CNN. In the following subsections, we
irst discuss the representation of the solutions in the evolutionary
ptimization algorithm. Then, the fitness function used in the proposed
ethod is introduced in details. Finally, we discuss our novel search

trategy for the Evol-CNN algorithm.

.1. Representation of solutions

In the proposed method, we adopt one-dimensional convolutional
eural network since the dimension of the data used for time series
orecasting problems has generally one dimension. Thus, we apply
D-CNN for forecasting the unknown values of wind power data. 1D-
NN has several hyperparameters and the aim of using improved
WO algorithm is to obtain optimal values for these hyperparameters.
o this end, we optimize nine critical hyperparameters by Evol-CNN.
herefore, we need to define each solution in the population space of

mproved GWO algorithm as a vector of nine values corresponding to
he considered hyperparameters. On the other hand, the basic search
trategy of GWO algorithm is mainly applied for optimization problems
ith continuous domain of individuals, while the hyperparameters
alues of a CNN should mostly be discrete values. Therefore, we use
n encoding transformation function to map each real number vector,
hich transforms the position of an individual in the continuous space

o a new integer vector for an individual in the discrete space as
ollows [45]:

𝑖𝑗 = ⌊𝑏𝑗 ∗
𝑥𝑖𝑗 − 𝑙𝑏
𝑢𝑏 − 𝑙𝑏

+ 0.5⌋, 𝑗 = 1,… , 𝑛 (1)

where 𝑥𝑖𝑗 represents the real values of the 𝑋𝑖 position in the 𝑗𝑡ℎ
dimension and 𝑄𝑖𝑗 denotes to the transformed integer number of the
𝑗𝑡ℎ dimension of individual 𝑖. 𝑏𝑗 denotes to the total number of the
item type 𝑗 and the lower and upper bounds of the search space are
represented by 𝑙𝑏 and 𝑢𝑏, respectively.

The overall schema of representation of solutions in the population
space of IGWO algorithm for the Evol-CNN is shown in Fig. 1.

2.2. Calculation of fitness function

Convolutional neural networks (CNNs) are a type of deep neural net-
works which generally processes complex and large datasets. Despite
the fact that CNNs were effectively used in a wide range of real-
world problems, quite few researches have confirmed CNN for wind
power forecasting. Mostly the CNN technique involve three main layers:
convolutional layer(s), pooling layer(s), and fully-connected layer. The
convolutional layer uses an algebraic theory known as ‘‘convolution’’ to


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

w
u

(
s
T
c

𝐸

Fig. 1. Representation of each solution in population space of IGWO algorithm for Evol-CNN.
w
a
r

𝐴

𝐶

w
t
[

retrieve feature points inside the raw data, whereas the pooling layer
is utilized to decrease the dimension of the input data. At last, a fully-
connected layer at the end of the CNN forecasts the points based on the
retrieved unique features. Also every convolutional layer is designed to
derive patterns from the response component (wind power) as shown
below:

𝑦𝑘𝑖𝑘 = 𝑓
(

(

𝑊 𝑘 ∗ ℎ
)

𝑖𝑗 + 𝑏𝑘
)

(2)

where 𝑓 denotes to the activation function. Here, 𝑊 𝑘 is the kernel
eight and * is the convolutional layer. We also use Rectified linear
nit (Relu) as the activation function in the present work.

In the optimization process of the Evol-CNN, the mean square error
MSE) is used to evaluate the performance of performed CNN for each
olution in the search space which is considered as the fitness function.
herefore, the fitness function used in the optimization process can be
alculated as follows:

𝑝 = 𝑀𝑆𝐸 = 1
𝑛

𝑛
∑

𝑖=1
(𝑦𝑖 − 𝑦̂𝑖)2 (3)

where 𝑦𝑖 is a vector of observed wind power data points being predicted
and 𝑦̂𝑖 represents the vector of 𝑛 predictions generated from a sample
of 𝑛 wind power data points.

We use dropout and learning rate techniques in the fully-connected
layer as a form of regularization to prevent over-fitting during training.
The nonlinear transformation of input data has been conducted via the
processes of convolution, pooling, and the fully-connected layers. In
addition to these configurations, we use momentum rate which is a
powerful strategy helping to improve both training speed and accuracy.
Thus, the distinguishing features of input data for prediction have been
learnt in this procedure. Finally, training procedure is performed by
SGD method.

The aim of our novel Evol-CNN algorithm is in obtaining the optimal
values of CNN hyperparameters leading to improve the performance of
wind power forecasting. To this end, we need to consider the values set
of CNN hyperparameters as a vector representing the solutions in the
search space. Then, the proposed optimization approach can be applied
to find the optimal values of CNN hyperparameters by exploring the
search space. On the other hand, in the optimization process, we need
to define a fitness function to evaluate the quality of each solution. To
this end, we apply CNN model on the values of training samples and
the prediction error value obtained by CNN is considered as the fitness
function. Suppose that the historical wind power values for 𝑀 time
steps are represented by a vector as follows:

𝑦 = (𝑦(0), 𝑦(1),… , 𝑦(𝑀−1)) (4)

where 𝑦(𝑡) represents the actual wind power for the time step 𝑡. Then,
we utilize CNN model to predict the wind power values for the next 𝑁
time steps. The predicted values of wind power for the next time steps
are expressed as follows:

⃗̂𝑦 = (𝑦̂ , 𝑦̂ ,… , 𝑦̂ ) (5)
4

(𝑀) (𝑀+1) (𝑀+𝑁−1)
where 𝑦̂(𝑡) denotes to the predicted wind power values for the time step
𝑡. After forecasting the wind power data using the CNN model, its error
is considered as the fitness value of the corresponding solution. To this
end, the MSE metric is used to calculate the error of CNN as the fitness
value using Eq. (3). In the proposed model, the number of convolutional
layers is automatically obtained by the used optimization strategy. Also,
one max pooling layer followed by a dense fully connected layer are
used for constructing the backbone of CNN architectures.

2.3. Search strategy

In the Evol-CNN method, we design an improved version of grey
wolf optimization algorithm as an evolutionary search strategy to
obtain the optimal values of hyperparameters for CNN model. GWO
algorithm is a swarm evolutionary meta-heuristic inspired by encircling
and hunting behaviour of the grey wolves in nature [46]. We apply
each individual in GWO algorithm to configure a CNN based on the
obtained values of hyperparameters. Then, the wind power values in
training data can be predicted by the configured CNN.

The main optimization process of Evol-CNN starts with an ini-
tialization step where a number of individuals are initialized with
random values representing the hyperparameters values of CNN model.
Therefore, the number of dimensions in each individual vector is equal
to the number of hyperparameters optimized by the improved GWO al-
gorithm. Each individual represents a solution containing the values of
hyperparameters for CNN model. After the initialization step, the search
procedure of improved GWO algorithm continues with obtaining new
generations of the first population and repeating this step by a number
of predefined iterations to find the optimal solution corresponding to
the optimal values of hyperparameters.

When designing the improved GWO algorithm, we consider the best
solutions as the alpha (𝛼) wolves. The second and third best solutions,
respectively, are called beta (𝛽) and delta (𝛿) wolves. It is assumed
that the rest of the candidate solutions are omega (𝜔) wolves. Hunting
behaviour (optimization procedure) is driven by 𝛼, 𝛽, and 𝛿. These
three wolves accompany the other wolves. The encircling behaviour
is mathematically modelled using the following equations:

𝐷⃗ = |𝐶.𝑋𝑝(𝑡) − 𝑋⃗(𝑡)| (6)

𝑋⃗(𝑡 + 1) = |𝑋𝑝(𝑡) − 𝐴.𝐷⃗| (7)

here 𝑡 denotes the current iteration, 𝐴 and 𝐶 indicate coefficients,
nd 𝑋𝑝 and 𝑋⃗ represent the position vectors of a prey and a grey wolf,
espectively.

The coefficients 𝐴 and 𝐶 are determined as follows:

= 2𝑎.𝑟1 − 𝑎 (8)

= 2.𝑟2 (9)

here the components of 𝑎 are decreased linearly from 2 to 0 during
he iterations and the vectors of 𝑟1 and 𝑟2 have random values in the
0, 1] interval.


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

w

𝐀

w
b
e
n

t
a
s
f

In order to model the hunting behaviour of the grey wolves math-
ematically, we presume that 𝛼 (the best candidate solution), 𝛽, and 𝛿
have better awareness of the possible location of the prey. Accordingly,
the first three best solutions that have been obtained so far have to
be saved and the other search agents (including the omega wolves)
need to change their positions based on the current location of the best
search agents. In this respect, the distances between any other wolves
(including Omegas) and these three best wolves are determined by the
following equations in the decreasing order of their fitness:

𝐷𝛼 = |𝐶1.𝑋𝛼 − 𝑋⃗|

𝐷𝛽 = |𝐶2.𝑋𝛽 − 𝑋⃗|

𝐷𝛿 = |𝐶3.𝑋𝛿 − 𝑋⃗|

(10)

Using the Eqs. (6) and (7), these distances are applied to provide the
new position of wolf 𝑋⃗(𝑡 + 1).

𝑋1 = 𝑋𝛼 − 𝐴1.𝐷𝛼

𝑋2 = 𝑋𝛽 − 𝐴2.𝐷𝛽

𝑋3 = 𝑋𝛿 − 𝐴3.𝐷𝛿

(11)

𝑋⃗(𝑡 + 1) =
𝑋1 +𝑋2 +𝑋3

3
(12)

The best solution or prey is found by repetitively deploying the two
encircling and hunting operators.

In a nutshell, the GWO starts with a random grey wolf population.
The mechanism of searching is principally driven by 𝛼, 𝛽 and 𝛿. They
diverge in looking for prey when |𝐴| > 1 and converge in attacking
prey when |𝐴| < 1. Eventually, if the stop criterion is met, the optimal
solution (prey) is obtained.

Although GWO is a robust optimizer that has demonstrated out-
standing efficiency across a number of optimization algorithms, this pa-
per offers a two-phase modification technique that boosts the efficiency
of this algorithm stronger than before. The proposed modifications of
GWO algorithm are described in the following.

- First modification:
Based on the greedy selection (GS) procedure of DE algorithm, we

employ the idea of ‘‘survival of the fittest’’ with the probability 𝑝. As per
this technique, new dominant positions within each generation proceed
to be improved for the next generations. Moreover, the worse positions
are ignored. The formula for this operator is as following:

𝑥⃗(𝑡 + 1) =
{

𝑥⃗(𝑡) 𝑓
(

𝑥⃗𝑛𝑒𝑤(𝑡)
)

> 𝑓 (𝑥⃗(𝑡)) and 𝑟𝑛𝑒𝑤 < 𝑝
𝑥⃗𝑛𝑒𝑤(𝑡) otherwise (13)

where 𝑓 (𝑋(𝑡)) represents the last position fitness, 𝑟new and 𝑝 denote
to the random values into the (0, 1) range, and 𝑋𝑛𝑒𝑤(𝑡) represents the
new position obtained by Eq. (12). In each iteration, the value of 𝑝
in Eq. (13) is characterized into the [0, 1] range randomly. The search
abilities are enhanced by the combination of GS into GWO since each
leader wolf gets the opportunity to stay alive and afterwards share
their observed information with other hunters in the next phases of the
search procedure.

- Second modification:
In GWO, the parameter 𝐴 is utilized to monitor the step size of

the search agents which tends to decrease linearly with iterations. In
this modification phase, we use the strong functionality of the levy
flight strategy in order to tune the parameter 𝐴. This modification
improves the potential exploration and exploitation of GWO, continu-
ously. Assume the ∞ parameter indicates to the step size as following:

∞⊕ Levy(𝛽) ∼ 0.01
𝑝

|𝑞|1∕𝛽
(

𝑋𝑘
𝑖 −𝑋𝑘

best
)

(14)

where the values of 𝑝 and 𝑞 are defined by:

𝑝 ∼ 𝑁
(

0, 𝜙2
𝑢
)

, 𝑞 ∼ 𝑁
(

0, 𝜙2
𝑣
)

(15)

𝜙𝑢 =
[

𝛤 (1 + 𝛽) × sin(𝜋 × 𝛽∕2)
]1∕𝛽

, 𝜙𝑣 = 1 (16)
5

𝛤 [(1 + 𝛽)∕2] × 𝛽
here 𝛤 symbolizes the standard gamma function in interval [0, 2]. We
modify the parameter 𝐴 using Eq. (17) as follows:

= 𝐿𝑒𝑣𝑦(𝑋) ∗ 𝐮 (17)

here 𝑋 represents the position of wolves and 𝑢 is a random value
etween [0, 1] range. These concepts are used to improve the global
xploration as well as local exploitation capacity of conventional tech-
ology and to deepen the searching advantages of GWO.

These two-stage modifications greatly improve the local exploita-
ion and global exploration of GWO. We name this new algorithm
s improved GWO (IGWO). The flowchart of the proposed IGWO is
hown in Fig. 2. Besides, a pseudo-code of our advanced wind power
orecasting framework has been provided in Algorithm 1.

Algorithm 1 Pseudo-code of the proposed wind power forecasting model
1: Input: 𝑝𝑜𝑝_𝑠𝑖𝑧𝑒 (population size) and 𝑛 (maximum number of iterations).
2: Output: Predicted wind power.
3: Begin algorithm:
4: Split dataset into two sets including training set 𝑇 𝑟 and test set 𝑇 𝑒;
5: Initialize the grey wolf population 𝑋𝑖 (𝑖 = 1, 2,… , 𝑝𝑜𝑝_𝑠𝑖𝑧𝑒);
6: Initialize parameter 𝛼, 𝐴 and 𝐶;
7: for (each solution 𝑋𝑖 in the grey wolf population) do
8: Set a CNN model based on the values of solution 𝑋𝑖 as the hyperparameters;
9: Calculate the fitness of solution 𝑋𝑖 using Eq. (3) as the MSE error of CNN model

obtained based on the training set 𝑇 𝑟;
10: end for
11: Let 𝑋𝛼 be the best solution;
12: Let 𝑋𝛽 be the second best solution;
13: Let 𝑋𝛿 be the third best solution;
14: Apply GS strategy;
15: while (number of iterations < n) do
16: for each solution 𝑋𝑖 in the grey wolf population do
17: Update the position of 𝑋𝑖 using Eq. (11);
18: Set a CNN model based on the values of solution 𝑋𝑖 as the hyperparameters;
19: Calculate the fitness of solution 𝑋𝑖 using Eq. (3) as the MSE error of CNN model

obtained based on the training set 𝑇 𝑟;
20: end for
21: Update 𝛼, 𝐴 by Levy flight operator and 𝐶;
22: Update 𝑋𝛼 , 𝑋𝛽 and 𝑋𝛿
23: Increase the number of iterations by 1;
24: end while
25: Set a CNN model based on the values of solution 𝑋𝛼 as the hyperparameters;
26: Predict the wind power data in the test set 𝑇 𝑒 using the CNN model;
27: End algorithm

Due to the selection of best possible sets of hyperparameters and ar-
chitectures, the strategy of training CNNs is considered to be a complex
and difficult problem with an uncertain search space. Moreover, in the
IGWO, the stability between exploration and exploitation phases is suc-
cessful, which can be quite effective in addressing complex challenges
such as CNN training. Thus, we can obtain the best solution containing
the optimal values of CNN hyperparameters after performing IGWO
algorithm. It should be noted that, the evolutionary search strategy
is applied based on the training data to configure the CNN model
with best optimal hyperparameter values. Then, the configured CNN
model is applied on test data to predict unknown values of wind power
data points. The overall procedure of the Evol-CNN is conceptually
presented in Fig. 3.

3. Experimental results and discussions

3.1. Wind power data

We evaluate the novel Evol-CNN algorithm on 10 min intervals of
wind power data provided by the Australian Energy Market Operator
(AEMO) for the whole year of 2010 from an existing wind farm in
Australia [47]. In this work, the data of Woolnorth wind farm located
in northwest of Tasmania is taken into consideration. This wind farm
consists of 62 turbines with 140 megawatt (MW) nominal capacity.
It is important to note that the wind site of Woolnorth is one of the
most challenging situations for wind power forecasting in Australia


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

d
i
t
s
u
t
t

𝑡
a
b
v
i
d
a
w
o
{
𝛥

3

r
p
t
o
a

d
h
o
b
s
m
o
i
p
t
v
a
u
i
b
p
u
s

Fig. 2. The flowchart of the proposed IGWO.

because of its location on the edge of a cliff facing the Southern
Ocean [47]. Similar to [10], we divided the used dataset based on
ifferent seasons to show the performance of wind forecasting models
n different weather conditions. This scenario makes it possible to show
he sensitivity of compared models to different seasons. Then, for each
eason, 75 percent of the data is used for training, and the remaining is
tilized for testing. The training set is used to train the models and the
est set is used to evaluate the performance of the compared models in
6

erms of different evaluation metrics. m
Table 1
The IGWO parameters.
Parameter Value

a [2→0]
Population size 20
Number of iteration 20
Number of runs 10

Table 2
List of CNN hyperparameter symbols and
their values.
Symbol Value

B𝑠 [10, 20, . . . , 100]
N𝑒 [1, 300]
N𝑓 [1, 300]
K𝑠 [1, 25]
MP𝑠 [1, 15]
D𝑟 [0.2, 0.25, . . . , 0.65]
L𝑟 [0.001, 0.006, . . . , 0.1]
M𝑟 [0.05, 0.1, . . . , 0.95]
N𝑐 [1, 2, . . . , 5]

To obtain a better representation of the input features values and
improve the forecasting performance, we employ mutual information
(MI) strategy. By considering 𝑝(𝑡) as the value of wind power for time
, we measure the MI of 𝑝(𝑡 − 𝑙 + 1) and 𝑝(𝑡 + 1) assuming 𝑙 regarded
s the time-lag of wind power time series values. We measure the MI
ased on the lags from 𝑙 = 1 to 𝑙 = 100. We pick the time lags with MI
alues higher than a threshold 𝜏 = 0.4 to be considered as the selected
nput sets for making a better correlation of wind power time series
ata which leads to generate the time-lags from 𝑙 = 1 to 𝑙 = 29. Let us
ssume we are currently at the time 𝑡 and the future time horizon’s
ind power value will be predicted. Thus, based on this inference,
ur selected input set is considered as 29 + 28 = 57 dimensional set
𝑝(𝑡 − 28), 𝛥𝑝(𝑡 − 27), 𝑝(𝑡 − 27),… , 𝑝(𝑡)} with the sequential difference
𝑝(𝑡) = 𝑝(𝑡) − 𝑝(𝑡 − 1) in the wind power dataset.

.2. Initialization setups for evol-CNN

Selecting values for the initial parameters in evolutionary algo-
ithms to train deep neural networks plays an important role in the
erformance of these types of networks, and IGWO is no exception from
his rule. We chose the initialized values of the IGWO algorithm based
n the recommendations in aforementioned works [46]. These values
re as shown in Table 1.

On the other hand, it is necessary to determine the architecture of
eep CNNs before their training, which is associated by selecting the
yperparameters aligned with each network layer. In this study, we
ptimize nine hyperparameters with Evol-CNN framework including
atch size (B𝑠), number of epochs (N𝑒), number of filters (N𝑓 ), kernel
ize (K𝑠), maxpooling size (MP𝑠), dropout rate (D𝑟), learning rate (L𝑟),
omentum rate (M𝑟), and number of convolutional layers (N𝑐). Based

n the previous literature [48], these hyperparameters are the most
mportant hyperparameters which have significantly impact on the
erformance of CNN training. Table 2 shows the hyperparameters and
heir ranges which are used for the experiments in this work. These
alues are selected based on the suggestions from literature and trial
nd error for not resulting in over-fitting. The other hyperparameters
sed in CNN training are the activation function type which has been
nitialized by powerful ReLU function, the optimizer is considered
y SGD, and pooling type is considered with maxpooling. Also, the
erformance of different forecasting models in this paper is evaluated
sing three well-known evaluation metrics including the root mean
quare error (RMSE), mean absolute percentage error (MAPE) and
ean absolute error (MAE).


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.
Fig. 3. An overview of the proposed wind power forecasting model.
Table 3
Error values of forecasting methods for spring season using different time horizons.
Model Time step

10 min 1 h 3 h

RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE

PR 0.1312 295.617 0.0972 0.1554 325.794 0.1022 0.2019 403.908 0.1328
AR 0.1108 256.223 0.0833 0.1389 221.026 0.0954 0.1961 349.241 0.09963
ARMA 0.1076 189.343 0.0782 0.1323 178.454 0.0902 0.1936 317.565 0.09577
ARIMA 0.0996 144.668 0.0754 0.1261 153.778 0.0855 0.1927 279.118 0.09042
SVR 0.0711 82.227 0.0415 0.1178 107.156 0.0734 0.1923 236.442 0.08166
CNN 0.0481 21.939 0.0277 0.0922 63.116 0.0581 0.1724 204.811 0.06343
FFNN 0.0673 37.513 0.0361 0.1124 97.212 0.0693 0.1881 225.666 0.07242
LSTM 0.0426 18.413 0.0254 0.0919 55.707 0.0577 0.1533 184.932 0.06122
DE-LSTM 0.0413 18.255 0.0247 0.0901 55.265 0.0531 0.1514 182.677 0.05896
SAE 0.0396 17.173 0.0236 0.0862 52.656 0.0472 0.1449 172.559 0.05361
DeepHybrid 0.0382 16.688 0.0222 0.0834 48.054 0.0456 0.1412 167.201 0.05032
GWO-CNN 0.0361 15.545 0.0205 0.0794 42.335 0.0424 0.1397 164.302 0.04754
Evol-CNN 0.0346 13.103 0.0177 0.0752 34.229 0.0313 0.1364 161.559 0.0403
Table 4
Error values of forecasting methods for summer season using different time horizons.
Model Time step

10 min 1 h 3 h

RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE

PR 0.1354 508.831 0.0415 0.1559 279.494 0.0526 0.3814 305.667 0.0855
AR 0.0938 315.992 0.0369 0.1433 178.101 0.0445 0.3177 242.881 0.0694
ARMA 0.0903 256.166 0.0361 0.1356 135.772 0.0427 0.3086 192.442 0.0668
ARIMA 0.0881 114.883 0.0346 0.1282 91.552 0.0418 0.2981 162.982 0.0627
SVR 0.0742 42.372 0.0298 0.1223 76.636 0.0397 0.2591 145.872 0.0589
FFNN 0.0632 36.856 0.0288 0.1074 69.284 0.0392 0.2451 141.208 0.0575
CNN 0.0618 34.217 0.0281 0.1052 62.646 0.0388 0.2313 135.662 0.0556
LSTM 0.0593 31.193 0.0278 0.1045 56.123 0.0385 0.2051 129.362 0.0542
DE-LSTM 0.0575 30.227 0.0272 0.1023 53.676 0.0379 0.2012 126.018 0.0527
SAE 0.0532 27.883 0.0265 0.1008 51.099 0.0376 0.1972 124.026 0.0514
DeepHybrid 0.0511 26.109 0.0258 0.0986 49.898 0.0371 0.1943 120.433 0.0498
GWO-CNN 0.0496 24.656 0.0251 0.0971 47.222 0.0365 0.1897 117.404 0.0485
Evol-CNN 0.0472 21.267 0.0232 0.0953 43.227 0.0315 0.1866 113.228 0.0462
3.3. Simulation results

In this section, we compare the performance of our proposed Evol-
CNN method with the classical baselines for short-term wind power
forecasting models including persistence (PR) algorithm [9], auto-
regressive (AR), auto regressive moving average (ARMA), and auto-
regressive integrated moving average (ARIMA). In addition, the single
and hybrid methods in recent literature are compared with the pro-
posed model. A single model approach applies a single regression
architecture to undertake the prediction task. In order to demonstrate
the impact of deep feature learning on wind data regression problems,
we compare the Evol-CNN with shallow deep ANN-based methods,
7

including feed-forward neural network (FFNN), long short-term mem-
ory (LSTM), and convolutional neural network (CNN). Besides, support
vector regression (SVR) [49] is chosen as another powerful supervised
learning benchmark used for regression tasks in the literature.

On the other hand, hybrid algorithms use multiple methods of wind
feature extraction to improve the accuracy of prediction tasks. In this
work, we compare the proposed Evol-CNN model with the recently
proposed hybrid differential evolution-LSTM (DE-LSTM) [50] algorithm
that employs DE to optimize the LSTM hyperparameters, as well as deep
stacked auto-encoder (SAE) [9] that learns rough patterns from the
input wind data. Also, a combination of standard version of GWO with
deep CNN has been provided in order to show the searching capability
of our IGWO model. In addition, we compare our proposed Evol-CNN


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

d
f
f

t
i
a
c
A
m
0
v
s
r

Table 5
Error values of forecasting methods for autumn season using different time horizons.
Model Time step

10 min 1 h 3 h

RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE

PR 0.1029 111.586 0.0455 0.1188 113.868 0.0446 0.1819 166.859 0.0519
AR 0.9652 74.402 0.0405 0.1098 85.922 0.0408 0.1682 145.911 0.0462
ARMA 0.0933 65.881 0.0392 0.1094 78.912 0.0397 0.1635 141.667 0.0451
ARIMA 0.0892 58.922 0.0371 0.1089 73.443 0.0389 0.1592 138.509 0.0438
SVR 0.0826 34.832 0.0309 0.1055 52.052 0.0342 0.1476 127.669 0.0415
FFNN 0.0622 19.914 0.0289 0.0991 61.948 0.0329 0.1445 120.446 0.0397
CNN 0.0615 16.332 0.0283 0.0936 60.407 0.0324 0.1408 117.494 0.0391
LSTM 0.0598 14.601 0.0278 0.0944 53.245 0.0321 0.1373 116.221 0.0378
DE-LSTM 0.0573 13.651 0.0268 0.0913 51.806 0.0317 0.1355 115.109 0.0371
SAE 0.0521 11.282 0.0262 0.0882 49.202 0.0311 0.1317 111.865 0.0356
DeepHybrid 0.0493 10.769 0.0256 0.0853 46.099 0.0306 0.1284 107.257 0.0331
GWO-CNN 0.0472 10.121 0.0241 0.0821 44.788 0.0296 0.1266 104.556 0.0326
Evol-CNN 0.0445 9.343 0.0211 0.0794 39.545 0.0278 0.1238 102.433 0.0315
Table 6
Error values of forecasting methods for winter season using different time horizons.
Model Time step

10 min 1 h 3 h

RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE

PR 0.1289 125.9188 0.0358 0.1549 215.661 0.0347 0.2977 357.326 0.0496
AR 0.1124 97.2224 0.0309 0.1481 178.545 0.0325 0.2561 289.156 0.0455
ARMA 0.1053 88.1532 0.0292 0.1419 148.919 0.0317 0.2451 251.727 0.0434
ARIMA 0.0956 79.4322 0.0278 0.1356 124.771 0.0308 0.2238 238.115 0.0416
SVR 0.0847 50.1192 0.0251 0.1243 99.0987 0.0288 0.2142 192.663 0.0391
FFNN 0.0809 25.6676 0.0235 0.1285 90.3269 0.0275 0.2055 188.919 0.0362
CNN 0.0758 22.5572 0.0228 0.0121 67.8111 0.0271 0.1863 183.434 0.0346
LSTM 0.0583 18.9991 0.0224 0.1159 66.9182 0.0266 0.1791 178.934 0.0338
DE-LSTM 0.0561 17.2118 0.0218 0.1134 44.3099 0.0259 0.1761 173.244 0.0332
SAE 0.0535 15.8782 0.0215 0.1113 39.5664 0.0254 0.1722 169.092 0.0327
DeepHybrid 0.0493 12.2999 0.0208 0.1073 34.3373 0.0251 0.1693 163.455 0.0321
GWO-CNN 0.0476 11.6673 0.0204 0.1089 30.8982 0.0259 0.1674 159.915 0.0309
Evol-CNN 0.0435 10.1983 0.0198 0.1022 27.2891 0.0246 0.1635 154.676 0.0302
Table 7
The best CNN architectures found by Evol-CNN.
Dataset Horizon RMSE Hyperparameters

N𝑓 K𝑠 N𝑒 B𝑠 MP𝑠 D𝑟 L𝑟 M𝑟 N𝑐

10 min 0.031 40 1 30 60 2 0.25 0.011 0.05 1
Spring 1 h 0.071 70 2 20 40 3 0.25 0.046 0.1 3

3 h 0.131 30 3 70 30 2 0.35 0.006 0.05 1

10 min 0.041 80 1 30 50 2 0.4 0.026 0.2 1
Summer 1 h 0.092 70 2 40 70 1 0.2 0.031 0.05 2

3 h 0.179 50 3 20 60 2 0.15 0.016 0.3 3

10 min 0.038 55 1 30 30 3 0.35 0.041 0.1 3
Autumn 1 h 0.072 20 2 20 30 6 0.25 0.031 0.15 3

3 h 0.114 20 3 40 20 2 0.45 0.021 0.25 1

10 min 0.043 35 1 20 30 1 0.25 0.026 0.1 2
Winter 1 h 0.098 50 1 20 40 4 0.2 0.006 0.3 2

3 h 0.156 30 2 30 30 5 0.15 0.011 0.2 1
T
a
a
1

m
p
f
s
p
l
a
s
f

with the hybrid algorithm proposed in [11] named as DeepHybrid. The
esign of this algorithm is based on deep belief network (DBF) and
uzzy type II inference system (FT2IS) for the supervised regression of
uture wind values.

In order to have a fair comparison for choosing the best configura-
ions for hyperparameters of deep ANNs, the learnable hyperparameters
ncluding dropout rate (D𝑟), learning rate (L𝑟) and momentum rate (M𝑟)
re taken into consideration. For CNN, LSTM and FFNN models, D𝑟 is
onsidered with values corresponding to 0.3, 0.25 and 0.3, respectively.
lso, L𝑟 is equal to 0.006, 0.021 and 0.36 for CNN, LSTM and FFNN
odels, respectively. Finally, M𝑟 is assigned to values equal to 0.05,
.3 and 0.2 for CNN, LSTM and FFNN models, respectively. These
alues have been chosen based on the trial and error through a grid
earch strategy. For other algorithms, the optimal values of parameters
eported in their corresponding papers are used in the experiments.
8

w

he number of runs and number of iterations for all baseline models
re considered the same as our proposed Evol-CNN model. All of the
lgorithms are implemented using Python 3.7 on a GPU of NVIDIA GTX
080 Ti with the Intel Core i7 CPU and 32 GB RAM.

Tables 3–6 show the average of RMSE and MAPE of the different
ethods to determine 10 min, 1 h and 3 h forecasting ahead of wind
ower data points for different seasons. The RMSE and MAPE generated
rom different algorithms for spring dataset are tabulated in Table 3,
howing that the RMSE and MAPE generated by the Evol-CNN in each
rediction step (from 10 min to 3 h forecasting horizon) carry out the
owest values. In Table 4, the Evol-CNN algorithm has higher prediction
ccuracy in comparison with other twelve benchmarks for summer
eason. Moreover, the PR and SVR perform weaker than neural network
amily algorithms. This happens since irregularity and the linearity of
ind power data is very high and these two methods are not able


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.

t
a
E
f

m
a
s
i
o
o
t
i
o
C
r
b
C
T
t
f

Fig. 4. Actual vs predicted values of Evol-CNN.
Fig. 5. Convergence curves of proposed Evol-CNN algorithm.
o compete with ANN algorithms. Among deep ANNs, DeepHybrid
lgorithm outperforms other deep ANN frameworks. However, the
vol-CNN already has a higher modelling capability of wind power
orecasting for this case.

As can be seen in Table 5, for ultrashort-term predictions, the PR
odel has reasonably good performance, however yields poor results

s time steps rise. With longer horizons, SVR and ANN methods have
ignificantly smaller values of RMSE and MAPE compared to PR. LSTM
mproves the RMSE and MAPE for 10 min, 1 h and 3 h compared to two
ther NN methods such as CNN and FFNN. DE-LSTM framework still
utperforms better than LSTM in all seasons because differential mu-
ation operator stabilizes the search space of the LSTM algorithm and
ncreases its accuracy. However, compared to deep ANNs, DeepHybrid
utperforms better than the other ones. Among all methods, Evol-
NN has the best forecasting performance for different horizons. The
esult of RMSEs and MAPEs generated of winter season by the twelve
enchmark methods are tabulated in Table 6, indicating that the Evol-
NN in three different time horizons outperforms other benchmarks.
his is primarily due to the extraction of more substantive features
hrough CNN representation and also to the robustness of extracted
eatures resulting from the optimization process in IGWO.
9

In Table 7, the best architectures found with lowest RMSE by Evol-
CNN for three different time steps of four seasons are represented. The
overall conclusion drawn from this table is that the values chosen by the
Evol-CNN are approximately not computationally high. For example,
to select the proper values for N𝑓 , the algorithm chooses numbers
that range from 20 to 80, which are almost far from the end of N𝑓
interval equal to 300. Thus, it is deduced that for network training, the
Evol-CNN chooses normal values with lower computational costs.

In order to intuitively present the performance of the Evol-CNN
algorithm, the test dataset of wind power time-series for spring season
and their predicted values is shown in Fig. 4. In this figure, the blue
and red lines indicate the actual and predicted wind power data points,
respectively. For predicting the next 10 min interval, the two lines
almost overlap, meaning that the predicted values are close to the
actual real data points. Nonetheless, as the horizon steps increase, the
performance for predicting the next 1 h and 3 h decreases. This is also
rational, since it is more difficult to predict the 1 h and 3 h ahead wind
power forecasting than the 10 min prediction.

Fig. 5 illustrates the convergence curves of Evol-CNN algorithm us-
ing 10 independent runs for spring dataset. According to this figure, as
forecasting horizon goes up, the prediction error increases. Moreover,


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.
Fig. 6. Violin plots of hyperparameters generated by Evol-CNN for 10 min interval.
it is much easier to converge for a forecasting horizon of 10 min ahead
compared to 1 h and 3 h ahead. Finally, for all forecasting horizons, the
optimization process converges properly toward the end of iterations.

Figs. 6–8 shows the violin plots of nine optimized hyperparameters
for three different horizons of spring season. This figure is important
since it can share valuable information about the selection of the main
hyperparameter values for CNN architectural design procedure. For
instance, to select the initialization values for dropout rate, Fig. 8 shows
that the appropriate values for this hyperparameter fall into the value
of around 0.3. On the other hand, the value of 0.6 is not suggested for
CNN training since it does not contain large amounts of dropout values
during 10 times of Evol-CNN running. Such an interpretation applies to
other hyperparameters of Figs. 6–8 as well.

In order to determine statistically the significance of the differences
between the performance of the Evol-CNN and other benchmarks, the
T-test statistic technique is conducted. This test is carried out on the
basis of the Evol-CNN results at 5% significance level and degree of
freedom equal to 3 against each of the other benchmarks. Table 8 lists
the obtained 𝑝 values performed by T-test. By investigating the obtained
𝑝 values in this table, it can be seen that the null hypothesis (significant
10
difference) at 5% significance level is rejected in all cases. Therefore,
we can conclude that the proposed EvolCNN model is significantly
better than other compared models in three horizons of each season
dataset.

4. Conclusion

This paper presents a novel algorithm called Evol-CNN which is a
combination of deep CNNs and improved version of GWO algorithm for
wind power forecasting. The aim of this algorithm is in optimization
of the CNN hyperparameters in a discrete space for improving the
accuracy of wind power forecasting. We also use the MI strategy for
obtaining the optimal features for our proposed model. In order to
demonstrate the effectiveness of the Evol-CNN, the performance of
this algorithm is compared with twelve forecasting benchmarks on an
Australian wind farm dataset for three different horizons. Considering
different short-horizon time steps for different scenarios, Evol-CNN
showed relatively better performance than other benchmarks in terms
of RMSE and MAPE evaluation metrics.


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.
Fig. 7. Violin plots of hyperparameters generated by Evol-CNN for 1 h interval.
Table 8
𝑝 values of T-test for Evol-CNN forecasting results vs other models.

Season Horizon PR AR ARMA ARIMA SVR FFNN CNN LSTM DE-LSTM SAE DeepHybrid GWO-CNN

10 min 9.81E−07 5.92E−06 5.53E−06 4.91E−06 1.66E−05 1.58E−05 1.54E−04 3.61E−03 8.14E−04 1.90E−03 1.16E−02 1.03E−02
Spring 1 h 7.98E−07 8.97E−06 8.41E−06 7.93E−06 6.15E−06 5.22E−06 5.13E−05 1.18E−05 1.03E−05 1.12E−04 1.09E−04 7.66E−03

3 h 3.27E−06 2.77E−06 2.41E−06 1.93E−06 8.87E−07 5.28E−07 7.78E−06 6.12E−05 3.18E−04 2.35E−03 1.61E−03 1.32E−03

10 min 3.98E−07 3.51E−07 3.21E−07 2.81E−07 1.09E−06 2.23E−04 1.91E−04 1.69E−04 1.15E−03 5.99E−03 1.63E−02 1.23E−02
Summer 1 h 1.31E−06 1.26E−06 1.02E−06 1.12E−06 1.61E−06 4.59E−05 4.26E−04 4.11E−04 6.25E−04 1.24E−03 9.83E−04 1.11E−03

3 h 1.45E−06 1.21E−06 1.02E−06 9.11E−05 1.16E−04 3.72E−07 2.71E−05 1.65E−04 1.54E−04 4.57E−05 3.43E−03 1.78E−03

10 min 7.69E−07 6.44E−07 6.11E−07 3.16E−07 2.73E−06 2.56E−06 2.90E−04 3.18E−04 4.99E−04 9.00E−05 3.01E−03 1.91E−03
autumn 1 h 2.45E−05 1.22E−05 7.81E−04 3.21E−04 2.85E−06 6.15E−06 1.54E−04 7.51E−05 1.16E−04 4.19E−04 1.92E−04 1.25E−03

3 h 9.59E−05 8.12E−05 7.77E−05 7.13E−05 5.33E−06 2.12E−05 1.85E−05 1.58E−04 3.95E−04 4.57E−05 2.09E−03 1.56E−03

10 min 4.57E−07 9.11E−06 6.56E−06 4.13E−06 2.93E−06 2.77E−06 2.02E−06 4.43E−05 2.63E−04 4.57E−05 5.33E−04 2.46E−04
Winter 1 h 3.96E−06 1.90E−06 1.01E−06 9.72E−05 1.43E−04 6.12E−05 2.90E−04 9.27E−05 2.02E−03 2.18E−03 3.61E−03 2.25E−03

3 h 1.67E−07 6.22E−06 2.32E−06 1.13E−06 2.33E−03 9.43E−06 3.98E−04 1.16E−04 5.21E−04 2.22E−03 6.83E−04 3.15E−04
11


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.
Fig. 8. Violin plots of hyperparameters generated by Evol-CNN for 3 h interval.
CRediT authorship contribution statement

Seyed Mohammad Jafar Jalali: Investigation, Visualization, Writ-
ing – original draft. Sajad Ahmadian: Methodology, Data curation.
Mahdi Khodayar: Formal analysis. Abbas Khosravi: Supervision. Mi-
adreza Shafie-khah: Conceptualization. Saeid Nahavandi: Validation.
João P.S. Catalão: Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.

Acknowledgment

J.P.S. Catalão acknowledges the support by FEDER funds through
COMPETE 2020 and by Portuguese funds through FCT, under POCI-
01-0145-FEDER-029803 (02/SAICT/2017).
12
References

[1] Luo X, Sun J, Wang L, Wang W, Zhao W, Wu J, et al. Short-term wind speed
forecasting via stacked extreme learning machine with generalized correntropy.
IEEE Trans Ind Inf 2018;14(11):4963–71.

[2] Jalali SMJ, Khodayar M, Ahmadian S, Noman MK, Khosravi A, Islam SMS, et
al. A new uncertainty-aware deep neuroevolution model for quantifying tidal
prediction. In: 2021 IEEE industry applications society annual meeting. IEEE;
2021, p. 1–6.

[3] Jalali SMJ, Khodayar M, Khosravi A, Osório GJ, Nahavandi S, Catalão JP. An
advanced generative deep learning framework for probabilistic spatio-temporal
wind power forecasting. In: 2021 IEEE international conference on environment
and electrical engineering and 2021 IEEE industrial and commercial power
systems Europe. IEEE; 2021, p. 1–6.

[4] Jalali SMJ, Ahmadian S, Khodayar M, Khosravi A, Ghasemi V, Shafie-khah M,
et al. Towards novel deep neuroevolution models: Chaotic levy grasshopper
optimization for short-term wind speed forecasting. Eng Comput 2021;1–25.

[5] Khodayar M, Khodayar ME, Jalali SMJ. Deep learning for pattern recognition of
photovoltaic energy generation. Electr J 2021;34(1):106882.

[6] Jalali SMJ, Ahmadian S, Khosravi A, Shafie-khah M, Nahavandi S, Catalão JP. A
novel evolutionary-based deep convolutional neural network model for intelligent
load forecasting. IEEE Trans Ind Inf 2021;17(12):8243–53.

[7] Jalali SMJ, Ahmadian S, Kavousi-Fard A, Khosravi A, Nahavandi S. Automated
deep CNN-LSTM architecture design for solar irradiance forecasting. IEEE Trans
Syst Man Cybern A 2021;52(1):54–65.

http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb1
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb2
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb3
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb4
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb5
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb6
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb7


International Journal of Electrical Power and Energy Systems 141 (2022) 108143S.M.J. Jalali et al.
[8] Saffari M, Khodayar M, Jalali SMJ, Shafie-khah M, Catalão JP. Deep convolu-
tional graph rough variational auto-encoder for short-term photovoltaic power
forecasting. In: 2021 international conference on smart energy systems and
technologies. IEEE; 2021, p. 1–6.

[9] Khodayar M, Kaynak O, Khodayar ME. Rough deep neural architecture for
short-term wind speed forecasting. IEEE Trans Ind Inf 2017;13(6):2770–9.

[10] Hill DC, McMillan D, Bell KR, Infield D. Application of auto-regressive models to
UK wind speed data for power system impact studies. IEEE Trans Sustain Energy
2011;3(1):134–41.

[11] Khodayar M, Wang J, Manthouri M. Interval deep generative neural network for
wind speed forecasting. IEEE Trans Smart Grid 2018;10(4):3974–89.

[12] Choi I-J, Park R-S, Lee J. Impacts of a newly-developed aerosol climatology
on numerical weather prediction using a global atmospheric forecasting model.
Atmos Environ 2019;197:77–91.

[13] do Nascimento Camelo H, Lucio PS, Junior JBVL, de Carvalho PCM, dos
Santos DvG. Innovative hybrid models for forecasting time series applied in wind
generation based on the combination of time series models with artificial neural
networks. Energy 2018;151:347–57.

[14] Zhang J, Wei Y, Tan Z. An adaptive hybrid model for short term wind speed
forecasting. Energy 2019;115615.

[15] Wang Y, Hu Q, Srinivasan D, Wang Z. Wind power curve modeling and
wind power forecasting with inconsistent data. IEEE Trans Sustain Energy
2018;10(1):16–25.

[16] Liu Y, Qin H, Zhang Z, Pei S, Wang C, Yu X, et al. Ensemble spatiotemporal
forecasting of solar irradiation using variational Bayesian convolutional gate
recurrent unit network. Appl Energy 2019;253:113596.

[17] Xie W, Zhang P, Chen R, Zhou Z. A nonparametric Bayesian framework
for short-term wind power probabilistic forecast. IEEE Trans Power Syst
2018;34(1):371–9.

[18] Ahmadian S, Moradi P, Akhlaghian F. An improved model of trust-aware
recommender systems using reliability measurements. In: 2014 6th Conference
on information and knowledge technology. IEEE; 2014, p. 98–103.

[19] Tahmasebi F, Meghdadi M, Ahmadian S, Valiallahi K. A hybrid recommendation
system based on profile expansion technique to alleviate cold start problem.
Multimedia Tools Appl 2021;80(2):2339–54.

[20] Ahmadian M, Ahmadi M, Ahmadian S, Jalali SMJ, Khosravi A, Nahavandi S.
Integration of deep sparse autoencoder and particle swarm optimization to
develop a recommender system. In: 2021 IEEE international conference on
systems, man, and cybernetics. IEEE; 2021, p. 2524–30.

[21] Moradi P, Rezaimehr F, Ahmadian S, Jalili M. A trust-aware recommender
algorithm based on users overlapping community structure. In: 2016 sixteenth
international conference on advances in ICT for emerging regions. IEEE; 2016,
p. 162–7.

[22] Hasani H, Jalali SMJ, Rezaei D, Maleki M. A data mining framework for
classification of organisational performance based on rough set theory. Asian
J Manag Sci Appl 2018;3(2):156–80.

[23] Jalali SMJ, Hedjam R, Khosravi A, Heidari AA, Mirjalili S, Nahavandi S.
Autonomous robot navigation using moth-flame-based neuroevolution. In:
Evolutionary machine learning techniques. Springer; 2020, p. 67–83.

[24] Jalali SMJ, Khosravi A, Kebria PM, Hedjam R, Nahavandi S. Autonomous robot
navigation system using the evolutionary multi-verse optimizer algorithm. In:
2019 IEEE international conference on systems, man and cybernetics. IEEE; 2019,
p. 1221–6.

[25] Kong X, Liu X, Shi R, Lee KY. Wind speed prediction using reduced support
vector machines with feature selection. Neurocomputing 2015;169:449–56.

[26] Hu Q, Zhang R, Zhou Y. Transfer learning for short-term wind speed prediction
with deep neural networks. Renew Energy 2016;85:83–95.

[27] Marugán AP, Márquez FPG, Perez JMP, Ruiz-Hernández D. A survey of artificial
neural network in wind energy systems. Appl Energy 2018;228:1822–36.

[28] Qian Z, Pei Y, Zareipour H, Chen N. A review and discussion of decomposition-
based hybrid models for wind energy forecasting applications. Appl Energy
2019;235:939–53.
13
[29] Ahmadian S, Khanteymoori AR. Training back propagation neural networks
using asexual reproduction optimization. In: 7th conference on information and
knowledge technology. IEEE; 2015, p. 1–6.

[30] Liu X, Zhang H, Kong X, Lee KY. Wind speed forecasting using deep neural
network with feature selection. Neurocomputing 2020;397:393–403.

[31] Chen J, Zhu Q, Li H, Zhu L, Shi D, Li Y, et al. Learning heterogeneous features
jointly: A deep end-to-end framework for multi-step short-term wind power
prediction. IEEE Trans Sustain Energy 2019.

[32] Wang K, Qi X, Liu H, Song J. Deep belief network based k-means cluster
approach for short-term wind power forecasting. Energy 2018;165:840–52.

[33] Liu H, Mi X, Li Y. Smart multi-step deep learning model for wind speed
forecasting based on variational mode decomposition, singular spectrum analysis,
LSTM network and ELM. Energy Convers Manage 2018;159:54–64.

[34] Wang H-z, Li G-q, Wang G-b, Peng J-c, Jiang H, Liu Y-t. Deep learning
based ensemble approach for probabilistic wind power forecasting. Appl Energy
2017;188:56–70.

[35] Abedinia O, Lotfi M, Bagheri M, Sobhani B, Shafie-khah M, Catalao JP. Improved
EMD-based complex prediction model for wind power forecasting. IEEE Trans
Sustain Energy 2020.

[36] Hong Y-Y, Rioflorido CLPP. A hybrid deep learning-based neural network for
24 h ahead wind power forecasting. Appl Energy 2019;250:530–9.

[37] Abedinia O, Bagheri M, Naderi MS, Ghadimi N. A new combinatory approach
for wind power forecasting. IEEE Syst J 2020.

[38] Hu Y-L, Chen L. A nonlinear hybrid wind speed forecasting model using LSTM
network, hysteretic ELM and differential evolution algorithm. Energy Convers
Manage 2018;173:123–42.

[39] Han L, Zhang R, Wang X, Bao A, Jing H. Multi-step wind power forecast based
on VMD-LSTM. IET Renew Power Gener 2019;13(10):1690–700.

[40] Stanley KO, Clune J, Lehman J, Miikkulainen R. Designing neural networks
through neuroevolution. Nat Mach Intell 2019;1(1):24–35.

[41] Mousavirad SJ, Jalali SMJ, Ahmadian S, Khosravi A, Schaefer G, Nahavandi S.
Neural network training using a biogeography-based learning strategy. In:
International conference on neural information processing. Springer; 2020, p.
147–55.

[42] Ahmadian S, Jalali SMJ, Raziani S, Chalechale A. An efficient cardiovascu-
lar disease detection model based on multilayer perceptron and moth-flame
optimization. Expert Syst 2021;e12914.

[43] Ahmadian S, Jalali SMJ, Islam SMS, Khosravi A, Fazli E, Nahavandi S. A novel
deep neuroevolution-based image classification method to diagnose coronavirus
disease (COVID-19). Comput Biol Med 2021;139:104994.

[44] Jalali SMJ, Ahmadian M, Ahmadian S, Khosravi A, Alazab M, Nahavandi S.
An oppositional-Cauchy based GSK evolutionary algorithm with a novel deep
ensemble reinforcement learning strategy for COVID-19 diagnosis. Appl Soft
Comput 2021;111:107675.

[45] Li Z, He Y, Li H, Li Y, Guo X. A novel discrete grey wolf optimizer for solving
the bounded Knapsack problem. In: International symposium on intelligence
computation and applications. Springer; 2018, p. 101–14.

[46] Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw
2014;69:46–61.

[47] Cutler N, Outhred H, MacGill I. Final report on UNSW project for AEMO to
develop a prototype wind power forecasting tool for potential large rapid changes
in wind power. The Centre for Energy and Environmental Markets; 2011.

[48] Sun Y, Xue B, Zhang M, Yen GG. Evolving deep convolutional neural networks
for image classification. IEEE Trans Evol Comput 2019.

[49] Santamaría-Bonfil G, Reyes-Ballesteros A, Gershenson C. Wind speed forecasting
for wind farms: A method based on support vector regression. Renew Energy
2016;85:790–809.

[50] Peng L, Liu S, Liu R, Wang L. Effective long short-term memory with differential
evolution algorithm for electricity price prediction. Energy 2018;162:1301–14.

http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb8
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb9
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb10
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb11
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb12
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb13
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb14
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb15
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb16
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb17
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb18
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb19
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb20
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb21
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb22
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb23
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb24
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb25
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb26
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb27
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb28
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb29
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb30
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb31
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb32
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb33
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb34
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb35
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb36
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb37
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb38
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb39
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb40
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb41
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb42
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb43
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb44
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb45
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb46
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb47
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb48
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb49
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50
http://refhub.elsevier.com/S0142-0615(22)00181-8/sb50

	An advanced short-term wind power forecasting framework based on the optimized deep neural network models
	Introduction
	Proposed method
	Representation of solutions
	Calculation of fitness function
	Search strategy

	Experimental results and discussions
	Wind power data
	Initialization setups for evol-CNN
	Simulation results

	Conclusion
	CRediT authorship contribution statement
	Declaration of competing interest
	Acknowledgment
	References