Received 31 December 2023, accepted 20 January 2024, date of publication 23 January 2024, date of current version 1 February 2024. Digital Object Identifier 10.1109/ACCESS.2024.3357661 Utilizing Ensemble Learning for Detecting Multi-Modal Fake News MUHAMMAD LUQMAN 1, MUHAMMAD FAHEEM 2, (Member, IEEE), WAHEED YOUSUF RAMAY3, MALIK KHIZAR SAEED 4, AND MAJID BASHIR AHMAD5 1Department of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China 2School of Technology and Innovations, University of Vaasa, 65200 Vaasa, Finland 3Department of Computer Science, Air University, Multan 60000, Pakistan 4Department of Computer Sciences, COMSATS University Islamabad, Vehari 61000, Pakistan 5School of Software and Microelectronics, Northwestern Polytechnical University, Xi’an 710072, China Corresponding author: Muhammad Faheem (muhammad.faheem@uwasa.fi) ABSTRACT The spread of fake news has become a critical problem in recent years due extensive use of social media platforms. False stories can go viral quickly, reaching millions of people before they can be mocked, i.e., a false story claiming that a celebrity has died when he/she is still alive. Therefore, detecting fake news is essential for maintaining the integrity of information and controlling misinformation, social and political polarization, media ethics, and security threats. From this perspective, we propose an ensemble learning-based detection of multi-modal fake news. First, it exploits a publicly available dataset Fakeddit consisting of over 1 million samples of fake news. Next, it leverages Natural Language Processing (NLP) techniques for preprocessing textual information of news. Then, it gauges the sentiment from the text of each news. After that, it generates embeddings for text and images of the corresponding news by leveraging Visual Bidirectional Encoder Representations from Transformers (V-BERT), respectively. Finally, it passes the embeddings to the deep learning ensemble model for training and testing. The 10-fold evaluation technique is used to check the performance of the proposed approach. The evaluation results are significant and outperform the state-of-the-art approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Odds Ratio (OR), respectively. INDEX TERMS Ensemble learning, convolutional neural network, multi-modal fake news, classification, boosted CNN, bagged CNN. I. INTRODUCTION The concept of fake news is not new. Its roots existed long ago in our society. It refers to false information which can be disseminated to mislead or deceive the Public. For example, fake news about COVID-19 vaccines could discourage people from getting vaccinated, leading to increased rates of illness and death. In the past, every kind of distinct material was considered fake news, like satires, conspiracies, news manipulation, and click-bait. However, fake news is now becoming jargon [1] and has a huge impact on the critical The associate editor coordinating the review of this manuscript and approving it for publication was Donato Impedovo . events happening in our society, e.g., spreading fake news (false stories) on social media was very concerning in US presidential election 2016 [2]. Fake news can spread quickly through social media and other online platforms. It can have serious consequences, such as causing panic, influencing elections, and eroding public trust in legitimate news sources. Individuals need to distinguish real news and critically evaluate sources of information before sharing or responding to them. Additionally, news organizations and social media platforms are responsible for combating the spread of fake news by fact-checking and removing false content. The surveys show that about 70% of Americans use social media as a source VOLUME 12, 2024 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 15037 https://orcid.org/0009-0008-3763-0395 https://orcid.org/0000-0003-4628-4486 https://orcid.org/0009-0009-4698-6948 https://orcid.org/0000-0002-9285-2555 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News of news and circulating information [3]. The accessibility of news and information on the Internet is very low-cost and convenient. However, spreading fake news on these carriers is straightforward and effortless [4]. Fake news can lead to false assumptions that drastically affect our society. Consequently, it is critical to design an automated fake news detection system. Many researchers are actively developing new and better methods for identifying and combating the spread of misin- formation. Some of the key research areas and trends in this field include deep learning approaches, e.g., Convolutional Neural Network (CNN); linguistic features, e.g., sentiment analysis, topic modeling, and stylometric analysis; source- based approaches, e.g., analyzing the domain name, social media presence, or history of the news source, and ensemble approaches, e.g., combining linguistic, source-based, and deep learning models to create a more robust and accurate detection system. Although recent research has identified the issues of the said problem and proposed different solutions, e.g., pre-trained language models have shown their effectiveness in alleviating feature engineering efforts, such as Bidirectional Encoder Representations from Transformers (BERT) [5], OpenAI GPT [6], and Elmo [7], however; the problem requires significant performance improvement. From this perspective, this paper proposes an ensemble learning-based detection of multi-modal fake news (ELD- FN). It first exploits a publicly available dataset Fakeddit, a novel multi-modal dataset consisting of over 1 million sam- ples from multiple categories of fake news. Second, it lever- ages Natural Language Processing (NLP) techniques for preprocessing textual information of news. Third, it gauges the sentiment from the text of each news. Fourth, it generates embeddings for text and images of the corresponding news by leveraging V-BERT [8], respectively. Finally, it passes the embeddings to the deep learning ensemble model for training and testing. The 10-fold evaluation technique is used to check the performance of ELD-FN. The evaluation results are significant and outperform the state-of-the-art approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Odds Ratio (OR), respectively. The main contributions made in this paper are as follows. • The proposed approach integrates news sentiment as a crucial feature and employs ensemble learning to identify multi-modal fake news. • It is evident from the evaluation results that ELD-FN is significant and outperforms the baseline approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, MCC, and OR, respectively. The organization of the rest of the paper is as follows. Section III describes the details of ELD-FN. Section IV describes the evaluation methods for ELD-FN, obtained results, and their threats to validity. Section II discusses the research background. Section V summarizes the paper and suggests future work. II. RELATED WORK Although extensive research on fake news detection has been performed [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], most research is conducted on textual data or uni-modal features. However, two most relevant researches [24], [25] proposed deep learning-based solutions for detecting fake news. The proposed approach (ELD-FN) differs from baseline approaches as it does not work for the multi-modal features but also considers the sentiments involved in the textual information of news. Most of the state-of-the-art fake news classification approaches can be categorized as follows: 1) fake news classification approaches for single-modality and 2) fake news classification approaches for multi-modality. A. FAKE NEWS CLASSIFICATION APPROACHES FOR SINGLE-MODALITY The fake news classification approaches for single-modality can be further divided into two categories based on the text/image features. 1) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING TEXTUAL FEATURES Textual features can be divided into generic and latent categories. Usually, traditional machine learning algorithms utilize Generic textual features. These algorithms analyze text based on linguistic levels such as lexicon, syntax, discourse, and semantics. Previous research has compiled a detailed table summarizing these features [10]. However, Latent textual features consist of the embeddings extracted from textual data of news at the word, sentence, or document level. Latent vectors are constructed from the textual news data. Furthermore, these latent vectors are used as input for classifiers, i.e., SVM. Recurrent neural networks (RNNs) are potent in modeling and analyzing sequential data. For example, Ma et al. used RNNs to capture relevant information over time by learning hidden layer representations [11]. Meanwhile, Chen et al. proposed a CNN-based approach for the classification [12]. Moreover, a novel technique Attention-Residual Network (ARC) is introduced to acquire long-range features. Ma et al. introduced a Generative Adversarial Network (GAN)-based model that employs a Generator network based on Gated Recurrent Units (GRU) to generate contentious instances. Furthermore, a Discriminator network based on RNNs is designed to identify essential features [13]. RNN-based models have proven very effective in classify- ing fake news detection datasets. However, the RNN-based models prioritize the recent input sequence, and the essential features may be located at the end of the sequence. Yu et al. proposed a CNN-based approach that resolves this issue. The proposed technique does not prioritize recent input sequences. This approach applies feature extraction based 15038 VOLUME 12, 2024 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News on the relationship of the essential features [14]. Vaibhav and Hovy utilize a graphical approach for classifying news articles [15]. For this purpose, they used Graph Neural Networks, such as Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT), to create graph embeddings for fake news detection. Wu et al. utilize multi-task learning techniques to classify and detect fake news. Moreover, the stance classification task optimizes shared layers concurrently, improving news representations [16]. Cheng et al. utilized LSTM model to classify the textual news data. They used a variational autoencoder to extract essential textual features at the tweet- level text. Some researchers have assumed that complex and multi-dimensional news are not accessible initially. The accessibility of only text-based news depends on the popular- ity [17]. Qian et al. developed a text-based model that utilizes word/sentence level data from legitimate papers to produce user feedback for early detection [18]. This addressed the scarcity of user reviews as an auxiliary source of information. For example, Qian et al. proposed an approach for generating user feedback on the text. Such feedback was along with word/sentence level information from real articles for the classification process [18]. Giachanou et al. investigated the influence of emotional cues in the proposed model. They propose an LSTM model that integrates emotional signals extracted from claim texts to differentiate between true and false news [19]. 2) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING IMAGE FEATURES As multimedia becomes more prevalent in social networks, news now contains text and visual information such as images and videos that convey rich meaning. However, textual feature-based approaches face challenges in effectively capturing visual information because of the heterogeneity between text and image data. Consequently, many researchers have proposed image-based approaches for detecting fake news. Classical image-based models utilized basic fundamental numerical features of images [20], [26], such as image count, popularity [27], and type to identify fake news. For impaired images, complex forensics features were extracted. Furthermore, post and user-based features are integrated to identify fake news [28]. However, it was evident that basic numerical features are inadequate to describe complex visual information of the news images. Deep learning models such as CNNs have proven effective in capturing visual features in news images. Many researches have shown that feature extraction from CNN models can be used in visual recognition tasks to generate generic image representation [29]. Building on the success of CNNs, recent studies have utilized pre-trained deep CNNs like VGG19 [30], [31] to obtain generic visual representations [32], [33]. Researchers suggested multi-domain visual neural models to capture the inherent traits of fabricated news images more effectively. These multi-domain models merged frequency and pixel domain visual data to differentiate between genuine and fabricated news based on visual characteristics [34]. Poor quality is a common trait in fake news images. The poor quality feature and image semantics are visible in frequency and pixel domains. However, the quality feature is extracted by CNNmodel, and the semantics of the images are extracted by CNN-RNN model. B. FAKE NEWS CLASSIFICATION APPROACHES FOR MULTI-MODALITY Word-based and Image-based information are both important in detecting fake news. As social networks often contain both types of information, combining them can improve per- formance. This section discusses the different multi-modal approaches for fake news detection, categorized based on the different perspectives they adopt. 1) PROBLEMS IN MULTI-MODALITY Several studies have explored using visual information to complement textual information in detecting fake news. These studies typically use text-based and image-based encoders to extract textual and visual features, respectively. Furthermore, these feature vectors construct an overall feature vector for each news. For example, Wang et al. proposed Event classification as an additional task to enhance the generalizing ability of themodel for event-invariant multi- modal features [32]. Other researchers, such as Singhal et al., use a combination of text-based and image-based features. They utilize BERT and XLNet pre-trained models for encoding text-based and image-based data, respectively [35]. However, these approaches are proven to be limited in effectively detecting multi-modal fake news because of their ability to capture complex cross-modal correlations. More advanced multi-modal techniques are needed to improve the performance of fake news detection. 2) FLEXIBILITY IN MULTI-MODALITY Some studies have recognized that irrelevant images are a common characteristic of multi-modal fake news and have focused on measuring the consistency between the text and visual components in detection. One approach by Zhou and Zafarani [36] used an image captioning model to generate sentences from images and then measured the similarity between those sentences and the original text. However, this approach was constrained by the discrepancies that existed between the training data of the image captioning model and the real news corpus. Another approach by Xue et al. projected the visual and textual features into a shared feature space and computed the similarities between resulting multi-modal features. However, they encountered difficulties capturing multi-modal inconsistencies because of the semantic gap between the two types of features [37]. VOLUME 12, 2024 15039 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News Ghorbanpour et al. [38] proposed the Fake-News-Revealer (FNR) method, which uses a Vision-transformer [39] and BERT [5] to extract image and text features respectively. The model extracted textual and visual features separately and determined their similarities by loss. 3) IMPROVEMENT IN MULTI-MODALITY Several researchers have proposed different approaches for fake news detection using multi-modal data. Jin et al. utilized an RNN model and applied an attention mechanism to com- bine information extracted from textual, visual, and social context data [40]. Zhang et al. [41] used a multi-channel CNN with an attention mechanism to combine multi-modal information, while Song et al. [42] proposed the co-attention transformer to model the bidirectional enhancement between images and text. Qian et al. developed a Hierarchical Multi-modal Contextual Attention Network (HMCAN), which was designed to collectively capture multi-modal context data and the hierarchical semantics of text [43]. Wu et al. introduced the Multi-modal Co-Attention Network (MCAN) that extracts spatial-domain and frequency-domain features from the image and text, and fuses visual and textual features using multiple co-attention layers [44]. Other researchers have also utilized Graph Convolutional Networks (GCN) and entity-centric cross-modal interaction to model the relationship between word-based and image- based objects. Finally, Zhang et al. and Laura et al. proposed a BERT-based multi-modal model to encode text-based and image-based information. The model effectively captures the interplay between text and images and employs contrastive learning to enhance multi-modal representations. [24], [45] integrated visual entities to enhance the comprehension of high-level semantics in news images and to model the inconsistencies and mutual enhancements of multi-modal entities [22]. In summary, when performing multi-modal fake news detection, there are three important inductive biases to con- sider when examining text-image correlations. Firstly, images provide additional information to the text, highlighting the need for multi-modal. Secondly, problems between text and images can serve as a potential signal for detecting fake news using multiple modalities. Finally, text-based and image-based data can improve performance by identifying essential features. III. METHODOLOGY A. OVERVIEW The overview of ELD-FN is depicted in Fig. 1. The following are the main steps of ELD-FN. 1) First, the publicly available multi-modal dataset (Fakeddit) is collected from Google Drive.1 2) Next, it leverages NLP techniques, e.g., tokenization, stop-word removal, lowercase conversion, and lemma- tization, for preprocessing textual information of news. 1https://fakeddit.netlify.app/, accessed on 15-01-2023. 3) Then, it computes the sentiment from the text of each news. 4) After that, it generates embeddings for text and images of the corresponding news by leveraging V-BERT, respectively. 5) Finally, it passes the embeddings to the deep learning ensemble model for training and testing. B. PROBLEM DEFINITION A news n from a set of multi-modal dataset of news N can be represented as follows: n =< t, i, s > (1) where, t is the textual information of n, i is the image of n, and s is an assigned status to n whether n is fake or true. The ELD-FN suggests the status of new news as either ture or false, where ture represents that the news is real and false represents that the news is fake. Consequently, the automatic classification of a new news n could be defined a mapping f : f : n→ c c ∈ {ture, false} , n ∈ N (2) where, c is a suggested status from a news status set (ture, false). C. PREPROCESSING The news may contain inappropriate and unnecessary text, e.g., English stop-words. Such information is considered an overhead for the machine learning classification algo- rithms because of processing time and memory utilization. Therefore, preprocessing of news text is essential for the performance of ELD-FN tomake it fast andmemory efficient. We perform the following preprocessing steps to clean the text of news. 1) TOKENIZATION Text tokenization breaks down a piece of text into smaller units called tokens. Tokens are individual words, phrases, or other meaningful text elements, which can be analyzed and processed further. 2) SPECIAL CHARACTER REMOVAL The text of news may contain special characters, e.g., semicolon (;). This step removes the special characters from the list of tokens. 3) STOP-WORD REMOVAL English text contains meaningless words that are used to make sentences meaningful, called stop-words. This step removes stop-words from the working list. 4) SPELL CORRECTION AND LOWERCASE CONVERSION This step identifies and corrects the spelling mistakes from the working list of tokens of news. 15040 VOLUME 12, 2024 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News FIGURE 1. The overview of ELD-FN. 5) LEMMATIZATION The lemmatization step converts higher-degree and compara- tive words into their lower-degree words, e.g., lemmatization converts the word darker into dark. We exploit Python Natural Toolkit (NLTK)2 for the preprocessing of news. The preprocessed news can be represented as follows: n′ =< t ′, i, s > (3) t ′ =< t1, t2, . . . , tn > (4) where, t ′ = t1, t2, . . . , tn are the tokens from the text of n after preprocessing. D. SENTIMENT ANALYSIS Sentiment analysis is a NLP technique that involves identi- fying and extracting subjective information from text, i.e., opinions, attitudes, emotions, and sentiments towards a particular topic. It automatically classifies the polarity of a text as positive, negative, or neutral. We exploit TextBlob API3 for the computation of sentiment analysis. The news (mentioned in Eq. e3) after sentiment computation can be represented as follows: n′ =< v, t ′, i, s > (5) where, v is the sentiment of n′. E. FEATURE MODELING This step passes the preprocessed text and images from the multi-modal dataset to V-BERT to generate the embeddings. V-BERT is an extension of the BERT model that combines the power of the BERT model with a visual grounding mech- anism, allowing it to understand the relationship between the text and the visual information in an image. This is achieved 2https://www.nltk.org/, accessed on 15-01-2013. 3https://textblob.readthedocs.io/en/dev/, accessed on 15-01-2023. by combining a region-based visual feature extractor with the BERT model, where each image region is encoded into a vector using a CNN. These visual features are concatenated with the input text, and the resulting sequence is fed into the BERT model. During training, V-BERT is optimized to minimize a joint loss function. This allows Visual BERT to learn language and vision representations in a unified framework and capture the complex interactions between the two modalities. The layers/steps involved in ELD-FN for identifying fake/real news. 1) BERT SHARED LAYER For the news text, the BERT shared layer is implemented using a pre-trained Seq2Seq model [8]. The fine-tuning learning process is required and indispensable to achieve bet- ter results. To improve its efficiency, separate BERT-shared layers are adopted for model-to-model textual features. The output of news text feature extractor OTBERT can be represented as follows: OTBERT = BERT T (XT ) (6) where, BERT T is the relevant BERT-shared layer modeling for news text and XT is the input representation of textual data. 2) IMAGE EMBEDDING LAYER For the news image, Faster-RCNN model [8] is applied to extract features from the image. The detected objects may provide visual contexts of the whole picture and be linked to specific terms through detailed region details. We also add a position embedding feature to images by encoding the object location. The output of the image feature extractorOTBERT can be represented as follow: OIBERT = BERT I (X I ) (7) VOLUME 12, 2024 15041 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News where, BERT I is the relevant BERT-shared layer modeling for images, and X I is the input representation of images. 3) PRE-FEATURE EXTRACTION The BERT-shared layer is strong enough for feature extrac- tion. It includes a pre-feature extractor to enhance the ability of BERT to learn semantic characteristics. Pre- feature extractor consists of the Position-wise Convolution Transformation (PCT) and the Multi-Head Self-Attention (MSA) layer. 4) MULTI-MODAL FEATURE CONCATENATION After extracting the latent features of text and image, these are concatenated together to obtain the desired multi-modal fea- ture representations. The multi-modal concatenated features Of can be represented as follows: OfBERT = OTBERT + O I BERT (8) F. ENSEMBLE MODEL Bagging and boosting [46] are two approaches to ensemble machine learning models. We applied both approaches with CNN and LSTMmodels. Four different architectures (bagged CNN, bagged LSTM, boosted CNN, boosted LSTM) of ensemble machine learning models have experimented using bagging and bootstrap aggregating to predict the fake/real news. Note that bagged CNN is the proposed ensemble model as it yields the other mentioned ensemble architectures. The predictions through different architectures are made using Algorithm 1. IV. EVALUATION This section constructs the research questions to evaluate ELD-FN, explains the exploited dataset, defines the metrics and evaluation process, and reports the findings and threats to validity. A. RESEARCH QUESTIONS (RQs) The following research questions are investigated to evaluate ELD-FN. • RQ1:Does ELD-FNoutperform the baseline approaches? • RQ2: Does news sentiment influence the identification of fake news? • RQ3: Does preprocessing influence the identification of fake news? • RQ4:Does ELD-FN outperform other classifiers regard- ing identifying fake news? The RQ1 compares the ELD-FN with the baseline approaches [24], [25] names as FakeNED and MultiFND in the rest of this paper. The reason to select these approaches as baseline approaches is that both are recently proposed approaches, closely related to our work and exploited the same dataset. The RQ2 investigates the influence of news sentiment to detect fake news. It evaluates whether positive news will likely be considered true or vice versa. Algorithm 1 Ensemble Model 1: procedure Ensemble Model 2: Input: XtT+1, α g b ,∫ g b h (σ,W g b h , b g b h )· · · ∫ g b N g b (σ,W N g b g b , b N g b g b ) 3: Initialize: ŷt+1, h← 2 4: X ∫ t 1 (σ,W t 1,b t 1) −−−−−−−→ y 5: while h ≤ N g b do 6: ŷtT+1← ŷtT+1 + α g b ∫ g b h (σ,W g b b , b g b h ,XtT+1) 7: h = h+ 1 8: end while 9: Output: ŷtT+1 10: end procedure where, XtT+1 is the feature set at time instances, α g b are Weights of bagging or boosting,∫ g b h (σ,W g b h , b g b h )· · · ∫ g b N g b (σ,W g b N g b , b g b N g b ) are set of ensembled bagged or boosted models, ŷtT+1 is the output of the ensembled model, X is the feature set, Y is the instance of the output, α is the activation function, and W g b b are Weights of bagging or boosting models. The RQ3 examines the impact of preprocessing the news text to detect fake news. The RQ4 investigates the impact of different deep-learning classification algorithms on ELD-FN. We analyze the ELD-FN and other deep learning approaches to evaluate the performance of ELD-FN B. DATASET The description of the exploited dataset of fake newsFakeddit is presented in Table 1 which is public (available online4). Nakamura et al. [47] collected the data from a social news and discussion website Reddit. It consists of over 1 million pieces of news (1,063,106) from 22 subreddits. It is classified in three different ways: 2-way, 3-way, and 6-way. The dataset samples with 6-way classification are represented in Fig. 2. Out of the total samples, 59.12% (628,501) and 40.48% (527,049) are fake and real news, correspondingly. However, only 64.25% (682,966) samples are multi-modal. Note that we only use the multi-modal data samples with 2-way classification to evaluate the proposed approach. Moreover, Fig. 3 and Fig. 4 represent the wordcloud (most common words in the dataset) and frequency of thewords, respectively. C. PROCESS This section explains the evaluation process of ELD-FN. After performing the preprocessing and feature modeling as mentioned in Section III, a 10-fold cross-validation technique is applied to train and test ELD-FN. The reason for considering 10-fold cross-validation is that it helps avoid data biasness and reduces the variance in performance estimation 4https://github.com/entitize/fakeddit, accessed on 15-01-2023. 15042 VOLUME 12, 2024 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News FIGURE 2. Dataset Example with 6-way Classification [47]. TABLE 1. Description of fakeddit dataset. FIGURE 3. Word cloud - most common words in details. that might be observed with a single train-test split [48]. The dataset’s total multi-modal news N are broken down into ten (10) slices Ci, where i = 1, 2, . . . , 10. For each cross-validation, the slices of N are selected that are not from Ci as training samples (Nt ) and news from Ci as testing samples (Nv). A bit-by-bit evaluation process for ith cross-validation is as follows: 1) All news Nt from N but Ci are extracted and combined; 2) an ensemble deep learning classifier is trained on Nt ; 3) a CNN classifier is trained on Nt ; 4) a LSTM classifier is trained on Nt ; 5) baseline classifiers are trained on Nt ; 6) we predict whether each news from the testing samples Ci is real or fake; and 7) the below-mentioned evaluation metrics are computed for each classifier. D. METRICS We train and test the deep learning classifiers to evaluate the performance of ELD-FN. We select the most accepted FIGURE 4. Minimun and maximum words. metrics (accuracy, precision, recall, and f1-score) for this purpose. Furthermore, we compute theMCC andOR to check the effectiveness of the classifiers. The selected metrics can be presented as follows: accuracy = TP+ TN TP+ TN + FP+ FN (9) precision = TP TP+ FP (10) recall = TP TP+ FN (11) f 1− score = 2 ∗ precision ∗ recall pecision+ recall (12) MCC = TP ∗ TN−FP ∗ FN √ (TP+ FP)(TP+ FN )(TN + FP)(TN + FN ) (13) OR = TP/FP FN/TN (14) where, TP and TN are the numbers of correctly predicted news as real and fake, respectively. Similarly, FP and FN are the numbers of incorrectly predicted news as real and fake, respectively. VOLUME 12, 2024 15043 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News TABLE 2. Performance of ELD-FN and baseline approaches. E. RESULTS 1) RQ1: COMPARISON OF ELD-FN AGAINST BASELINE APPROACHES Table 2 and Fig. 5 present the evaluation metrics for three different approaches (ELD-FN, FakeNED, MultiFND) based on their accuracy, precision, recall, F1-score, MCC, and OR. The results advised that the average values of these metrics for ELD-FN, FakeNED, and MultiFND are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02), (89.25%, 91.12%, 87.54%, 89.29%, 0.45, and 15.78), and (78.91%, 85.27%, 76.42%, 80.60%, 0.39, and 13.95), respectively. The f1-score distribution of cross-validation for ELD- FN, FakeNED, and MultiFND are presented in Fig. 6. A beanplot is a visualization that displays a continuous variable’s distribution across different groups. The beanplot compares the f1-score distributions by plotting one bean for each approach. Across a bean, the width of the bean represents the density of the data, with wider beans indicating higher density. The following observations are made from Table 2, Fig. 5, and Fig. 6. • ELD-FN has the accuracy (88.83%) and highest pre- cision (93.54%), indicating that it has the highest percentage of correctly classified instances and true positive instances. • ELD-FN has the highest recall (90.29%) and F1-score (91.89%), indicating that it has the highest ability to correctly identify positive instances and achieve a balance between precision and recall. • ELD-FN also has the highest MCC (0.49) and OR (17.02), indicating a better correlation between pre- dicted and actual classifications and higher odds of event occurrence than FakeNED and MultiFND. The average results of MCC (0.49 > 0.45 > 0.39) > 0 and OR (17.02 > 15.78 > 13.95) > 1 are true for ELD-FN and confirm its effectiveness. • The minimum f1-score of ELD-FN is higher than the maximum f1-scores of FakeNED andMultiFND (shown in Fig. 6). To validate the significant difference in the means of performance (f1-score) for all iterations of ELD-FN, Fak- eNED, and MultiNED, we perform a single-factor Analysis of Variance (ANOVA). ANOVA is a statistical method used to test whether there is a significant difference in the means of three or more independent groups or samples. It is conducted on Excell with its default settings and presented in Fig. 7. It suggests that F > Fcric and p-value < (α = 0.05) are true for f1-score, and the factor (using different approaches) significantly differs in f1-score. Moreover, we utilize two re-sampling methods, over- sampling and under-sampling to tackle the class imbal- ance within the dataset. Over-sampling involves generating additional samples for the minority class through Ran- domOverSampler, while under-sampling entails removing surplus records from the majority class in imbalanced datasets using RandomUnderSampler. The findings reveal that employing under-sampling results in accuracy, precision, recall, and F1-score values of 86.12%, 92.54%, 88.76%, and 90.61%, respectively. However, it’s important to note that under-sampling diminishes the number of majority class samples, leading to a loss of information. Consequently, the performance of both majority and minority classes in the fine-tuned BERT model declines when under-sampling is applied. Likewise, utilizing the over-sampling technique yields accuracy, precision, recall, and F1-score values of 90.26%, 94.37%, 91.88%, and 93.11%, respectively. This enhancement is attributed to BERT being exposed to a larger dataset, enabling it to learn meaningful patterns more effectively. The preceding analysis concluded that ELD-FN outper- forms the baseline approaches in detecting fake news. 2) RQ2: INFLUENCE OF SENTIMENT ON ELD-FN The evaluation results of ELD-FNwith andwithout sentiment analysis are presented in Table 3 and Fig. 8. The evaluation results of ELD-FN for different settings of sentiment (enable/disable) based on their accuracy, precision, recall, F1- score, MCC, and OR are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02) and (88.12%, 90.38%, 89.98%, 90.17%, 0.49, and 17.02), respectively. From Table 3 and Fig. 8, it is observed that Disabling sen- timent (i.e., textual features only) brings out the significant difference in precision from 93.54% to 90.38% and f1-score from 91.89% to 90.17%. However, MCC and OR remain the same. Table 5 represents the relationship between sentiment and news. It presents that 65.84% of negative news are real, whereas only 34.16% of the positive news are real. However, 73.71% of negative news are fake, whereas only 26.29% of the positive news are fake. It means the possibility of spreading fake news is 180.37% = (73.71% - 26.29%) / 26.29%, if the news is negative. For example, if a fake news article portrays a political figure negatively, it can contribute to a negative sentiment towards that figure among the public and will propagate quickly. 15044 VOLUME 12, 2024 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News FIGURE 5. Performance of ELD-FN and baseline approaches. FIGURE 6. Distribution of f-measure. The preceding analysis concluded that sentiment and fea- tures are critical for detecting fake news and disabling either would significantly reduce the performance of ELD-FN. 3) RQ3: INFLUENCE OF PREPROCESSING ON ELD-FN The evaluation results of ELD-FN with and without prepro- cessing are presented in Table 4 and Fig. 9. The evaluation results of ELD-FN for different settings of preprocessing (enable/disable) based on their accuracy, precision, recall, F1- score, MCC, and OR are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02) and (88.49%, 92.95%, 90.11%, 90.50%, 0.49, and 17.02), respectively. From Table 4 and Fig. 9, it is observed that disabling preprocessing brings out the significant difference in accu- racy from 88.83% to 88.12%, precision from 93.54% to 92.95%, recall from 90.29 to 90.11, and f1-score from 91.89% to 90.50%. However, MCC and OR remain the same. The preceding analysis concluded that text preprocessing and features are critical for detecting fake news and FIGURE 7. ANOVA analysis on performance comparison. disabling either would significantly reduce the performance of ELD-FN. 4) RQ4: COMPARISON OF ELD-FN AGAINST OTHER CLASSIFIERS We select off-the-shelf deep learning classifiers (CNN and LSTM), the most widely adopted and well-known. Note that the preprocessed text, their sentiment, and feature embeddings are given as input to the selected classifiers for comparative analysis. We set hyper-parameters’ values as dropout = 0.2, recurrent_dropout = 0.2, loss function = binary-crossentropy, and activation = sigmoid for ELD-FN and both baseline approaches. Table 6 and Fig. 10 present the evaluation metrics for ELD- FN, CNN, and LSTM based on their accuracy, precision, recall, F1-score, MCC, and OR. The results advised that the average values of these metrics for ELD-FN, FakeNED, and MultiFND are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02), (86.73%, 92.56%, 85.81%, 89.06%, 0.48, and 16.97), VOLUME 12, 2024 15045 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News TABLE 3. Influence of sentiment on ELD-FN. TABLE 4. Influence of preprocessing on ELD-FN. FIGURE 8. Influence of sentiment on ELD-FN. and (86.51%, 90.22%, 86.19%, 88.21%, 0.48, and 16.92), respectively. The following observations are made from Table 5 and Fig. 10. • ELD-FN outperforms CNN and LSTM. The perfor- mance enhancement of ELD-FN upon CNN in accuracy, precision, recall, f1-score, MCC, and OR is 2.42%, 1.06%, 5.22%, 3.18%, 0.01, and 0.05, respectively. However, the performance enhancement of ELD-FN upon LSTM in accuracy, precision, recall, f1-score, MCC, and OR is 2.68%, 3.68%, 4.76%, 4.17%, 0.01, and 0.10, respectively. • ELD-FN performs better than LSTM because LSTM requires short text and performs sequential processing, which is unnecessary in our case. In contrast, CNN is proven efficient for long text and works better to extract local invariant features. The preceding analysis concluded that ELD-FN outper- forms other classifiers in detecting fake news. F. THREATS TO VALIDITY The probability of incorrect labeling of news is the first threat to construct validity. This research assumes that the TABLE 5. Relation between sentiment and news. assigned labels by Nakamura et al. [47] are correct. However, incorrect labeling of data may cause the productivity of ELD-FN. The choice of assessment metrics of ELD-FN is another threat to construct validity. The chosen metrics for detecting news are the most accepted in the literature for the classification task. The choice of the sentiment analysis repository is the first threat to internal validity. The chosen repository III-E has been public and has good results in computing sentiment. Exploiting other repositories may cause the productivity of ELD-FN. ELD-FN, FakeNED, and MultiFND coding is the second threat to internal validity. The coding and the produced results of ELD-FN, FakeNED, and MultiFND are verified to mitigate the threat. However, unknown errors may cause the productivity of ELD-FN. The hyper-parameters setting of ELD-FN is the third threat to internal validity. The hyper-parameters setting for ELD-FN 15046 VOLUME 12, 2024 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News FIGURE 9. Influence of preprocessing on ELD-FN. TABLE 6. Comparison of ELD-FN against other classifiers. FIGURE 10. Comparison of ELD-FN against other classifiers. is mentioned in Section IV-E4. The change in settings may cause the productivity of ELD-FN. V. CONCLUSION AND FUTURE WORK Automatic fake news detection is crucial to avoid spreading false information that can have serious consequences, ranging from reputational damage to social and political unrest. In some cases, fake news can even incite violence and lead to harm or loss of life. Therefore, the ability to automatically identify and flag false information can help mitigate the threats of fake news. From this perspective, this paper proposes an ensemble deep learning-based detection of fake news. The proposed approach leverages NLP techniques for preprocessing textual information of news, computes the sentiment from the text of each news, generates embeddings for text and images of the corresponding news VOLUME 12, 2024 15047 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News by leveraging V-BERT, and passes the embeddings to the deep learning ensemble model for training and testing. The evaluation results significantly outperform the state-of-the- art approaches in identifying fake news. In future, we would like to investigate the need to adapt detection algorithms to new types of media. Fake news is not limited to text-based content, and algorithms must be able to detect false information in images, videos, and audio as well. Moreover, we are interested in improving the interpretability of detection algorithms. Current methods often rely on opaque deep learningmodels, making it difficult to understand how decisions are being made. Future work could focus on developing more transparent models or tools that help users understand how algorithms arrive at their conclusions. REFERENCES [1] S. De Sarkar, F. Yang, and A. Mukherjee, ‘‘Attending sentences to detect satirical fake news,’’ in Proc. 27th Int. Conf. Comput. Linguistics, 2018, pp. 3371–3380. [2] H. Allcott and M. Gentzkow, ‘‘Social media and fake news in the 2016 election,’’ J. Econ. Perspect., vol. 31, no. 2, pp. 211–236, May 2017. [3] A. Moon. (2017). Two-Thirds of American Adults Get News From Social Media: Survey. [Online]. Available: https://uk.reuters.com/article/us- usa-internet-socialmedia/two-thirds-of-american-adults-get-news-from- social-media-survey-idUKKCN1BJ2A8 [4] K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, ‘‘Fake news detection on social media: A data mining perspective,’’ ACM SIGKDD Explor. Newslett., vol. 19, no. 1, pp. 22–36, 2017. [5] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-training of deep bidirectional transformers for language understanding,’’ 2018, arXiv:1810.04805. [6] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, ‘‘Improving language understanding by generative pre-training,’’ Tech. Rep., 2018. [7] J. Sarzynska-Wawer, A. Wawer, A. Pawlak, J. Szymanowska, I. Stefaniak, M. Jarkiewicz, and L. Okruszek, ‘‘Detecting formal thought disorder by deep contextualized word representations,’’ Psychiatry Res., vol. 304, Oct. 2021, Art. no. 114135. [8] L. Harold Li, M. Yatskar, D. Yin, C.-J. Hsieh, and K.-W. Chang, ‘‘VisualBERT: A simple and performant baseline for vision and language,’’ 2019, arXiv:1908.03557. [9] S. Afroz, M. Brennan, and R. Greenstadt, ‘‘Detecting hoaxes, frauds, and deception in writing style online,’’ in Proc. IEEE Symp. Secur. Privacy, May 2012, pp. 461–475. [10] X. Zhou, J. Wu, and R. Zafarani, ‘‘SAFE: Similarity-aware multi-modal fake news detection,’’ inProc. Advances in KnowledgeDiscovery andData Mining. Cham, Switzerland: Springer, 2020, pp. 354–367. [11] J. Ma, W. Gao, P. Mitra, S. Kwon, B. J. Jansen, K.-F. Wong, and M. Cha, ‘‘Detecting rumors from microblogs with recurrent neural networks,’’ in Proc. Int. Joint Conf. Artif. Intell. (IJCAI), 2016, pp. 3818–3824. [12] Y. Chen, J. Sui, L. Hu, and W. Gong, ‘‘Attention-residual network with CNN for rumor detection,’’ in Proc. 28th ACM Int. Conf. Inf. Knowl. Manage., Nov. 2019, pp. 1121–1130. [13] J. Ma, W. Gao, and K.-F. Wong, ‘‘Detect rumors on Twitter by promoting information campaigns with generative adversarial learning,’’ in Proc. World Wide Web Conf., May 2019, pp. 3049–3055. [14] F. Yu, Q. Liu, S. Wu, L. Wang, and T. Tan, ‘‘A convolutional approach for misinformation identification,’’ in Proc. 26th Int. Joint Conf. Artif. Intell., Aug. 2017, pp. 3901–3907. [15] V. Vaibhav, R. M. Annasamy, and E. Hovy, ‘‘Do sentence interactions matter? Leveraging sentence level representations for fake news classifi- cation,’’ 2019, arXiv:1910.12203. [16] L. Wu, Y. Rao, H. Jin, A. Nazir, and L. Sun, ‘‘Different absorption from the same sharing: Siftedmulti-task learning for fake news detection,’’ 2019, arXiv:1909.01720. [17] M. Cheng, S. Nazarian, and P. Bogdan, ‘‘VRoC: Variational autoencoder- aided multi-task rumor classifier based on text,’’ in Proc. Web Conf., 2020, pp. 2892–2898. [18] F. Qian, C. Gong, K. Sharma, and Y. Liu, ‘‘Neural user response generator: Fake news detection with collective user intelligence,’’ in Proc. 27th Int. Joint Conf. Artif. Intell., Jul. 2018, pp. 3834–3840. [19] A. Giachanou, P. Rosso, and F. Crestani, ‘‘Leveraging emotional signals for credibility detection,’’ in Proc. 42nd Int. ACM SIGIR Conf. Res. Develop. Inf. Retr., Jul. 2019, pp. 877–880. [20] K. Wu, S. Yang, and K. Q. Zhu, ‘‘False rumors detection on Sina Weibo by propagation structures,’’ in Proc. IEEE 31st Int. Conf. Data Eng., Apr. 2015, pp. 651–662. [21] P. Li, X. Sun, H. Yu, Y. Tian, F. Yao, and G. Xu, ‘‘Entity-oriented multi- modal alignment and fusion network for fake news detection,’’ IEEE Trans. Multimedia, vol. 24, pp. 3455–3468, 2022. [22] P. Qi, J. Cao, X. Li, H. Liu, Q. Sheng, X. Mi, Q. He, Y. Lv, C. Guo, and Y. Yu, ‘‘Improving fake news detection by using an entity-enhanced framework to fuse diverse multimodal clues,’’ in Proc. 29th ACM Int. Conf. Multimedia, Oct. 2021, pp. 1212–1220. [23] C. Song, N. Ning, Y. Zhang, and B. Wu, ‘‘A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks,’’ Inf. Process. Manage., vol. 58, no. 1, Jan. 2021, Art. no. 102437. [24] L. D. Sciucca, M. Mameli, E. Balloni, L. Rossi, E. Frontoni, P. Zingaretti, and M. Paolanti, ‘‘FakeNED: A deep learning based-system for fake news detection from social media,’’ in Proc. Int. Conf. Image Anal. Process., 2022, pp. 303–313. [25] I. Segura-Bedmar and S. Alonso-Bartolome, ‘‘Multimodal fake news detection,’’ Information, vol. 13, no. 6, p. 284, Jun. 2022. [Online]. Available: https://www.mdpi.com/2078-2489/13/6/284 [26] F. Yang, Y. Liu, X. Yu, and M. Yang, ‘‘Automatic detection of rumor on Sina Weibo,’’ in Proc. ACM SIGKDD Workshop Mining Data Semantics, Aug. 2012, pp. 1–7. [27] Z. Jin, J. Cao, Y. Zhang, J. Zhou, and Q. Tian, ‘‘Novel visual and statistical image features for microblogs news verification,’’ IEEE Trans. Multimedia, vol. 19, no. 3, pp. 598–608, Mar. 2017. [28] C. Boididou, S. Papadopoulos, D.-T. Dang-Nguyen, G. Boato, and Y. Kompatsiaris, ‘‘The certh-unitn participation@ verifying multimedia use 2015,’’MediaEval, vol. 1, p. 2, May 2015. [29] B. Emek Soylu, M. S. Guzel, G. E. Bostanci, F. Ekinci, T. Asuroglu, andK.Acici, ‘‘Deep-learning-based approaches for semantic segmentation of natural scene images: A review,’’ Electronics, vol. 12, no. 12, p. 2730, Jun. 2023. [Online]. Available: https://www.mdpi.com/2079- 9292/12/12/2730 [30] Q. S. Hamad, H. Samma, and S. A. Suandi, ‘‘Feature selection of pre- trained shallow CNN using the QLESCA optimizer: COVID-19 detection as a case study,’’ Appl. Intell., vol. 53, no. 15, pp. 18630–18652, Feb. 2023, doi: 10.1007/s10489-022-04446-8. [31] S. R. Shah, S. Qadri, H. Bibi, S. M.W. Shah, M. I. Sharif, and F. Marinello, ‘‘Comparing inception v3, VGG 16, VGG 19, CNN, and ResNet 50: A case study on early detection of a Rice disease,’’ Agronomy, vol. 13, no. 6, p. 1633, Jun. 2023. [Online]. Available: https://www.mdpi.com/2073-4395/13/6/1633 [32] Y.Wang, F.Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao, ‘‘EANN: Event adversarial neural networks for multi-modal fake news detection,’’ in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2018, pp. 849–857. [33] D. Khattar, J. S. Goud, M. Gupta, and V. Varma, ‘‘MVAE: Multimodal variational autoencoder for fake news detection,’’ in Proc. World Wide Web Conf., May 2019, pp. 2915–2921. [34] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for large-scale image recognition,’’ 2014, arXiv:1409.1556. [35] S. Singhal, R. R. Shah, T. Chakraborty, P. Kumaraguru, and S. Satoh, ‘‘SpotFake: A multi-modal framework for fake news detection,’’ in Proc. IEEE 5th Int. Conf. Multimedia Big Data (BigMM), Sep. 2019, pp. 39–47. [36] X. Zhou and R. Zafarani, ‘‘A survey of fake news: Fundamental theories, detection methods, and opportunities,’’ ACM Comput. Surv., vol. 53, no. 5, pp. 1–40, Sep. 2021. [37] J. Xue, Y. Wang, Y. Tian, Y. Li, L. Shi, and L. Wei, ‘‘Detecting fake news by exploring the consistency of multimodal data,’’ Inf. Process. Manage., vol. 58, no. 5, Sep. 2021, Art. no. 102610. [38] F. Ghorbanpour, M. Ramezani, M. A. Fazli, and H. R. Rabiee, ‘‘FNR: A similarity and transformer-based approach to detect multi-modal fake news in social media,’’ Social Netw. Anal. Mining, vol. 13, no. 1, pp. 1–15, Mar. 2023. 15048 VOLUME 12, 2024 http://dx.doi.org/10.1007/s10489-022-04446-8 M. Luqman et al.: Utilizing Ensemble Learning for Detecting Multi-Modal Fake News [39] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszko- reit, and N. Houlsby, ‘‘An image is worth 16×16 words: Transformers for image recognition at scale,’’ 2020, arXiv:2010.11929. [40] Z. Jin, J. Cao, H. Guo, Y. Zhang, and J. Luo, ‘‘Multimodal fusion with recurrent neural networks for rumor detection on microblogs,’’ in Proc. 25th ACM Int. Conf. Multimedia, Oct. 2017, pp. 795–816. [41] H. Zhang, Q. Fang, S. Qian, and C. Xu, ‘‘Multi-modal knowledge-aware event memory network for social media rumor detection,’’ in Proc. 27th ACM Int. Conf. Multimedia, Oct. 2019, pp. 1942–1951. [42] C. Song, C. Yang, H. Chen, C. Tu, Z. Liu, and M. Sun, ‘‘CED: Credible early detection of social media rumors,’’ IEEE Trans. Knowl. Data Eng., vol. 33, no. 8, pp. 3035–3047, Aug. 2021. [43] S. Qian, J. Wang, J. Hu, Q. Fang, and C. Xu, ‘‘Hierarchical multi-modal contextual attention network for fake news detection,’’ in Proc. 44th Int. ACM SIGIR Conf. Res. Develop. Inf. Retr., Jul. 2021, pp. 153–162. [44] Y. Wu, P. Zhan, Y. Zhang, L. Wang, and Z. Xu, ‘‘Multimodal fusion with co-attention networks for fake news detection,’’ in Proc. IJCNLP, 2021, pp. 2560–2569. [45] W. Zhang, L. Gui, and Y. He, ‘‘Supervised contrastive learning for multimodal unreliable news detection in COVID-19 pandemic,’’ in Proc. 30th ACM Int. Conf. Inf. Knowl. Manage., Oct. 2021, pp. 3637–3641. [46] T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, ‘‘Comparing boosting and bagging techniques with noisy and imbalanced data,’’ IEEE Trans. Syst., Man, Cybern., A, Syst. Hum., vol. 41, no. 3, pp. 552–568, May 2011. [47] K. Nakamura, S. Levy, and W. Y. Wang, ‘‘r/Fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection,’’ in Proc. Int. Conf. Lang. Resour. Eval., 2020, pp. 1–9. [48] M. Tausif, S. Dilshad, Q. Umer, M. W. Iqbal, Z. Latif, C. Lee, and R. N. Bashir, ‘‘Ensemble learning-based estimation of reference evapotranspiration (ETO),’’ Internet Things, vol. 24, Feb. 2023, Art. no. 100973. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S2542660523002962 MUHAMMAD LUQMAN received the bachelor’s degree in computer science from the University of Gujrat, Pakistan, in 2017, and the master’s degree in computer science from Northwestern Polytechnical University, China. He is currently a young Scholar in the field of computer science. His research interests include wide spectrum, primarily focusing on cutting-edge fields, such as artificial intelligence, deep learning, and data mining. MUHAMMAD FAHEEM (Member, IEEE) received the B.Sc. degree in computer engineering from the Department of Computer Engineering, University College of Engineering and Tech- nology, Bahauddin Zakariya University, Multan, Pakistan, in 2010, the M.S. degree in computer science from the Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia (UTM), Johor Bahru, Malaysia, in 2012, and the Ph.D. degree in computer science from the Faculty of Engineering, UTM, in 2021. From 2012 to 2014, he was a Lecturer with the COMSATS Institute of Information and Technology, Pakistan. From 2014 to 2022, he was also an Assistant Professor with the Department of Computer Engineering, Abdullah Gul University, Turkey. He is currently a Researcher with the School of Computing (Innovations and Technology), University of Vaasa, Vaasa, Finland. He has authored several papers in refereed journals and conferences. His research interests include cybersecurity, blockchain, artificial intelligence, smart grids, and smart cities. He served as a reviewer for numerous journals in IEEE, Elsevier, Springer, Wiley, Hindawi, and MDPI. WAHEED YOUSUF RAMAY received the Ph.D. degree from the University of Science and Tech- nology Beijing (USTB) China. He is currently an Assistant Professor with Air University. His aca- demic and clinical focus is the use of algorithms (deep learning, machine learning, and big data analysis), advanced text analysis techniques, and sentiment analysis. MALIK KHIZAR SAEED received the B.S. degree in information technology from the University of Gujrat, Gujrat, Pakistan, in 2013, and the M.S. degree in computer science from COMSATS University Islamabad, Vehari Campus, Pakistan. He is currently working as a Visiting Lecturer at COMSATS University Islamabad. His research interests include machine learning, deep learning, and artificial intelligence. He is also interested in classification-related tasks using different ML approaches. MAJID BASHIR AHMAD received the master’s degree in computer science from COMSATS Uni- versity Islamabad, Pakistan, in 2014, and the M.S. degree in computer science from The University of Lahore, Pakistan, in 2019. He is currently a Research Scholar in the field of computer science. His research interests include artificial intelligence, machine learning, and data mining. VOLUME 12, 2024 15049