Revolutionizing Urdu Sentiment Analysis : Harnessing the Power of XLM-R and GPT-2
annif.suggestions | social media|Urdu language|data mining|Pakistan|machine learning|natural language|Twitter|deep learning|Indo-Aryan languages|text mining|en | en |
annif.suggestions.links | http://www.yso.fi/onto/yso/p20774|http://www.yso.fi/onto/yso/p17877|http://www.yso.fi/onto/yso/p5520|http://www.yso.fi/onto/yso/p105965|http://www.yso.fi/onto/yso/p21846|http://www.yso.fi/onto/yso/p26762|http://www.yso.fi/onto/yso/p24097|http://www.yso.fi/onto/yso/p39324|http://www.yso.fi/onto/yso/p24821|http://www.yso.fi/onto/yso/p27112 | en |
dc.contributor.author | Ashraf, Muhammad Rehan | |
dc.contributor.author | Hussain, Muzammal | |
dc.contributor.author | Jaffar, M. Arfan | |
dc.contributor.author | Ramay, Waheed Yousuf | |
dc.contributor.author | Faheem, Muhammad | |
dc.contributor.faculty | fi=Tekniikan ja innovaatiojohtamisen yksikkö|en=School of Technology and Innovations| | - |
dc.contributor.orcid | https://orcid.org/0000-0003-4628-4486 | - |
dc.contributor.organization | fi=Vaasan yliopisto|en=University of Vaasa| | |
dc.date.accessioned | 2025-06-02T14:18:10Z | |
dc.date.accessioned | 2025-06-25T14:05:16Z | |
dc.date.available | 2025-06-02T14:18:10Z | |
dc.date.issued | 2024-07-18 | |
dc.description.abstract | Sentiment analysis extracts valuable insights from textual sources using computation, textual or systematic analysis, and natural language processing. It identifies and measures the attitudes, beliefs, and emotional states individuals express through text data. Recent research on sentiment analysis has largely focused on the English language; therefore, low-resource languages are getting much less attention. Conducting sentiment analysis of low-resource languages is difficult because large datasets and related repositories are unavailable. This paper creates a new dataset for low-resource language (Urdu) to address this issue. The dataset, namely LUCSA-23, consists of more than 65,000 user reviews from various genres, including food, sports, showbiz, apps, and political reviews from developing countries, i.e., Pakistan. Urdu domain experts further annotate the created dataset. This paper proposes an Urdu sentiment analysis approach leveraging the transformer model, i.e., XLM-R and GPT-2. It preprocesses the Urdu text input, generates BERT embeddings, and passes them to the proposed classifier as input for sentiment classification. The proposed classifier is compared with machine/deep/embedded classifiers to evaluate its performance. The findings show that the proposed classifiers outperform existing state-of-the-art approaches with an accuracy of 95%. | - |
dc.description.notification | ©2024 The Authors. This work is licensed under a Creative Commons Attribution (BY) 4.0 License. | - |
dc.description.reviewstatus | fi=vertaisarvioitu|en=peerReviewed| | - |
dc.format.bitstream | true | |
dc.format.content | fi=kokoteksti|en=fulltext| | - |
dc.format.extent | 15 | - |
dc.format.pagerange | 99779-99793 | - |
dc.identifier.olddbid | 23942 | |
dc.identifier.oldhandle | 10024/19674 | |
dc.identifier.uri | https://osuva.uwasa.fi/handle/11111/3300 | |
dc.identifier.urn | URN:NBN:fi-fe2025060257996 | - |
dc.language.iso | eng | - |
dc.publisher | IEEE | - |
dc.relation.doi | 10.1109/ACCESS.2024.3429496 | - |
dc.relation.ispartofjournal | IEEE access | - |
dc.relation.issn | 2169-3536 | - |
dc.relation.url | https://doi.org/10.1109/ACCESS.2024.3429496 | - |
dc.relation.volume | 12 | - |
dc.rights | CC BY 4.0 | - |
dc.source.identifier | WOS:001276323400001 | - |
dc.source.identifier | 2-s2.0-85199099391 | - |
dc.source.identifier | https://osuva.uwasa.fi/handle/10024/19674 | |
dc.subject | Sentiment analysis; Web sites; Analytical models; Accuracy; Video on demand; Reviews; Electronic mail; Natural language processing; Urdu; XLM-R; GPT-2; classification; BERT | - |
dc.subject.discipline | fi=Tietotekniikka|en=Computer Science| | - |
dc.subject.yso | Urdu language | - |
dc.subject.yso | deep learning | - |
dc.title | Revolutionizing Urdu Sentiment Analysis : Harnessing the Power of XLM-R and GPT-2 | - |
dc.type.okm | fi=A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä|en=A1 Peer-reviewed original journal article|sv=A1 Originalartikel i en vetenskaplig tidskrift| | - |
dc.type.publication | article | - |
dc.type.version | publishedVersion | - |
Tiedostot
1 - 1 / 1
Ladataan...
- Name:
- Osuva_Ashraf_Hussain_Jaffar_Ramay_Faheem_2024.pdf
- Size:
- 1.77 MB
- Format:
- Adobe Portable Document Format