Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets

Bilal Khan; Khairullah Khan; Fida Muhammad Khan; Haseena Noureen; Ahmad Ali; Mohsin Shah

doi:10.62762/TACS.2024.928069

CiteScore

Impact Factor

Volume 1, Issue 2, IECE Transactions on Advanced Computing and Systems

Volume 1, Issue 2, 2024

Submit Manuscript Edit a Special Issue

Article QR Code

Scan the QR code for reading

Popular articles

Research on A Ship Trajectory Classification Method Based on Deep Learning YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval A Mimic Fusion Algorithm for Dual Channel Video Based on Possibility Distribution Synthesis Theory Deep Prediction Network Based on Covariance Intersection Fusion for Sensor Data Visual Feature Extraction and Tracking Method Based on Corner Flow Detection Inaugural Editorial of the Chinese Journal of Information Fusion Simultaneous Spatiotemporal Bias Compensation and Data Fusion for Asynchronous Multisensor Systems YOLOv8-Lite: A Lightweight Object Detection Model for Real-time Autonomous Driving Systems Short and Long-Term Renewable Electricity Demand Forecasting Based on CNN-Bi-GRU Model

IECE Transactions on Advanced Computing and Systems, Volume 1, Issue 2, 2024: 78-96

Open Access | Research Article | 26 May 2024

Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets

Bilal Khan 1

Khairullah Khan 1 *

Fida Muhammad Khan 2 *

Haseena Noureen 3

Ahmad Ali 4

Mohsin Shah 5

1 Department of Computer Science and Information Technology, University of Science and Technology Bannu, Bannu, Pakistan

2 Department of Computer Science, Qurtuba University of Science and Information Technology, Peshawar Campus, Peshawar, Pakistan

3 Department of Computer Science and Information Technology, University of Malakand, Chakdara 18800, Pakistan

4 College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China

5 Department of Telecommunication, Hazara University, Mansehra, Khyber Pakhtunkhwa, Pakistan

* Corresponding Authors: Khairullah Khan, [email protected] ; Fida Muhammad Khan, [email protected]

DOI: 10.62762/TACS.2024.928069

Received: 24 February 2024, Accepted: 27 April 2024, Published: 26 May 2024

PDF (1.31 MB) Full-Text HTML XML

Article Metrics Cite This Article

Abstract

Stance detection identifies a text’s position or attitude toward a given subject. A major challenge in Roman Urdu is the lack of a publicly available dataset for political stance detection. To address this gap, we constructed a high-quality dataset of 8,374 political tweets and comments using the Twitter API, annotated with stance labels: agree, disagree, and unrelated. The dataset captures diverse political viewpoints and user interactions. For feature representation, we employed TF-IDF due to its effectiveness in handling high-dimensional, context-sensitive Roman Urdu text. Several machine learning classifiers were evaluated, with Random Forest achieving the highest accuracy of 95%. Additionally, we fine-tuned the transformer-based RoBERTa model, which outperformed traditional methods with 97% accuracy. Our results demonstrate the potential of combining machine learning and deep learning for stance detection in low-resource languages. This study not only introduces a novel dataset but also provides a robust evaluation of methods, highlighting the importance of modern AI techniques in processing informal and multilingual text data.

Graphical Abstract

Keywords

stance

roman urdu

machine learning

SVM

random forest

logistic regression

naïve bayes

decision tree and RoBERTa

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Ghosh, S., Singhania, P., Singh, S., Rudra, K., & Ghosh, S. (2019). Stance detection in web and social media: a comparative study. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th International Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9–12, 2019, Proceedings 10 (pp. 75-87). Springer International Publishing.
[CrossRef] [Google Scholar]
Cao, R., Lee, R. K.-W., & Hoang, T.-A. (2022). Stance detection for online public opinion awareness: An overview. International Journal of Intelligent Systems, 37(12), 11944-11965.
[CrossRef] [Google Scholar]
AlDayel, A., & Magdy, W. (2021). Stance detection on social media: State of the art and trends. Information Processing & Management, 58(4), 102597.
[CrossRef] [Google Scholar]
Gasparetto, A., Marcuzzo, M., Zangari, A., & Albarelli, A. (2022). A survey on text classification algorithms: From text to predictions. Information, 13(2), 83.
[CrossRef] [Google Scholar]
Ansari, Z., Ali, S., & Khan, F. (2020). Use of roman script for writing urdu language. International Journal of Linguistics and Culture, 1(2), 165-178.
[CrossRef] [Google Scholar]
Alturayeif, N., Luqman, H., & Ahmed, M. (2023). A systematic review of machine learning techniques for stance detection and its applications. Neural Computing and Applications, 35(7), 5113-5144.
[CrossRef] [Google Scholar]
Küçük, D., & Can, F. (2019). A tweet dataset annotated for named entity recognition and stance detection. arXiv preprint arXiv:1901.04787.
[CrossRef] [Google Scholar]
Walker, M. A., Anand, P., Abbott, R., & Grant, R. (2012). That is your evidence?: Classifying stance in online political debate. Decision Support Systems, 53(4), 719-729.
[CrossRef] [Google Scholar]
Yan, Y., Chen, J., & Shyu, M.-L. (2020). Yan, Y., Chen, J., & Shyu, M. L. (2018). Efficient large-scale stance detection in tweets. International Journal of Multimedia Data Engineering and Management (IJMDEM), 9(3), 1-16.
[CrossRef] [Google Scholar]
Ayyub, K., Javed, M., Shaukat, Z., & Ismail, M. (2021). Stance detection using diverse feature sets based on machine learning techniques. Journal of Intelligent & Fuzzy Systems, 40(5), 9721-9740.
[CrossRef] [Google Scholar]
Karande, H., Patil, S., & Joshi, R. (2021). Stance detection with BERT embeddings for credibility analysis of information on social media. PeerJ Computer Science, 7, e467.
[CrossRef] [Google Scholar]
Skanda, V. S., Kumar, M. A., & Soman, K. P. (2017, September). Detecting stance in kannada social media code-mixed text using sentence embedding. In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 964-969). IEEE. [\href{
[CrossRef] [Google Scholar]
Siddiqua, U. A., Chy, A. N., & Aono, M. (2019, June). Tweet stance detection using an attention based neural ensemble model. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 1868-1873).
[CrossRef] [Google Scholar]
Tian, L., Zhang, X., Wang, Y., & Liu, H. (2020). Early detection of rumours on twitter via stance transfer learning. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42 (pp. 575-588). Springer International Publishing.
[CrossRef] [Google Scholar]
Kochkina, E., Liakata, M., & Augenstein, I. (2017). Turing at semeval-2017 task 8: Sequential approach to rumour stance classification with branch-lstm. arXiv preprint arXiv:1704.07221.
[CrossRef] [Google Scholar]
Shafi, J., Adeel Nawab, R. M., & Rayson, P. (2023). Semantic tagging for the urdu language: Annotated corpus and multi-target classification methods. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(6), 1-32.
[CrossRef] [Google Scholar]
Küçük, D. (2017). Joint named entity recognition and stance detection in tweets. arXiv preprint arXiv:1707.09611.
[CrossRef] [Google Scholar]
Li, Y., Luo, Y., & Li, C. (2021). P-stance: A large dataset for stance detection in political domain. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2355-2365.
[CrossRef] [Google Scholar]
Küçük, D., & Can, F. (2018). Stance detection on tweets: An svm-based approach. arXiv preprint arXiv:1803.08910.
[CrossRef] [Google Scholar]
Gül, I., Lebret, R., & Aberer, K. (2024). Stance detection on social media with fine-tuned large language models. arXiv preprint arXiv:2404.12171.
[Google Scholar]
Chuang, Y. S. (2023). Tutorials on stance detection using pre-trained language models: Fine-tuning BERT and prompting large language models. arXiv preprint arXiv:2307.15331.
[Google Scholar]
Wang, X., Wang, Y., Cheng, S., Li, P., & Liu, Y. (2024). DEEM: Dynamic Experienced Expert Modeling for Stance Detection. arXiv preprint arXiv:2402.15264.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Khan, B., Khan, K., Khan, F. M., Noureen, H., Ali, A., & Shah, M. (2024). Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets. IECE Transactions on Advanced Computing and Systems, 1(2), 78–96. https://doi.org/10.62762/TACS.2024.928069

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 78

Publisher's Note

IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Copyright © 2024 by the Author(s). Published by Institute of Emerging and Computer Engineers. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

IECE Transactions on Advanced Computing and Systems

ISSN: 3067-7157 (Online)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Google Scholar

Crossref

Scopus

Web of Science

We use cookies