-
CiteScore
-
Impact Factor
Volume 1, Issue 2, IECE Transactions on Advanced Computing and Systems
Volume 1, Issue 2, 2024
Submit Manuscript Edit a Special Issue
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
IECE Transactions on Advanced Computing and Systems, Volume 1, Issue 2, 2024: 78-96

Open Access | Research Article | 26 May 2024
Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets
1 Department of Computer Science and Information Technology, University of Science and Technology Bannu, Bannu, Pakistan
2 Department of Computer Science, Qurtuba University of Science and Information Technology, Peshawar Campus, Peshawar, Pakistan
3 Department of Computer Science and Information Technology, University of Malakand, Chakdara 18800, Pakistan
4 College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China
5 Department of Telecommunication, Hazara University, Mansehra, Khyber Pakhtunkhwa, Pakistan
* Corresponding Authors: Khairullah Khan, [email protected] ; Fida Muhammad Khan, [email protected]
Received: 24 February 2024, Accepted: 27 April 2024, Published: 26 May 2024  
Abstract
Stance detection identifies a text’s position or attitude toward a given subject. A major challenge in Roman Urdu is the lack of a publicly available dataset for political stance detection. To address this gap, we constructed a high-quality dataset of 8,374 political tweets and comments using the Twitter API, annotated with stance labels: agree, disagree, and unrelated. The dataset captures diverse political viewpoints and user interactions. For feature representation, we employed TF-IDF due to its effectiveness in handling high-dimensional, context-sensitive Roman Urdu text. Several machine learning classifiers were evaluated, with Random Forest achieving the highest accuracy of 95%. Additionally, we fine-tuned the transformer-based RoBERTa model, which outperformed traditional methods with 97% accuracy. Our results demonstrate the potential of combining machine learning and deep learning for stance detection in low-resource languages. This study not only introduces a novel dataset but also provides a robust evaluation of methods, highlighting the importance of modern AI techniques in processing informal and multilingual text data.

Graphical Abstract
Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets

Keywords
stance
roman urdu
machine learning
SVM
random forest
logistic regression
naïve bayes
decision tree and RoBERTa

Data Availability Statement
Data will be made available on request.

Funding
This work was supported without any funding.

Conflicts of Interest
The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Ghosh, S., Singhania, P., Singh, S., Rudra, K., & Ghosh, S. (2019). Stance detection in web and social media: a comparative study. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th International Conference of the CLEF Association, CLEF 2019, Lugano, Switzerland, September 9–12, 2019, Proceedings 10 (pp. 75-87). Springer International Publishing.
    [CrossRef]   [Google Scholar]
  2. Cao, R., Lee, R. K.-W., & Hoang, T.-A. (2022). Stance detection for online public opinion awareness: An overview. International Journal of Intelligent Systems, 37(12), 11944-11965.
    [CrossRef]   [Google Scholar]
  3. AlDayel, A., & Magdy, W. (2021). Stance detection on social media: State of the art and trends. Information Processing & Management, 58(4), 102597.
    [CrossRef]   [Google Scholar]
  4. Gasparetto, A., Marcuzzo, M., Zangari, A., & Albarelli, A. (2022). A survey on text classification algorithms: From text to predictions. Information, 13(2), 83.
    [CrossRef]   [Google Scholar]
  5. Ansari, Z., Ali, S., & Khan, F. (2020). Use of roman script for writing urdu language. International Journal of Linguistics and Culture, 1(2), 165-178.
    [CrossRef]   [Google Scholar]
  6. Alturayeif, N., Luqman, H., & Ahmed, M. (2023). A systematic review of machine learning techniques for stance detection and its applications. Neural Computing and Applications, 35(7), 5113-5144.
    [CrossRef]   [Google Scholar]
  7. Küçük, D., & Can, F. (2019). A tweet dataset annotated for named entity recognition and stance detection. arXiv preprint arXiv:1901.04787.
    [CrossRef]   [Google Scholar]
  8. Walker, M. A., Anand, P., Abbott, R., & Grant, R. (2012). That is your evidence?: Classifying stance in online political debate. Decision Support Systems, 53(4), 719-729.
    [CrossRef]   [Google Scholar]
  9. Yan, Y., Chen, J., & Shyu, M.-L. (2020). Yan, Y., Chen, J., & Shyu, M. L. (2018). Efficient large-scale stance detection in tweets. International Journal of Multimedia Data Engineering and Management (IJMDEM), 9(3), 1-16.
    [CrossRef]   [Google Scholar]
  10. Ayyub, K., Javed, M., Shaukat, Z., & Ismail, M. (2021). Stance detection using diverse feature sets based on machine learning techniques. Journal of Intelligent & Fuzzy Systems, 40(5), 9721-9740.
    [CrossRef]   [Google Scholar]
  11. Karande, H., Patil, S., & Joshi, R. (2021). Stance detection with BERT embeddings for credibility analysis of information on social media. PeerJ Computer Science, 7, e467.
    [CrossRef]   [Google Scholar]
  12. Skanda, V. S., Kumar, M. A., & Soman, K. P. (2017, September). Detecting stance in kannada social media code-mixed text using sentence embedding. In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 964-969). IEEE. [\href{
    [CrossRef]   [Google Scholar]
  13. Siddiqua, U. A., Chy, A. N., & Aono, M. (2019, June). Tweet stance detection using an attention based neural ensemble model. In Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 1868-1873).
    [CrossRef]   [Google Scholar]
  14. Tian, L., Zhang, X., Wang, Y., & Liu, H. (2020). Early detection of rumours on twitter via stance transfer learning. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42 (pp. 575-588). Springer International Publishing.
    [CrossRef]   [Google Scholar]
  15. Kochkina, E., Liakata, M., & Augenstein, I. (2017). Turing at semeval-2017 task 8: Sequential approach to rumour stance classification with branch-lstm. arXiv preprint arXiv:1704.07221.
    [CrossRef]   [Google Scholar]
  16. Shafi, J., Adeel Nawab, R. M., & Rayson, P. (2023). Semantic tagging for the urdu language: Annotated corpus and multi-target classification methods. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(6), 1-32.
    [CrossRef]   [Google Scholar]
  17. Küçük, D. (2017). Joint named entity recognition and stance detection in tweets. arXiv preprint arXiv:1707.09611.
    [CrossRef]   [Google Scholar]
  18. Li, Y., Luo, Y., & Li, C. (2021). P-stance: A large dataset for stance detection in political domain. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2355-2365.
    [CrossRef]   [Google Scholar]
  19. Küçük, D., & Can, F. (2018). Stance detection on tweets: An svm-based approach. arXiv preprint arXiv:1803.08910.
    [CrossRef]   [Google Scholar]
  20. Gül, I., Lebret, R., & Aberer, K. (2024). Stance detection on social media with fine-tuned large language models. arXiv preprint arXiv:2404.12171.
    [Google Scholar]
  21. Chuang, Y. S. (2023). Tutorials on stance detection using pre-trained language models: Fine-tuning BERT and prompting large language models. arXiv preprint arXiv:2307.15331.
    [Google Scholar]
  22. Wang, X., Wang, Y., Cheng, S., Li, P., & Liu, Y. (2024). DEEM: Dynamic Experienced Expert Modeling for Stance Detection. arXiv preprint arXiv:2402.15264.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Khan, B., Khan, K., Khan, F. M., Noureen, H., Ali, A., & Shah, M. (2024). Comparing Fine-Tuned RoBERTa with Traditional Machine Learning Models for Stance Detection in Political Tweets. IECE Transactions on Advanced Computing and Systems, 1(2), 78–96. https://doi.org/10.62762/TACS.2024.928069

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 204
PDF Downloads: 78

Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions
CC BY Copyright © 2024 by the Author(s). Published by Institute of Emerging and Computer Engineers. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
IECE Transactions on Advanced Computing and Systems

IECE Transactions on Advanced Computing and Systems

ISSN: 3067-7157 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Copyright © 2024 Institute of Emerging and Computer Engineers Inc.