Motamot Dataset: Benchmarking LLMs vs PLMs in Bangla Political NLP

24 Jun 2026

Table Of Links

Abstract

I. INTRODUCTION

II. RELATED WORKS

III. BACKGROUND STUDY

IV. CORPUS CREATION

V. IMPLEMENTATION DETAILS

VI. RESULT ANALYSIS & DISCUSSION

VII. FUTURE RESEARCH DIRECTIONS

VIII. CONCLUSION AND REFERENCES

Political sentiment analysis [1] examines public opinion, feelings, and attitudes about political entities, events, or ideologies, which is especially important during Bangladeshi elections. Online newspapers such as Prothom Alo1 , Bangladesh Pratidin2 , Samakal3 , and others are important forums for political conversation and information dissemination. These platforms provide individuals with election-related news, opinions, and analyses, which influence public attitude in Bangladesh’s changing political scenario.

The evolution of pre-trained language models has revolutionized NLP, offering exceptional performance across diverse tasks with minimal fine-tuning. Tailored for Bengali, models like BanglaBERT [2], SahajBERT, and mBERT [3] have advanced Bengali NLP applications significantly.

However, their dependence on large annotated datasets poses challenges due to Bengali’s limited representation in NLP. The rise of LLMs showcases remarkable accuracy without extensive training, yet models like LLaMA and GPT-3/4 face scrutiny for opaque parameters and potential misinformation.

To address this, the Reinforcement Learning from Human Feedback (RLHF) mechanism [4] aims to ensure truthful responses, yet in Bengali, LLM application remains largely unexplored due to data scarcity. Significant progress has been achieved in sentiment analysis across a variety of areas, including Reddit market sentiment [5], student feedback [6], product reviews [7], restaurant reviews [8], and general sentiment around the COVID-19 vaccination [9].

These attempts have made major improvements to understanding public opinion and assessing feelings in a variety of circumstances. Despite this improvement, there is still a significant gap: a scarcity of thorough research devoted only to political sentiment analysis. This study aims to investigate public sentiments about politics on Bangladeshi online newspapers during elections. We analyze a large amount of content from these websites to understand people’s perspectives on political issues such as parties, policy, and elections.

In addition, we investigate the emotion conveyed by political parties. We intend to give insight for both political parties and voters by understanding the views and opinions expressed by individuals, as well as the sentiments of political parties, allowing them to make informed decisions about which party to support. To the best of our knowledge, this is the first analysis of LLMs in the domain of “Political Sentiment Analysis” in Bengali language.

Here is a summary of the outcomes of our experiments:

• Developed a novel dataset named “Motamot,” containing 7,058 data points labeled with Positive and Negative sentiments, tailored specifically for Political Sentiment Analysis in the Bengali language. The dataset comprises 4,132 instances labeled as Positive and 2,926 instances labeled as Negative sentiments.

• Conducted comprehensive evaluations of both PLMs (BanglaBERT, Bangla BERT Base, XLM-RoBERTa, mBERT, and SahajBERT) along with LLMs (Gemini 1.5 Pro and GPT 3.5 Turbo).

• Identified that zero-shot performance of LLMs generally lags behind State-of-the-Art (SOTA) fine-tuned PLMs across most evaluation tasks, revealing substantial performance disparities among LLMs. This underscores the conclusion that current LLMs are not well-suited for addressing low-resourced language tasks in Bengali, particularly in zero-shot scenarios.

• Illustrated that Few-shot learning outperforms PLMs, highlighting its potential as a more effective approach for Bengali Political Sentiment Analysis tasks. Additionally, while hallucination occurred in zero-shot scenarios, Fewshot learning did not exhibit such hallucination.

Authors:

This paper is available on arxiv under CC BY 4.0 license.

← Previous

LLMs vs Transformers: Bengali Political Sentiment Analysis Benchmark

Up Next →

Hybrid NLP & LLM Sentiment Analysis: Multi-Domain Literature Review