Hybrid NLP & LLM Sentiment Analysis: Multi-Domain Literature Review

cover
24 Jun 2026

Abstract

I. INTRODUCTION

II. RELATED WORKS

III. BACKGROUND STUDY

IV. CORPUS CREATION

V. IMPLEMENTATION DETAILS

VI. RESULT ANALYSIS & DISCUSSION

VII. FUTURE RESEARCH DIRECTIONS

VIII. CONCLUSION AND REFERENCES

In their study, Xiang et al. [5] propose a semi-supervised approach for market sentiment analysis, utilizing LLMs to generate weak labels for Reddit posts. They incorporate Chain-ofThought (COT) reasoning to enhance label stability and accuracy. Despite being trained on weakly labeled data, the experimental results demonstrate competitive performance against supervised models. Additionally, Zarmeen et al. [6] present a hybrid sentiment analysis model for student feedback, integrating TF-IDF, N-gram, and lexicon-based features. Their study showcases superior performance over other methods and APIs, highlighting its relevance in educational contexts.

By leveraging machine learning and domain-specific lexicons, the approach enables accurate sentiment analysis, aiding educators in enhancing teaching methodologies and decision-making. Meanwhile, Manal et al. [7] explore e-commerce sentiment analysis, emphasizing the vast internet data necessary for understanding customer sentiments. Through comparisons of supervised ML models using TF-IDF, N-gram, and lexiconbased approaches, prevalent positivity is revealed. Logistic regression excels post extensive text prep and evaluation, showcasing its efficacy in predicting customer recommendations.

Their findings underscore sentiment analysis’ critical role in e-commerce decisions, emphasizing the need to tackle challenges like spotting fake reviews. In another study, Ehsanur et al. [8] analyze Bangladeshi food delivery app reviews using NLP, comparing AFINN, RoBERTa, and DistilBERT models. Despite challenges like limited data and noise, DistilBERT achieves the highest accuracy (77%), highlighting its effectiveness.

The study underscores the significance of sentiment analysis in the food delivery sector, suggesting the need for context-specific models to address natural language complexities. Lastly, Muntasir et al. [10] develop hybrid CNN-LSTM models with various Word Embeddings to detect emotions from Bangla texts. Achieving 90.49% accuracy and 92.83% F1 score with Word2Vec embedding, their study aims to accurately identify happiness, anger, and sadness emotions, contributing to Bangla language sentiment analysis.

This paper is available on arxiv under CC BY 4.0 license.