Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection

Songram, Panida; Khummanee, Suchart; Muangprathub, Jirapond; Kawattikul, Khanabhorn

doi:10.3390/app16115310

Open AccessArticle

Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection

¹

Department of Computer Science, Faculty of Informatics, Mahasarakham University, Mahasarakham 44150, Thailand

²

Department of Applied Mathematics and Informatics, Faculty of Science and Industrial Technology, Prince of Songkla University, Surat Thani Campus, Surat Thani 84000, Thailand

³

Department of Information Technology, Faculty of Social Technology, Rajamangala University of Technology Tawan-Ok, Chanthaburi Campus, Chanthaburi 22210, Thailand

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(11), 5310; https://doi.org/10.3390/app16115310 (registering DOI)

Submission received: 10 April 2026 / Revised: 15 May 2026 / Accepted: 19 May 2026 / Published: 25 May 2026

Download

Browse Figures

Review Reports Versions Notes

Abstract

Thai sentiment analysis on social media remains challenging due to the lack of explicit word boundaries, informal expressions, code-mixing, short text length, and severe class imbalance, particularly in the WISESIGHT benchmark dataset. Although transformer-based models such as WangchanBERTa provide strong contextual representations for Thai language processing, conventional hybrid architectures using fixed convolutional kernels may fail to capture diverse local semantic patterns across different sentiment classes. To address this limitation, this study proposes WangchanBERTa with Instance-Class-Aware Dynamic Kernel Selection (WangchanBERTa-IC-DKS), a novel hybrid framework that integrates contextualized embeddings with multi-kernel convolutional neural networks and dynamic kernel selection at both the instance and class levels. The proposed model adaptively determines kernel importance according to sentence semantics and class-specific sentiment characteristics, enabling more effective local feature extraction and improved minority-class recognition. Experiments were conducted on the WISESIGHT sentiment benchmark using four kernel candidates: [2, 3, 4], [3, 4, 5], [2, 3, 4, 5], and [1, 2, 3, 4, 5]. The kernel candidate [2, 3, 4, 5] achieves the best performance, with a mean macro-F1 of 63.93% averaged across three independent runs using different random seeds. Ablation studies further show that combining both class-aware and instance-aware dynamic kernel selection improves balanced classification performance compared with fixed multi-kernel CNN baselines. The proposed model also outperforms the previous state-of-the-art Parallel Hybrid model by 1.23 percentage points on the WISESIGHT dataset and by 1.36 percentage points on the 40 Thai children’s tales dataset. These findings demonstrate that dynamic kernel selection is more effective than fixed convolutional kernels across the two evaluated Thai sentiment benchmarks, particularly for short, noisy, and imbalanced Thai texts.

Keywords:

Thai sentiment analysis; Thai social media texts; dynamic kernel selection; WangchanBERTa; instance-aware; class-aware

Graphical Abstract

1. Introduction

Sentiment analysis is an important methodology for investigating public sentiment, tracking brand reputation, and deriving actionable insights from large volumes of social media reviews. Although effective techniques have been proposed for sentiment analysis in highly researched languages such as English and Chinese, sentiment analysis in low-resource languages remains a significant challenge, especially Thai. Thai is a tonal language that lacks explicit word boundaries and exhibits complex linguistic characteristics, making text processing inherently difficult. Therefore, specialized approaches are required for sentiment analysis in this language. Sentiment analysis on Thai social media texts, characterized by creative spelling variations, emoji usage, code-mixing with English, and unique internet slang, further complicates sentiment analysis on Thai texts. Many approaches have been proposed for Thai sentiment analysis. Traditional lexicon-based sentiment analysis was first proposed, but it is ineffective at capturing the complexity of sentiment in social media text. Early machine learning-based sentiment analysis relied heavily on manually engineered features, which often lack generalizability. Consequently, deep learning approaches have emerged as a more effective solution. Various deep learning models have been proposed for sentiment analysis, such as LSTM, BiLSTM, CNN, GRU, and BiGRU. Among them, hybrid deep learning models were also proposed to enhance the performance of Thai sentiment analysis [1]. Hybrid models demonstrate improved performance over single-architecture models. However, most existing CNN-based hybrid architectures rely on fixed combinations of convolutional kernels, assuming that all sentiment classes and all input sentences require the same receptive field. In practice, different sentiment classes may rely on different local semantic patterns. For example, negative sentiment may be expressed through short, explicit complaints, whereas the question and positive classes may benefit from broader contextual understanding. To overcome this limitation, this study proposes WangchanBERTa with Instance-Class-Aware Dynamic Kernel Selection (WangchanBERTa-IC-DKS). This novel hybrid architecture integrates WangchanBERTa with a multi-kernel CNN and a dynamic kernel selection mechanism. Unlike conventional fixed-kernel CNN models, the proposed framework dynamically adjusts kernel importance according to both instance-level semantic characteristics and class-level sentiment preferences. This enables the model to capture diverse local semantic patterns more effectively and improves minority-class recognition on imbalanced Thai sentiment datasets.

The main contributions of this study are as follows:

1.: We propose WangchanBERTa-IC-DKS, a novel hybrid framework that integrates WangchanBERTa with both instance-aware and class-aware dynamic kernel selection for Thai sentiment analysis. Unlike conventional fixed multi-kernel CNN models, the proposed framework dynamically adjusts kernel importance according to both sentence-level semantic characteristics and class-specific sentiment preferences.
2.: We systematically investigate multiple kernel candidates, including [2, 3, 4], [3, 4, 5], [2, 3, 4, 5], and [1, 2, 3, 4, 5], to identify the best-performing configuration within the explored parameter space on the WISESIGHT dataset using macro-F1 as the primary evaluation metric. Experimental results show that the kernel candidate [2, 3, 4, 5] achieves the best-balanced performance within the investigated parameter space.
3.: We conduct comprehensive ablation studies to evaluate the individual contributions of fixed multi-kernel CNN, class-aware dynamic kernel selection, instance-aware dynamic kernel selection, and the full IC-DKS framework, demonstrating that combining both class-aware and instance-aware mechanisms improves macro-F1 performance on the WISESIGHT dataset.
4.: To ensure result reliability and robustness, all experiments are conducted over three independent runs using different random seeds, and the final performance is reported as mean ± standard deviation. The proposed model achieves the highest mean macro-F1 of 63.93%, outperforming the previous state-of-the-art Parallel Hybrid model by 1.23 percentage points on the WISESIGHT dataset and by 1.36 percentage points on the 40 Thai children’s tales dataset.

2. Related Work

The field of Thai sentiment analysis has developed since 2009, from lexicon-based methods to deep learning architectures. Past studies sought to improve sentiment analysis methods in the Thai context. In contrast, more recent studies have employed single- and hybrid transformer-based models, as well as other deep learning models, to better understand the Thai language. Previous sentiment analysis studies can be grouped into two categories as follows.

The first group emphasized lexicon-based and machine learning methods. For example, Phienthrakul et al. explored SVM with multiple kernel functions for sentiment classification in Thai text [2]. The combination of different kernel functions and a single kernel function was evaluated, and it was reported that the combination could capture diverse linguistic features and improve classification performance compared to single-kernel approaches on product reviews. Lertsuksakda et al. proposed a novel approach to constructing Thai sentiment terms using the hourglass of emotions model [3]. They moved simple positive–negative polarity to capture more nuanced emotional states. Sentiment lexicons reflecting complex emotional dimensions were constructed. Chirawichitchai investigated emotion classification in Thai text using different term weighting and machine learning techniques [4]. Various term weighting schemes were evaluated to improve classification accuracy. The results showed that appropriately weighted features could enhance the performance of machine learning models for Thai emotion classification. Chirawichitchai proposed a term-weighting scheme based on the term occurrence ratio, optimized for sentiment analysis [5]. The term weighting scheme was proposed in mathematical formulations for calculating term importance, thereby better reflecting the sentiment characteristics of Thai words. The proposed weighting scheme improved classification performance by emphasizing terms with strong discriminative power for sentiment polarity.

Pasupa et al. applied SVMs to sentiment analysis of Thai children’s stories [6]. They extended sentiment analysis typically performed in domains such as product reviews and social media. Netisopakul and Lertsuksakda employed hypothesis testing based on observations from Thai sentiment classification experiments [7]. A more rigorous statistical approach was used to evaluate sentiment classification methods, employing hypothesis testing to validate performance differences between approaches. Haruechaiyasak et al. introduced S-Sense, which is a comprehensive framework for sentiment analysis. The framework was specifically designed for social media sensing [8]. It integrated multiple components, including preprocessing, feature extraction, and classification, tailored to the informal and often abbreviated language used in Thai social media. Porntrakoon and Moemeng introduced the SenseComp (Sentiment Compensation) technique for multi-dimensional analysis of consumer reviews in Thai [9]. They observed that consumer reviews often express mixed sentiment across product dimensions. As a result, a more sophisticated analysis was required than simple overall polarity classification. To address this, the proposed SenseComp technique improved sentiment analysis by evaluating multiple attributes separately. Taemung and Chirawichitchai applied SVM to analyze sentiment in Thai product reviews [10]. They focused on e-commerce applications and addressed practical challenges in classifying Thai-language customer reviews. The study showed that SVM could achieve reasonable performance on commercial sentiment analysis tasks. However, it still struggled to handle complex or ambiguous sentiment expressions.

The second group focused on deep learning and transformer approaches. For example, Vateekul and Koomsubha conducted comprehensive studies by applying deep learning techniques on Thai Twitter data [11]. Various deep learning architectures, including DCNN (Dynamic Convolutional Neural Network) and LSTM, were explored for sentiment analysis of Thai social texts. The study showed that the deep learning approaches outperformed traditional machine learning methods on Thai social media sentiment analysis. The best model was DCNN, achieving the highest accuracy with 75.35%. Pasupa and Seneewong also proposed a comparative study of deep learning techniques for Thai sentiment analysis [12]. Various deep learning architectures were systematically evaluated, and the impact of different input representations, including word embeddings, POS tags, and sentiment features, was explored. The study showed that deep learning approaches are most effective for Thai sentiment analysis. The CNN architecture, combined with the three feature types, could achieve the highest F1 of 81.70%. Thong-iad and Netisopakul compared different methods for Thai sentence sentiment tagging using Thai sentiment resources [13]. Different sentiment lexicons and tagging approaches were evaluated for the classification task. The results showed that using adverb and adjective synsets alone could achieve the highest emotion classification accuracy. Traditional machine learning methods, such as SVM and Random Forest, perform well, but deep learning models, including CNN, LSTM, and BERT-based models, often surpass them, especially when advanced techniques such as hyperparameter tuning and incorporating linguistic features (POS, sentiment values) are employed. SVMs offer strong performance among traditional methods, whereas hybrid deep learning models and transformer models achieve higher accuracy.

Recently, transformer-based models have significantly advanced natural language processing across diverse languages [14]. Prominent representatives of this family include BERT [15], mBERT [16], RoBERTa [17], ALBERT [18], XLM-RoBERTa [19], and ELECTRA [20]. Several of these models, including mBERT, ALBERT, and XLM-RoBERTa, support Thai language processing. However, prior studies have reported that multilingual models often achieve lower performance than monolingual models specifically trained and optimized for a single language, particularly in language-specific downstream tasks [21,22,23,24]. Although transformer-based pre-trained models are highly effective in capturing rich linguistic representations, they may not sufficiently emphasize sentiment-discriminative features required for fine-grained sentiment classification. Consequently, integrating transformer-based pre-trained models with deep learning architectures has been proposed as an effective strategy and has demonstrated promising results for Thai sentiment analysis [25]. Lowphansirikul et al. [26] proposed WangchanBERTa, a RoBERTa-based model pre-trained on large-scale Thai corpora that provides powerful contextualized representations for capturing Thai-specific semantic and syntactic information. WangchanBERTa consistently outperforms established multilingual models like mBERT and XLM-R across a variety of benchmarks, achieving in NER (Named Entity Recognition), sentiment analysis, and POS tagging. WangchanBERTa has since become a foundation model for numerous Thai NLP tasks, including sentiment analysis. Pasupa and Seneewong proposed hybrid deep learning models that combined multiple neural architectures for Thai sentiment analysis [1]. The hybrid model, integrating CNNs and LSTMs, could capture both local features and long-range dependencies. It achieved higher performance than single-architecture models. From the proposed hybrid models, BiLSTM-CNN achieved the highest performance. It achieved macro-F1 of 74.36% on ThaiTales, 77.07% on ThaiEconTwitter, and 55.21% on the WISESIGHT datasets. Jitboonyapinit et al. investigated sentiment analysis on Thai social media using convolutional neural networks combined with long short-term memory networks [27]. The challenges of this work are about informal language and unique linguistic patterns in Thai social media posts. The CNN-LSTM model successfully extracted spatial and temporal features. It achieved 85.00% accuracy of product reviews on social media. Khamphakdee and Seresangtakul developed an efficient deep learning approach optimized for Thai sentiment analysis [28]. This work focused on balancing model performance with computational efficiency. Modifications to the model architecture reduced training time while maintaining high accuracy.

Nokkaew et al. analyzed online public opinion regarding major infrastructure projects using advanced machine learning and deep learning for Thai sentiment analysis [29]. Their research demonstrated the application of sentiment analysis to policy-relevant social issues by analyzing public discourse on the Thailand–China high-speed train and Laos–China railway projects. Comment sentiment classification was performed using six approaches: linear regression, Naive Bayes, Random Forest, BiLSTM, BERT-Base-Thai, and WangchanBERTa. The WangchanBERTa model achieved 94.57% accuracy. Suraratchai and Phoomvuthisarn proposed a hybrid method combining WangchanBERTa with CNN and BiLSTM architectures for Thai sentiment analysis [25]. Their research was built upon the pre-trained WangchanBERTa model by adding convolutional and bidirectional recurrent layers to capture additional linguistic features. The hybrid architecture achieved competitive performance on the WISESIGHT and the Thai children’s tales datasets. The Parallel Hybrid approach, WangchanBERTa-CNN-BiLSTM, achieved the highest macro-F1, reaching 62.70% on the WISESIGHT dataset and 78.59% on the Thai Children’s Tales dataset. Satjathanakul and Siriborvornratanakul focused on improving sentiment polarity classification on the Thai product reviews dataset using modern Transformer-based architectures [30]. The fine-tuned WangchanBERTa model achieved performance metrics ranging between 66% and 93% on the product reviews dataset. Emphan et al. enhanced the performance of the sentiment analysis model using GridSearchCV for hyperparameter optimization. This work focused on classifying sentiment in electric-vehicle discussions in Thailand [31]. Hyperparameter tuning was used to identify optimal configurations for Thai sentiment classification models. The study demonstrated that careful hyperparameter tuning could improve performance.

Previous studies have shown that transformer-based models, particularly WangchanBERTa, provide strong contextual understanding for Thai sentiment analysis, whereas hybrid architectures that combine WangchanBERTa with CNN or recurrent networks further improve performance by capturing both global and local semantic information. However, most existing approaches that combine WangchanBERTa with CNN rely on fixed convolution kernel combinations. They assume that all sentiment classes and all input sentences require identical receptive fields. This assumption is particularly problematic for highly imbalanced datasets, such as WISESIGHT, where minority classes often require different semantic scopes than the dominant classes. Although previous studies have extensively explored hybrid architectures, the problem of dynamically selecting kernel importance based on both class characteristics and instance-level semantics remains largely underexplored in Thai sentiment analysis. Therefore, this study proposes WangchanBERTa-IC-DKS. This novel hybrid architecture integrates WangchanBERTa with multi-kernel CNN and dynamic kernel selection mechanisms at both the class and instance levels. This design enables adaptive kernel weighting to improve local feature extraction, enhance minority-class recognition, and improve balanced classification performance on the Thai sentiment datasets.

3. Materials and Methods

3.1. Datasets

The WISESIGHT dataset has been widely used for Thai sentiment analysis in many studies. It was introduced by Suriyawongkul et al. [32]. The data was collected by WISESIGHT Thailand, one of the largest social media analysis companies that monitors social media discussions across many channels. The data comprises real messages written in Thai, collected from many popular social networking sites, including Facebook, Twitter, Instagram, YouTube, and Pantip. This publicly available dataset is available from UCI or Hugging Face. It contains social media posts annotated into four sentiment categories: positive, neutral, negative, and question. The WISESIGHT dataset is challenging for sentiment analysis competitions because it contains informal, noisy Thai social media text with slang, misspellings, and no clear word boundaries, making accurate tokenization and interpretation difficult. Additionally, short context on average (as shown in Figure 1), mixed sentiments, class imbalance, and subtle differences between categories (e.g., neutral vs. question) further complicate sentiment classification as shown in Figure 2. The WISESIGHT benchmark is split into training, validation, and testing sets for competition in prior studies, as shown in Figure 3 and Table 1.

Figure 1, Figure 2 and Figure 3, and Table 1 clearly illustrate the imbalance among sentiment categories. The question class contains only 476 samples in the training set, 42 samples in the validation set, and 57 samples in the testing set. Classifying the question class is challenging because it is very small. Many studies struggled with this class because of its very small size. The largest category in the dataset is the neutral class. It contains 11,795 samples in the training set, 1291 samples in the validation set, and 1453 samples in the testing set. The largest class size can interfere with smaller ones, such as the positive and negative classes, because models may overfit to neutral-class samples. In addition, the dataset contains highly informal, diverse language typical of Thai social media, including slang, abbreviations, emojis, mixed Thai-English content, and inconsistent spelling. These characteristics make the WISESIGHT dataset particularly challenging for sentiment analysis and require models to handle noisy and unpredictable real-world text. Therefore, the WISESIGHT dataset serves as the primary challenging benchmark for evaluating the proposed model’s performance.

To demonstrate the proposed model’s capabilities, we further evaluated its performance on an additional Thai sentiment dataset, the 40 Thai children’s tales dataset. The 40 Thai children’s tales dataset is a Thai sentiment analysis corpus comprising 40 Thai children’s stories. It was originally introduced to support sentiment classification of short narrative text in Thai. The 40 Thai children’s tales dataset comprises 1115 labeled messages, with an imbalanced class distribution across three sentiment categories: 309 messages (27.71%) labeled as positive, 508 messages (45.56%) labeled as neutral, and 298 messages (26.73%) labeled as negative. Following prior studies, the dataset was divided into training, validation, and testing subsets with a 60:20:20 ratio to ensure a fair and consistent evaluation setting. Compared with the WISESIGHT dataset, this corpus is less imbalanced and contains shorter text sequences, making it particularly suitable for evaluating sentiment classification performance on short-form Thai narrative text. The resulting training, validation, and testing subsets with stratified class distributions are shown in Table 2.

3.2. Proposed WangchanBERTa-IC-DKS Framework

Transformer-based models such as WangchanBERTa provide powerful contextualized representations and improve Thai NLP performance, but mainly focus on global context and may overlook fine-grained local features critical for sentiment classification in short texts [15,26]. Therefore, integrating approaches that capture both global and local features can improve sentiment classification performance, particularly for Thai-language texts that exhibit complex contextual dependencies and fine-grained local semantic patterns. This integration enables the model to better understand the meaning of sentences at the grammatical level while simultaneously identifying important local n-gram features. This study proposes a novel Thai sentiment classification model called WangchanBERTa-IC-DKS. The proposed architecture integrates the contextual representation capability of WangchanBERTa with a multi-kernel convolutional neural network (CNN) and a dynamic kernel selection mechanism that jointly considers both instance-aware and class-aware information. Unlike conventional BERT-CNN architectures that use fixed kernel combinations followed by simple concatenation [33,34,35], the proposed model dynamically adjusts kernel importance based on both the semantic characteristics of each input sentence and the specific requirements of each sentiment class. This design enables the model to more effectively capture local semantic patterns and improve minority-class recognition on imbalanced Thai sentiment datasets.

Figure 4 shows the architecture of the proposed WangchanBERTa-IC-DKS model. WangchanBERTa first encodes the input Thai text to generate contextual token representations. Multi-kernel CNN layers with different kernel sizes extract local semantic patterns. The proposed Instance-Class-Aware Dynamic Kernel Selection (IC-DKS) mechanism dynamically combines kernel-specific features using both instance-aware weights generated from the CLS embedding and class-aware trainable kernel weights. The fused class-specific representations are passed to independent classifiers to produce final multi-class sentiment predictions.

The overall framework of the proposed model consists of five main components: (1) input encoding, (2) contextual embedding generation using WangchanBERTa, (3) multi-kernel CNN feature extraction, (4) Instance-Class-Aware Dynamic Kernel Selection (IC-DKS), and (5) class-specific classification.

3.2.1. Input Encoding

Given an input Thai sentence

X = x_{1}, x_{2}, x_{3}, \dots, x_{n}

(1)

where

x_{i}

denotes the i-th token and n is the sequence length, the sentence is first tokenized using the SentencePiece tokenizer of WangchanBERTa. Special tokens [CLS] and [SEP] are added, and the input is converted into input IDs and attention masks before being fed into the pretrained WangchanBERTa encoder.

3.2.2. Contextual Embedding Generation

The pretrained WangchanBERTa encoder (airesearch/wangchanberta-base-att-spm-uncased, version 1.0, available via Hugging Face) produces contextual token representations:

H = WangchanBERTa (X)

(2)

where

H \in R^{B \times L \times d}

(3)

and B denotes the batch size, L denotes the sequence length, and d denotes the hidden dimension of WangchanBERTa. Each token embedding captures contextual semantic information from the entire sentence.

The embedding of the [CLS] token is extracted as:

h_{C L S} = H [:, 0, :]

(4)

where

h_{C L S} \in R^{B \times d}

(5)

This representation is used for instance-aware kernel selection.

3.2.3. Multi-Kernel CNN Feature Extraction

To capture local semantic patterns at multiple granularities, the contextual embeddings are passed through multiple one-dimensional convolutional layers with different kernel sizes:

K = {k_{1}, k_{2}, \dots, k_{K}}

(6)

where K denotes the total number of kernel candidates, and

k_{i}

represents the kernel size of the i-th convolution. Each kernel captures local semantic patterns at different receptive fields.

For each kernel

k_{i}

, convolution is performed as:

C_{i} = ReLU ({Conv}_{k_{i}} (H))

(7)

followed by global max pooling:

f_{i} = MaxPool (C_{i})

(8)

where

f_{i} \in R^{B \times n_{f}}

(9)

and

n_{f}

is the number of CNN filters. The features from all kernels are stacked together:

F = [f_{1}, f_{2}, \dots, f_{K}]

(10)

where

F \in R^{B \times K \times n_{f}}

(11)

3.2.4. Instance-Aware Kernel Selection

Different input sentences may require different kernel sizes for optimal representation. Therefore, instance-specific kernel weights are generated from the [CLS] embedding as follows:

w^{(i n s t)} = W_{i n s t} h_{C L S} + b_{i n s t}

(12)

where

w^{(i n s t)} \in R^{B \times K}

(13)

This mechanism allows the model to dynamically determine the importance of each kernel for each input sentence.

3.2.5. Class-Aware Kernel Selection

Different sentiment classes may rely on different local semantic patterns. To model this property, a trainable class-aware kernel weight matrix is introduced:

Θ \in R^{C \times K}

(14)

where C denotes the number of sentiment classes. For class c, the corresponding kernel preference vector is defined as:

θ_{c} = Θ [c, :]

(15)

where

θ_{c} \in R^{K}

(16)

This enables each sentiment class to learn its own kernel preference.

3.2.6. Dynamic Kernel Selection (DKS)

The final kernel weights for class c are obtained by combining the instance-aware and class-aware weights:

w_{c} = Softmax (w^{(i n s t)} + θ_{c})

(17)

where

w_{c} \in R^{B \times K}

(18)

The softmax function normalizes the kernel importance so that the total weight across all kernels equals one.

The final class-specific feature representation is computed using weighted kernel fusion:

v_{c} = \sum_{k = 1}^{K} w_{c, k} \cdot f_{k}

(19)

where

v_{c} \in R^{B \times n_{f}}

(20)

This operation allows the model to emphasize the most informative kernel features for each sentiment class.

3.2.7. Class-Specific Classification

Each class-specific feature vector is passed through an independent class-specific linear classifier to produce one logit for each sentiment class.

y_{c} = W_{c} v_{c} + b_{c}

(21)

where

y_{c} \in R^{B \times 1}

(22)

All class logits are concatenated as:

Y = [y_{1}, y_{2}, \dots, y_{C}]

(23)

where

Y \in R^{B \times C}

(24)

Finally, the prediction probabilities are obtained using the Softmax function:

\hat{y} = Softmax (Y)

(25)

where

\hat{y}

represents the probability distribution over all sentiment classes. The class with the highest probability is selected as the final sentiment prediction.

3.3. Evaluation Metrics

To evaluate model performance across categories in the two datasets, we use several standard classification metrics: accuracy, F1, macro-precision, macro-recall, and macro-F1 as defined in (26)–(30), where c denotes each sentiment category, n is number of categories, N represents the total number of predictions, and

T P_{c}

refers to the number of correctly predicted samples for class c. Due to class imbalance across the datasets, macro-F1 is used as the primary evaluation metric in this study to ensure that the performance of minority classes is not overlooked. This metric provides a more reliable and fair assessment of model effectiveness in real-world sentiment classification scenarios.

A c c u r a c y = \frac{\sum_{c = 1}^{n} T P_{c}}{N}

(26)

F 1_{c} = 2 \times \frac{P r e c i s i o n_{c} \times R e c a l l_{c}}{P r e c i s i o n_{c} + R e c a l l_{c}}

(27)

Macro - p r e c i s i o n = \frac{1}{n} \sum_{c = 1}^{n} P r e c i s i o n_{c}

(28)

Macro - r e c a l l = \frac{1}{n} \sum_{c = 1}^{n} R e c a l l_{c}

(29)

Macro - F 1 = \frac{1}{n} \sum_{c = 1}^{n} F 1_{c}

(30)

4. Experiment Results and Discussion

4.1. Experimental Setup

In the experimental setup, the input sequence length was limited to 128 tokens. The model was trained with a batch size of 32 and optimized using AdamW with a learning rate of

2 \times 10^{- 5}

, incorporating a weight decay of 0.01 to mitigate overfitting. A dropout rate of 0.2 was also applied as a regularization technique to help improve model generalization. Together, weight decay and dropout were key regularization techniques during training. All models were trained for up to 20 epochs, with early stopping and a patience of 5 to prevent overfitting and identify the optimal training point based on validation performance. The best checkpoint for each model was selected based on the highest validation macro-F1. To ensure the robustness and stability of the experimental results, each model was trained and evaluated three times using different random seeds: 42, 123, and 777. The final reported results are presented as the mean and standard deviation across these three runs. All experiments were conducted on Google Colab using a NVIDIA Tesla T4 GPU (12 GB VRAM).

4.2. Kernel Size Comparison

The kernel size comparison experiment was conducted to investigate the impact of different kernel size combinations on the performance of the proposed WangchanBERTa-IC-DKS model. Since convolutional kernel sizes determine the range of local semantic patterns captured by the CNN module, selecting an appropriate kernel combination is crucial for effective sentiment classification.

From Table 3, the experimental results demonstrate that the kernel candidate [2, 3, 4, 5] provides the most effective feature extraction strategy for the proposed Wangchan-BERTa-IC-DKS model on the WISESIGHT dataset. As shown in the kernel size comparison, [2, 3, 4, 5] achieves the highest macro-F1 (63.93%), which is higher than [2, 3, 4] (56.20%), [3, 4, 5] (57.40%), and [1, 2, 3, 4, 5] (51.27%). It also produces the best accuracy (73.58%), macro-precision (67.53%), and macro-recall (62.52%).

4.3. Ablation Experiment

Table 4 presents the ablation study of the proposed WangchanBERTa-IC-DKS model using the kernel candidate [2, 3, 4, 5] on the WISESIGHT sentiment dataset. The objective of this experiment is to verify the contribution of each dynamic kernel selection component, including fixed multi-kernel CNN, class-aware dynamic kernel selection (class-aware DKS), instance-aware dynamic kernel selection (instance-aware DKS), and the full proposed Instance-Class-Aware Dynamic Kernel Selection (IC-DKS) framework. All models use the same kernel candidates [2, 3, 4, 5]. The baseline model, WangchanBERTa-CNN with fixed multi-kernel convolution, achieved a macro-F1 of 56.25%, indicating that the combination of WangchanBERTa and a multi-kernel CNN effectively captures local semantic patterns for Thai sentiment classification. However, fixed kernels assume that all sentiment classes and all input sentences require the same receptive field, thereby limiting the model’s ability to handle semantic diversity across classes and sentence structures. When the class-aware DKS mechanism was introduced, the model achieved higher accuracy (70.58%) and macro-precision (63.59%), but slightly lower macro-recall (54.34%) than the fixed multi-kernel convolution model. The macro-F1 improved to 56.66%. This indicates that class-aware weighting helps the model learn class-specific kernel preferences. For the instance-aware DKS mechanism, the accuracy slightly improves to 70.81% compared to the baseline, as it enables the model to dynamically adjust kernel importance for each sentence based on the CLS representation. However, the macro-F1 (55.35%) is slightly lower than the baseline. This suggests that relying solely on instance-level adaptation without class-level structural guidance may lead to unstable kernel selection, particularly on imbalanced datasets, where minority classes require stronger prior information. The proposed WangchanBERTa-IC-DKS model, which integrates class-aware and instance-aware mechanisms, achieves the best macro-F1 score (63.93%). Compared with the fixed multi-kernel baseline, the proposed model improves macro-F1 by more than 7 percentage points, demonstrating substantial gains in balanced classification performance, especially for minority classes where macro-F1 is a more reliable metric than accuracy.

The results indicate that the class-aware and instance-aware mechanisms function in complementary rather than independent ways. The class-aware component captures relatively consistent kernel preference tendencies at the class level, while the instance-aware module dynamically refines these preferences according to the semantic characteristics of each input sentence. Their integration enables fine-grained dynamic kernel selection, thereby enabling the model to capture diverse local semantic patterns better and achieve improved classification performance on the WISESIGHT sentiment benchmark. Therefore, the performance of the proposed WangchanBERTa-IC-DKS model does not come from simply increasing model complexity, but from the effective interaction between class-level preference priors and instance-level semantic adaptation. These results support the effectiveness of the proposed framework and suggest that dynamic kernel selection may be more effective than fixed convolutional kernels for handling diverse sentiment expressions in real-world Thai social media text.

4.4. Internal Baseline Comparison

Table 5 presents the comparative performance of the proposed WangchanBERTa-IC-DKS model against the baseline WangchanBERTa and its hybrid variants on the WISESIGHT dataset. The purpose of this experiment is to evaluate whether the proposed dynamic kernel selection mechanism provides consistent performance improvements over the baseline WangchanBERTa and conventional WangchanBERTa-based hybrid architectures, including recurrent architectures such as BiLSTM and BiGRU. WangchanBERTa-IC-DKS achieves the highest macro-F1 of 63.93%, outperforming the original WangchanBERTa baseline, WangchanBERTa-BiGRU, and WangchanBERTa-BiLSTM. The WangchanBERTa-BiLSTM model achieved strong performance with an accuracy of 72.42% and a macro-F1 score of 60.09%, indicating that recurrent architectures can effectively capture sequential dependencies in Thai sentiment classification. In contrast, WangchanBERTa-BiGRU yielded lower results, with a macro-F1 of only 51.31%, suggesting that the simpler gating mechanism of GRU may be insufficient to handle the complex contextual ambiguity and informal linguistic patterns found in Thai social media text. Compared with the strongest recurrent baseline, WangchanBERTa-BiLSTM, the proposed model improves macro-F1 by 3.84 percentage points. These results demonstrate that dynamic kernel selection is more effective than both recurrent sequence modeling and static convolutional feature extraction on the WISESIGHT dataset.

4.5. Computational Efficiency and Performance Trade-Off

To further validate the practical applicability of the proposed model, we conducted an additional comparison of computational efficiency. The comparison includes total parameter size, inference time on the full test set, and macro-F1 score, as shown in Table 6. The standard WangchanBERTa model comprises the pretrained WangchanBERTa encoder (105.24 M parameters) and a simple linear classification head (0.01 M parameters), yielding a total of approximately 105.25 M parameters. In contrast, the proposed WangchanBERTa-IC-DKS model replaces the standard classifier with the IC-DKS classification head (1.38 M parameters), resulting in a total of approximately 106.62 M parameters. Similarly, the other hybrid models replace the standard classifier with alternative task-specific heads such as CNN, BiLSTM, and BiGRU. The proposed WangchanBERTa-IC-DKS achieves the highest macro-F1 while maintaining the same parameter size as the standard WangchanBERTa-CNN baseline, despite introducing both instance-aware and class-aware dynamic kernel selection mechanisms. This indicates that the performance improvement is not primarily due to increased model complexity, but rather to more effective adaptive feature extraction enabled by dynamic kernel selection. In terms of inference efficiency, averaged over three runs, the proposed WangchanBERTa-IC-DKS achieves 18.85 s for full test-set inference, only slightly higher than the fixed multi-kernel CNN baseline (18.80 s). This relatively small difference suggests that the dynamic kernel weighting mechanism introduces only modest additional computational overhead. Compared with sequential models such as BiLSTM and BiGRU, the proposed model also provides a better balance between efficiency and performance. Although WangchanBERTa-BiLSTM achieves competitive performance (60.09%), it requires the highest computational cost (19.19 s) and the largest parameter size (107.35M). In contrast, WangchanBERTa-IC-DKS improves predictive performance while maintaining reasonable inference latency and parameter efficiency. As a result, the proposed framework provides a favorable balance between classification performance and computational efficiency.

4.6. Comparison with Previous Studies

Table 7 presents the overall performance comparison between the proposed WangchanBERTa-IC-DKS model and previous methods on the WISESIGHT dataset. The results show that the proposed model achieves the highest macro-F1 of 63.93%. The proposed model outperformed the previous BiLSTM-CNN model (55.21%) and the state-of-the-art Parallel Hybrid model (62.70%) by 8.72 and 1.23 percentage points, respectively. Although the numerical improvement over the state-of-the-art Parallel Hybrid model appears modest, this gain is meaningful because macro-F1 is the most appropriate evaluation metric for the WISESIGHT dataset due to its severe class imbalance, particularly the very small size of the question class. Unlike accuracy, macro-F1 assigns equal weight to all classes and more accurately reflects the model’s effectiveness on minority classes.

Table 8 further provides per-class performance comparison between the previous state-of-the-art Parallel Hybrid model and the proposed WangchanBERTa-IC-DKS model on the WISESIGHT dataset. The results show that the proposed model achieves substantial improvements in the negative and question classes, with F1 improvements from 76.74% to 78.11% and from 42.96% to 45.62%, respectively. This demonstrates that the proposed model is more effective at handling minority and difficult classes, a key challenge in Thai sentiment classification on social media text. For the positive class, the F1 also improves slightly, from 52.33% to 53.83%, indicating that the model can identify more positive samples despite the strong semantic overlap between the positive and neutral classes. Although the neutral class shows a slight decrease in F1 from 78.76% to 78.16%, this reduction is relatively small, whereas improvements in the minority classes lead to better-balanced classification performance. This trade-off is desirable because improving minority classes typically contributes more to macro-F1 optimization and reflects more balanced classification performance.

These findings support the effectiveness of the proposed framework on the WISESIGHT dataset. This combination of class-aware and instance-aware modules enables the model to capture both global class characteristics and local sentence-specific semantic patterns more effectively. As a result, the proposed model not only improves overall performance but, more importantly, enhances robustness to difficult and underrepresented classes, a critical requirement for real-world sentiment analysis systems.

4.7. Kernel Weight Analysis on the WISESIGHT Dataset

To improve the interpretability of the proposed WangchanBERTa-IC-DKS framework, we further analyze the kernel importance distributions on the WISESIGHT dataset from both class-aware and instance-aware perspectives. Since the proposed model dynamically assigns kernel importance based on both sentiment classes and individual sentence characteristics, understanding these distributions helps explain how different kernel sizes contribute to sentiment classification performance. Among these two perspectives, class-aware analysis provides the primary evidence for model interpretability because it reflects global kernel preference patterns at the sentiment-class level. In contrast, instance-aware analysis serves as supporting evidence by illustrating local sample-specific adaptation. Table 9 presents the mean ± standard deviation of class-aware kernel weights across the three random seeds. The class-aware kernel weight analysis reveals that overall kernel preference patterns are partially consistent across random seeds, although the degree of stability varies across sentiment classes. Although some standard deviations remain relatively large, the interpretation focuses on consistent ranking tendencies across seeds rather than exact numerical values. For the positive class, larger kernels, especially

k = 5

, tend to receive the highest average importance (0.3964), suggesting that broader contextual patterns are often useful for capturing positive sentiment expressions. However, both

k = 2

and

k = 5

show relatively large standard deviations, indicating that this preference is not fully stable across runs. This may be because positive sentiment expressions in Thai social media are linguistically diverse, ranging from short emotional expressions such as “ดีมาก” (very good) and “ชอบมาก” (really like) to longer context-dependent statements. As a result, the model may rely on either short-range or broader contextual patterns depending on training dynamics. For the neutral class, kernels

k = 4

and

k = 5

receive the highest average importance, with

k = 5

showing the largest mean kernel weight (0.4672), suggesting that broader contextual patterns are more useful for modeling neutral sentiment expressions. Nevertheless, the relatively large standard deviations for both kernels indicate that the dominance between these receptive fields varies across runs, reflecting the difficulty of distinguishing neutral sentiment from weak positive or weak negative expressions. For the negative class, kernel

k = 4

has the smallest standard deviation (±0.0180) while maintaining consistently high importance across all seeds. This potentially indicates that medium-range local semantic patterns may be useful for identifying negative sentiment expressions. For the question class,

k = 5

has the highest mean importance (0.4025), but also a very large standard deviation (±0.3546), suggesting strong instability across seeds. This likely reflects both the inherent ambiguity of the question samples and the severe class imbalance in the WISESIGHT dataset, where the question class has the fewest training examples. Consequently, kernel preference for this class becomes more sensitive to initialization and optimization dynamics.

While class-aware analysis provides global class-level tendencies, instance-aware analysis demonstrates how the proposed model dynamically adjusts kernel importance for individual samples. Table 10 presents representative examples from different sentiment classes.

The instance-aware analysis shows that kernel selection varies across individual samples, as expected, because the proposed Instance-Aware DKS adapts kernel importance to sentence-level semantic characteristics. For example, long positive and interrogative sentences such as S5 and S8 favor

k = 5

, indicating that broader contextual patterns are important for understanding such expressions. In contrast, some negative samples, such as S7, prefer

k = 3

with relatively low standard deviation, suggesting medium-range local semantic patterns. These results demonstrate that kernel selection should vary at the sample level rather than remain fixed for all inputs. This supports the motivation of the proposed DKS framework. Together, the class-aware and instance-aware analyses provide complementary insights into how the proposed WangchanBERTa-IC-DKS framework adapts kernel importance across sentiment classes and individual samples, thereby improving the interpretability of the model’s behavior. Overall, the results from Table 9 and Table 10 suggest that kernel preference distributions should be interpreted as empirical tendencies rather than definitive linguistic conclusions. While some relatively consistent patterns can be observed, particularly for the negative class, the variability across random seeds indicates that strong causal interpretations should be avoided. Therefore, the class-aware and instance-aware kernel analyses provide useful interpretability evidence for understanding the proposed model. Still, these observations should be framed cautiously as reasonable hypotheses rather than empirically proven linguistic properties.

4.8. Error Analysis on the WISESIGHT Dataset

The confusion matrix in Figure 5 and the complete error analysis in Table 11 provide deeper insight into the classification behavior of the proposed model on the WISESIGHT test set. The confusion matrix and error analysis correspond to the run whose macro-F1 was closest to the average performance across multiple random seeds, providing a representative view of the model behavior. The results show that the most frequent confusion occurred between the positive and neutral classes. Specifically, 219 positive samples were incorrectly predicted as neutral, while 144 neutral samples were misclassified as positive. This pattern is further supported by the examples shown in Table 11 (S12–S14), where positive texts were predicted as neutral despite high confidence scores (0.917–0.986). These examples reveal that many positive expressions in Thai social media are implicit and do not contain strong sentiment words. For instance, expressions such as successful expectations, product availability inquiries with positive intent, or weak approval statements can appear semantically similar to neutral statements. As a result, the model tends to assign them to the neutral class.

A similar issue appears between the negative and neutral classes. The confusion matrix indicates that 143 negative samples were misclassified as neutral and 117 neutral samples were misclassified as negative. Table 11 provides representative examples of both directions. In S11, a neutral statement regarding a promotion limit (“only 6 items”) was misclassified as negative because complaint-like lexical cues triggered a negative interpretation. In contrast, S15 contains a polite apology message that was labeled as negative but predicted as neutral, in which softened complaint expressions weakened the negative sentiment signal. These cases demonstrate that sentiment polarity in Thai often depends on pragmatic interpretation rather than explicit emotional words. The misclassification of positive as neutral in S12–S14 indicates that the model often struggles with implicit or weak positive sentiment. These samples do not contain strong positive sentiment words, but their meanings can still be interpreted as positive from the context. The question class also suffers from substantial confusion with the neutral class. Table 11 (S16–S17) shows that short interrogative expressions, such as asking product availability or requesting a car price, were predicted as neutral with high confidence. This occurs because many Thai questions resemble simple information requests and often lack explicit lexical indicators that clearly signal a question, leading the model to interpret them as neutral informational statements rather than genuine questions. An important observation from Table 11 is that several misclassified samples were predicted with very high confidence, often above 0.90. This indicates that the errors are not caused solely by model uncertainty, but rather by semantic overlap between classes and by annotation ambiguity inherent in the dataset. Therefore, the main challenge is not only feature extraction but also the subtle boundary between sentiment categories, especially positive–neutral and question–neutral pairs.

Overall, both the confusion matrix and qualitative examples consistently indicate that the primary challenge in Thai sentiment classification lies in distinguishing semantically similar classes and handling minority classes. Although the proposed instance-class-aware dynamic kernel selection mechanism improves local semantic representation and achieves the best overall macro-F1 performance, implicit sentiment expressions and severe class imbalance remain important limitations, and they constitute promising directions for future work.

4.9. Performance of the Proposed Model on the 40 Thai Children’s Tales Dataset

Table 12 presents the performance comparison between the previous studies and the proposed WangchanBERTa-IC-DKS model with kernel candidate [2, 3, 4, 5] on the 40 Thai children’s tales dataset. The results show that the proposed model achieves a macro-F1 of 79.95%, outperforming the previous SOTA result of 78.59%. The proposed model can improve the F1 for the positive class (a minority class) from 75.35% to 80.79%, but with slightly lower F1 for the neutral and negative classes (as shown in Table 13). Although the numerical improvement of macro-F1 is 1.36 percentage points, this gain is meaningful because macro-F1 is the most reliable metric for multi-class sentiment classification, particularly when balanced performance across all sentiment classes is required rather than overall accuracy alone. The proposed model also achieves the highest accuracy (80.42%), macro-precision (81.38%), and macro-recall (79.61%), indicating that the improvement is not limited to a single evaluation aspect but reflects a more robust classification capability across all sentiment categories. The performance of WangchanBERTa-IC-DKS demonstrates that dynamic kernel selection remains effective not only for social media sentiment classification, such as WISESIGHT, but also for literary and narrative text classification in children’s tales.

Figure 6 and Table 14 consistently demonstrate that the main classification difficulty in the 40 Thai Children’s Tales dataset lies in distinguishing between neutral, positive, and negative sentiments, particularly when emotional polarity is weak, implicit, or context-dependent. From the confusion matrix in Figure 6, substantial confusion arises between negative and neutral classes, with 17 negative samples incorrectly predicted as neutral, and between positive and neutral classes, with 13 positive samples misclassified as neutral. This suggests that the model tends to exhibit a strong neutral bias, particularly when emotional signals are subtle or indirect. This observation is supported by the detailed error cases presented in Table 14. For example, sample S21 (Neu→Neg) shows that the presence of the context-related negative cues may have biased the model toward the negative. Similarly, S22 (Pos→Neu) demonstrates that weak positive sentiment and contextual ambiguity make positive meaning difficult to detect, leading the model to prefer the neutral class. In S23 (Neg→Neu), complaint-like lexical cues trigger a misleading negative interpretation despite the actual neutral intent, while S24 (Pos→Neu) shows that softened or polite expressions weaken clear emotional polarity. Another frequent issue appears in sentences with ambiguous structural patterns. For instance, S25 (Neg→Neu) shows that question-like sentence forms resemble neutral information requests, making classification difficult without strong interrogative markers. Likewise, S26 (Neu→Pos) indicates that short interrogative expressions lacking explicit question indicators confuse the classifier, causing incorrect polarity assignment. Sample S27 (Neu→Pos) further shows that mixed contextual cues and ambiguous sentiment polarity blur class boundaries and reduce prediction reliability.

Overall, the most errors arise not from strong sentiment expressions but from implicit sentiment, softened emotional language, complaint-like wording, and ambiguous contextual clues, which are common characteristics of Thai children’s tales. These findings explain why the model frequently defaults to the neutral class and highlight the limitation of relying primarily on surface lexical features. Future improvements may require stronger context-aware semantic modeling and better handling of pragmatic and implicit sentiment expressions to reduce these boundary-level classification errors.

5. Conclusions and Future Work

This study proposed WangchanBERTa with Instance-Class-Aware Dynamic Kernel Selection (WangchanBERTa-IC-DKS) for sentiment analysis on short, noisy, and highly imbalanced Thai texts. The proposed framework integrates the contextual representation capability of WangchanBERTa with a multi-kernel convolutional neural network and a dynamic kernel selection mechanism that jointly considers both instance-aware and class-aware information. Unlike conventional WangchanBERTa-CNN models that rely on fixed combinations of convolutional kernels, the proposed IC-DKS model dynamically adjusts kernel importance based on both sentence-level semantic characteristics and sentiment-class preferences, enabling more effective local feature extraction and improved minority-class recognition. Experimental results on the WISESIGHT sentiment benchmark showed that the kernel candidate [2, 3, 4, 5] achieved the best overall performance within the investigated parameter space, suggesting that kernel size selection is important for Thai sentiment classification. The observed kernel preference distributions suggest that different sentiment classes may benefit from different receptive fields, indicating that fixed kernel selection may be less effective for capturing diverse local semantic patterns across sentiment classes. The proposed WangchanBERTa-IC-DKS model achieved the highest macro-F1 of 63.93%, outperforming the previous state-of-the-art Parallel Hybrid model (62.70%) by 1.23 percentage points. Ablation studies also verified that combining both class-aware and instance-aware dynamic kernel selection improves balanced classification performance compared with fixed multi-kernel CNN baselines. The effectiveness of the proposed model was further validated on the 40 Thai children’s tales dataset, which comprises shorter, less noisy narrative texts with imbalanced classes. Experimental results on the 40 Thai children’s tales dataset further supported the effectiveness of the proposed model, achieving a macro-F1 of 79.95%, which outperformed the previous state-of-the-art result (78.59%) by 1.36 percentage points. These findings demonstrate that dynamic kernel selection is more effective than fixed convolutional kernels for real-world Thai sentiment analysis, particularly on highly imbalanced datasets such as WISESIGHT. In addition, the proposed framework improves robustness for difficult minority classes, which remains one of the most challenging categories in Thai sentiment classification.

Although the proposed model achieves competitive performance, several limitations remain. First, the experiments were conducted on only two Thai sentiment datasets, namely WISESIGHT and the 40 Thai children’s tales dataset. Although these datasets represent both noisy social media text and short narrative text, further evaluation on additional Thai sentiment datasets from different domains would provide stronger empirical support for the robustness of the proposed model across different classification settings. Second, severe ambiguity between the neutral and positive classes, as well as between the neutral and negative classes, remains a challenging issue. Future work may explore hierarchical two-stage classification strategies, in which the model first distinguishes between neutral and non-neutral instances and then further classifies instances into positive, negative, and question classes. This may help reduce the strong neutral bias observed in the confusion matrix. In addition, neutral-bias calibration mechanisms, dynamic threshold-adjustment techniques, and hard example mining based on confusion-prone samples could be further investigated to improve minority-class robustness and balanced classification performance.

Author Contributions

Conceptualization, P.S. and K.K.; methodology, P.S. and K.K.; software, P.S. and K.K.; validation, P.S., S.K., J.M. and K.K.; formal analysis, P.S. and K.K.; investigation, P.S. and K.K.; resources, P.S. and K.K.; writing—original draft preparation, P.S. and K.K.; writing—review and editing, P.S. and K.K.; visualization, P.S. and K.K.; supervision and project administration, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research project was financially supported by Mahasarakham University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The WISESIGHT Sentiment Analysis dataset (version 1.1) presented in this study is available on GitHub at https://github.com/PyThaiNLP/wisesight-sentiment accessed on 9 January 2026 [32]. The 40 Thai Children Stories dataset (First release) is available on GitHub at https://github.com/dsmlr/40-Thai-Children-Stories accessed on 9 January 2026 [36].

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research.

References

Pasupa, K.; Seneewong Na Ayutthaya, T. Hybrid Deep Learning Models for Thai Sentiment Analysis. Cogn. Comput. 2022, 14, 167–193. [Google Scholar] [CrossRef]
Phienthrakul, T.; Kijsirikul, B.; Takamura, H.; Okumura, M. Sentiment Classification with Support Vector Machines and Multiple Kernel Functions. In Neural Information Processing, ICONIP 2009; Leung, C.S., Lee, M., Chan, J.H., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5864, pp. 583–592. [Google Scholar] [CrossRef]
Lertsuksakda, R.; Netisopakul, P.; Pasupa, K. Thai sentiment terms construction using Hourglass of Emotions. In Proceedings of the 2014 6th International Conference on Knowledge and Smart Technology (KST), Chon Buri, Thailand, 30–31 January 2014; pp. 46–50. [Google Scholar] [CrossRef]
Chirawichitchai, N. Emotion classification of Thai text using machine learning techniques. In Proceedings of the 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE), Chon Buri, Thailand, 14–16 May 2014; pp. 91–96. [Google Scholar] [CrossRef]
Chirawichitchai, N. Developing Term Weighting Scheme Based on Term Occurrence Ratio for Sentiment Analysis. In Information Science and Applications; Kim, K., Ed.; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2015; Volume 339. [Google Scholar] [CrossRef]
Pasupa, K.; Netisopakul, P.; Lertsuksakda, R. Sentiment analysis of Thai children stories. Artif. Life Robot. 2016, 21, 357–364. [Google Scholar] [CrossRef]
Netisopakul, P.; Pasupa, K.; Lertsuksakda, R. Hypothesis testing based on observation from Thai sentiment classification. Artif. Life Robot. 2017, 22, 184–190. [Google Scholar] [CrossRef]
Haruechaiyasak, C.; Palingoon, P.; Trakultaweekoon, K. S-Sense: A Sentiment Analysis Framework for Social Media Monitoring. Inf. Technol. J. KMUTNB 2018, 14, 11–22. [Google Scholar]
Porntrakoon, P.; Moemeng, C. Thai Sentiment Analysis for Consumer’s Review in Multiple Dimensions Using Sentiment Compensation Technique (SenseComp). In Proceedings of the 2018 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Chiang Rai, Thailand, 18–21 July 2018; pp. 25–28. [Google Scholar] [CrossRef]
Tesmuang, R.; Chirawichitchai, N. Sentiment Analysis of Thai Online Product Reviews using Genetic Algorithms with Support Vector Machine. Prog. Appl. Sci. Technol. 2020, 10, 7–13. [Google Scholar]
Vateekul, P.; Koomsubha, T. A study of sentiment analysis using deep learning techniques on Thai Twitter data. In Proceedings of the 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), Khon Kaen, Thailand, 13–15 July 2016. [Google Scholar] [CrossRef]
Pasupa, K.; Seneewong Na Ayutthaya, T. Thai sentiment analysis with deep learning techniques: A comparative study based on word embedding, POS-tag, and sentic features. Sustain. Cities Soc. 2019, 50, 101615. [Google Scholar] [CrossRef]
Thong-Iad, K.; Netisopakul, P. Comparison of Thai Sentence Sentiment Tagging Methods Using Thai Sentiment Resource. In Recent Advances in Information and Communication Technology 2019; Boonyopakorn, P., Meesad, P., Sodsee, S., Unger, H., Eds.; Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2019; Volume 936, pp. 89–98. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems (NeurIPS); Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30, Available online: https://arxiv.org/abs/1706.03762 (accessed on 19 January 2026).
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers. arXiv 2018, arXiv:1810.04805. [Google Scholar] [CrossRef]
Wu, S.; Dredze, M. Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT. arXiv 2019, arXiv:1904.09077. [Google Scholar] [CrossRef]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar] [CrossRef]
Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv 2019, arXiv:1909.11942. [Google Scholar] [CrossRef]
Conneau, A.; Khandelwal, K.; Goyal, N.; Chaudhary, V.; Wenzek, G.; Guzmán, F.; Grave, E.; Ott, M.; Zettlemoyer, L.; Stoyanov, V. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics Online, Online, 5–10 July 2020; pp. 8440–8451. [Google Scholar] [CrossRef]
Clark, K.; Luong, M.-T.; Le, Q.; Manning, C. ELECTRA: Pre-training Text Encoders as Discriminators. arXiv 2020, arXiv:2003.10555. [Google Scholar] [CrossRef]
Gurgurov, D.; Bäumel, T.; Anikina, T. Multilingual Large Language Models and Curse of Multilinguality. arXiv 2024, arXiv:2406.10602. [Google Scholar] [CrossRef]
Chang, T.A.; Arnett, C.; Tu, Z.; Bergen, B. K When Is Multilinguality a Curse? Language Modeling for 250 Languages. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; pp. 4074–4096. [Google Scholar] [CrossRef]
Blevins, T.; Limisiewicz, T.; Gururangan, S.; Li, M.; Gonen, H.; Smith, N.A.; Zettlemoyer, L. Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; pp. 10822–10837. [Google Scholar] [CrossRef]
Zhao, Z.; Aletras, N. Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Mexico City, Mexico, 16–21 June 2024; pp. 3226–3244. [Google Scholar] [CrossRef]
Suraratchai, K.; Phoomvuthisarn, S. Thai Language Sentiment Analysis with a Hybrid Method on WangchanBERTa-CNN-BiLSTM. J. Inf. Sci. Technol. 2024, 14, 1–11. [Google Scholar]
Lowphansirikul, L.; Polpanumas, C.; Jantrakulchai, N.; Nutanong, S. WangchanBERTa: Pretraining transformer-based Thai Language Models. arXiv 2021, arXiv:2101.09635. [Google Scholar] [CrossRef]
Jitboonyapinit, C.; Maneerat, P.; Chirawichitchai, N. Sentiment Analysis on Thai Social Media Using Convolutional Neural Networks and Long Short-Term Memory. Int. Sci. J. Eng. Technol. 2023, 7, 74–80. [Google Scholar]
Khamphakdee, N.; Seresangtakul, P. An Efficient Deep Learning for Thai Sentiment Analysis. Data 2023, 8, 90. [Google Scholar] [CrossRef]
Nokkaew, M.; Nongpong, K.; Yeophantong, T.; Ploykitikoon, P.; Arjharn, W.; Siritaratiwat, A.; Narkglom, S.; Wongsinlatam, W.; Remsungnen, T.; Namvong, A.; et al. Analyzing online public opinion on Thailand-China high-speed train and Laos-China railway mega-projects using advanced machine learning for sentiment analysis. Soc. Netw. Anal. Min. 2023, 14, 15. [Google Scholar] [CrossRef]
Satjathanakul, J.; Siriborvornratanakul, T. Sentiment analysis in product reviews in Thai language. Int. J. Inf. Technol. 2025, 17, 1979–1985. [Google Scholar] [CrossRef]
Emphan, C.; Tiamkaew, E.; Khruahong, S. Enhancing the Performance of Sentiment Analysis Models Using GridSearchCV: A Case Study on Electric Vehicles in Thailand. J. Appl. Inform. Technol. 2026, 8, 260631. [Google Scholar] [CrossRef]
Suriyawongkul, A.; Chuangsuwanich, E.; Chormai, P.; Chantarapratin, N.; Prasertsom, P.; Sawatphol, J.; Yamada, N.; Rutherford, A.; Polpanumas, C.; Udomcharoenchaikit, C. PyThaiNLP/Wisesight Sentiment Corpus with Word Tokenization Label (v1.1). Zenodo 2024. [Google Scholar] [CrossRef]
Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep Learning-based Text Classification: A Comprehensive Review. ACM Comput. Surv. 2021, 54, 1–40. [Google Scholar] [CrossRef]
Kalchbrenner, N.; Grefenstette, E.; Blunsom, P. A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014—Proceedings of the Conference, Baltimore, MD, USA, 23–24 June 2014. [Google Scholar] [CrossRef]
Data Science and Machine Learning Research Group. 40-Thai-Children-Stories. 2019. Available online: https://github.com/dsmlr/40-Thai-Children-Stories (accessed on 28 April 2026).

Figure 1. Text length distribution per category.

Figure 2. Class distribution of the WISESIGHT dataset.

Figure 3. Class distribution across train, validation, and test Sets.

Figure 4. Framework of the proposed WangchanBERTa-IC-DKS model.

Figure 5. Confusion matrix of the proposed WangchanBERTa-IC-DKS model from the WISESIGHT test set.

Figure 6. Confusion matrix of the proposed WangchanBERTa-IC-DKS model from the 40 Thai children’s tales test set.

Table 1. Dataset statistics for training, validation, and testing subsets.

Number of Messages	Training	Validation	Testing
Total	21,628	2404	2671
#Neutral	11,795	1291	1453
#Negative	5491	637	683
#Positive	3866	434	478
#Question	476	42	57
Avg. Words	27.21	27.18	27.12
Avg. Chars	89.82	89.50	90.36

Table 2. Dataset statistics for training, validation, and testing subsets of the 40 Thai children’s tales dataset.

Number of Messages	Training	Validation	Testing
Total	669	223	223
#Neutral	305	102	101
#Negative	179	59	60
#Positive	185	62	62
Avg. Words	17.41	17.66	17.24
Avg. Chars	68.86	69.49	67.82

Table 3. Kernel size selection comparison for WangchanBERTa-IC-DKS on the WISESIGHT dataset.

Kernel Candidate	Accuracy	Macro-Precision	Macro-Recall	Macro- F1
[2, 3, 4]	71.09 ± 0.0169	63.92 ± 0.0056	54.07 ± 0.0604	56.20 ± 0.0512
[3, 4, 5]	69.84 ± 0.0306	59.89 ± 0.0356	56.28 ± 0.0534	57.40 ± 0.0530
[2, 3, 4, 5]	73.58 ± 0.0073	67.53 ± 0.0489	62.52 ± 0.0266	63.93 ± 0.0138
[1, 2, 3, 4, 5]	68.90 ± 0.0195	64.50 ± 0.0612	50.57 ± 0.0805	51.27 ± 0.0641

Table 4. Ablation study of dynamic kernel selection components on the WISESIGHT dataset.

Model	Accuracy	Macro-Precision	Macro-Recall	Macro-F1
WangchanBERTa-CNN (Fixed Multi-Kernel)	69.81 ± 0.0222	62.03 ± 0.0225	54.59 ± 0.0379	56.25 ± 0.0267
WangchanBERTa-CNN (Class-Aware DKS Only)	70.58 ± 0.0217	63.59 ± 0.0311	54.34 ± 0.0659	56.66 ± 0.0492
WangchanBERTa-CNN (Instance-Aware DKS Only)	70.81 ± 0.0265	63.14 ± 0.0280	53.26 ± 0.0446	55.35 ± 0.0564
WangchanBERTa-IC-DKS (Proposed Model)	73.58 ± 0.0073	67.53 ± 0.0489	62.52 ± 0.0266	63.93 ± 0.0138

Table 5. Performance comparison of internal baseline and the proposed WangchanBERTa-IC-DKS models on the WISESIGHT dataset.

Model	Accuracy	Macro-Precision	Macro-Recall	Macro-F1
WangchanBERTa	66.09 ± 0.0520	59.10 ± 0.0639	49.03 ± 0.0888	51.75 ± 0.0877
WangchanBERTa-BiLSTM	72.42 ± 0.0274	61.16 ± 0.0186	60.16 ± 0.0146	60.09 ± 0.0185
WangchanBERTa-BiGRU	67.90 ± 0.0437	54.54 ± 0.0862	52.30 ± 0.1260	51.31 ± 0.1174
WangchanBERTa-IC-DKS (Proposed Model)	73.58 ± 0.0073	67.53 ± 0.0489	62.52 ± 0.0266	63.93 ± 0.0138

Table 6. Comparison of computational efficiency and predictive performance across different models.

Model	Total Params (M)	Inference Time (s)	Macro-F1
WangchanBERTa	105.25	18.49 ± 0.0103	51.75 ± 0.0877
WangchanBERTa-CNN (Fixed Multi-Kernel)	106.62	18.80 ± 0.0388	56.25 ± 0.0267
WangchanBERTa-CNN (Class-Aware DKS Only)	106.62	18.75 ± 0.0398	56.66 ± 0.0492
WangchanBERTa-CNN (Instance-Aware DKS Only)	106.62	18.74 ± 0.0202	55.35 ± 0.0564
WangchanBERTa-IC-DKS (Proposed Model)	106.62	18.85 ± 0.0107	63.93 ± 0.0138
WangchanBERTa-BiLSTM	107.35	19.19 ± 0.0227	60.09 ± 0.0185
WangchanBERTa-BiGRU	106.82	19.05 ± 0.0291	51.31 ± 0.1174

Table 7. Performance comparison of the proposed WangchanBERTa-IC-DKS model with previous studies on the WISESIGHT dataset.

Model	Accuracy	Macro-Precision	Macro-Recall	Macro-F1
BiLSTM-CNN [1]	N/A	N/A	N/A	55.21
Parallel Hybrid [25] (SOTA)	73.64	67.04	60.19	62.70
WangchanBERTa-IC-DKS (Proposed Model)	73.58 ± 0.0073	67.53 ± 0.0489	62.52 ± 0.0266	63.93 ± 0.0138

N/A indicates that the corresponding metric was not reported in the original study.

Table 8. Per-Class performance comparison between the previous state-of-the-art Parallel Hybrid model and the proposed WangchanBERTa-IC-DKS model on the WISESIGHT dataset.

Class	Model	F1
Negative	Parallel Hybrid (SOTA) [25]	76.74
Negative	WangchanBERTa-IC-DKS (Proposed Model)	78.11 ± 0.0064
Neutral	Parallel Hybrid (SOTA) [25]	78.76
Neutral	WangchanBERTa-IC-DKS (Proposed Model)	78.16 ± 0.0069
Positive	Parallel Hybrid (SOTA) [25]	52.33
Positive	WangchanBERTa-IC-DKS (Proposed Model)	53.83 ± 0.0046
Question	Parallel Hybrid (SOTA) [25]	42.96
Question	WangchanBERTa-IC-DKS (Proposed Model)	45.62 ± 0.0432

Table 9. Class-aware kernel weight distributions across three random seeds.

Class	$k = 2$	$k = 3$	$k = 4$	$k = 5$
Positive	0.3202 ± 0.3050	0.1661 ± 0.0802	0.1173 ± 0.0521	0.3964 ± 0.3484
Neutral	0.1092 ± 0.0328	0.0721 ± 0.0407	0.3515 ± 0.2942	0.4672 ± 0.2668
Negative	0.1275 ± 0.1072	0.1797 ± 0.1618	0.3149 ± 0.0180	0.3779 ± 0.1521
Question	0.1520 ± 0.1656	0.2615 ± 0.1196	0.1837 ± 0.1511	0.4025 ± 0.3546

Table 10. Instance-aware kernel weight distributions for representative samples.

ID	True Label	$k = 2$	$k = 3$	$k = 4$	$k = 5$
S1	Positive	0.1614 ± 0.1534	0.3589 ± 0.1548	0.1815 ± 0.0800	0.2359 ± 0.2338
S2	Neutral	0.2042 ± 0.1374	0.4000 ± 0.1067	0.1535 ± 0.0976	0.2423 ± 0.2447
S3	Neutral	0.2884 ± 0.2314	0.2821 ± 0.1660	0.0995 ± 0.0787	0.3301 ± 0.3195
S4	Question	0.1526 ± 0.1344	0.1957 ± 0.2700	0.2425 ± 0.2270	0.4098 ± 0.2530
S5	Question	0.0322 ± 0.0292	0.0197 ± 0.0221	0.0604 ± 0.0549	0.8876 ± 0.1034
S6	Negative	0.1208 ± 0.1384	0.2799 ± 0.1717	0.4179 ± 0.2257	0.1814 ± 0.1930
S7	Negative	0.1317 ± 0.1331	0.5009 ± 0.0462	0.2792 ± 0.1178	0.0887 ± 0.0948
S8	Positive	0.0084 ± 0.0100	0.0983 ± 0.1644	0.0053 ± 0.0085	0.8882 ± 0.1912

Notes: S1: รักก้อนนะ (Love Kon, dear.); S2: จัดเลยม๊ะ (Shall we go for it?); S3: ครับผม (Yes, sir.); S4: รสชาติ เป็นไงมังคับ (How does it taste?); S5: พี่แพรมีน้ำหอม Armani code women ไหมค่ะ (P’Prae, do you have Armani Code Women perfume?); S6: จืดกว่าไฮเนเก้นอีกหรอ (Is it even blander than Heineken?); S7: ไม่ไปแล้ว แยกย้าย (I’m not going anymore, let’s go our separate ways.); S8: สำหรับงาน Transmisson: The Spirit of Warrior ต้องขวดนี้ค่ะ Jack Daniel’s Tennessee Whiskey honey เป็นวิสกี้ที่โดดเด่นในเรื่องของความหอมหวานราวกับน้ำผึ้ง(เดือน5) เพราะนำเอา Honey Liqueur ถึง 4 ชนิดผสมเข้าไป ทำให้ได้รสเข้มข้นเหมือนกับดื่มน้ำผึ้งอยู่จริง ๆ แต่ไม่ได้หวานมากจนบาดคอนะคะ ผู้ชาย ก็ชอบผู้หญิงก็ต้องบอกว่าใช่!! แล้วเจอกันค่ะ <3 (For the event Transmission: The Spirit of Warrior, this has to be the bottle—Jack Daniel’s Tennessee Whiskey Honey. It is a whiskey well known for its rich aroma and honey-like sweetness because it is blended with four types of Honey Liqueur, giving it a rich flavor as if you were actually drinking honey. However, it is not overly sweet or too strong on the throat. Men like it, and women would definitely say yes too!! See you there <3).

Table 11. Examples of misclassified samples from the WISESIGHT test set for error analysis.

ID	Confidence	Error Type	Reason
S11	0.988	Neu→Neg	Complaint-like lexical cues lead to negative interpretation
S12	0.986	Pos→Neu	Implicit positive sentiment without strong sentiment words
S13	0.986	Pos→Neu	Weak positive expression and ambiguous sentiment
S14	0.917	Pos→Neu	Weak positive sentiment easily confused with neutral
S15	0.903	Neg→Neu	Polite complaint weakens negative sentiment signal
S16	0.895	Q→Neu	Question form resembles neutral information request
S17	0.881	Q→Neu	Short interrogative without explicit sentiment polarity

Notes: S11: โปรเฉพาะแค่ 6 ชิ้นหรอ ไปซื้อมาจะซื้อ 10 ชิ้น พนักงานบอกโปรแค่ 6 ชิ้น (The promotion is limited to only 6 items. I am planning to buy 10, but the staff told me the promotion applies to only 6 items.); S12: จิ๋ว เชียงราย - ต๋อง มัตซูชิ รอบ 16 คน ระบบ 5/9 เฟรม ชนะ 3 เฟรม คว้าแชมป์ไปครองได้สำเร็จ (Jiew Chiang Rai defeated Tong Matsushi in the round of 16 under the 5/9 frame format, winning 3 frames and successfully claiming the championship.); S13: หาซื้อผ้าอนามัยลอรีเอะ ซูเปอร์ เจนเทิล พลัส ได้ที่ไหนคะ ตอนนี้หาซื้อไม่ได้เลยค่ะ (I am looking for Laurier Super Gentle Plus sanitary pads. but I haven’t been able to find them anywhere right now.); S14: โอเคอยู่ ใช้งานได้ปกติ ไม่มีปัญหาอะไร (It is okay and works normally without any problems.); S15: ทีมงานต้องขออภัยเป็นอย่างยิ่งครับ เนื่องจากรถยนต์ไม่สามารถส่งมอบได้ตามกำหนด (The team sincerely apologizes because the vehicle could not be delivered as scheduled.); S16: มีขายไหมครับ สนใจมากครับ (Is it available for sale? I’m very interested.); S17: ขอราคา New Pajero Sport คับ (Could I get the price of the New Pajero Sport?).

Table 12. Performance comparison on the 40 Thai children’s tales dataset.

Model	Accuracy	Macro-Precision	Macro-Recall	Macro-F1
BiLSTM-CNN [1]	N/A	N/A	N/A	74.36
Parallel Hybrid [25] (SOTA)	79.42	79.09	78.18	78.59
WangchanBERTa-IC-DKS (Proposed Model)	80.42 ± 0.0093	81.38 ± 0.0201	79.61 ± 0.0216	79.95 ± 0.0125

N/A indicates that the corresponding metric was not reported in the original study.

Table 13. Per-Class performance comparison between the previous state-of-the-art Parallel Hybrid model and the proposed WangchanBERTa-IC-DKS model on the 40 Thai children’s tales dataset.

Class	Model	F1
Negative	Parallel Hybrid (SOTA) [25]	77.59
Negative	WangchanBERTa-IC-DKS (Proposed Model)	77.18 ± 0.0268
Neutral	Parallel Hybrid (SOTA) [25]	82.02
Neutral	WangchanBERTa-IC-DKS (Proposed Model)	81.88 ± 0.0198
Positive	Parallel Hybrid (SOTA) [25]	75.35
Positive	WangchanBERTa-IC-DKS (Proposed Model)	80.79 ± 0.0315

Table 14. Examples of misclassified samples from the 40 Thai children’s tales test set for error analysis.

ID	Confidence	Error Type	Reason
S21	0.952	Neu→Neg	Context-related negative cues may have biased the model toward the negative class.
S22	0.889	Pos→Neu	Weak positive sentiment and contextual ambiguity make positive meaning difficult to detect
S23	0.878	Neg→Neu	Complaint-like lexical cues trigger negative interpretation despite neutral intent
S24	0.850	Pos→Neu	Instructional or explanatory statement rather than expressing clear emotional sentiment.
S25	0.849	Neg→Neu	Question form resembles neutral information request without clear interrogative markers
S26	0.805	Neu→Pos	Short interrogative sentence lacks explicit question indicators, causing neutral prediction
S27	0.788	Neu→Pos	Ambiguous sentiment polarity with mixed contextual cues confuses class boundaries

Notes: S21: ท่ามกลางความมืดมิดในยามค่ำคืน ชายชราคนหนึ่งกำลังนั่งอ่านหนังสืออย่างเงียบๆ (Amid the darkness of the night, an old man is sitting quietly reading a book.); S22: พระอาทิตย์พูดขึ้นว่า ฉันจะสร้างสะพานนี้เพื่อเดินทางไปหาเมฆเพื่อนรัก (The sun said, ‘I will build this bridge to travel and meet my dear friend, the cloud.’); S23: เธอสงสารเจ้าไมกี้มาก (She feels very sorry for Mikey.); S24: พ่อจะสอนให้เจ้าขุดรูใต้ดิน เจ้าจะได้รู้ว่าจะซ่อนลูกโอ๊กไว้ที่ไหนเมื่อมันสุกงอม (Father will teach you how to dig holes underground, so you will know where to hide the acorns when they are fully ripe.); S25: โอ้ ไม่ใช่ มันร้อง ข้ายังไม่ได้วิ่งเข้าเส้นชัยนี่ (Oh, no! it cried. I haven’t crossed the finish line yet!); S26: ฉันตระหนักถึงเรื่องเหล่านี้ดี แต่ฉันอดนึกถึงหมาป่าตัวหนึ่งที่ฉันเคยเห็นอยู่บนเนินเขาไม่ได้ (I am well aware of these things, but I cannot help thinking of a wolf I once saw on the hillside.); S27: ต้นสนน้อยและเพื่อนเดินทางมา จากในป่าให้พวกเราได้ใช้งาน(The little pine tree and its friends have traveled from the forest for us to use.).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Songram, P.; Khummanee, S.; Muangprathub, J.; Kawattikul, K. Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection. Appl. Sci. 2026, 16, 5310. https://doi.org/10.3390/app16115310

AMA Style

Songram P, Khummanee S, Muangprathub J, Kawattikul K. Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection. Applied Sciences. 2026; 16(11):5310. https://doi.org/10.3390/app16115310

Chicago/Turabian Style

Songram, Panida, Suchart Khummanee, Jirapond Muangprathub, and Khanabhorn Kawattikul. 2026. "Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection" Applied Sciences 16, no. 11: 5310. https://doi.org/10.3390/app16115310

APA Style

Songram, P., Khummanee, S., Muangprathub, J., & Kawattikul, K. (2026). Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection. Applied Sciences, 16(11), 5310. https://doi.org/10.3390/app16115310

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Thai Sentiment Analysis with WangchanBERTa and Instance-Class-Aware Dynamic Kernel Selection

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Datasets

3.2. Proposed WangchanBERTa-IC-DKS Framework

3.2.1. Input Encoding

3.2.2. Contextual Embedding Generation

3.2.3. Multi-Kernel CNN Feature Extraction

3.2.4. Instance-Aware Kernel Selection

3.2.5. Class-Aware Kernel Selection

3.2.6. Dynamic Kernel Selection (DKS)

3.2.7. Class-Specific Classification

3.3. Evaluation Metrics

4. Experiment Results and Discussion

4.1. Experimental Setup

4.2. Kernel Size Comparison

4.3. Ablation Experiment

4.4. Internal Baseline Comparison

4.5. Computational Efficiency and Performance Trade-Off

4.6. Comparison with Previous Studies

4.7. Kernel Weight Analysis on the WISESIGHT Dataset

4.8. Error Analysis on the WISESIGHT Dataset

4.9. Performance of the Proposed Model on the 40 Thai Children’s Tales Dataset

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI