DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition

Pallakonda, Archana; Yanamala, Rama Muni Reddy; Raj, Rayappa David Amar; Napoli, Christian; Randieri, Cristian

doi:10.3390/technologies13090420

Open AccessArticle

DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition

by

Archana Pallakonda

¹,

Rama Muni Reddy Yanamala

²,

Rayappa David Amar Raj

³,

Christian Napoli

^5,6

and

Cristian Randieri

^4,5,*

¹

Department of Computer Science and Engineering, National Institute of Technology Warangal, Warangal 506004, Telangana, India

²

Department of Electronics and Communication Engineering, Indian Institute of Information Technology Design and Manufacturing (IIITD&M) Kancheepuram, Chennai 600127, Tamil Nadu, India

³

Amrita School of Artificial Intelligence, Amrita Vishwa Vidyapeetham, Coimbatore 641112, Tamil Nadu, India

⁴

Department of Theoretical and Applied Sciences, eCampus University, Via Isimbardi 10, 22060 Novedrate, Italy

⁵

Department of Computer, Control, and Management Engineering, Sapienza University of Rome, Via Ariosto 25, 00185 Rome, Italy

⁶

Department of Artificial Intelligence, Czestochowa University of Technology, ul. Dqbrowskiego 69, 42-201 Czestochowa, Poland

^*

Author to whom correspondence should be addressed.

Technologies 2025, 13(9), 420; https://doi.org/10.3390/technologies13090420

Submission received: 19 August 2025 / Revised: 5 September 2025 / Accepted: 10 September 2025 / Published: 18 September 2025

Download

Browse Figures

Versions Notes

Abstract

Emotion recognition plays a crucial role in our day-to-day communication, and detecting emotions is one of the most formidable tasks in the field of human–computer Interaction (HCI). Facial expressions are the most straightforward and efficient way to identify emotions. With so many real-time applications, although automatic facial expression recognition (FER) is essential for numerous real-world applications in computer vision, developing a feature descriptor that accurately captures the subtle variations in facial expressions remains a significant challenge. Towards addressing this issue, a novel feature extraction technique inspired by Dining Philosophers Problem, named Dining Philosophers Problem Inspired Binary Patterns (DPIBP), has been proposed in this work. The proposed DPIBP methods extract three features in a local 5 × 5 neighborhood by considering the impact of both neighboring pixels and the adjacent pixels on the current pixel. To categorize facial expressions, the system used a multi-class Support Vector Machine (SVM) classifier. Reflecting real-world use, researchers tested the method on JAFFE, MUG, CK+, and TFEID benchmark datasets using a person-independent protocol. The proposed method, DPIBP, achieved superior performance compared to existing techniques that rely on manually crafted features for extraction.

Keywords:

dining philosophers problem; facial expression recognition; feature descriptors; local binary patterns; person independent protocol

Graphical Abstract

1. Introduction

Understanding emotions is crucial for effective communication, but pinpointing them through technology remains a complex challenge. Facial expressions, however, offer the most direct and readily interpretable window into a person’s emotional state [1,2]. Humans may learn a great deal about their emotions from facial expressions, which are emotional markers [3]. As a result, based only on facial expressions, people may quickly determine the emotional condition of another person [4]. The capability to recognize emotions through facial expressions is vital for machines to interact effectively with humans. Consequently, facial expression data is a fundamental component in developing automatic emotion recognition systems. Facial expression recognition systems leverage changes in facial features to infer emotional states.

Face analysis task has many real-time applications in virtual reality, animation, human–computer interaction, pain diagnosis, driver mood detection, etc. [5]. Recent work has demonstrated that compact FER pipelines combining spatial transformer modules and multi-head self-attention can both improve robustness and be coupled in real time to ambient actuators (for example, RGB lighting) in Ambient Assisted Living scenarios [6]. FER systems have surged in popularity within the research community, fueled by their vast potential across various applications. In a typical facial expression recognition system, there are three primary stages: face acquisition, feature extraction, and classification [4]. The face acquisition stage performs facial region segmentation within the input image. This involves the detection of facial landmarks, followed by cropping the image to isolate the facial region of interest (ROI) and exclude irrelevant background information. Once a face is identified in the image, the system zooms in on that area to focus solely on the facial information and remove any background clutter. After selecting the desired facial region, features are extracted from that area. Since the extracted features are primarily what determines how well a FER system performs, efficiently extracting important information from the face is essential [7]. To achieve efficient expression classification, the extracted features are fed into machine learning algorithms known for their effectiveness in this domain. These algorithms include K-Nearest Neighbors (KNN) for fast, similarity-based classification, Support Vector Machines (SVMs) for robust decision boundaries, neural networks (NNs) for capturing complex relationships in the data, and Decision Trees (DTs) for interpretable classification rules. Existing studies show that SVM outperforms other classifiers for FER [5,7,8]. FER systems aim to classify facial images into a set of basic emotions, typically including surprise, sadness, neutrality, joy, fear, disgust, and anger [7].

Feature extraction is arguably the most critical stage in FER systems. The quality of extracted features heavily influences classification accuracy. Even the most sophisticated classification algorithms will struggle to categorize expressions accurately if provided with irrelevant or uninformative features [7]. There are minimal differences in facial expressions, and accurately classifying them requires capturing those minute differences. According to the studies [4,5,7,8], texture-based feature descriptors have the potential to efficiently extract noteworthy features from a facial image. A higher level of security for biometric systems has recently been achieved through the use of face expressions [9]. The features of face images can be automatically extracted by convolutional neural networks (CNNs) [10,11,12,13,14]. However, CNNs have low model generalization and high training data and hardware requirements. To address this challenge, this work introduces a novel feature extraction technique called Dining Philosopher Problem-Inspired Binary Patterns (DPIBP).

Here are the key advancements/innovations/contributions this work introduces:

Inspired by Dining Philosopher Problem, a novel method for feature extraction named Dining Philosopher Problem-Inspired Binary Patterns (DPIBP) has been proposed to extract robust features in a local 5 × 5 neighborhood.
Four variants of the DPIBP method namely DPIBP1, DPIBP2, DPIBP3, and DPIBP4 have been proposed corresponding to angles of $0^{\circ}$ , $90^{\circ}$ , $180^{\circ}$ , and $270^{\circ}$ .
Each DPIBP method generates three feature codes by considering the positions of philosophers, chopsticks and noodles in a local 5 × 5 neighborhood with lesser dimensions than traditional variants of Local Binary Patterns (LBPs).
The proposed DPIBP methods have been evaluated on standard FER datasets such as JAFFE, CK+, MUG, and TFEID in person-independent protocol to validate the efficiency in real-time scenarios.

2. Related Work

For FER-based systems to operate better, it is imperative to extract reliable, accurate, and stable features. Two categories of face feature extraction methods are found in the literature: appearance-based and geometry-based [7,15]. Geometry-based methods extract facial features by considering the shapes and positions of distinct facial components. In contrast, appearance-based techniques use image filters either universally to capture holistic features or locally to extract fine details from facial images [16]. Even though researchers have extensively studied global methods like Eigenfaces [17] and Fisher faces [18], local feature descriptors have gained favor due to their efficient computation and resilience to variations in lighting. This makes them well-suited for real-world applications of facial expression recognition. Local approaches focus on two main areas: texture and edges [7]. The most often used texture-based feature descriptor for local based approaches that can withstand monotonic fluctuations in light is Local Binary Pattern (LBP). Regional Adaptive Affinitive Patterns (RADAP-LOs) [7] are designed to be robust against variations within emotions and uneven lighting. To capture even more subtle expressions, the researchers created variations of RADAP-LO, namely XRADAP-LO, ARADAP-LO, and DRADAP-LO. Recent research has highlighted key challenges and complementary directions in facial analysis. A survey on deep learning-based FER [19] reviews datasets, protocols, and architectures, emphasizing persistent issues such as overfitting, illumination, pose, and identity bias. Similarly, advances in face manipulation detection [20] propose methods that both detect forgeries and localize manipulated regions using semantic segmentation and noise maps. These studies underline the progress and remaining challenges in building robust and trustworthy systems, reinforcing the need for lightweight yet effective descriptors such as the proposed DPIBP.

Kola et al. [5] combined wavelet transform with local gradient coding based on horizontal and diagonal (LGC-HD) operators and singular values to generate robust features, resulting in high recognition rates on benchmark FER datasets. Kartheek et al. [21] proposed Chess Pattern (CSP), a technique based on chess piece movements, which outperformed the existing systems by effectively extracting features from different facial regions. Radial Mesh Pattern (RDMP) [4] has been proposed to address the limitations of existing techniques and to produce distinct feature codes for various image regions. Another FER method [22], inspired by the Knight’s Tour problem, utilizes Knight Tour Patterns for feature extraction from local textures in facial images. By varying neighborhood sizes and exploring diverse weighting schemes, the approach aims to enhance data description efficiency. Kartheek et al. introduced three feature descriptors for facial expression recognition: Radial Cross Pattern, Chess Symmetric Pattern, and Radial Cross Symmetric Pattern [23] for addressing limitations of existing methods and for enhancing distinctive feature extraction within a 5 × 5 pixel neighborhood. A novel method is proposed by Shanthi et al. [24] to enhance the FER by fusing LBP and Local Neighborhood Encoded Pattern (LNEP) to capture the pixel relationships for improved accuracy and noise resilience in computer vision. To address the limitations of traditional approaches like LBP, Kola et al. [8] introduced LBP Adaptive Window by analyzing diagonal, horizontal, and vertical neighbors separately and by incorporating adaptive windowing and radial averaging.

Compass masks are applied in a local neighbourhood by the local edge-based methods [25,26,27], which generate feature codes using the top responses. Extracting features from small areas can be problematic for edge-based methods. Noise and slight distortions can significantly affect them, leading to inconsistent codes for different edge orientations. To address this, Iqbal et al. proposed the Neighborhood-Aware Edge Directional Pattern [28]; however, because the coding system only takes into account a

3 \times 3

neighborhood for feature extraction, NEDP occasionally suffers from conserving the global shapes. A novel method named Multi-View Laplacian Least Squares (MVLLS) is introduced [29] for emotion recognition in computer interactions and Artificial Intelligence (AI), addressing the challenges of multi-view learning and local variations within emotion categories. A lightweight neural network, HiNet [30], is proposed for FER, leveraging the HyFeat block to capture essential and refined features. HiNet demonstrates high accuracy with fewer parameters, outperforming existing methods in experiments across multiple datasets. An efficient FER approach proposed by Sun et al. [31] addresses the limitations in existing methods by autonomously identifying key features, particularly focusing on subtle details like eyes and mouth. This method introduces self-adaptive feature learning, separating identity from expression, building active feature dictionaries, reducing redundancy, and employing an active learning model for classification.

Extracting features from small areas can be problematic for edge-based methods. Noise and slight distortions can significantly affect them, leading to inconsistent codes for different edge orientations. To address this, Iqbal et al. proposed the Neighborhood-Aware Edge Directional Pattern (NEDP) [32]. By leveraging feature combination and a voting strategy, another FER method is proposed to improve accuracy [33], addressing challenges in AI-driven FER. A FER-deep neural net (FER-net) [34] utilizes the networks to enhance the FER accuracy by addressing challenges posed by manual feature design in existing methods. FER-net outperforms other tested methods across various datasets, highlighting its efficacy in human-computer interaction and psychological applications. A novel approach utilizing a generative adversarial network (GAN) improved multi-view FER by training a CNN on straight-on faces [35], embedding it into a GAN for synthesizing frontal views from tilted faces, and retraining the classifier on both original and synthesized images. Shi et al. [36] proposed a novel architecture called the Multiple Branch Cross-Connected Convolutional Neural Network (MBCC-CNN) to improve facial expression recognition (FER) systems’ ability to identify expressions. This network combines three powerful techniques: residual connections, Network in Network (NiN), and tree structures. The authors in [37] discuss how AI, including FER approaches, improves journalism by examining audience emotions during interviews or political debates. It emphasises how FER approaches like DPIBP can help in personalized content delivery and emotion analysis in the media. The review in [38] concentrates on the application of FER in elderly care, emphasizing its usefulness in noticing emotional states for better caregiving. It can be referenced to emphasise how DPIBP delivers a resource-efficient option for emotion recognition in elderly care, presenting a lightweight solution compared to deep learning-based approaches.

The existing studies have shown that edge-based descriptors are often affected by small local distortions and can produce inconsistent codes in smoother regions. They are also highly unreliable. Despite not capturing every intricate detail, deep learning models have demonstrably improved the overall accuracy of facial expression recognition systems. However, the generalizability of CNN-based approaches in manipulation detection is not guaranteed, as their performance strongly depends on the intrinsic characteristics of the datasets employed, as highlighted by recent comparative analyses [39]. While deep learning models boast impressive capabilities, their power comes at a cost. Their intricate nature necessitates massive datasets and substantial computational resources for proper training. If these resources are limited, the model might overfit, meaning it performs well on the training data but struggles with new, unseen information.

Extracting deep features for facial expressions is computationally expensive, and a significant portion of the extracted features may not be relevant for recognizing emotions. Furthermore, there are fewer images in the “in the lab” FER datasets. As a result, the datasets have a restricted amount of data available for training. Since there is no significant difference between the two facial expressions, it is critical to correctly identify them in order to classify them. Research suggests that local texture-based approaches are effective in extracting these subtle details from facial images, even when limited training data is available. As a result, the proposed method has been designed, inspired by locally based approaches, to capture minute expression changes in facial images. In facial expression recognition, Local Binary Pattern (LBP) and its variants [40] have been extensively employed for feature extraction due to their simplicity and computational efficiency. Uniform or Traditional LBP captures local texture by comparing a central pixel’s intensity to its neighbors. The extended LBP enhances feature representation by considering more neighbors, while Rotated LBP and Rotation-Invariant LBP address rotation invariance, making them stronger to facial pose changes. Recent LBP optimization strategies, like LBP Feature Learning, use deep learning to automatically obtain discriminative binary patterns, while Adaptive LBP dynamically modifies window sizes for enhanced performance under variable lighting. Despite this progress, we selected handcrafted DPIBP due to its proportion of interpretability and computational efficiency, which is especially appropriate in resource-constrained environments where optimized LBP techniques or deep learning models may be impractical due to their higher complexity and data necessities.

The proposed DPIBP is motivated by the Dining Philosophers Problem, where the connections between neighboring pixels (philosophers) and their spatial context (chopsticks and noodles) are essential to catching subtle facial expression variations. Unlike traditional Local Binary Patterns, which concentrate only on intensity comparisons within a fixed neighborhood, DPIBP comprises both local pixel relationships and contextual information, allowing it to distinguish between slight differences in facial expressions more effectively. Additionally, DPIBP is computationally efficient, delivering real-time processing abilities with decreased memory and inference time compared to complicated deep learning models like CNNs, making it superior in resource-constrained environments such as mobile devices. This blend of high recognition accuracy and efficiency drives DPIBP an effective solution for real-time facial expression recognition tasks.

3. Proposed Method

3.1. Dining Philosophers Problem

The Dining Philosophers problem is a famous example of a challenge in coordinating multiple tasks. Imagine five philosophers sitting at a round table with a bowl of noodles in the center and only five chopsticks [41,42]. Each philosopher wants to alternate between thinking and eating, but they need both their left and right chopsticks to eat. The problem is that a philosopher can only pick up one chopstick at a time. If their desired chopstick is unavailable, they must put down their current chopstick and wait. Figure 1 illustrates this scenario, where P0 to P4 represent the philosophers, and C0 to C4 represent the chopsticks. Inspired by this problem, researchers have proposed a new method for extracting features from images called Dining Philosophers Problem-Inspired Binary Patterns (DPIBP). This method works by analyzing small local areas (5 × 5 pixels) in an image.

3.2. Theoretical Grounding of the Dining Philosophers Analogy

The analogy between the Dining Philosophers Problem (DPP) and pixel relationships in DPIBP can be described in terms of modeling coordination and dependency within a local image neighborhood. In DPP, all the philosophers (agents) need both neighboring chopsticks (resources) to eat, making a reliance on their neighbors. Moreover, DPIBP maps this by regarding philosophers as chosen peripheral pixels in a 5 × 5 neighborhood, the noodle as the center pixel, and chopsticks as neighboring context pixels. The “availability” is encoded by analogizing pixel intensities with the center using a binary function

T (a, b) = \{\begin{matrix} 1, & if a \geq b \\ 0, & otherwise \end{matrix}

(1)

as described in Equation (1).

Joint spatial dependency is modeled via logical AND of context pixel comparisons

T (a_{1}, b) \land T (a_{2}, b)

(2)

as shown in Equation (2).

Rotational variants (DPIBP1–DPIBP4) employ shifts to guarantee orientation invariance. The final DPIBP feature is a weighted binary sum capturing multiple neighborhood dependencies, as represented in Equation (3):

D P I B P_{p} = \sum_{k} T (S_{k}, S_{center}) \cdot w_{k}

(3)

where

S_{k}

denotes philosopher pixels and

w_{k}

their spatial weights. This formalism provides a mathematically grounded, context-aware descriptor inspired by resource allocation dynamics in DPP.

3.3. Feature Extraction Using DPIBP Method

DPIBP is a handcrafted, local texture-based feature extraction technique inspired by the Dining Philosophers problem. The DPIBP method uses overlapping squares of size 5 × 5 to extract local features from the face. A sample 5 × 5 block with pixel representations is shown in Figure 2a. As shown in Figure 1, in the Dining Philosophers problem, there are five philosophers, five chopsticks, and noodles placed at the center of the table. For extracting valuable features, all the philosophers, chopsticks, and noodles are logically placed in a 5 × 5 neighborhood, as shown in Figure 2b. The equivalent pixel positions corresponding to the placement of philosophers, chopsticks, and noodles concerning Figure 2a,b are shown in Figure 2c. For extracting feature, DPIBP utilizes a 5 × 5 neighborhood around a central pixel. It analyzes the relative positions of neighboring pixels (philosophers) and their diagonal neighbors (chopsticks) within this area. Four variants of DPIBP methods, namely DPIBP1, DPIBP2, DPIBP3, and DPIBP4, have been proposed here, corresponding to angles of

0^{\circ}

,

90^{\circ}

,

180^{\circ}

, and

270^{\circ}

. Each DPIBP method is designed in such a way that it extracts three feature codes (

{DPIBP}_{p i}

,

{DPIBP}_{c i}

,

{DPIBP}_{p c i}

where i = 1, 2, 3, 4) locally. In a 5 × 5 neighborhood region, corresponding to DPIBP methods (DPIBP1, DPIBP2, DPIBP3, and DPIBP4). The positions of philosophers, chopsticks, and noodles, along with pixel positions, are shown in Figure 2, Figure 3, Figure 4 and Figure 5. Each DPIBP method generates three feature codes namely

{DPIBP}_{p i}

,

{DPIBP}_{c i}

and

{DPIBP}_{p c i}

where i = 1, 2, 3, 4.

All DPIBP methods yield three complementary feature codes—DPIBPp, DPIBPc, and DPIBPpc—which jointly capture various and complementary facets of local texture in a 5 × 5 neighborhood. DPIBPp encodes the direct intensity relationships between philosopher pixels (chosen peripheral neighbors) and the center pixel (noodle), capturing primary local divergences. DPIBPc regards the chopstick pixels (adjacent diagonal neighbors) as corresponding to the center, encoding contextual and edge-adjacent data that complements DPIBPp. Finally, DPIBPpc integrates these concepts by using a logical AND operation between the pairs of chopstick pixels neighboring philosophers, mimicking the “both resources available” condition from the Dining Philosophers analogy, which captures joint pixel dependences and more complex spatial interactions. The concatenation of histograms of these three features therefore synthesizes respective pixel intensity differences, contextual adjacency, and joint dependency patterns into one strong descriptor, promoting better discrimination of subtle facial expression changes. Their interaction balances local detail with contextual structure, improving classification performance by using heterogeneous but complementary data characteristics.

3.3.1. DPIBP1 Feature Extraction

To extract discriminant information in a local neighborhood, the effect of the neighboring pixels on the current pixel is often studied [5,7,43]. For extracting features related to

{DPIBP}_{p 1}

, the positions of philosophers in a 5 × 5 neighborhood are considered and are compared with the noodles (center pixel) as shown in Equations (4) and (5). As there are five philosophers,

{DPIBP}_{p 1}

generates a feature with 5 bits, which is then transformed to a decimal number as shown in Equations (6) and (7). To capture features that describe

{DPIBP}_{c 1}

, the positions of chopsticks in a 5 × 5 neighborhood are considered and compared with the noodles (center pixel), as shown in Equation (8). As there are five chopsticks,

{DPIBP}_{c 1}

generates a feature with 5 bits, which is converted into decimal form, as shown in Equation (9). Thus, for extracting

{DPIBP}_{p 1}

and

{DPIBP}_{c 1}

, the pixels corresponding to philosophers and chopsticks are chosen and are correspondingly compared with the center pixel (

S_{13}

).

\begin{matrix} D P I B P_{p a} = {T (S_{3}, S_{13}), T (S_{15}, S_{13}), T (S_{24}, S_{13}), \\ T (S_{22}, S_{13}), T (S_{11}, S_{13})} \end{matrix}

(4)

T (a, b) = \{\begin{matrix} 1, & if a \geq b \\ 0, & otherwise \end{matrix}

(5)

w = [1, 2, 4, 8, 16]

(6)

D P I B P_{p 1} = \sum (D P I B P_{p a} * w)

(7)

\begin{matrix} D P I B P_{c a} = {T (S_{9}, S_{13}), T (S_{20}, S_{13}), T (S_{23}, S_{13}), \\ T (S_{16}, S_{13}), T (S_{7}, S_{13})} \end{matrix}

(8)

D P I B P_{c 1} = \sum (D P I B P_{c a} * w)

(9)

In the Dining Philosophers problem, the philosopher can only eat noodles if chopsticks are freely available on either side. Based on this idea, the third feature named

{DPIBP}_{p c 1}

is extracted, which is shown in Equation (10). The logical AND operator is used in Equation (10) because the philosopher can only eat if both the chopsticks are readily available. Like

{DPIBP}_{p 1}

and

{DPIBP}_{c 1}

,

{DPIBP}_{p c 1}

also generates a feature with 5 bits, which is then converted to a decimal number, as shown in Equation (11). Previous studies [44,45] have demonstrated that local facial regions are substantially correlated with expressional changes and that the use of histogram-based features in local descriptors makes them resilient to changes in expression, position, illumination, and noise [46]. Hence, histogram-based features have been extracted for each of

{DPIBP}_{p 1}

,

{DPIBP}_{c 1}

and

{DPIBP}_{p c 1}

and are correspondingly shown in Equations (12)–(14). Finally, all the features are horizontally concatenated to form a feature vector corresponding to DPIBP1, as shown in Equation (15).

\begin{matrix} D P I B P_{p c a} = {T (S_{7}, S_{3}) \land T (S_{9}, S_{3}), T (S_{9}, S_{15}) \land \\ T (S_{20}, S_{15}), T (S_{20}, S_{24}) \land T (S_{23}, S_{24}), T (S_{23}, S_{22}) \land \\ T (S_{16}, S_{22}), T (S_{16}, S_{11}) \land T (S_{7}, S_{11})} \end{matrix}

(10)

D P I B P_{p c 1} = \sum (D P I B P_{p c a} * w)

(11)

The calculation of

D P I B P_{p c 1}

uses

D P I B P_{p c a}

, a 10-bit number generated by evaluating the intensity of philosopher pixels in a

5 \times 5

neighbourhood. Each comparison gives a binary output (0 or 1), and these outputs are combined to create a 10-bit binary pattern that symbolises the relationships among the pixels. The 5-bit vector w indicates the weights attributed to the individual philosopher pixels according to their corresponding placements within the neighbourhood. These weights aim to underscore the significance of specific pixel locations in the feature extraction approach. To compute

D P I B P_{p c 1}

, the 10-bit value

D P I B P_{p c a}

is multiplied element-wise by the 5-bit weight vector w. In this process of multiplication, each element of

D P I B P_{p c a}

is multiplied by its matching weight in w. Given that the vector w comprises merely 5 bits, this process assigns the weights to the initial 5 bits of the

D P I B P_{p c a}

value. The outcome is a 5-bit weighted sum that signifies the weighted contributions of the philosopher pixels. The sum is subsequently transformed into a decimal number, which is utilised in the ultimate feature extraction method called

D P I B P_{p c 1}

.

H D P I B P_{p 1} = h (D P I B P_{p 1})

(12)

H D P I B P_{c 1} = h (D P I B P_{c 1})

(13)

H D P I B P_{p c 1} = h (D P I B P_{p c 1})

(14)

F (D P I B P 1) = H D P I B P_{p 1} \cup H D P I B P_{c 1} \cup H D P I B P_{p c 1}

(15)

3.3.2. DPIBP2

DPIBP2 has been proposed by correspondingly rotating the logical placement of philosophers and chopsticks by

90^{\circ}

, as shown in Figure 3a,b. For extracting features related to

{DPIBP}_{p 2}

, the positions of philosophers in a 5 × 5 neighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated which is then converted to a decimal number, as shown in Equations (16) and (17). For extracting features related to

{DPIBP}_{c 2}

, the positions of chopsticks in a 5 × 5 neighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated and converted into a decimal form, as shown in Equations (18) and (19). Thus, for extracting

{DPIBP}_{p 2}

and

{DPIBP}_{c 2}

, the pixels corresponding to philosophers and chopsticks are chosen and are correspondingly compared with the center pixel (

S_{13}

).

\begin{matrix} D P I B P_{p b} = {T (S_{15}, S_{13}), T (S_{23}, S_{13}), T (S_{16}, S_{13}), \\ T (S_{6}, S_{13}), T (S_{3}, S_{13})} \end{matrix}

(16)

D P I B P_{p 2} = \sum (D P I B P_{p b} * w)

(17)

\begin{matrix} D P I B P_{c b} = {T (S_{19}, S_{13}), T (S_{22}, S_{13}), T (S_{11}, S_{13}), \\ T (S_{2}, S_{13}), T (S_{9}, S_{13})} \end{matrix}

(18)

D P I B P_{c 2} = \sum (D P I B P_{c b} * w)

(19)

For a philosopher, if the chopsticks on either side are readily available, then he can consume the noodles immediately. Based on this idea,

{DPIBP}_{p c 2}

is extracted by considering the positions of philosophers and the availability of chopsticks on either side, which is shown in Equation (20). Like

{DPIBP}_{p 2}

and

{DPIBP}_{c 2}

,

{DPIBP}_{p c 2}

also generates a feature with 5 bits, which is then converted to a decimal number, as shown in Equation (21). Histogram-based features have been extracted for each of the models,

{DPIBP}_{p 2}

,

{DPIBP}_{c 2}

and

{DPIBP}_{p c 2}

, and are correspondingly shown in Equations (22)–(24). Finally, all the features are horizontally concatenated to form a feature vector corresponding to DPIBP2, as shown in Equation (25).

\begin{matrix} D P I B P_{p c b} = {T (S_{9}, S_{15}) \land T (S_{19}, S_{15}), T (S_{19}, S_{23}) \land \\ T (S_{22}, S_{23}), T (S_{22}, S_{16}) \land T (S_{11}, S_{16}), T (S_{11}, S_{6}) \land \\ T (S_{2}, S_{6}), T (S_{2}, S_{3}) \land T (S_{9}, S_{3})} \end{matrix}

(20)

D P I B P_{p c 2} = \sum (D P I B P_{p c b} * w)

(21)

H D P I B P_{p 2} = h (D P I B P_{p 2})

(22)

H D P I B P_{c 2} = h (D P I B P_{c 2})

(23)

H D P I B P_{p c 2} = h (D P I B P_{p c 2})

(24)

F (D P I B P 2) = H D P I B P_{p 2} \cup H D P I B P_{c 2} \cup H D P I B P_{p c 2}

(25)

3.3.3. DPIBP3

DPIBP3 has been proposed by correspondingly rotating the logical placement of philosophers and chopsticks by

180^{\circ}

, as shown in Figure 4a,b. For extracting features related to

{DPIBP}_{p 3}

, the positions of philosophers in a 5 × 5 neighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated which is then converted to a decimal number as shown in Equations (26) and (27). For extracting features related to

{DPIBP}_{c 3}

, the positions of chopsticks in a 5 × 5 neighborhood are considered, and upon cneighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated and then converted to a decimal number, as shown in Equations (28) and (29). Thus, for extracting

{DPIBP}_{p 3}

and

{DPIBP}_{c 3}

, the pixels corresponding to philosophers and chopsticks are chosen and are correspondingly compared with the center pixel (

S_{13}

).

\begin{matrix} D P I B P_{p c} = {T (S_{23}, S_{13}), T (S_{11}, S_{13}), T (S_{2}, S_{13}), \\ T (S_{4}, S_{13}), T (S_{15}, S_{13})} \end{matrix}

(26)

D P I B P_{p 3} = \sum (D P I B P_{p c} * w)

(27)

\begin{matrix} D P I B P_{c c} = {T (S_{17}, S_{13}), T (S_{6}, S_{13}), T (S_{3}, S_{13}), \\ T (S_{10}, S_{13}), T (S_{19}, S_{13})} \end{matrix}

(28)

D P I B P_{c 3} = \sum (D P I B P_{c c} * w)

(29)

{DPIBP}_{p c 3}

is extracted by considering the positions of philosophers and the availability of chopsticks on either side, which is shown in Equation (30). Like

{DPIBP}_{p 3}

and

{DPIBP}_{c 3}

,

{DPIBP}_{p c 3}

also generates a feature with 5 bits, which is then transformed to a decimal number, as shown in Equation (31). Histogram-based features have been extracted for

{DPIBP}_{p 3}

,

{DPIBP}_{c 3}

and

{DPIBP}_{p c 3}

and are correspondingly shown in Equations (32)–(34). Finally, all the features are horizontally concatenated to form a feature vector corresponding to DPIBP3, as shown in Equation (35).

\begin{matrix} D P I B P_{p c c} = & {T (S_{19}, S_{23}) \land T (S_{17}, S_{23}), \\ T (S_{17}, S_{11}) \land T (S_{6}, S_{11}), \\ T (S_{6}, S_{2}) \land T (S_{3}, S_{2}), \\ T (S_{3}, S_{4}) \land T (S_{10}, S_{4}), \\ T (S_{10}, S_{15}) \land T (S_{19}, S_{15})} \end{matrix}

(30)

D P I B P_{p c 3} = \sum (D P I B P_{p c c} * w)

(31)

H D P I B P_{p 3} = h (D P I B P_{p 3})

(32)

H D P I B P_{c 3} = h (D P I B P_{c 3})

(33)

H D P I B P_{p c 3} = h (D P I B P_{p c 3})

(34)

F (D P I B P 3) = H D P I B P_{p 3} \cup H D P I B P_{c 3} \cup H D P I B P_{p c 3}

(35)

3.3.4. DPIBP4

DPIBP4 has been proposed by correspondingly rotating the logical placement of philosophers and chopsticks by

270^{\circ}

, as shown in Figure 5a,b. For extracting features related to

{DPIBP}_{p 4}

, the positions of philosophers in a 5 × 5 neighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated and then converted to a decimal number, as shown in Equations (36) and (37). For extracting features related to

{DPIBP}_{c 4}

, the positions of chopsticks in a 5 × 5 neighborhood are considered, and upon comparing with the noodles (center pixel), a 5-bit feature is generated and then converted to a decimal number, as shown in Equations (38) and (39). Thus, for extracting

{DPIBP}_{p 4}

and

{DPIBP}_{c 4}

, the pixels corresponding to philosophers and chopsticks are chosen and correspondingly compared with the center pixel (

S_{13}

).

\begin{matrix} D P I B P_{p d} = {T (S_{11}, S_{13}), T (S_{3}, S_{13}), \\ T (S_{10}, S_{13}), T (S_{20}, S_{13}), \\ T (S_{23}, S_{13})} \end{matrix}

(36)

D P I B P_{p 4} = \sum (D P I B P_{p d} * w)

(37)

\begin{matrix} D P I B P_{c d} = {T (S_{7}, S_{13}), T (S_{4}, S_{13}), T (S_{15}, S_{13}), \\ T (S_{24}, S_{13}), T (S_{17}, S_{13})} \end{matrix}

(38)

D P I B P_{c 4} = \sum (D P I B P_{c d} * w)

(39)

{DPIBP}_{p c 4}

is extracted by considering the positions of philosophers and the availability of chopsticks on either side, which is shown in Equation (40). Like

{DPIBP}_{p 4}

and

{DPIBP}_{c 4}

,

{DPIBP}_{p c 4}

also generates a feature with 5 bits, which is then converted to a decimal number, as shown in Equation (41). Histogram-based features have been extracted for

{DPIBP}_{p 4}

,

{DPIBP}_{c 4}

and

{DPIBP}_{p c 4}

and are correspondingly shown in Equations (42)–(44). Finally, all the features are horizontally concatenated to form a feature vector corresponding to DPIBP2, as shown in Equation (45).

\begin{matrix} D P I B P_{p c d} = & {T (S_{17}, S_{11}) \land T (S_{7}, S_{11}), \\ T (S_{7}, S_{3}) \land T (S_{4}, S_{3}), \\ T (S_{4}, S_{10}) \land T (S_{15}, S_{10}), \\ T (S_{15}, S_{20}) \land T (S_{24}, S_{20}), \\ T (S_{24}, S_{23}) \land T (S_{17}, S_{23})} \end{matrix}

(40)

D P I B P_{p c 4} = \sum (D P I B P_{p c d} * w)

(41)

H D P I B P_{p 4} = h (D P I B P_{p 4})

(42)

H D P I B P_{c 4} = h (D P I B P_{c 4})

(43)

H D P I B P_{p c 4} = h (D P I B P_{p c 4})

(44)

F (D P I B P 4) = H D P I B P_{p 4} \cup H D P I B P_{c 4} \cup H D P I B P_{p c 4}

(45)

3.4. Distinctiveness of DPIBP4 and Its Contribution

DPIBP4 differentiates itself from the previous DPIBP variants (DPIBP1, DPIBP2, DPIBP3) mainly through a rotational transformation of the neighborhood pattern by 270°, yielding a special arrangement of philosopher, chopstick, and noodle pixel situations comparable to the central pixel. This rotation efficiently captures spatial relationships that are complementary and not duplicative with those encoded by the 0°, 90°, and 180° variants. The 270° direction presents different directional dependencies among pixels, allowing the descriptor to better characterize facial textures and subtle expression-related divergences that are embodied differently under rotation. Empirically, DPIBP4 shows a borderline but invariant progress in recognition accuracy—particularly on datasets such as TFEID, where it gained the highest accuracy (94.78%) among all DPIBP variants, suggesting that the 270° rotation contributes significant, non-overlapping feature information that improves the discriminator’s capability to distinguish subtle facial expression nuances.

Thus, DPIBP4 is not only a rotational copy but an essential segment of the exhaustive multi-orientation encoding technique, collectively enhancing robustness to pose divergences and acquiring exceptional all-around performance.

4. Experimental Results and Analysis

This section assesses the effectiveness of our proposed DPIBP methods on four popular benchmark datasets commonly used in facial expression recognition research: JAFFE [47], MUG [48], CK+ [49], and TFEID [50]. The seven expressions taken into consideration for the experimental study are disgust, sadness, anger, fear, surprise, neutrality, and happiness. Subsections IV-A, IV-B, and IV-C provide information about the datasets that were used for the experimental evaluation and the experimental settings. The proposed method’s performance is evaluated using confusion matrices and a comparative analysis of recognition accuracy with the recent handcrafted FER methods.

4.1. Datasets

213 Japanese female respondents facing images of 10 subjects, representing seven different facial emotions, make up the Japanese Female Facial Expression (JAFFE) dataset. In this dataset, there are nearly four images for every facial expression associated with a certain subject. The facial image dataset consists of TIFF images with a dimension of 295 × 256 pixels.

The Multimedia Understanding Group was the one who created this MUG dataset. The images belonging to 86 subjects (51 male and 35 women) are available in this dataset. The facial images of JPG format are stored with a resolution of 896 × 896 pixels. To ensure a manageable sample size for evaluation, only images from 45 out of the 86 subjects were chosen. In this dataset, for every expression, five samples are associated with a specific subject.

Encompassing 123 participants, the CK+ dataset features 593 picture sequences capturing facial expressions. Every image series starts with a neutral expression and ends with the highest point of the expression. The analysis focuses on extracting keyframes from the image sequences. The first frame is used as the neutral expression, while three frames depicting the peak of each emotional expression are chosen for each sequence [7,21]. The facial images are stored in the PNG format, captured at 30 frames per second, with a resolution of 640 × 480 pixels.

The Taiwanese Facial Expression Image Database (TFEID), maintained by National Yang-Ming University, contains 7200 facial expressions from 40 participants (20 male and 20 female). Each photo boasts a quality resolution of 480 × 600 pixels and is formatted as JPG. This comprehensive dataset covers nearly every conceivable facial expression.

4.2. Experimental Setup

For the datasets chosen for experimental evaluation, the facial images are resized into 120 × 120 pixels to maintain uniformity across datasets [7,21]. For all four datasets, the empirically determined block size (L) is 8. The DPIBP operator, as defined in Section 2, uses a 5 × 5 local neighborhood to render feature maps. To include spatial information, all these feature maps are then split into a grid of non-overlapping blocks. From individual blocks, histograms of the feature codes are calculated and connected to construct the final feature vector. For all four datasets, the grid size was empirically selected to be 8 × 8 (L = 8), as this design delivered the optimal balance between recognition accuracy and feature vector dimensionality. The selection of images for each expression within the four FER datasets is detailed in Figure 6.

For experimental studies, a Windows 10 PC with 16 GB of RAM and MATLAB R2018a software is used. For experimental evaluation, a One versus All (OVA) multi-class SVM technique has been employed. The experiments are conducted in a person-independent (PI) environment employing exclude-one-subject (loso) cross-validation method for the datasets JAFFE, MUG, and TFEID. The CK+ dataset employs ten-fold person-independent cross-validation because not all participants display every facial expression [4,7,21]. In the loso protocol, all of the images from a certain subject are given for testing and are not included in the training set. As a result, by employing a loso protocol, the expressions are always based on speculative data. A total of ten, forty, and fifty patients were chosen for the JAFFE, MUG, and TFEID dataset experimental investigation. Thus, the JAFFE, MUG, and TFEID datasets are used in the tests, and the loso approach is used 10, 45, and 40 times. Subject-wise recognition results are averaged to determine the overall accuracy of the system. The equation related to accuracy is shown in (46). Typically, a confusion matrix combines predicted and actual values. The addition of each and every confusion matrix from every subject yields the final confusion matrix.

Recognition Accuracy = \frac{Number of correct predictions}{Total number of predictions}

(46)

4.3. Comparison Analysis

Table 1 presents the results of the evaluation of the suggested DPIBP algorithms using benchmark FER datasets collected in a laboratory setting. It may be seen from Table 1, it is observed that optimal recognition accuracy of 61.50% has been achieved on the JAFFE dataset using the DPIBP3 method, 85.21% on the MUG dataset, 90.79% on the CK+ dataset using the DPIBP2 method, and 94.78% on the TFEID dataset using the DPIBP4 method respectively. In the following subsections, the performance of DPIBP has been compared with the existing handcrafted FER methods.

4.4. JAFFE Dataset

The confusion matrix corresponding to DPIBP3 method for the classification of seven expressions related to JAFFE dataset is shown in Figure 7. Compared to other expressions, happy and angry expression images are more accurately detected using the proposed DPIBP3 method. According to Table 2, the proposed DPIBP outperformed RADAP-LO by 5.29%, WGC by 3.30%, CSP by 1.56%, RDMP by 1.48%, and KP by 0.51% in terms of facial expression recognition accuracy. The proposed DPIBP model outperforms the LBP-based methods in references [21,40] by achieving an accuracy of 61.50% on the JAFFE dataset. In comparison, LBP in [40] achieves 50.01%, and LBP in [21] achieves 53.65% accuracy. This demonstrates the effectiveness of the proposed method in improving facial expression recognition accuracy. The proposed DPIBP model outperforms the LBP-based methods in references [50,51] due to its novel feature extraction approach inspired by the Dining Philosophers Problem, which captures more spatial relationships and subtle texture differences in facial expressions compared to traditional LBP. While LBP only considers pixel intensity differences, DPIBP incorporates multiple contextual features through its innovative use of Rook, Bishop, and Knight movements, leading to more discriminative features. This results in better recognition accuracy, as reflected by the improved performance of the DPIBP model (61.50%) over LBP in [40] (50.01%) and [21] (53.65%).

4.5. MUG Dataset

The confusion matrix corresponding to the DPIBP2 method for the classification of seven expression related to the MUG dataset is shown in Figure 8. Compared to other expressions, happy and angry expression images are more accurately detected using the proposed DPIBP2 method. According to Table 3, the proposed DPIBP outperformed RADAP-LO by 5.05%, DCFA-CNN by 2.12%, CSP by 2.41%, RDMP by 1.74%, and KP by 2.10% in terms of facial expression recognition accuracy. The proposed DPIBP model achieved 85.21% accuracy on the MUG dataset, outperforming other methods, including LBP-based models in references [21,40], which achieved 80.01% and 76.16%, respectively.

4.6. CK+ Dataset

The confusion matrix corresponding to DPIBP2 method for the classification of seven expressions related to CK+ dataset is shown in Figure 9. Compared to other expressions, happy and disgusted expression images are more accurately detected using the proposed DPIBP2 method. According to Table 4, the proposed DPIBP outperformed HiNet by 2.18%, WGC by 20.18%, CSP by 4.55%, RDMP by 4.25%, and KP by 3.57% in terms of facial expression recognition accuracy. The proposed DPIBP model outperforms the LBP-based models on the CK+ dataset, achieving 90.79% accuracy compared to 88.07% for LBP in [52] and 86.71% for LBP in [53]. The DPIBP model captures richer, spatially diverse features through its innovative approach inspired by the Dining philosophers Problem, which allows it to better distinguish facial expressions compared to traditional LBP methods that primarily focus on pixel intensity comparisons. This results in a significant performance boost, particularly in identifying subtle emotional variations across different facial expressions.

4.7. TFEID Dataset

The confusion matrix corresponding to the DPIBP4 method for the classifiaction of seven expressions related to TFEID dataset is shown in Figure 10. Compared to other expressions, happy and surprised expression images are more accurately detected using the proposed DPIBP4 method. According to Table 5, the proposed DPIBP outperformed DAMCNN by 1.42%, MSDV by 1.28%, CSP by 0.38% and RDMP by 0.14% in terms of facial expression recognition accuracy. The proposed DPIBP model outperforms the LBP model on the TFEID dataset, achieving 94.78% accuracy compared to 92.02% for LBP in [21].

Inference time refers to the time taken by a model to process a single image and yield an outcome. For DPIBP, the inference time is relatively low because it manually extracts features from a small 5 × 5 neighborhood of pixels and is computationally less expensive compared to CNN-based models, which perform distinct complex operations such as convolutions, pooling, and fully connected layers. In contrast, CNN models, which involve different layers of computations and need more processing power for convolutions, typically take longer to process each image. Furthermore, DPIBP has lower memory conditions since it only holds the extracted features for classification and performs with smaller fixed neighborhoods (5 × 5). The memory usage can be calculated utilizing system profiling tools, following the RAM consumption while processing the images. On the other hand, CNN-based models demand much more memory due to the large number of parameters in each convolutional layer and the fully connected layers, which can efficiently reach hundreds of MB or more. The computational efficiency of the proposed DPIBP framework is presented through key parameters, as shown in Table 6. We evaluated DPIBP on four benchmark datasets and report the Total Runtime, Per-Image Runtime, and approximate Memory Footprint.

The computational efficiency of the proposed DPIBP framework is presented through many key parameters, as shown in Table 6. We considered DPIBP on four benchmark datasets. For each dataset, we present Precision, Recall, F1-Score, Accuracy (where applicable), Total Runtime, Per-Image Runtime, and Memory Footprint. The results demonstrate that DPIBP exhibits efficient performance with relatively lower runtime and memory consumption across all datasets. Particularly, the per-image runtime for JAFFE is 5.63 s, while it varies from 2.02 to 2.50 s for the other datasets. Also, the memory footprint remains low (∼50 MB) during inference for all datasets, emphasizing DPIBP’s suitableness for real-time and embedded applications. These results further underline the efficiency of DPIBP compared to standard deep learning models, making it a feasible solution for resource-constrained environments.

5. Conclusions

Expression recognition thrives on accurate facial feature representation. To this end, new texture-based feature descriptors (DPIBP) for facial expression recognition from static photos have been proposed by drawing inspiration from the Dining Philosopher Problem. Four variants of DPIBP methods, namely DPIBP1, DPIBP2, DPIBP3, and DPIBP4 have been proposed corresponding to angles of

0^{\circ}

,

90^{\circ}

,

180^{\circ}

, and

270^{\circ}

. Three feature codes are produced by each of these suggested feature descriptors by encoding the nearby and adjacent pixel relationships within a local neighborhood. The PGIBP methods have been utilized for the experiments on four well-known FER datasets, including JAFFE, MUG, CK+, and TFEID. On the JAFFE, MUG, CK+, and TFEID datasets, the optimal recognition accuracy of the proposed PGIBP descriptors was 61.50%, 85.21%, 90.79%, and 94.78%. Images expressing happiness and anger can be identified more accurately with the suggested DPIBP approaches compared to other expressions. While this study focused on analyzing features from static images, the method can be extended to recognize natural expressions in video sequences. Similar to challenges faced in unmanned aerial vehicle perception systems operating under adverse environmental conditions, where robust detection of obstacles and other aircraft is essential for autonomy [56], future FER methods must also address degraded sensing scenarios such as low light, occlusions, or motion blur. Integrating our novel descriptors with deep learning models, inspired by computer vision advancements, has the potential to unlock even higher accuracy in expression recognition. Also, the concept of the self-attention mechanism can be further explored by combining different data features in FER.

Author Contributions

Conceptualization, A.P., C.R. and C.N.; methodology, A.P. and C.N.; software, R.D.A.R. and R.M.R.Y.; formal analysis, R.M.R.Y. and A.P.; investigation, C.R. and R.D.A.R.; data curation, R.D.A.R. and A.P.; resources, R.D.A.R. and R.M.R.Y.; supervision, C.N. and C.R.; project administration, C.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable. This study did not involve humans or animals and therefore did not require ethical approval.

Informed Consent Statement

Not applicable. This study did not involve human participants.

Data Availability Statement

The datasets used in this study are publicly available and can be accessed through the original source cited in the manuscript [47,48,49,50]. No new data were generated in this study.

Acknowledgments

The authors thank their respective institutions for providing computational resources and infrastructure. No external administrative or technical support was involved.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ramakrishnan, S.; Upadhyay, N.; Das, P.; Achar, R.; Palaniswamy, S.; Kumaar, A.N. Emotion recognition from facial expressions using images with arbitrary poses using siamese network. In Proceedings of the 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 7–9 October 2021; pp. 268–273. [Google Scholar]
Gupta, P.K.; Varadharajan, N.; Ajith, K.; Singh, T.; Patra, P. Facial Emotion Recognition Using Computer Vision Techniques. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–7. [Google Scholar]
Niu, B.; Gao, Z.; Guo, B. Facial expression recognition with LBP and ORB features. Comput. Intell. Neurosci. 2021, 2021, 8828245. [Google Scholar] [CrossRef]
Kartheek, M.N.; Prasad, M.V.; Bhukya, R. Radial mesh pattern: A handcrafted feature descriptor for facial expression recognition. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 1619–1631. [Google Scholar] [CrossRef]
Kola, D.G.R.; Samayamantula, S.K. Facial expression recognition using singular values and wavelet-based LGC-HD operator. IET Biom. 2021, 10, 207–218. [Google Scholar] [CrossRef]
Russo, S.; Tibermacine, I.E.; Randieri, C.; Rabehi, A.; Alharbi, A.H.; El-kenawy, E.S.M.; Napoli, C. Exploiting facial emotion recognition system for ambient assisted living technologies triggered by interpreting the user’s emotional state. Front. Neurosci. 2025, 19, 1622194. [Google Scholar] [CrossRef]
Mandal, M.; Verma, M.; Mathur, S.; Vipparthi, S.K.; Murala, S.; Kranthi Kumar, D. Regional adaptive affinitive patterns (RADAP) with logical operators for facial expression recognition. IET Image Process. 2019, 13, 850–861. [Google Scholar] [CrossRef]
Kola, D.G.R.; Samayamantula, S.K. A novel approach for facial expression recognition using local binary pattern with adaptive window. Multimed. Tools Appl. 2021, 80, 2243–2262. [Google Scholar] [CrossRef]
Kumar Tataji, K.N.; Kartheek, M.N.; Prasad, M.V. CC-CNN: A cross connected convolutional neural network using feature level fusion for facial expression recognition. Multimed. Tools Appl. 2024, 83, 27619–27645. [Google Scholar] [CrossRef]
Selvin, S.; Vinayakumar, R.; Gopalakrishnan, E.; Menon, V.K.; Soman, K. Stock price prediction using LSTM, RNN and CNN-sliding window model. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 1643–1647. [Google Scholar]
Priya, S.S.; Sanjana, P.S.; Yanamala, R.M.R.; Amar Raj, R.D.; Pallakonda, A.; Napoli, C.; Randieri, C. Flight-Safe Inference: SVD-Compressed LSTM Acceleration for Real-Time UAV Engine Monitoring Using Custom FPGA Hardware Architecture. Drones 2025, 9, 494. [Google Scholar] [CrossRef]
Yeddula, L.R.; Pallakonda, A.; Raj, R.D.A.; Yanamala, R.M.R.; Prakasha, K.K.; Kumar, M.S. YOLOv8n-GBE: A Hybrid YOLOv8n Model with Ghost Convolutions and BiFPN-ECA Attention for Solar PV Defect Localization. IEEE Access 2025, 13, 114012–114028. [Google Scholar] [CrossRef]
Randieri, C.; Perrotta, A.; Puglisi, A.; Grazia Bocci, M.; Napoli, C. CNN-Based Framework for Classifying COVID-19, Pneumonia, and Normal Chest X-Rays. Big Data Cogn. Comput. 2025, 9, 186. [Google Scholar] [CrossRef]
Osheter, T.; Campisi Pinto, S.; Randieri, C.; Perrotta, A.; Linder, C.; Weisman, Z. Semi-Autonomic AI LF-NMR Sensor for Industrial Prediction of Edible Oil Oxidation Status. Sensors 2023, 23, 2125. [Google Scholar] [CrossRef] [PubMed]
Makhmudkhujaev, F.; Iqbal, M.T.B.; Ryu, B.; Chae, O. Local directional-structural pattern for person-independent facial expression recognition. Turk. J. Electr. Eng. Comput. Sci. 2019, 27, 516–531. [Google Scholar] [CrossRef]
Iqbal, M.T.B.; Ryu, B.; Rivera, A.R.; Makhmudkhujaev, F.; Chae, O.; Bae, S.H. Facial expression recognition with active local shape pattern and learned-size block representations. IEEE Trans. Affect. Comput. 2020, 13, 1322–1336. [Google Scholar] [CrossRef]
Turk, M.A.; Pentland, A. Face recognition using eigenfaces. In Proceedings of the 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Maui, HI, USA, 3–6 June 1991; Volume 91, pp. 586–591. [Google Scholar]
Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [Google Scholar] [CrossRef]
Li, S.; Deng, W. Deep facial expression recognition: A survey. IEEE Trans. Affect. Comput. 2020, 13, 1195–1215. [Google Scholar] [CrossRef]
Kong, C.; Chen, B.; Li, H.; Wang, S.; Rocha, A.; Kwong, S. Detect and locate: Exposing face manipulation by semantic-and noise-level telltales. IEEE Trans. Inf. Forensics Secur. 2022, 17, 1741–1756. [Google Scholar] [CrossRef]
Kartheek, M.N.; Prasad, M.V.; Bhukya, R. Chess pattern with different weighting schemes for person independent facial expression recognition. Multimed. Tools Appl. 2022, 81, 22833–22866. [Google Scholar] [CrossRef]
Kartheek, M.N.; Madhuri, R.; Prasad, M.V.; Bhukya, R. Knight tour patterns: Novel handcrafted feature descriptors for facial expression recognition. In Proceedings of the International Conference on Computer Analysis of Images and Patterns, Virtual, 28–30 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 210–219. [Google Scholar]
Kartheek, M.N.; Prasad, M.V.; Bhukya, R. Modified chess patterns: Handcrafted feature descriptors for facial expression recognition. Complex Intell. Syst. 2021, 7, 3303–3322. [Google Scholar] [CrossRef]
Shanthi, P.; Nickolas, S. An efficient automatic facial expression recognition using local neighborhood feature fusion. Multimed. Tools Appl. 2021, 80, 10187–10212. [Google Scholar] [CrossRef]
Rivera, A.R.; Castillo, J.R.; Chae, O.O. Local directional number pattern for face analysis: Face and expression recognition. IEEE Trans. Image Process. 2012, 22, 1740–1752. [Google Scholar] [CrossRef] [PubMed]
Rivera, A.R.; Castillo, J.R.; Chae, O. Local directional texture pattern image descriptor. Pattern Recognit. Lett. 2015, 51, 94–100. [Google Scholar] [CrossRef]
Ryu, B.; Rivera, A.R.; Kim, J.; Chae, O. Local directional ternary pattern for facial expression recognition. IEEE Trans. Image Process. 2017, 26, 6006–6018. [Google Scholar] [CrossRef]
Iqbal, M.T.B.; Abdullah-Al-Wadud, M.; Ryu, B.; Makhmudkhujaev, F.; Chae, O. Facial expression recognition with neighborhood-aware edge directional pattern (NEDP). IEEE Trans. Affect. Comput. 2018, 11, 125–137. [Google Scholar] [CrossRef]
Guo, S.; Feng, L.; Feng, Z.B.; Li, Y.H.; Wang, Y.; Liu, S.L.; Qiao, H. Multi-view laplacian least squares for human emotion recognition. Neurocomputing 2019, 370, 78–87. [Google Scholar] [CrossRef]
Verma, M.; Vipparthi, S.K.; Singh, G. Hinet: Hybrid inherited feature learning network for facial expression recognition. IEEE Lett. Comput. Soc. 2019, 2, 36–39. [Google Scholar] [CrossRef]
Sun, Z.; Chiong, R.; Hu, Z.p. Self-adaptive feature learning based on a priori knowledge for facial expression recognition. Knowl.-Based Syst. 2020, 204, 106124. [Google Scholar] [CrossRef]
Reddy, A.H.; Kolli, K.; Kiran, Y.L. Deep cross feature adaptive network for facial emotion classification. Signal Image Video Process. 2022, 16, 369–376. [Google Scholar] [CrossRef]
Wang, Y.; Li, M.; Wan, X.; Zhang, C.; Wang, Y. Multiparameter space decision voting and fusion features for facial expression recognition. Comput. Intell. Neurosci. 2020, 2020, 8886872. [Google Scholar] [CrossRef]
Mohan, K.; Seal, A.; Krejcar, O.; Yazidi, A. FER-net: Facial expression recognition using deep neural net. Neural Comput. Appl. 2021, 33, 9125–9136. [Google Scholar] [CrossRef]
Han, Z.; Huang, H. Gan based three-stage-training algorithm for multi-view facial expression recognition. Neural Process. Lett. 2021, 53, 4189–4205. [Google Scholar] [CrossRef]
Shi, C.; Tan, C.; Wang, L. A facial expression recognition method based on a multibranch cross-connection convolutional neural network. IEEE Access 2021, 9, 39255–39274. [Google Scholar] [CrossRef]
Al Masum Molla, M.; Manjurul Ahsan, M. Artificial Intelligence and Journalism: A Systematic Bibliometric and Thematic Analysis of Global Research. arXiv 2025, arXiv:2507.10891. [Google Scholar] [CrossRef]
Gaya-Morey, F.X.; Buades-Rubio, J.M.; Palanque, P.; Lacuesta, R.; Manresa-Yee, C. Deep Learning-Based Facial Expression Recognition for the Elderly: A Systematic Review. arXiv 2025, arXiv:2502.02618. [Google Scholar]
Dell’Olmo, P.V.; Kuznetsov, O.; Frontoni, E.; Arnesano, M.; Napoli, C.; Randieri, C. Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis. Mach. Learn. Knowl. Extr. 2025, 7, 54. [Google Scholar] [CrossRef]
MK, L.M.; Modepalli, D.; Shaik, M.B.; Busi, M.; Venkataiah, C.; Y, M.R.; Alkhayyat, A.; Rawat, D. Efficient Feature Extraction for Recognition of Human Emotions through Facial Expressions Using Image Processing Algorithms. E3S Web Conf. 2023, 391, 01182. [Google Scholar]
Yue, K.B. Dining philosophers revisited, again. ACM Sigcse Bull. 1991, 23, 60–64. [Google Scholar] [CrossRef]
Davidrajuh, R. Verifying solutions to the dining philosophers problem with activity-oriented petri nets. In Proceedings of the 2014 4th International Conference on Artificial Intelligence with Applications in Engineering and Technology, Kota Kinabalu, Malaysia, 3–5 December 2014; pp. 163–168. [Google Scholar]
Tuncer, T.; Dogan, S.; Ataman, V. A novel and accurate chess pattern for automated texture classification. Phys. A Stat. Mech. Its Appl. 2019, 536, 122584. [Google Scholar] [CrossRef]
Happy, S.; Routray, A. Automatic facial expression recognition using features of salient facial patches. IEEE Trans. Affect. Comput. 2014, 6, 1–12. [Google Scholar] [CrossRef]
Majumder, A.; Behera, L.; Subramanian, V.K. Emotion recognition from geometric facial features using self-organizing map. Pattern Recognit. 2014, 47, 1282–1293. [Google Scholar] [CrossRef]
Dubey, S.R. Local directional relation pattern for unconstrained and robust face retrieval. Multimed. Tools Appl. 2019, 78, 28063–28088. [Google Scholar] [CrossRef]
Lyons, M.; Akamatsu, S.; Kamachi, M.; Gyoba, J. Coding facial expressions with gabor wavelets. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14–16 April 1998; pp. 200–205. [Google Scholar]
Aifanti, N.; Papachristou, C.; Delopoulos, A. The MUG facial expression database. In Proceedings of the 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, Desenzano del Garda, Italy, 12–14 April 2010; pp. 1–4. [Google Scholar]
Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, San Francisco, CA, USA, 13–18 June 2010; pp. 94–101. [Google Scholar]
Yang, T.; Yang, Z.; Xu, G.; Gao, D.; Zhang, Z.; Wang, H.; Liu, S.; Han, L.; Zhu, Z.; Tian, Y.; et al. Tsinghua facial expression database–A database of facial expressions in Chinese young and older women and men: Development and validation. PLoS ONE 2020, 15, e0231304. [Google Scholar] [CrossRef] [PubMed]
Chandra Sekhar Reddy, P.; Vara Prasad Rao, P.; Kiran Kumar Reddy, P.; Sridhar, M. Motif shape primitives on fibonacci weighted neighborhood pattern for age classification. In Soft Computing and Signal Processing: Proceedings of ICSCSP 2018, Volume 1; Springer: Berlin/Heidelberg, Germany, 2019; pp. 273–280. [Google Scholar]
Yang, J.; Adu, J.; Chen, H.; Zhang, J.; Tang, J. A facial expression recongnition method based on dlib, ri-lbp and resnet. J. Phys. Conf. Ser. 2020, 1634, 012080. [Google Scholar] [CrossRef]
Dong, F.; Zhong, J.; Wang, W.; Han, J.; Chen, T. A Novel Facial Expression Recognition Algorithm Based on LBP and Improved ResNet. In Proceedings of the 2024 7th International Conference on Sensors, Signal and Image Processing, Shenzhen, China, 22–24 November 2024; pp. 103–108. [Google Scholar]
Mukhopadhyay, M.; Dey, A.; Shaw, R.N.; Ghosh, A. Facial emotion recognition based on textural pattern and convolutional neural network. In Proceedings of the 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), Kuala Lumpur, Malaysia, 24–26 September 2021; pp. 1–6. [Google Scholar]
Xie, S.; Hu, H.; Wu, Y. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recognit. 2019, 92, 177–191. [Google Scholar] [CrossRef]
Randieri, C.; Ganesh, S.V.; Raj, R.D.A.; Yanamala, R.M.R.; Pallakonda, A.; Napoli, C. Aerial Autonomy Under Adversity: Advances in Obstacle and Aircraft Detection Techniques for Unmanned Aerial Vehicles. Drones 2025, 9, 549. [Google Scholar] [CrossRef]

Figure 1. Dining Philosophers Problem.

Figure 2. Neighborhood pixels used in DPIBP’s feature extraction process. (a) Sample 5 × 5 block with pixel representations. (b) Logical positioning of philosophers, chopsticks, and noodles in a 5 × 5 block for extracting the features using DPIBP1. (c) Pixels chosen for extracting the features using DPIBP1 as per the logical placement shown in (b).

Figure 3. Pixels considered for feature extraction using DPIBP2. (a) Logical positioning of philosophers, chopsticks, and noodles in a 5 × 5 block for feature extraction using DPIBP2 (b) The pixels chosen for extracting the features using DPIBP2 as per the logical placement shown in (a).

Figure 4. Pixels considered for feature extraction using DPIBP3. (a) Logical positioning of philosophers, chopsticks, and noodles in a 5 × 5 block for feature extraction using DPIBP3. (b) The pixels chosen for extracting the features using DPIBP3 as per the logical placement shown in (a).

Figure 5. Pixels considered for feature extraction using DPIBP4. (a) Logical positioning of philosophers, chopsticks, and noodles in a 5 × 5 block for extracting the feature using DPIBP4. (b) The pixels chosen for extracting the feature using DPIBP4 as per the logical placement shown in (a).

Figure 6. Number of images chosen from each expression for experimental analysis across four FER datasets.

Figure 7. Confusion matrix for JAFFE dataset obtained using DPIBP3 method.

Figure 8. Confusion matrix for MUG dataset obtained using DPIBP2 method.

Figure 9. Confusion matrix for CK+ dataset obtained using DPIBP2 method.

Figure 10. Confusion matrix for TFEID dataset obtained using DPIBP4 method.

Table 1. Accuracy comparison using DPIBP for seven expressions.

Dataset	DPIBP1	DPIBP2	DPIBP3	DPIBP4
JAFFE	60.36	61.32	61.50	58.93
MUG	85.14	85.21	83.37	84.25
CK+	89.90	90.79	89.87	90.30
TFEID	94.29	93.51	94.23	94.78

Table 2. Recognition accuracy on JAFFE dataset.

Method	Accuracy (%)
RADAP-LO [7]	56.21
WGC [5]	58.20
CSP [21]	59.94
RDMP [4]	60.02
KP [22]	60.99
LBP in [40]	50.01
LBP in [21]	53.65
Proposed DPIBP	61.50

Table 3. Recognition accuracy on MUG dataset.

Method	Accuracy (%)
RADAP-LO [7]	80.16
DCFA-CNN [51]	83.09
CSP [21]	82.80
RDMP [4]	83.47
KP [22]	83.11
LBP+SVM [40]	80.01
LBP+KNN [40]	71.43
LBP [21]	76.16
Proposed DPIBP	85.21

Table 4. Recognition accuracy on CK+ dataset.

Method	Accuracy (%)
HiNet [30]	88.61
WGC [5]	70.61
CSP [21]	86.24
RDMP [4]	86.54
KP [22]	87.22
LBP in [52]	88.07
LBP in [53]	86.71
LBP+CNN [54]	79.56
Proposed DPIBP	90.79

Table 5. Recognition accuracy on TFEID dataset.

Method	Accuracy (%)
DAMCNN [55]	93.36
MSDV [33]	93.50
CSP [21]	94.40
RDMP [4]	94.64
LBP [21]	92.02
Proposed DPIBP	94.78

Table 6. Computational efficiency table of DPIBP across benchmark datasets.

Dataset	Precision	Recall	F1-Score	Accuracy	Total Runtime (min)	Per-Image Runtime (s)	Memory Footprint
JAFFE	0.6123	0.6173	0.6020	–	20	5.63	Low (∼50 MB)
MUG	0.8526	0.8519	0.8503	–	20	2.29	Low (∼50 MB)
CK+	0.9010	0.8530	0.8720	0.906	20	2.02	Low (∼50 MB)
TFEID	0.9550	0.9480	0.9510	0.947	20	2.50	Low (∼50 MB)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pallakonda, A.; Yanamala, R.M.R.; Raj, R.D.A.; Napoli, C.; Randieri, C. DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition. Technologies 2025, 13, 420. https://doi.org/10.3390/technologies13090420

AMA Style

Pallakonda A, Yanamala RMR, Raj RDA, Napoli C, Randieri C. DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition. Technologies. 2025; 13(9):420. https://doi.org/10.3390/technologies13090420

Chicago/Turabian Style

Pallakonda, Archana, Rama Muni Reddy Yanamala, Rayappa David Amar Raj, Christian Napoli, and Cristian Randieri. 2025. "DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition" Technologies 13, no. 9: 420. https://doi.org/10.3390/technologies13090420

APA Style

Pallakonda, A., Yanamala, R. M. R., Raj, R. D. A., Napoli, C., & Randieri, C. (2025). DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition. Technologies, 13(9), 420. https://doi.org/10.3390/technologies13090420

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DPIBP: Dining Philosophers Problem-Inspired Binary Patterns for Facial Expression Recognition

Abstract

1. Introduction

2. Related Work

3. Proposed Method

3.1. Dining Philosophers Problem

3.2. Theoretical Grounding of the Dining Philosophers Analogy

3.3. Feature Extraction Using DPIBP Method

3.3.1. DPIBP1 Feature Extraction

3.3.2. DPIBP2

3.3.3. DPIBP3

3.3.4. DPIBP4

3.4. Distinctiveness of DPIBP4 and Its Contribution

4. Experimental Results and Analysis

4.1. Datasets

4.2. Experimental Setup

4.3. Comparison Analysis

4.4. JAFFE Dataset

4.5. MUG Dataset

4.6. CK+ Dataset

4.7. TFEID Dataset

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI