A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries

Tupan, Johan Marcus; Rieuwpassa, Fredrik; Setha, Beni; Latuny, Wilma; Goesniady, Samuel

doi:10.3390/fishes10020075

Open AccessArticle

A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries

by

Johan Marcus Tupan

¹,

Fredrik Rieuwpassa

²,

Beni Setha

²,

Wilma Latuny

^1,*

and

Samuel Goesniady

¹

Department of Industrial Engineering, Faculty of Engineering, Pattimura University, Ambon 97233, Maluku, Indonesia

²

Department of Fish Processing Technology, Pattimura University, Ambon 97233, Maluku, Indonesia

^*

Author to whom correspondence should be addressed.

Fishes 2025, 10(2), 75; https://doi.org/10.3390/fishes10020075

Submission received: 12 November 2024 / Revised: 7 February 2025 / Accepted: 7 February 2025 / Published: 13 February 2025

(This article belongs to the Special Issue Management and Technology for Tuna Fisheries)

Download

Browse Figures

Versions Notes

Abstract

The Indonesian maritime territory harbors a rich diversity of marine resources, making up approximately 37% of global fish species diversity. Tuna, particularly in Maluku Province, stands out as a vital economic asset with growing production and export numbers. Current practices for processing and evaluating tuna meat, however, face significant limitations due to basic infrastructure and reliance on manual inspection methods, leading to potential contamination risks and treatment identification errors. This research addresses these challenges by implementing an advanced deep learning solution based on convolutional neural networks (CNNs) to automatically identify three distinct treatment categories for tuna loin: No-Treatment, CO-Treatment, and CS-Treatment. Trained on a comprehensive image dataset, the model demonstrated exceptional performance with 95% accuracy. While field testing confirmed the model’s strong performance in correctly identifying treatment categories, occasional classification errors highlighted areas for improvement in data preprocessing. This study provides a significant step forward in automated fish processing assessment technology, offering a promising solution to longstanding challenges in the marine processing industry.

Keywords:

deep learning; convolutional neural networks; fish treatment classification; computer vision; tuna processing

Key Contribution: This study presents the development and implementation of an automated, highly accurate CNN-based system for the real-time classification of tuna loin treatment status, revolutionizing traditional manual inspection methods in the Indonesian tuna processing industry.

1. Introduction

Indonesia’s marine waters are home to an impressive 37% of global fish species diversity. These waters contain various high-value marine species, including tuna, shrimp, lobster, reef fish, ornamental fish, shellfish, and seaweed. While the sustainable marine fish potential in Indonesia reaches 12.54 million tons yearly across its waters and exclusive economic zone (ZEEI), the permitted catch quota (JTB) is limited to 80% of this potential, approximately 10.03 million tons annually. In 2019, the actual fisheries production reached 7.53 million tons, utilizing only 69.59% of the JTB. The potential of marine microflora and fauna remains largely unexplored, though it shows promise as a future source of functional food [1].

Maluku Province’s economy significantly benefits from its fisheries sector, which contributed an average of 12.64% to the region’s gross regional domestic product (PDRB) from 2016 to 2020. Tuna, particularly yellowfin tuna, stands as a primary commodity in the region. Production saw remarkable growth from 15,608 tons in 2015 to 30,804 tons in 2019 before experiencing a sharp 53% decline to 14,349 tons in 2020 due to COVID-19 impacts. Among Maluku’s island clusters, Cluster 7 dominated production with 12,293 tons (48%), followed by South Seram Cluster 5 with 4510 tons (18%) and Banda Cluster 6 with 3704 tons (12%) [2]. The significant expansion of tuna production in Maluku began with direct tuna exports in 2018, with volumes increasing from 1432 tons in 2019 to 1601 tons in 2020, marking an 11.8% growth.

Tuna fisheries in Maluku are known for their fishermen, who still use traditional fishing techniques. As a result, fishermen in the region tend to immediately process the tuna they catch into loin cuts on board. This is done to efficiently manage storage space. Although loin processing is carried out on board, the facilities used by fishermen are not always adequate. The sanitation conditions of the boats are often not guaranteed, increasing the risk of contamination of the catch. This contamination can negatively impact the quality of tuna loin during distribution and marketing [3]. Fish quality assessment encompasses multiple factors, including safety, food attributes, physical properties, nutritional content, availability, freshness, and overall integrity. To assess fish quality, various chemical, physical, biochemical, and microbiological tools and sensor-based methods have also been created [4].

Multiple factors influence fish product quality, including safety standards, food characteristics, nutritional content, freshness levels, availability, physical properties, and overall integrity. Additionally, various chemical, physical, biochemical, and microbiological tools, along with sensing methods, have been designed to assess fish quality. Numerous chemical and biochemical techniques are available to determine levels of fish freshness, but these methods often require significant costs, long processing times, and specialized skilled labor [5]. Another quality characteristic is color, which is widely used to determine human perception and is one of the attributes used to grade the freshness of seafood samples based on color changes. In the industry, traditional manual sorting by fishermen can be inconsistent due to visual fatigue. The inspection process, involving physical-only inspections, results in fish intended for consumption often being damaged. Research related to fish freshness quality inspection has been widely conducted. The visual quality of seafood, such as color, skin appearance, and eyes, directly affects its economic value and consumer acceptance. Consumers always assess and identify the freshness and quality of products by the color of seafood [6]. Traditional methods for evaluating freshness are time-consuming and can only be performed by professional operators.

The machine vision system (MVS) offers a solution to traditional method limitations by integrating information and image processing techniques for seafood assessment [7]. This system consists of a camera to capture images, an illumination system, and computer software for image analysis [8]. MVS technology applications primarily focus on evaluating seafood quality, including morphology assessment, species detection, and several physical and chemical characteristics during the processing and storage stages [7]. Early studies on MVS application were conducted to evaluate fish freshness by observing changes in the eyes [9] and extracting gill features [10]. All study results indicate that MVS technology is highly effective in reflecting seafood quality and can be used as a tool to predict seafood freshness levels. One method used is digital image processing. Image processing is performed to extract information from objects or recognize objects in images. Digital image processing methods are also applied to deep learning identification methods. Deep learning is a new field in machine learning. Deep learning has excellent feasibility in computer vision. Currently, the development of technology in computer vision with the application of convolutional neural networks shows high accuracy performance, such as in object detection. Research has been conducted to identify fish freshness, such as by comparing K-nearest neighbor algorithms [11].

Research using tuna loin images with deep learning models based on convolutional neural networks is still very limited. Previous studies primarily utilized computer vision and machine learning models to evaluate fish meat quality, as performed by [12,13,14]. Furthermore, studies using artificial intelligence methods in fisheries and marine fields have also been conducted by many researchers, especially in predicting fish freshness. The most commonly used method for predicting fish freshness is deep learning with convolutional neural network (CNN) algorithms [15,16,17,18,19,20,21,22,23,24]. The algorithms used to measure model performance predictions also vary, including YOLOv architecture [18,21], MobileNet architecture [22,23,24], VGG 16 architecture [17,20,22], and a combination of Xception, MobileNet VI, ResNet 50, and VGG 16 architectures [22]. A combination of deep learning models with CNN algorithms using combined prediction model architectures that provide prediction accuracy above 85%, such as DenseNet, ResNet, and Inception architectures, has not been explored by previous researchers.

Given these considerations, there is a clear need for comprehensive research analyzing tuna loin raw material quality. This should include developing an automated image processing-based quality prediction model that combines various architectural approaches using convolutional neural network algorithms to compare prediction accuracy in Ambon Island’s fish processing industry.

The primary contributions of this study are summarized as follows:

It provides an overview of advancements in fisheries and marine science research, emphasizing the application of machine learning and deep learning models, with a particular focus on tuna, through a comprehensive literature review.
It introduces a non-destructive framework based on DL-CNN for identifying treatments of tuna loin by analyzing and interpreting the color characteristics of tuna loin meat.
It evaluates the performance of prediction models for tuna loin treatments by utilizing multiple CNN architectures.

2. Related Works

According to Pianta et al. [25], quality refers to meeting requirements while minimizing potential defects, also known as the “zero defect” standard. Meanwhile, from [26] Quality is a multifaceted concept that encompasses products, services, people, processes, and environments, defined by their ability to meet or exceed expectations. Over the past decade, diverse perspectives have emerged to deepen the understanding of “quality”. Customers have unique requirements, which must be converted into measurable attributes to ensure satisfaction. Ultimately, quality aims to provide value that meets customer expectations for services or products. Some key definitions from [27] are as follows: quality as suitability for a purpose, quality as a measure of customer satisfaction, quality as precision in meeting design or specifications, quality as adherence to standards or norms, and quality as a level of excellence.

The definition of frozen tuna loin, according to the Indonesian National Standard (SNI) 01-4104.1-2006 [28], refers to a processed fishery product made from fresh or frozen tuna that undergoes the following treatments: receiving, gutting (or without gutting), washing, loin cutting, skinning and trimming, quality sorting, wrapping, freezing, weighing, packing, labeling, and storage.

According to SNI 01-2729.1-2006, the characteristics of fresh fish are as follows: the pupils of the eyes are black and prominent with a clear cornea; the gills are dark red and free of mucus; the flesh is elastic (springy) and firm when pressed; and the mucus on the skin’s surface is clear and colorless. On the other hand, the characteristics of fish that is not fresh include the following: the pupils appear very cloudy; the color of the gills has turned brownish; the texture of the flesh is soft; and an unpleasant odor begins to be detected. Freshness significantly contributes to the quality of fish or fishery products. According to [29], there are nine quality indicators for fishery products: availability, convenience, safety, packaging color, freshness, nutritional properties, price value, consistency, and sensory properties. Specifically, for the freshness indicator, there are seven sub-indicators: microbiology, volatiles, sensory analysis, physical properties, protein, lipids, and ATP.

The sorting of tuna aims to categorize fresh tuna that meets export quality standards. Various factors that can cause differences in quality include the time of death, live cutting methods, fish handling, blood removal effectiveness, sanitation, duration at sea, and the implementation of the cold chain. The sorting process is conducted organoleptically, involving the assessment of appearance, skin, eyes, texture, meat firmness, and meat color. Organoleptic assessment, particularly related to texture, firmness, and color, is directed at fish meat samples taken from the tail and the rear of the ventral fin. This is done to prevent physical damage to the tuna that will be exported. The quality or grade of tuna at the transit point is classified into four categories: grade A, B, C, and D. The sorting process is carried out by an inspector using a coring tube, which is a sharp rod-like tool made of iron. Sampling is conducted on both sides of the fish (the rear fin or tail on the right and left) by inserting the coring tube into the fish’s body to obtain a piece of tuna meat.

2.1. Machine Learning

Machine learning is a field of study based on the concept that machines can learn independently without needing to be explicitly programmed. The data used by the system to learn is known as a dataset, while each training example is called a training set or sample. The quality of learning improves with an increase in the amount of available data [30]. Deep learning is a sub-field of machine learning introduced in 1986 and later implemented in 2000 in artificial neural networks (ANNs) [31]. Deep learning enables computers to perform tasks similarly to humans. With hundreds of layers or even more, deep learning is known as a “deep” method. Deep learning is a machine learning technique that uses multiple layers of non-linear processing to extract features, recognize patterns, and perform classification [32]. The difference between machine learning and deep learning lies in feature extraction: in machine learning, this is performed separately from the classification process, whereas in deep learning, these processes can be combined.

2.2. Deep Learning

Deep learning methods continue to evolve, with convolutional neural networks (CNNs) producing the most significant results in image recognition to date [33]. Research has shown that convolutional neural networks (CNNs) are deep learning algorithms that can acquire human-like proficiency in specific visual tasks. They are specifically designed to handle data inputs in the form of images, assign weight values, and differentiate between objects. CNNs eliminate the need for manual feature extraction. Instead, CNNs directly extract features from a series of image data. Features related to the task are not used in prior training but are learned during the network’s training process on image datasets. This automatic feature extraction method is the most accurate learning model for image processing tasks such as object detection, classification, and recognition [34].

The classification of fish freshness is critical in fisheries and food safety, with traditional methods being increasingly replaced by more objective and reliable approaches enabled by deep learning (DL) and image processing technologies. DL models, particularly convolutional neural networks (CNNs), analyze features such as eye clarity, gill color, skin texture, and flesh color to assess freshness. Performance indicators like accuracy, sensitivity, specificity, and F1 score are used to evaluate model effectiveness. Other machine learning algorithms, including support vector machines (SVMs), decision trees, and k-nearest neighbors (KNN), also contribute to fish freshness classification. The choice of method is influenced by factors such as data availability and industry requirements. Ongoing research continues to refine these approaches, aiming to provide more accurate, efficient, and scalable solutions for the fisheries sector. A summary of the relevant literature on fish freshness classification can be found in Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6.

The sensory quality of tuna is determined by the color and texture of the meat, which are important factors in product evaluation by consumers, fishermen, and the fisheries processing industry. The fisheries processing industry in Maluku, Indonesia, produces tuna loin products, including fresh tuna loin sashimi (fresh loin sashimi), frozen loin treated with carbon monoxide (frozen loin CO), and frozen loin treated with clear smoke (frozen loin CS). Tuna loin with CO and CS treatments generally exhibits a brighter and more visually appealing color compared to untreated tuna loin. The injection of CO or CS gas into the tuna loin aims to evenly distribute the fresh red color of hemoglobin throughout the loin.

Based on a literature review, research using images of yellowfin tuna meat and applying deep learning models based on convolutional neural networks remains limited. Similarly, the use of CNN architectures for meat classification and fish freshness detection is still limited. This study will investigate the application of deep learning using multi-architecture CNN models (ResNet, DenseNet, Inception) to classify the color of tuna loin meat in untreated (fresh) meat, CO-treated tuna loin, and clear smoke (CS)-treated tuna loin. The performance indicators used in this study include accuracy, precision, recall, F1-score, ROC, AUC, and kappa score.

3. Material and Methods

This study was conducted in the fisheries processing industry in Maluku, Indonesia. The stages of the frozen tuna loin production process include (1) raw material receiving, (2) temporary storage, (3) washing I, (4) debagging, (5) washing II, (6) removal of black meat and skinning, (7) trimming, (8) weighing, (9) carbon monoxide treatment or filtered wood smoke (FWS) treatment, (10) chilling, (11) ozone treatment, (13) retouching, (14) weighing and labeling, (15) vacuum sealing, (16) freezing, (17) metal detection, (18) packing and labeling, (19) storage, (20) loading, and (21) stuffing. This study uses a sample dataset consisting of fresh tuna loin (No-Treatment class), frozen loin with CO treatment (CO-Treatment class), and frozen with CS treatment (CS-Treatment class), collected at different stages of the tuna loin production process. “No-Treatment” refers to fresh tuna loin obtained from suppliers and processed into tuna loin after the removal of black meat and skinning and trimming stages. “CO-Treatment” refers to tuna loin obtained after the carbon monoxide gas injection and chilling process. “CS-Treatment” refers to tuna loin obtained after the filtered wood smoke (FWS) treatment and chilling process.

Images of the tuna loin are captured from multiple angles using a digital camera (Table 7 and Figure 1) to ensure a comprehensive analysis of all physical characteristics. Each image is labeled according to the quality grade of the tuna loin, as assessed by an experienced quality control checker following established evaluation standards. This labeling process serves as the foundation for training the prediction model.

3.1. Research/System Overview

The methodology section will provide a comprehensive description of both the application development and the data preparation pipeline for deep learning model implementation. The data preparation workflow encompasses all stages, from initial image collection to final model-ready processing, including image resizing, augmentation techniques, and preprocessing steps. This systematic documentation aims to ensure reproducibility and facilitate future improvements in the research.

The stages of implementing DCNN to predict tuna loin with CO or CS treatment or without treatment can be seen in Figure 1.

3.2. Image Datasets

The initial phase of data collection involved obtaining detailed images of tuna loin meat, carefully capturing samples across different classifications. The complete dataset comprised 565 images (215 No-Treatment; 214 CO-Treatment; 136 CS-Treatment), providing a robust foundation for analysis. To demonstrate the classification categories, Figure 2 presents representative examples from each class: Figure 2A depicts a sample from the No-Treatment class, Figure 2B showcases the CO-Treatment class, and Figure 2C illustrates the CS-Treatment class. These selected images serve as visual references for the classification categories employed in this research.

The data preparation phase involves several preprocessing techniques, including image resizing and augmentation methods. To optimize model performance, all images are standardized to 224 × 224 pixels, striking a balance between computational efficiency and information preservation. The analysis focuses specifically on segmented pixels rather than the entire image, enabling the model to concentrate on relevant features.

3.3. Image Resizing and Image Data Augmentation

All collected images were resized to a standardized resolution of 224 × 224 pixels, a common format used in deep learning architectures such as MobileNet. The original images, captured at a resolution of 6000 × 4000 pixels, were subjected to dimensional reduction using a resizing function. This transformation converted high-resolution images into the standardized format, enhancing computational efficiency while maintaining uniformity across the dataset. Standardization is essential to ensure consistency when processing images of varying dimensions and facilitates reliable model training and evaluation.

To increase the diversity of the training dataset, an augmentation phase was implemented. This phase generated computationally distinct variants of the original images through various transformations, including rotation, adjustments to width and height, shear transformations, zoom modifications, horizontal and vertical flips, brightness alterations, and scaling adjustments. These augmentation techniques introduced a broader range of image variations, allowing the model to generalize more effectively during training. Image data augmentation creates unique samples that convolutional neural networks (CNNs) recognize as distinct inputs. This approach was employed in this study to overcome the limitations of a small dataset, ensuring the generation of diverse representations and enhancing the robustness of the training dataset.

3.4. Model Architectures

3.4.1. DenseNet

DenseNet is constructed by applying identity connections at each layer, enabling the concatenation of residual mappings from all preceding layers. As a result, each layer receives input from the feature maps of all previous layers, and the output of one layer serves as input for all subsequent layers. This structure effectively improves feature utilization across all layers without significantly increasing model complexity. In this study, the DenseNet 121 architecture was selected as one of the commonly utilized DenseNet structures [50].

3.4.2. ResNet

ResNet-50 is a convolutional neural network architecture consisting of 50 layers, as detailed by [51]. One of the key innovations in convolutional neural network (CNN) architectures, particularly in ResNet-50, is the introduction of shortcut connections, as discussed by [52]. The ResNet-50 model is notable for its use of skip connections and residual learning structures, which help mitigate training issues such as gradient vanishing and gradient explosion, as explained by [53].

3.4.3. Inception

Inception V3, developed by Google to tackle the ImageNet Large Visual Recognition Challenge in 2012, is a deep convolutional network model. It employs a diverse set of filters within its convolutional layers, differing from traditional approaches by combining these filters using channel concatenation operations before moving to the next iteration [38]. The inception module’s primary role is to serve as a multi-level feature extractor, merging results from various convolutional filters within a single module. These filter outputs are organized into channel dimensions for further processing in subsequent layers.

3.5. Metric Evaluation

Seven evaluation metrics (Equations (1)–(8)) are employed to assess the performance of the proposed model [54,55,56,57]. These metrics are derived from the confusion matrix, which compares actual classes with predicted ones. Correctly classified instances are represented along the diagonal of the confusion matrix.

P_{a} = \frac{{T P}_{a}}{{T P}_{a} + {F P}_{a}}

(1)

R_{a} = \frac{{T P}_{a}}{{T P}_{a} + {F N}_{a}}

(2)

{F 1}_{a} = 2 * \frac{P_{a} \times R_{a}}{P_{a} + R_{a}}

(3)

A C C = \frac{{T P}_{a} + {T N}_{a}}{{T P}_{a} + {T N}_{a} {+ {F P}_{a} + F N}_{a}}

(4)

where TP_a, TN_a, FP_a, and FN_a represent the true positive, true negative, false positive, and false negative for class “a”. Similarly, P_a, R_a, and F1_a represent precision, recall, and F1-score for class “a”:

T P R = \frac{{T P}_{a}}{{T P}_{a} + {F N}_{a}}

(5)

F P R = \frac{{F P}_{a}}{{F P}_{a} + {T N}_{a}}

(6)

where TPR is the true positive rate, and FPR is the false positive rate. TPR and FPR represent the receiver operating characteristic/ROC.

A U C = 0.5 \times \sum_{i - 1}^{n} {T P R}_{n}

(7)

where AUC is the area under the curve, and 0.5 is the threshold.

K a p p a S c o r e = \frac{P_{0} - P_{e}}{1 - P_{e}}

(8)

where P₀ is the observed agreement ratio, and P_e is the hypothetical probability of chance agreement.

3.6. Model Training and Testing

The research implements three distinct CNN architectures—ResNet, DenseNet, and Inception—each chosen for their unique capabilities: ResNet’s skip connections enable effective deep network training; DenseNet’s dense connections provide efficient feature map utilization; and Inception’s multiscale processing allows for feature capture at different resolutions. The prepared dataset of 565 images underwent augmentation and segmentation before being split into two portions: a training set containing 80% (453 images) with balanced grade distribution and a testing set with the remaining 20% (112 images).

3.7. Implementation

The modeling for predicting tuna loin treatments was performed in Python using Keras. The hyper-parameters can be found in Table 8. We split each dataset into training and test sets at an 80:20 ratio per category [58].

4. Result and Discussion

In this section, we will discuss the model training for each DCNN architecture: ResNet, DenseNet, and Inception. The discussion will include evaluating each architecture’s performance in predicting tuna loin quality based on seven evaluation metrics using samples of tuna loin images not used in the training and validation processes. We will also compare the performance with different architectures reported in previous research.

4.1. Training Model Result

This section evaluates the performance and training results of the ResNet classification model shown through the performance graph over 15 epochs, including validation loss, validation accuracy, training loss, and training accuracy (Figure 3). The graph shows the validation loss (blue line), exhibiting a significant decrease from epochs 1 to 4 and then gradually declining until epoch 15, indicating that the model is successfully learning from the data. The training loss (orange dashed line) drops sharply at first but then tends to stabilize at a lower value than the validation loss, indicating signs of overfitting. Training accuracy (red dashed line) increases gradually until reaching around 0.95 in the final epoch, while validation accuracy appears to fluctuate with a downward trend at the end (green dashed line). The significant gap between the high training accuracy and lower, unstable validation accuracy further confirms that the model is experiencing overfitting—where the model excels at learning training data but is less optimal in generalizing to unseen validation data. Overall, the ResNet model demonstrates impressive training performance, with a high training accuracy of 93% and consistently decreasing loss values, showing strong learning capabilities on the training data. While there is room for improvement in validation performance through optimization techniques, the model establishes a solid foundation for classification tasks.

Figure 4 illustrates the performance and training evaluation results of the DenseNet classification model shown through the performance graph over 15 epochs, including validation loss, validation accuracy, training loss, and training accuracy. The graph shows that the validation loss (blue line) starts remarkably high but drops very sharply in the first two epochs and continues to decrease steadily until epoch 15, indicating efficient initial learning. The training loss (orange dashed line) shows a rapid improvement in the first few epochs and then stabilizes at a consistently low value of around 0.95–1.0, suggesting strong learning on the training data. Training accuracy (red dash–dot line) demonstrates steady improvement, reaching and maintaining approximately 90–95% accuracy after epoch 8. However, the validation accuracy (green dotted line) shows concerning instability and fluctuations throughout training, with a general decreasing trend and significant variance between epochs, despite the decreasing validation loss. The substantial gap between the stable, high training accuracy and the unstable, lower validation accuracy, combined with the diverging loss metrics, strongly indicates that the DenseNet model is experiencing notable overfitting issues. Overall, the DenseNet model showcases exceptional learning efficiency with rapid initial convergence and excellent training accuracy above 95%, demonstrating strong feature extraction capabilities. While the validation metrics suggest room for fine-tuning, the model’s robust training performance and steady loss reduction provide a promising platform for further optimization.

Figure 5 displays the performance and training results of the Inception classification model shown through the performance graph over 15 epochs, including validation loss, validation accuracy, training loss, and training accuracy. The graph shows that the validation loss (blue line) starts notably high at around 2.0 but demonstrates a dramatic decrease in the first four epochs and continues to decline steadily until epoch 15, indicating effective initial learning. The training loss (orange dashed line) shows a gradual increase after epoch 4 and stabilizes at a higher value of around 0.9–1.0, displaying a different pattern compared to the validation loss. Training accuracy (red dash–dot line) shows moderate improvement, reaching and maintaining approximately 75–80% accuracy throughout the training process. The validation accuracy (green dotted line) starts high but shows a concerning downward trend with significant fluctuations, particularly after epoch 8, despite the consistently decreasing validation loss. The divergent patterns between training and validation metrics, especially the increasing training loss while validation loss decreases, suggest unique learning dynamics in the Inception model. Overall, the Inception model exhibits remarkable efficiency in minimizing validation loss and demonstrates stable training accuracy around 80%, showing promising feature extraction capabilities. While the validation metrics indicate potential areas for improvement, the model’s unique learning patterns and steady validation loss reduction provide interesting insights for further architectural optimization.

The evaluation of three classification models, ResNet, DenseNet, and Inception, over 15 epochs highlights their performance trends, learning dynamics, and challenges: (1) ResNet demonstrates strong training performance, achieving high training accuracy (~95%) and consistent loss reduction, indicating effective learning from the training data. However, the model shows signs of overfitting, evidenced by a significant gap between high training accuracy and fluctuating, lower validation accuracy. Despite the overfitting, ResNet establishes a solid foundation for classification tasks. (2) DenseNet achieves rapid initial convergence and maintains exceptional training accuracy (above 95%) with steadily decreasing loss values, reflecting its strong feature extraction capabilities. However, the model also exhibits overfitting, as validation accuracy fluctuates significantly with a downward trend, highlighting the need for optimization to improve generalization. (3) The Inception model effectively minimizes validation loss, indicating strong initial learning. While training accuracy stabilizes at 75–80%, the validation accuracy fluctuates and trends downward after epoch 8, suggesting unique learning dynamics. The divergence between decreasing validation loss and increasing training loss requires further investigation. Nonetheless, the model shows potential in feature extraction and provides valuable insights for future optimization.

While all three models showcase robust training performance and effective learning on the training data, their generalization capabilities remain a challenge due to varying degrees of overfitting. ResNet and DenseNet exhibit high training accuracy but unstable validation metrics, while Inception demonstrates unique learning behavior with potential for refinement. These insights underline the need for optimization techniques to enhance validation performance, providing a pathway for future model improvements.

4.2. Model Performance

Performance measurement plays a pivotal role in assessing the effectiveness and reliability of prediction models. It encompasses various evaluation metrics, such as accuracy, precision, recall, F1 score, and area under the curve (AUC), to gauge the model’s ability to generalize and make accurate predictions. These metrics collectively assess the overall correctness of the model, its balance in handling false and true predictions, and its robustness, particularly in image classification tasks. Advanced methods, including receiver operating characteristic (ROC) analysis and kappa statistics, offer deeper insights by addressing probabilistic predictions and agreement beyond chance. This comprehensive evaluation framework provides a clear understanding of the model’s performance, guiding its optimization and suitability for deployment (Figure 6, Figure 7 and Figure 8 and Table 9).

Figure 6 displays a confusion matrix that shows the performance of the ResNet model in classifying three classes: No-Treatment, CO-Treatment, and CS-Treatment. The matrix indicates that for No-Treatment, the model correctly classified 38 instances, with 5 instances misclassified as CO-Treatment and 1 as CS-Treatment. For CO-Treatment, the model achieved perfect classification, with all 42 instances correctly identified, showing no misclassifications. In the case of CS-Treatment, the model accurately predicted 25 instances, with only 1 instance misclassified as No-Treatment, demonstrating strong performance for this class. The matrix reveals that the model’s main challenge lies in distinguishing No-Treatment cases, while it particularly excels in identifying CO-Treatment instances.

Figure 7 displays a confusion matrix that shows the performance of the DenseNet model in classifying three classes: No-Treatment, CO-Treatment, and CS-Treatment. The matrix indicates that for No-Treatment, the model correctly classified 43 instances, with only 1 instance misclassified as CO-Treatment. For CO-Treatment, the model performed excellently, with 41 correct classifications and only 1 misclassification as No-Treatment. In the case of CS-Treatment, the model accurately predicted 23 instances, with 1 instance misclassified as No-Treatment and 2 as CO-Treatment. The matrix demonstrates that the model maintains high accuracy across all classes, with minimal misclassifications.

Figure 8 displays a confusion matrix that shows the performance of the Inception model in classifying three classes: No-Treatment, CO-Treatment, and CS-Treatment. For No-Treatment, the model correctly classified 40 instances, with 2 instances misclassified as CO-Treatment and 2 as CS-Treatment. In the CO-Treatment category, the model achieved 41 correct classifications, with only 1 misclassification as No-Treatment. For CS-Treatment, the model accurately predicted 21 instances, while misclassifying 4 instances as No-Treatment and 1 instance as CO-Treatment. This pattern shows that while the model performs well overall, it has some challenges, particularly in distinguishing CS-Treatment cases.

Table 9 outlines the performance metrics, including accuracy, precision, recall, F1 Score, ROC, AUC, and kappa score, for the ResNet, DenseNet, and Inception architectures across various datasets: No-Treatment, CO-Treatment, and CS-Treatment.

The ResNet model achieves a consistent accuracy of 93.75% across all classes, highlighting strong overall classification performance. Precision values vary across classes, with No-Treatment achieving the highest at 97.44%, followed by CS-Treatment at 96.15% and CO-Treatment at 89.36%, suggesting particularly reliable predictions for No-Treatment cases. Recall scores reveal perfect performance (1.0000) for CO-Treatment, a high recall of 0.9615 for CS-Treatment, and a slightly lower recall of 0.8636 for No-Treatment. F1 scores, which balance precision and recall, demonstrate excellent results: CO-Treatment leads with 0.9438, CS-Treatment follows with 0.9615, and No-Treatment scores 0.9157. Overall, the ResNet model exhibits robust classification capabilities, excelling in CO-Treatment and CS-Treatment identification while showing room for improvement in No-Treatment classification.

The DenseNet model, as detailed in Table 9, demonstrates improved performance over ResNet, with an accuracy of 95.54% across all classes. Precision scores are exceptional, with CS-Treatment achieving perfect precision (1.0000), No-Treatment at 0.9556, and CO-Treatment at 0.9318, reflecting the model’s high reliability in positive predictions, particularly for CS-Treatment cases. Recall scores are strong for No-Treatment (0.9773) and CO-Treatment (0.9762) but slightly lower for CS-Treatment (0.8846), indicating occasional missed detections in the latter. F1 scores show balanced performance, with No-Treatment at 0.9663, CO-Treatment at 0.9535, and CS-Treatment at 0.9388. The DenseNet model outperforms ResNet, delivering superior classification capabilities, especially in No-Treatment and CO-Treatment cases, while maintaining strong accuracy in CS-Treatment classification.

The Inception model, also summarized in Table 9, achieves a stable accuracy of 91.07% across all classes, indicating solid classification performance. Precision scores are reliable, with CO-Treatment at 0.9318, CS-Treatment at 0.9130, and No-Treatment at 0.8889, highlighting the model’s effectiveness in positive predictions, particularly for CO-Treatment cases. Recall values are strong for CO-Treatment (0.9762) and No-Treatment (0.9091) but lower for CS-Treatment (0.8077), suggesting challenges in identifying all CS-Treatment instances accurately. F1 scores reflect balanced performance, with CO-Treatment leading at 0.9535, followed by No-Treatment at 0.8989 and CS-Treatment at 0.8571. The Inception model exhibits robust classification capabilities, excelling in CO-Treatment identification, while maintaining reliable performance for No-Treatment and showing potential for improvement in CS-Treatment detection.

ROC values reflect the trade-off between true positive rate (TPR) and false positive rate (FPR) across datasets. For ResNet, the ROC value of 1.000 for both CO-Treatment and CS-Treatment indicates perfect classification, with no trade-off between TPR and FPR. DenseNet consistently maintained high ROC values, though slightly lower than ResNet, still demonstrating excellent performance across all datasets. Inception exhibited comparatively lower ROC values, particularly for the CS-Treatment dataset (0.840), indicating some challenges in effectively distinguishing between classes.

The AUC measures the overall ability of the model to distinguish between classes. ResNet demonstrated consistently high AUC values (0.987 across all datasets), reflecting strong classification performance with high separability. DenseNet performed slightly lower but remained competitive, showing AUC values around 0.966 across datasets, which underscore its robust classification capability. Inception exhibited the lowest AUC values among the three architectures. While its performance for No-Treatment and CO-Treatment was satisfactory (0.920), it declined to 0.840 for CS-Treatment, suggesting less reliable performance for this dataset.

The kappa score measures the agreement between predicted and actual classifications beyond what would be expected by chance. DenseNet achieved the highest kappa scores across all datasets, reaching a peak of 0.932 for the No-Treatment and CO-Treatment datasets, indicating superior agreement. ResNet maintained slightly lower but still strong kappa scores (0.904 across all datasets), reflecting consistent reliability. Inception showed a notable decline in kappa scores (0.804 across datasets), indicating weaker agreement compared to both ResNet and DenseNet.

In summary, the Inception model exhibits robust performance in differentiating between the three treatment categories, demonstrating reliable classification capabilities supported by favorable evaluation metrics across all classes. The model’s consistent accuracy and strong performance indicators suggest it can be effectively deployed for similar classification tasks within this specific dataset. However, it is important to acknowledge that these results are context-specific and may not necessarily translate to different datasets or scenarios. To ensure broader applicability and validate the model’s generalization capabilities, it would be prudent to conduct further evaluation using an entirely separate, independent dataset that is not used during the training phase. This additional validation step would provide more comprehensive insights into the model’s real-world effectiveness and adaptability.

For comparison purposes, an analysis was performed on several methods employed for classifying fish freshness, using accuracy as the primary indicator, as presented in Table 10.

Inception V3 has achieved high accuracy scores, such as 100%, as demonstrated in studies like [39], underscoring its strong performance in image classification tasks. The architecture of Inception V3 efficiently captures multiscale features, making it particularly effective for complex datasets. The inception modules of Inception V3 enable it to extract detailed spatial patterns across different scales, which enhances its ability to accurately classify complex images. However, the deep and complex nature of the architecture makes it computationally demanding, potentially limiting its application in resource-constrained environments.

VGG-16 has demonstrated good performance, with an accuracy of 98%, shown in studies like [20]. This architecture employs multiple stacked convolutional layers to learn high-level features from images. VGG-16 is known for its simplicity and robustness in image feature learning, making it a reliable choice for a variety of image classification tasks. Its deep structure effectively captures hierarchical patterns, which is advantageous for standard image classification tasks. However, this deep architecture can lead to increased computational costs, and it may not perform as well as more advanced models on complex datasets that require handling large amounts of data and variability.

Xception can achieve an accuracy of 77% as reported by [22]. It incorporates depth-wise separable convolutions to enhance computational efficiency while maintaining high accuracy. The design of Xception strikes a good balance between computational efficiency and classification accuracy, making it well-suited for tasks that require moderate computational resources. The depth-wise separable convolutions significantly reduce the number of parameters, allowing the model to operate efficiently even on mobile and edge devices. However, its performance may not match that of more specialized models like DenseNet or ResNet for certain complex image classification tasks that require detailed feature extraction.

MobileNet, a lightweight model, has achieved varying accuracies ranging from 63.21% to 81% across different studies, such as those conducted by [23,24]. Its design is specifically optimized for mobile and embedded devices. The architecture employs depth-wise separable convolutions, enhancing computational efficiency and increasing processing speed, making it ideal for applications requiring rapid inference and lower computational demands. However, while MobileNet excels in speed and efficiency, it may not achieve the high accuracy levels of more complex models like Inception V3 or DenseNet on challenging datasets that necessitate more advanced feature extraction.

ResNet 50 has demonstrated strong performance, with an accuracy of 87% reported in [21]. The architecture employs residual connections to alleviate the vanishing gradient problem and enables the training of deep networks. This residual learning approach allows ResNet 50 to maintain high accuracy even at deeper layers, making it suitable for tasks that require deeper network architectures. Its ability to capture deeper hierarchical features and address gradient issues makes it a powerful choice for complex image classification tasks. However, despite being more efficient than very deep models, ResNet 50 remains more computationally intensive than lighter architectures like MobileNet, which may restrict its applicability in environments with limited computational power.

YOLO (you only look once) achieved an accuracy of 95% in [21], making it particularly well-suited for real-time object detection tasks where speed is of the utmost importance. Its design is optimized for fast inference, as it predicts both bounding boxes and class probabilities in a single evaluation pass, which makes it ideal for real-time applications such as analyzing video streams or live camera feeds. However, while YOLO excels in speed, its accuracy may not match that of more complex models like Inception V3 or DenseNet when fine-grained classification is required, as it prioritizes speed over detailed feature extraction.

DenseNet demonstrated superior performance compared to other architectures, achieving an accuracy of 95.54% in the current study. It utilizes dense connections, allowing feature maps from each layer to be fully utilized by all subsequent layers. This architecture design improves gradient flow and reduces the number of parameters, significantly enhancing feature reuse and connectivity, which in turn increases accuracy and efficiency. DenseNet’s dense connections enable it to capture detailed feature dependencies, making it particularly effective for tasks that require fine-grained image analysis. However, the dense connections also contribute to higher computational complexity, making DenseNet more resource-demanding compared to simpler architectures like MobileNet. This increased complexity can limit its deployment on resource-constrained devices.

4.3. Model Implementation Test Results

The evaluation phase of model implementation focuses on assessing the trained model’s effectiveness using novel fish meat imagery to determine its classification accuracy on unseen data. This validation process is essential for establishing the model’s practical utility and real-world applicability. The assessment encompasses both successful and unsuccessful classification outcomes. As illustrated in Table 11, the successful classification results demonstrate the model’s ability to accurately categorize new test samples, affirming its effectiveness in the intended application. Nevertheless, the model exhibits certain limitations, as evidenced by instances of misclassification. These imperfections are anticipated, given that the model’s performance is not absolute. Table 12 documents cases where the model failed to correctly classify new samples, highlighting the inherent constraints in its classification capabilities. Such classification errors are consistent with the model’s measured accuracy levels and represent typical challenges in machine learning applications.

5. Conclusions

Based on the conducted research and testing, we have successfully developed and implemented an AI model for fish meat classification. The CNN architecture demonstrated exceptional feature extraction capabilities, achieving a notable accuracy of 95%, which validates the model’s optimization and effectiveness in performing classification tasks. Testing with novel data yielded encouraging results, with the model successfully categorizing samples into the three designated classes: No-Treatment, CO-Treatment, and CS-Treatment. Despite these achievements, several areas for potential enhancement have been identified. The data processing pipeline could be improved, and instances of misclassification in new test data indicate opportunities for model refinement. To address these challenges, expanding the training dataset’s size and diversity through data augmentation techniques could enhance the model’s generalization capabilities, reducing underfitting issues and improving overall accuracy. Future research directions could explore more sophisticated approaches, such as implementing complex CNN architectures and advanced preprocessing methods to capture more detailed image features. Additionally, extending the platform to include an Android-based mobile application would significantly enhance its practical utility, particularly in industrial settings. This mobile solution would enable real-time classification, making it more accessible for quality control processes in the food processing industry. Through the continued development and exploration of these improvements, this research has the potential to contribute significantly to the advancement of AI-based quality assessment technologies in food processing applications.

Author Contributions

Conceptualization, J.M.T., F.R., B.S. and W.L.; methodology, J.M.T., F.R., B.S., W.L. and S.G.; software, J.M.T., W.L. and S.G.; validation, J.M.T., W.L. and S.G.; formal analysis, J.M.T., W.L. and S.G.; investigation, J.M.T., W.L. and S.G.; resources, J.M.T.; data curation, J.M.T., W.L. and S.G.; writing—original draft preparation, J.M.T., W.L. and S.G; writing—review and editing, J.M.T., W.L. and S.G.; visualization, B.S.; supervision, F.R., B.S. and W.L.; project administration, J.M.T., W.L. and S.G.; funding acquisition, J.M.T. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Pattimura University with grant number 1069/UN13/SK/2024, and the APC was funded by Pattimura University (30%) and by the authors (70%).

Institutional Review Board Statement

The samples we utilized, in the form of images, were derived from processed samples produced by the company, so ethical approval was not needed.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gustiano, R.; Kurniawan, K.; Haryono, H. Optimizing the Utilization of Genetic Resources of Indonesian Native Freshwater Fish. Asian J. Conserv. Biol. 2021, 10, 189–196. [Google Scholar] [CrossRef]
Tauda, I.; Hiariey, J.; Lopulalan, Y.; Bawole, D. Management policy of small-scale tuna fisheries based on island cluster in Maluku. IOP Conf. Ser. Earth Environ. Sci. 2021, 777, 012011. [Google Scholar] [CrossRef]
Suryaningrum, T.D.; Ikasari, D.; Octavini, H. Evaluation of Fresh Tuna Loin Quality for Sashimi Processed on Boat during Handling and Distribution in Ambon. JPB Kelaut. Dan Perikan. 2017, 12, 163–178. [Google Scholar]
Kılıçarslan, S.; Çiçekliyurt, M.M.H.; Kılıçarslan, S. Fish Freshness Detection Through Artificial Intelligence Approaches: A Comprehensive Study. Turk. J. Agric. Food Sci. Technol. 2024, 12, 290–295. [Google Scholar] [CrossRef]
Sengar, N.; Dutta, M.K.; Travieso, C.M. Computer vision based technique for identification and quantification of powdery mildew disease in cherry leaves. Computing 2018, 100, 1189–1201. [Google Scholar] [CrossRef]
Shi, C.; Qian, J.; Zhu, W.; Liu, H.; Han, S.; Yang, X. Nondestructive determination of freshness indicators for tilapia fillets stored at various temperatures by hyperspectral imaging coupled with RBF neural networks. Food Chem. 2019, 275, 497–503. [Google Scholar] [CrossRef]
Dowlati, M.; de la Guardia, M.; Mohtasebi, S.S. Application MSV of machine-vision techniques to fish-quality assessment. TrAC Trends Anal. Chem. 2012, 40, 168–179. [Google Scholar] [CrossRef]
Hong, H.; Yang, X.; You, Z.; Cheng, F. Visual quality detection of aquatic products using machine vision. Aquac. Eng. 2014, 63, 62–71. [Google Scholar] [CrossRef]
Murakoshi, T.; Masuda, T.; Utsumi, K.; Tsubota, K.; Wada, Y. Glossiness and perishable food quality: Visual freshness judgment of fish eyes based on luminance distribution. PLoS ONE 2013, 8, e58994. [Google Scholar] [CrossRef]
Issac, A.; Dutta, M.K.; Sarkar, B. Computer vision based method for quality and freshness check for fish from segmented gills. Comput. Electron. Agric. 2017, 139, 10–21. [Google Scholar] [CrossRef]
Prasetyo, E.; Suciati, N.; Fatichah, C.; Pardede, E. Standardizing the fish freshness class during ice storage using clustering approach. Ecol. Inform. 2024, 80, 102533. [Google Scholar] [CrossRef]
Lugatiman, K.; Fabiana, C.; Echavia, J.; Adtoon, J.J. Tuna meat freshness classification through computer vision. In Proceedings of the 2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Laoag, Philippines, 29 November–1 December 2019; pp. 1–6. [Google Scholar]
Moon, E.J.; Kim, Y.; Xu, Y.; Na, Y.; Giaccia, A.J.; Lee, J.H. Evaluation of salmon, tuna, and beef freshness using a portable spectrometer. Sensors 2020, 20, 4299. [Google Scholar] [CrossRef] [PubMed]
Medeiros, E.C.; Almeida, L.M.; Filho, J.G.D.A.T. Computer Vision and Machine Learning for Tuna and Salmon Meat Classification. Informatics 2021, 8, 70. [Google Scholar] [CrossRef]
Saputra, S.; Yudhana, A.; Umar, R. Implementation of Naïve Bayes for fish freshness identification based on image processing. J. RESTI (Rekayasa Sist. Dan Teknol. Inf.) 2022, 6, 412–420. [Google Scholar] [CrossRef]
Wu, T.; Yang, L.; Zhou, J.; Lai, D.C.; Zhong, N. An improved nondestructive measurement method for salmon freshness based on spectral and image information fusion. Comput. Electron. Agric. 2019, 158, 11–19. [Google Scholar] [CrossRef]
Taheri-Garavand, A.; Nasiri, A.; Banan, A.; Zhang, Y.-D. Smart deep learning-based approach for non-destructive freshness diagnosis of common carp fish. J. Food Eng. 2020, 278, 109930. [Google Scholar] [CrossRef]
Anas, D.F.; Jaya, I. Design and implementation of fish freshness detection algorithm using deep learning. IOP Conf. Ser. Earth Environ. Sci. 2021, 944, 12007. [Google Scholar] [CrossRef]
Priya, K.A.; Kaladevi, A.C.; Perumal, R. Detection of Sardine Fish Freshness Using Deep Convolution Neural Network. Ann. Rom. Soc. Cell Biol. 2021, 25, 16063–16070. [Google Scholar]
Rayan, M.A.; Rahim, A.; Rahman, M.A.; Marjan, M.A.; Ali, U.A.M.E. Fish freshness classification using combined deep learning model. In Proceedings of the 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), Rajshahi, Bangladesh, 8–9 July 2021; pp. 1–5. [Google Scholar]
Ayuningtias, I.; Jaya, I.; Iqbal, M. Identification of yellowfin tuna (Thunnus albacares), mackerel tuna (Euthynnus affinis), and skipjack tuna (Katsuwonus pelamis) using deep learning. IOP Conf. Ser. Earth Environ. Sci. 2021, 944, 12009. [Google Scholar] [CrossRef]
Prasetyo, E.; Suciati, N.; Fatichah, C. Yolov4-tiny and spatial pyramid pooling for detecting head and tail of fish. In Proceedings of the 2021 International Conference on Artificial Intelligence and Computer Science Technology (ICAICST), Yogyakarta, Indonesia, 29–30 June 2021; pp. 157–161. [Google Scholar]
Prasetyo, E.; Purbaningtyas, R.; Adityo, R.D.; Suciati, N.; Fatichah, C. Combining MobileNetV1 and Depthwise Separable convolution bottleneck with Expansion for classifying the freshness of fish eyes. Inf. Process. Agric. 2022, 9, 485–496. [Google Scholar] [CrossRef]
Hanifa, M.F.; Ramadhan, A.T.; Widiyono, N.A.; Mubarak, R.S.; Putri, A.A.; Priyanta, S. Fishku Apps: Fishes Freshness Detection Using CNN with MobilenetV2. IJCCS (Indones. J. Comput. Cybern. Syst.) 2023, 17, 67–78. [Google Scholar] [CrossRef]
Pianta, R.; Downer, J.; Hamre, B. Quality in early education classrooms: Definitions, gaps, and systems. Future Child. 2016, 26, 119–137. [Google Scholar] [CrossRef]
Goetsch, D.L.; Davis, S.B. Quality Management for Organizational Excellence; Pearson: Upper Saddle River, NJ, USA, 2014. [Google Scholar]
Kumar, P.; Raju, N.V.S.; Kumar, M.V. Quality of quality definitions-an analysis. Int. J. Sci. Eng. Technol. 2016, 5, 142–148. [Google Scholar]
SNI 01-4104.3-2006; Stages of Handling and Processing Frozen Tuna Loin for Fresh Tuna Raw Material. Indonesian National Standardization Agency: Jakarta, Indonesia, 2006.
Olafsdottir, G.; Martinsdóttir, E.; Oehlenschläger, J.; Dalgaard, P.; Jensen, B.; Undeland, I.; Mackie, I.M.; Henehan, G.; Nielsen, J.; Nilsen, H. Methods to evaluate fish freshness in research and industry. Trends Food Sci. Technol. 1997, 8, 258–265. [Google Scholar] [CrossRef]
Wlodarczak, P. Machine Learning and Its Applications; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef]
Deng, L.; Yu, D. Deep learning: Methods and applications. Found. Trends Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef]
Zhao, Z.-Q.; Zheng, P.; Xu, S.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef]
Krishna, S.T.; Kalluri, H.K. Deep learning and transfer learning approaches for image classification. Int. J. Recent Technol. Eng. 2019, 7, 427–432. [Google Scholar]
Sengara, N.; Dutta, M.K.; Sarkar, B. Computer vision based technique for identification of fish quality after pesticide exposure. Int. J. Food Prop. 2017, 20, 1160–1173. [Google Scholar] [CrossRef]
Fitriyah, H.; Syauqy, D.; Susilo, F.A. Automatic Detection of Yellowfin Tuna (Euthynnus affinis) Freshness Based on Eye Image Using Binary Similarity: Deteksi kesegaran ikan tongkol (Euthynnus affinis) secara otomatis berdasarkan citra mata menggunakan binary similarity. J. Teknol. Inf. Dan Ilmu Komput. (JTIIK) 2020, 7, 879–886. [Google Scholar] [CrossRef]
Saputra, S.; Yudhana, A.; Umar, R. Fish Freshness Identification Using KNN Algorithm Based on Digital Image: Identifikasi kesegaran ikan menggunakan algoritma KNN berbasis citra digital. Kre-TIF J. Tek. Inform. 2022, 10, 1–9. [Google Scholar]
Pujiarini, E.H.; Lenti, F.N. Convolutional Neural Network for Identification of Tilapia Fish Freshness Level Based on Eye Color Changes: Convolution neural network untuk identifikasi tingkat kesegaran ikan nila berdasarkan perubahan warna mata. J. Khatulistiwa Inform. 2023, 11, 21–25. [Google Scholar] [CrossRef]
Kalista, A.; Redjo, A.; Rosidah, U. Application of Image Processing for Tilapia (Oreochromis niloticus) Freshness Level Assessment: Penerapan image processing untuk tingkat kesegaran ikan nila (Oreochromis niloticus). J. Pengolah. Has. Perikan. Indones. 2019, 22, 229–235. [Google Scholar]
Hernandez, A.A. Classification of Nile Tilapia using Convolutional Neural Network. In Proceedings of the 2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), Shah Alam, Malaysia, 7 October 2019; pp. 126–131. [Google Scholar] [CrossRef]
Hu, J.; Li, D.; Duan, Q.; Han, Y.; Chen, G.; Si, X. Fish species classification by color, texture and multi-class support vector machine using computer vision. Comput. Electron. Agric. 2012, 88, 133–140. [Google Scholar] [CrossRef]
Fouad MM, M.; Zawbaa, H.M.; El-Bendary, N.; Hassanien, A.E. Automatic Nile Tilapia fish classification approach using machine learning techniques. In Proceedings of the 2013 13th International Conference on Hybrid Intelligent Systems (HIS), Gammarth, Tunisia, 4–6 December 2013; pp. 173–178. [Google Scholar]
Jose, J.A.; Kumar, C.S.; Sureshkumar, S. A deep multi-resolution approach using learned complex wavelet transform for tuna classification. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 6208–6216. [Google Scholar] [CrossRef]
Tolentino, L.K.S.; Orillo, J.W.F.; Aguacito, P.D.; Colango, E.J.M.; Malit, J.R.H.; Marcelino, J.T.G.; Nadora, A.C.; Odeza, A.J.D. Fish freshness determination through support vector machine. J. Telecommun. Electron. Comput. Eng. 2017, 9, 139–143. [Google Scholar]
Navotas, I.C.; Santos, C.N.V.; Balderrama, E.J.M.; Candido, F.E.B.; Villacanas, A.J.E.; Velasco, J.S. Fish identification and freshness classification through image processing using artificial neural network. ARPN J. Eng. Appl. Sci. 2018, 13, 4912–4922. [Google Scholar]
Diamante, R.A. Identification of fish freshness using image processing and machine learning techniques. Comput. Electron. Agric. 2018, 157, 363–373. [Google Scholar]
Arora, M.; Mangipudi, P.; Dutta, M.K. A low-cost imaging framework for freshness evaluation from multifocal fish tissues. J. Food Eng. 2022, 314, 110777. [Google Scholar] [CrossRef]
Yudhana, A.; Umar, R.; Saputra, S. Fish freshness identification using machine learning: Performance comparison of k-NN and Naïve Bayes classifier. J. Comput. Sci. Eng. 2022, 16, 153–164. [Google Scholar] [CrossRef]
Aziz, M.A.; Fudholi, D.H.; Kurniawardhani, A. Non-destructive Fish Freshness Detection on Mobile Applications Using YOLOv4 and YOLOv4-Tiny: Deteksi kesegaran daging ikan bersifat non-destructive pada aplikasi mobile menggunakan YOLOv4 dan YOLOv4-Tiny. J. Tek. Inform. Dan Sist. Inf. 2023, 10, 126–141. [Google Scholar]
Zhou, X.; Li, C.; Rahaman, M.; Yao, Y.; Ai, S.; Sun, C.; Wang, Q.; Zhang, Y.; Li, M.; Li, X.; et al. A comprehensive review for breast histopathology image analysis using classical and deep neural networks. IEEE Access 2020, 8, 90931–90956. [Google Scholar] [CrossRef]
Elsharif, A.A.E.F.; Abu-Naser, S.S. Retina diseases diagnosis using deep learning. Int. J. Acad. Eng. Res. (IJAER) 2022, 6, 11–37. [Google Scholar]
Nashrullah, F.; Wibowo, S.A.; Budiman, G. The Investigation of Epoch Parameters in ResNet-50 Architecture for Pornographic Classification: Investigasi Parameter Epoch Pada Arsitektur ResNet-50 Untuk Klasifikasi Pornografi. J. Comput. Electron. Telecommun. 2020. [Google Scholar] [CrossRef]
Liu, F.; Xu, H.; Qi, M.; Liu, D.; Wang, J.; Kong, J. Depth-wise separable convolution attention module for garbage image classification. Sustainability 2022, 14, 3099. [Google Scholar] [CrossRef]
Shahi, T.B.; Sitaula, C.; Neupane, A.; Guo, W. Fruit classification using attention-based MobileNetV2 for industrial applications. PLoS ONE 2022, 17, e0264586. [Google Scholar] [CrossRef]
Shahi, T.B.; Xu, C.-Y.; Neupane, A.; Guo, W. Recent Advances in Crop Disease Detection Using UAV and Deep Learning Techniques. Remote Sens. 2023, 15, 2450. [Google Scholar] [CrossRef]
Sitaula, C.; Shahi, T.B. Monkeypox Virus Detection Using Pre-Trained Deep Learning-Based Approaches. J. Med. Syst. 2022, 46, 78. [Google Scholar] [CrossRef]
Carrington, A.M.; Manuel, D.G.; Fieguth, P.W.; Ramsay, T.; Osmani, V.; Wernly, B.; Bennett, C.; Hawken, S.; Magwood, O.; Sheikh, Y.; et al. Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 45, 329–341. [Google Scholar] [CrossRef]
Afaq, S.; Rao, S. Significance of Epochs on Training a Neural Network. Int. J. Sci. Technol. Res. 2020, 9, 485–488. [Google Scholar]
Natsir, A.M.F.M.; Achmad, A. Hazriani Classification of Export-Quality Tuna Using Convolutional Neural Network: Klasifikasi ikan tuna layak ekspor menggunakan convolutional neural network. J. Ilm. Sist. Inf. Dan Tek. Inform. (JISTI) 2023, 6, 172–183. [Google Scholar] [CrossRef]

Figure 1. General pipeline of DL-based tuna loin treatment detection.

Figure 2. (A) Images of No-Treatment class. (B) Images of CO-Treatment class. (C) Images of CS-Treatment class.

Figure 3. ResNet training model result.

Figure 4. DenseNet training model result.

Figure 5. Inception training model result.

Figure 6. ResNet confusion matrix.

Figure 7. DenseNet confusion matrix.

Figure 8. Inception confusion matrix.

Table 1. Summary of literature review on accuracy prediction of fish quality based on image/data with eye attributes.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[15]	Fish (digital images of fish)	Fish eyes	210 fisheye digital images divided into training (70%) and testing (30%) data	Naïve Bayes algorithm	Accuracy: 79.37%
[22]	Milkfish	Fish eyes	234 images (78 very fresh fish; days 1 and 2; 80 fresh fish, days 3 and 4; 80 non-fresh fish, days 5 and 6)	Transfer learning convolutional neural network using 4 architectures: Xception, MobileNet VI, ResNet 50, and VGG 16	Xception: 77% MobileNet VI: 81% ResNet 50: 87% VGG 16: 97%
[23]	Chanos chanos, Johnius trachycephalus, Nibea albiflora, Rastrelliger faughni, Upeneus moluccensis, Eleutheronema tetradactylum, Oreochromis mossambicus, Oreochromis niloticus	Fish eyes	Dataset includes 4392 fish eye images from 8 fish species, categorized as highly fresh (day 1 and 2), fresh (day 3 and 4), and not fresh (day 5 and 6)	Convolutional neural network (CNN), MobileNetV1 bottleneck with expansion (MB-BE)	Accuracy: 63.21% ResNet50: 84.86%
[24]	Tuna, milkfish, mackerel	Fish eyes	Captured datasets for both fresh and non-fresh fish, resized to 224 × 224 pixels	CNN, transfer learning with MobileNetv2	Accuracy: Tuna: 97% Milkfish: 94% Mackerel: 93%
[18]	Euthynnus affinis, Chanos chanos, Rastrelliger	Fish eyes	3378 images classified into good quality, medium quality, and poor quality (according to SNI 01-2729-2013)	Deep learning with Tiny Yolov2 architecture	Precision: 72.9%, Recall: 57.5%, Accuracy: 57.5%
[35]	Indian Rohu (L. rohita)	Fish eyes	Tested on a database of eight fish samples with three repetitions. Sampling is performed over six different days (6 days × 8 fish × 3 replicates = 144 samples)	Random forest classifier, decision tree	Accuracy: 96.87% Sensitivity: 100%
[36]	Skipjack tuna	Fish eyes	30 images (12 fresh fish images; 18 non-fresh fish images)	Binary similarity	Accuracy: 60%
[37]	Selar fish	Fish eyes	150 images with intervals of 1, 5, and 10 h	k-NN, RGB color features	Accuracy: 93.33%
[38]	Nile fish	Fish eyes	50	CNN	Accuracy: 93%

Table 2. Summary of literature review on accuracy prediction of fish quality based on image/data with gill attributes.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[39]	Nile fish	Gills	Image capture was performed on the gills of Nile fish with 3 repetitions using a white background. Image capture time was set for 12 h (with a 4 h interval).	Image processing program built in Visual Basic 6.0	Non-destructive method using image processing to determine the freshness level of fish across several categories. For Nile fish, very fresh (high quality) had a red color percentage of 82.18%, fresh (good quality) 67.10%, limit of acceptability 38.52%, and spoiled 9.92%.

Table 3. Summary of literature review on accuracy prediction of fish quality based on image/data with skin attributes.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[5]	Rohu labeo or L. rohita (Rohu)	Skin	30 fish sample images	Image processing techniques	Accuracy = 96.66%
[21]	Nile tilapia (Ikan nila)	Skin	4000 data set (2000 fresh images; 2000 non-fresh images); for fresh fish: training data: 1500 images and test data: 50 images. For non-fresh fish: training data: 1500 images and test data: 50 images	Convolutional neural network (VGG-16 architecture), bi-directional neural network (LSTM), architecture of the CNN Bi-LSTM neural network	Accuracy: 98% Precision: 96% Recall: 100% Specificity: 96.15% F1 score: 97.96% Classification error: 2%

Table 4. Summary of literature review on accuracy prediction of fish quality based on image/data with meat attributes.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[12]	Yellowfin tuna	Meat	60 samples; 1–2, 3–4, and 5–8 h	Computer vision, RGB extraction, KNN, and Waikato environment for knowledge analysis (WEKA)	Accuracy: 86.67%
[13]	Tuna, salmon, beef	Meat	Atlantic salmon: 15 samples; Pacific salmon: 15 samples; tuna: 17 samples; beef: 16 samples	Machine learning, portable spectrometer	Accuracy approximately 85% for salmon, 88% for tuna, and 92% for beef
[14]	Tuna and salmon	Meat	Tuna: 4 levels; salmon: 3 levels	Computer vision, machine learning	Accuracy: 100%
[16]	Salmon	Meat	2336 salmon samples; 1869 samples were randomly selected as the training set, and 467 samples were used as the test set.	Convolutional neural network modeling	Accuracy: 74.2%

Table 5. Summary of literature review on accuracy prediction of fish quality based on image/data with whole-fish attributes.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[21]	Thunnus albacares, Euthynnus affinis, Katsuwonus pelamis	Whole fish	550 images (188 Thunnus albacares, 202 Katsuwonus pelamis, and 160 Euthynnus affinis)	Deep learning with YOLOv5 architecture	Values for training loss: 0.000253 Accuracy: 95% Precision: 98.1% Recall: 93.9% F1 score: 96%
[40]	Nile tilapia	Whole fish	Total dataset 2000 images; data training 900 images and data testing 100 images	Convolutional neural network with inception V3 architecture	Accuracy: 50% for 0 iterations; 73.5% for 10 iterations; 98% for more than 100 iterations; and 100% for over 200 iterations
[41]	Images of the following six species of freshwater fish common to China were obtained: grass carp (Ctenopharyngodon idellus), silver carp (Hypophthalmichthys molitrix), bighead carp (Aristichthys nobilis), snakehead murrel (Channa striata), Wuchang bream (Megalobrama amblycephala), and red-bellied pacu (Colossoma brachypomum)	Images of the fish	Images of the fish (1024 × 768 size) were captured with a Nokia N8-00 smartphone camera	Multi-class support vector machine using computer vision	Average accuracy: 97.77%
[42]	Tilapia	Whole fish	96 images of tilapia fish and 55 images of non-tilapia fish	Feature extraction algorithms, namely, scale invariant feature transform (SIFT) and speeded up robust features (SURF), and machine learning classifiers, namely, artificial neural network (ANN), support vector machines (SVMs), and k-nearest neighbor (k-NN)	Accuracy: 94.4%
[19]	Sardine fish	Whole fish	2127 images (1049 fresh sardine fish and 1078 non-fresh sardine fish)	Deep convolutional neural network	Sensitivity: 96.2% Specificity: 92.3% Positive predictive value: 92.6% Negative predictive value: 96% Accuracy: 99.5% F1 score: 94%
[43]	Bigeye tuna, skipjack tuna, yellowfin tuna	Whole fish	The dataset used in the study has a total of 657 images, consisting of 220 images of bigeye tuna, 215 images of skipjack tuna, and 222 images of yellowfin tuna	k-Nearest neighbor (kNN), support vector machine (SVM), kernel extreme learning machine (KELM), linear discriminant analysis (LDA), random forest classifiers, probabilistic neural network (PNN), and artificial neural network (ANN)	Accuracy: 94.58% Precision: 94.72% Recall: 89.64% F1score: 92.04% MCE: 5.42%

Table 6. Summary of literature review on accuracy prediction of fish quality based on image/data with combined attributes of eyes, gills, skin, meat, and whole fish.

Ref.	Research Object	Image	Number of Samples	Method	Metric Evaluation
[44]	Milkfish, round scad, short mackerel scad	Fish eyes, fish gills	The database for the network includes 720 images for milkfish, 480 images for round scad, and 480 images for short mackerel scad	Support vector machine classifier	Accuracy: 98%
[45]	Milkfish, round scad, tilapia	Fish eyes, gills	30 fish samples per species that were used to obtain a total of 800 images each for the eyes and gills	Artificial neural network, feed-forward neural network, digital image processing	Accuracy: milkfish 90%, round scad 93.33%, tilapia 100%
[46]	Milkfish (Chanos chanos)	Eye, gills, and body images	72 cropped images used as a validation dataset for the body; 39 of those are validated as fresh milkfish’ gills	Confusion matrix, Coiflet wavelet transform, region of interest, support vector machine	Accuracy: 85.407% in region of interest detection and 98% in confusion matrix for classification
[47]	Labeo rohita (Rohu) fish	Gills, eyes, and skin	288 sample images	Design of a new mathematical model for computation of a novel Q-score, computation of slopes and SC of all focal tissues, computation of weights of focal tissues and features, normalization of weighted parameters, computation of novel Q-score	Accuracy: 98.07%
[48]	Selarides leptolepis	Eyes, body	160 images; the total image set is divided into 80 images in the fresh class and 80 images in the rotten class	k-NN and naïve Bayes classifier	Average accuracy: 97% Precision: 97% Recall: 97% Specificity: 97% AUC: 97%
[49]	Euthynnus affinis (Tongkol Deho), Priacanthus tayenus (Manglah), Rastrelliger brachysoma (Solok), Scomber australasicus (Mackerel), Caranx elanophygus (Kuwe Lilin), Nemipterus virgatus (Teribang), Restrelliger kanagurta (Banyar), and Atule mate (Kolong)	Eye and skin	N/A	Deep learning, Yolov4, Yolov4-tiny, mobile application	Accuracy: Yolov4: 99.17% Yolov4-tiny: 97.25%

Table 7. Camera settings for tuna loin image/photo capture.

Distance to object	10 cm
Grade/treatment	Fresh tuna loin, tuna loin CO, and tuna loin CS
Image quality	JPEG normal (8.6 MB) [2.3] K (good, basic normal)
Lens	DX VR (AF-P NIKKOR 18–55 mm, 1:3.5–5.6 G)
Touch shutter	OFF
Image size	Large (L)
Release mode	Continuous H
Focus mode	Single-servo AF (AF-S)
Flash mode	Auto
Resolution	6000 × 4000
ISO image	Automatic ISO-A 6400
Time setting	2–20 s

Table 8. Detailed hyper-parameters used in our study.

Parameters	Value
Image size	224 × 224
Color mode	RGB
Class mode	Categorical
Classes	{“No-Treatment”: 0, “CO-Treatment”: 1, “CS-Treatment”: 2}
Batch size	64
Epoch	15
Rotation range	90
Width shift range	0.05
Height shift range	0.05
Shear range	0.05
Horizontal flip	True
Vertical flip	True
Optimizer	Adam
Brightness range	[0.75, 1.25]
Rescale	1/255
Validation split	0.2
Loss	Categorical cross entropy

Table 9. Accuracy, precision, recall, F1 score, ROC, AUC, and kappa score.

Data Set	Method	Accuracy	Precision	Recall	F1 Score	ROC		AUC	Kappa Score
Data Set	Method	Accuracy	Precision	Recall	F1 Score	TPR	FPR	AUC	Kappa Score
No-Treatment	ResNet	0.9375	0.9744	0.8636	0.9157	0.974	0.065	0.987	0.904
	DenseNet	0.9554	0.9556	0.9773	0.9663	0.977	0.024	0.966	0.932
	Inception	0.9107	0.8889	0.9091	0.8989	0.952	0.029	0.920	0.804
CO-Treatment	ResNet	0.9375	0.8936	1.0000	0.9438	1.000	0.000	0.987	0.904
	DenseNet	0.9554	0.9318	0.9762	0.9535	0.953	0.022	0.966	0.932
	Inception	0.9107	0.9318	0.9762	0.9535	0.976	0.015	0.920	0.804
CS-Treatment	ResNet	0.9375	0.9615	0.9615	0.9615	1.000	0.013	0.987	0.904
	DenseNet	0.9554	1.0000	0.8846	0.9388	0.958	0.046	0.966	0.932
	Inception	0.9107	0.9130	0.8077	0.8571	0.840	0.054	0.920	0.804

Table 10. Comparison of methods and classification accuracy for fish freshness.

Ref.	Method	Accuracy (%)
[16]	Convolutional neural network modeling	74.2
[14]	Computer vision, machine learning	100
[19]	Deep convolutional neural network	99.5
[38]	CNN	93
[40]	Inception V3	100
[20]	Convolutional neural network (VGG-16)	98
[22]	Xception, MobileNet VI, ResNet 50, VGG 16	Xception: 77 MobileNet VI: 81 ResNet 50: 87 VGG 16: 97
[59]	VGG16	81.9
[23]	MobileNet, ResNet50	MobileNetV1: 63.21 Resnet50: 84.86
[24]	CNN, transfer learning with MobileNetv2	Tuna: 97 Milkfish: 94 Mackerel: 93
[18]	Tiny Yolov2	57.5
[21]	Yolov5	95
[49]	Yolov4, Yolov4-tiny, mobile application	Yolov4: 99.17 Yolov4-tiny: 97.25

Table 11. Test results of correct data classification by model (ResNet).

Actual Class	Predicted Class	Description
No-Treatment	No-Treatment	Succeeded
CO-Treatment	CO-Treatment	Succeeded
CS-Treatment	CS-Treatment	Succeeded

Table 12. Test results of incorrect data classification by model (ResNet).

Actual Class	Predicted Class	Description
No-Treatment	CS-Treatment	Failed
CO-Treatment	CS-Treatment	Failed
CS-Treatment	No-Treatment	Failed

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tupan, J.M.; Rieuwpassa, F.; Setha, B.; Latuny, W.; Goesniady, S. A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries. Fishes 2025, 10, 75. https://doi.org/10.3390/fishes10020075

AMA Style

Tupan JM, Rieuwpassa F, Setha B, Latuny W, Goesniady S. A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries. Fishes. 2025; 10(2):75. https://doi.org/10.3390/fishes10020075

Chicago/Turabian Style

Tupan, Johan Marcus, Fredrik Rieuwpassa, Beni Setha, Wilma Latuny, and Samuel Goesniady. 2025. "A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries" Fishes 10, no. 2: 75. https://doi.org/10.3390/fishes10020075

APA Style

Tupan, J. M., Rieuwpassa, F., Setha, B., Latuny, W., & Goesniady, S. (2025). A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries. Fishes, 10(2), 75. https://doi.org/10.3390/fishes10020075

Article Menu

A Deep Learning Approach to Automated Treatment Classification in Tuna Processing: Enhancing Quality Control in Indonesian Fisheries

Abstract

1. Introduction

2. Related Works

2.1. Machine Learning

2.2. Deep Learning

3. Material and Methods

3.1. Research/System Overview

3.2. Image Datasets

3.3. Image Resizing and Image Data Augmentation

3.4. Model Architectures

3.4.1. DenseNet

3.4.2. ResNet

3.4.3. Inception

3.5. Metric Evaluation

3.6. Model Training and Testing

3.7. Implementation

4. Result and Discussion

4.1. Training Model Result

4.2. Model Performance

4.3. Model Implementation Test Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI