Bruise damage in fruit is very common and known to be one of the most contributing factors to degradation and loss of quality in fruit [1
]. A wide range of preharvest and postharvest factors contributing to fruit bruising susceptibility and incidence have been reported [3
]; however, bruise damage continues to occur due to inherent susceptibility of produce and inadequate application of control measures. The presence of bruising leads to reduced market acceptability of fresh produce and postharvest loss along horticultural value chain due to downgrading or outright rejection, thereby contributing to wastage and associated negative socio-economic and environmental impacts [6
]. Extensive research on assessment [4
] and non-destructive detection [10
] of bruise damage has been conducted over the past few decades generating a certain level of understanding of the phenomenon and how to manage it [4
]. Nevertheless, challenges remain, including that of effective and fast detection of such defects at early stages of development. Furthermore, bruised fruit undergo accelerated ripening and senescence, which necessitate the need for further detection and control measures [12
The term ‘early detection’ of bruise damage in fruit material is related to the assessment carried out at early stages of bruise development. The term carries a sense of controversy in meaning, since the development is benchmarked based on visual assessment of the bruise whereby, in some background colors [1
], bruises might not be noticeable and yet, be present, even externally. In many research reports, early bruise detection refers to an assessment conducted soon after (typically within a few hours) bruise induction [13
For example, in a study on detecting ‘early’ bruises using thermal imaging combined with hyperspectral imaging, a cylindrical weight of 0.2 kg was used to induce bruise damage in peaches by dropping it from heights of 200 and 400 mm 1 h before assessing the bruised samples [15
]. Considering such a drop as nearly free, the energy that was used to bruise the peaches would range from 0.39 to 0.78 J. While investigating the detection of fresh bruises (2 h old) in apples, using short wave infrared hyperspectral imaging, Keresztes et al. (2016) induced bruises by applying an impact energy of 0.41 J [17
]. Elmasry et al. (2008) used a 250 g flat steel plate from a 10 cm drop height to induce bruises in McIntosh apples, which is estimated to be approximately 0.25 J [18
]. Recently, Li et al. (2018) developed segmentation method to investigate the detection of early (1 h old) bruises induced by applying an impact energy of approximately 0.49 J, based on hyperspectral images. In another study of modelling apple bruise susceptibility under the influence of temperature, radius of curvature and acoustic stiffness, though the focus was not on early bruises, these were established at impact energies as low as 0.048 J and assessed after 48 h [19
]. Since the manifestation of bruises depends on the type of fruit (attributes such as firmness, color, etc.), the energy applied to induce the damage and time elapsed thereafter, it would make sense to add to the definition of ‘early bruise’, bruises that have yet to manifest to visual inspection (latent bruises).
In all the above-mentioned cases where the non-destructive detection of early bruises in fruit was investigated, the bruising energy level was greater than 0.2 J and the definition of early bruises practically meant fresh bruises. However, it was found prior to this work, that bruises could be produced in apples, even at energy levels lower than commonly used in research experiments and damage could be embedded at a latent state. In such a case, however, traditional methods of assessing bruise volume and susceptibility, which are mainly based on visual aspects, would be highly challenged and likely unreliable.
On the bright side, there are many instruments that have shown potential for objectively detecting bruises, and near infrared (NIR) based techniques are at the forefront, for they offer more convenience and flexibility, which is desirable for industrial application. NIR based techniques are already commercially available for quality screening of foodstuff. Additionally, developments in spectral data-driven solutions are ongoing, which are likely to help achieve the requirements for effective detection of various latent defects, such as damage and pathological infection, among others [20
The hyperspectral imaging (HSI) technique has been proven as versatile in detecting defects in fruit and plant material in general [22
], for it offers both imaging and spectroscopic capabilities. Particularly, the uses of HSI for early detection of bruise damage has seen progress as per recent reports of application to peaches [14
], mango [13
] and apple [16
], among others.
Therefore, an HSI system operating in the wavelength range of 900–2500 nm was used in this project with the aim to assess the feasibility of detecting latent bruise damage.
In this work, early detection of bruises in apple fruit was investigated by monitoring fruit samples exposed to impact forces, before and until bruise marks become evident to visual inspection. ‘Golden Delicious’ apples were used since they provide a high visibility of surface injuries and therefore allow for good assessment of bruise manifestation.
The objectives were to determine the state of bruise latency as it relates to impact energy and proven by objective image processing and visualization, to assess the detectability of such latent defects using classification learners of HSI data and evaluate the performance of derived binary and quantitative bruise classification models.
Impact bruise damages to apple were created at energies below the threshold found in the literature resulting in soft damage, undetectable by the naked eye. Visualization of bruises at specific wavelengths highlighted their subdermal existence and machine learning classification models were used to establish the feasibility of bruise segregation from healthy tissue to aid in rapid detection for sorting and grading.
2. Materials and Methods
2.1. Fruit Material and Bruising Experiment
Fruit material was sourced from a local retail market for locally grown fruits. In total, 36 Golden Delicious apples, free from visible defects, were selected from a set of packs and split into two batches for use in calibration (Batch 1) and validation (Batch 2), respectively. Three apples were used per bruising level for each batch, thereby resulting in six bruises per level per batch at a given scanned time during the entire experimental period.
Bruise damages were created on two opposite sides, in a middle area between the pedicel and peduncle on each fruit, by dropping a steel ball (63.79 g) onto the fruit in the equatorial area, after which bruised fruits were stored at room temperature (25 °C, 65% Rh) for nearly an hour prior to image acquisition. The impactor was dropped from different heights ranging from 0.02 to 0.32 m to create levels of bruise severity, L1 to L6, as denoted in Table 1
. Assuming the fall was nearly free, impact energies applied on the fruit surface were calculated according to the method in [26
] and ranged approximately from 0.013 J to 0.200 J (see Table 1
), with respect to drop heights. It should be stressed that the three lowest energy levels used (L1–L3) were lower than the lowest experimental impact energy (0.048 J) previously found in the literature [19
], by nearly a quarter. In this manner, boundaries of bruising energy were pushed in order to create latent bruise damage.
2.2. Hyperspectral Imaging System
An HSI unit from the Central Analytical Facility at Stellenbosch University was used for all the image acquisitions. The system is equipped with a line scanning hyperspectral camera (HySpex SWIR-384 from HySpex, Oslo, Norway) operating in the short-wave infrared (SWIR) range, from 930 to 2500 nm, in steps of approximately 5.45 nm, which results in 288 contiguous spectral wavebands in total. A moving sample stage allows for varying sample speed, an illumination unit consisting of two DC regulated light sources capable of delivering up to 150 W each and provide a stable (0.1% regulation, 0.1% noise) light intensity. The light sources are positioned at two opposite sides towards the sample, focusing the illumination to a line overlapping with the camera field of view. The system operation and image acquisition were carried out using ‘Breeze’ software (version 2019.1.0, Umeå, Prediktera, Sweden) installed on a computer running the Windows 10 operating system.
2.3. Image Acquisition
Prior to image acquisition the system was set up as follows. The distance between sample and camera was set to 20.5 cm; the grey standard was fixed at 68 mm from above the moving stage in alignment with the sample surface; the integration time was set to 3000 μs and the saturation of the grey standard set to 50%. The scanned length along the stage was 10 cm with a reference collection time of 60 s. Images of bruised apples were acquired on fruits moving at a speed of 5.47 mm/s using a ‘HySpex SDK’ camera lens with 30 cm focal length and a 95 mm field of view. The image acquisition was performed at 100 frames/s, with 404 frames per image and 384 pixels per line, setting the hypercubes at 404 × 384 × 288 in size. Both bruised sides of the fruit samples were scanned separately and used as individual samples. Additional to image acquisition within the first hour of bruising, images were also taken after 6, 18, 48 and 72 h to record the temporal bruise development in fruits initially bruised at low impact energy, as presented in Table 1
under the “Times scanned” column. Given that the main focus of was to assess the feasibility of detecting latent bruises, at higher impact energies such as L5 and L6, bruises were already severe enough withing a day after impact. Therefore, the latter were only scanned at fewer and earliest inspection times than at lower impact levels.
2.4. Image Analysis
Three dimensional hyperspectral images were imported into Evince software (version 2.7.10, Prediktera, Sweden) for pre-processing and background removal. The background was removed by interactively separating (selecting, excluding and reconstructing) the background pixels from the fruit pixels from a principal component analysis (PCA) based contour plots, applied on hypercubes. Segmented images were exported to a MATLAB (version R2019a, Mathworks, Natick, MA, USA) recognizable format for further processing; converted into MATLAB’s dataset objects (DSO) for subsequent use in unfolding 3D hypercubes into 2D data matrices, a technique of dimensionality reduction without loss of information [27
]. The exported images were converted into hyperspectrograms for use in subsequent image reconstruction to visualize bruises at latent stage. Image reconstruction was done based on individual principal components (PC) generated to constitute hyperspectrograms, and using a MATLAB based conversion software tool (HyperspectrogramsGUI) available on request from the authors [28
]. The hyperspectrograms method is useful in reducing dimensionality of hyperspectral data, making it easier to handle huge amounts of data, which is typical in hyperspectral imaging [22
]. Additional to data visualization using image reconstruction from hyperspectrograms, MATLAB Image Processing Toolbox’s image player, was also used for visualization of wavelength specific images that best highlighted the well contrasted bruises. Furthermore, the regions of interest were extracted manually, which included 186 bruised areas and 287 areas of unbruised tissue. From all regions of interest corresponding average spectra were extracted for use in subsequent classification tasks.
2.5. Bruise Detection
In order to evaluate the implementation of discerning bruises at their latent stage, three aspects were established. First, bruise detection models were evaluated for their performance at predicting latent bruises. Classification models that encompass all used levels of severity were trained and tested on predicting the status of latent bruises. The models detect differences between bruised and sound tissue and are tested on latent bruises to decide whether they are recognized as bruises or not.
A second aspect that was investigated is the effect of temporal evolution of bruises on the detection model’s performance. The detection model, trained on all data, were tested on latent bruises at different scanned times.
Thirdly, a quantitative model that specifically considers bruise severity levels as categories was built in order to establish a framework for the identification of bruises quantitatively. Classification models were built to differentiate between these categories and were tested on samples at latent levels that were previously excluded from training.
Classification models were built using learning algorithms available in the machine learning toolbox of MATLAB software, including decision trees, naïve Bayes, support vector machine, nearest neighbors and linear discriminant analysis and ensemble subspace discriminant algorithms. Only the best performing models were reported, some of which are described in the next section. In Figure 1
, a schematic diagram of the analytical workflow that was followed is summarized.
2.6. Description of Classification Learners
The Support Vector Machine (SVM) algorithm for classification aims to maximize an optimal hyperplane as a decision function. The basic SVM deals with two-class situations, whereby the created hyperplane for separating data is defined by a number of margins to the nearest data points, also known as support vectors [29
]. SVM has a reputation of high efficiency at avoiding issues of overfitting, which is typical of high-dimensionality data, such as that generated by HSI systems. It is known for excellent performance in classification and prediction [30
]. In multiclass situations, some methods of combining multiple two-class SVMs are used. These methods include building all possible combinations of “one against one” two-categories classification problems and using either a voting strategy or an acyclic graph for deciding a sample’s correct class, or building all possible “one versus all the rest” dual-category problems in training and applying a decision function to determine a new sample’s category [31
]. In this work SVM models were developed using a quadratic kernel function and a “one versus one” method.
In k-Nearest Neighbor (kNN), classification is performed by computing the Euclidean distance between a sample and each of the other samples in the training set. Once k nearest ones are found, the unknown is classified to the class that has most members among these neighbors [32
]. A nonlinear classification algorithm, kNN is known to perform reasonably well in multiclass problems [34
]. In this work preference was given to the “fine kNN” (FKNN) algorithm, using 1 neighbor and equal Euclidean distance weights.
The LDA method employs the Mahalanobis distance to estimate linear decision boundaries, which are defined in order to maximize the ratio of between-class to within-class dispersion [35
], under the assumption that variance-covariance matrices of the classes are equal [36
]. LDA is a popular method in chemometrics for its effectiveness of pattern recognition in multivariate data analysis.
Ensemble predictors combine results from many base learners, also referred to as ‘weak learners’, into one of higher performance, using methods such as bagging, subspace, boosting, etc. In Ensemble Subspace Discriminant (ESD), a random selection of features in the subspaces and a majority voting (between the predictors) rule are used to build the ensemble of learners and to adopt the classification result [37
]. The ESD-based classification in this work used 30 learners and a 128-subspace dimension.
Successful application of approaches that used spectral average of ROIs were reported in previous studies of bruise detection [1
]. The models developed in this work are based on spectral data, provided that the bruised region to have already been located on the fruit. This work extended such application particularly to the detection of latent bruises as well as exploiting the possibility of implementing a quantitative prediction model for bruise damage. In an industrial application scenario, it would be convenient to detect bruised regions using methods such as watershed segmentation [14
], other thresholding methods [40
] or pixel-based bruise segregation [17
], prior to evaluating the severity of the defects as demonstrated in this work. However, these bruise segregation methods are yet to be tested on latent bruises. It is also worth mentioning that in methods for locating bruised regions that do not take the spectral dimension of hypercubes into account, such a task would be highly challenging, or at least computationally costly, thus a quick determination of optimal wavelengths for visualizing latent bruises, similar to this work, would be very useful. Alternatively, methods of image synthesis [43
] can be useful in highlighting faint features such as latent bruises and enhance the efficiency of search algorithms.
The analytical workflow used in this work combined various software applications. However, in real life applications it would be advantageous to work with a single end to end software platform, which would combine search and segregation of regions of interest (e.g., region with defect), extract features (e.g., spectral data) and predict the state and/or severity of the bruises. Such end-to-end solutions are unknown to the authors and likely non-existent currently. Future work should put an effort into developing such solutions. The use of deep learning methods [44
] is one such approach that could enable compact workflows for detecting defects on whole images of fruit requiring fewer to no steps of prior selection of ROIs than in this study, and it is intended to explore this in future work. It is important to note that deep learning methods require extensive computational infrastructure and much larger datasets of images than was used in this work.
Results showed that the behavior of classifiers in two-class problems for bruise detection was different to that in multi-class problems for quantitative models. F-KNN was the fastest in quantitative models, whereas SVM was the fastest in detection tasks. Though SVM achieved the best performance in terms of accuracy and prediction speed for detection tasks, it did not perform well for quantitative models. However, it was observed that when the number of response classes (31 groups) was reduced to nearly half (14 groups), the performance of SVM models was improved, which suggests that the lower the number of categories in a classification task, the higher SVM was likely to perform. The results suggest that SVM is recommendable for latent bruise detection tasks while for quantitative models, ESD and LDA would be a better fit to achieve high accuracy. However, a loss in prediction speed was involved.
This lower prediction speed, especially for ESD, is only relative to the other tested algorithms. When translated to applications where such speed is crucial, such as inline sorting, the requirements by the latter can still be met by the algorithms. For example, in a typical fruit sorting system, speeds from 5 samples/s [17
] up to 10 samples/s (about 1 to 2 m/s) in commercial sorting lines [45
] are used, while in-field defect detection has been proven feasible at a speed of 0.5 m/s [46
]. Within such speed requirements one has to account for all the analytical steps of a detection task, such as data (image) acquisition, processing, feature extraction, ROI selection and classification. The speeds reported here (>400 samples/s) are only related to and sufficient for the classification part. The analytical steps prior to classification have been shown to be achievable using some versions of scanning systems and researchers [46
] have made confident reports about the current technical advancements to help achieve such and more requirements for the agricultural and food industry.
Bruise severity modelling is an important prerequisite for tasks of grading fruit based on level of damage. However, it is not commonly reported on in postharvest research. The modelling approach presented here can be adopted for the detection of various other defects, especially for severity estimation models. Although this work was based on an imaging system, the application of this methodology has extensive relevance to spectroscopic systems as well.
In this work, a single background (bright green) was considered, but future work should expand the training dataset to include other aspects, such as various backgrounds and catching angles for light during scanning. The findings show that latent bruises can be accurately detected using hyperspectral imaging data after quick search and determination of regions of interest (bruised area).
One important aspect of machine learning models that is required for successful application tasks is the ability of the models to generalize. To ensure this is implemented, calibration models were tested on new, unseen samples, initially set aside from the overall dataset; however, there is room for improvement, and it is suggested that future work should consider data augmentation techniques such as in [48
] to further improve on the generalization ability of the models such as these developed in this study.