1. Introduction
The dimensions (length, diameter, area, etc.) and shape characteristics of agricultural products are significant for the management of product processing, quality control, classification, storage, packaging, and marketing processes [
1]. In machine vision systems utilized for automatic fruit classification, the color and size characteristics of the product are employed as references for optimization studies. The projection area, which defines the surface area of the fruit in a two-dimensional plane, is a feature that determines the accuracy of the robotic harvesting system in recognizing the object [
2]. The surface area of agricultural products is an important measure used to determine the energy requirement needed to drying process [
3].
In the context of the food industry and the expectations of consumers, the classification of products according to their size and shape is imperative in order to satisfy market demands. The implementation of various image-processing algorithms is crucial to ensure the reliability and efficiency of the classification process [
4]. The objective of these algorithms is to enhance automated classification systems that utilize the size and shape characteristics of products. In the currently available literature, a range of image-processing algorithms have been employed for the identification of the size and shape of onions [
5], kiwis [
6] and pomegranates [
7]. The divergent morphologies exhibited by diverse agricultural products necessitate the optimization of these algorithms on a per-product basis [
8,
9].
Strawberries (
Fragaria × ananassa Duch.) exhibit considerable morphological variability in size and shape, which directly affects their industrial processing and commercial classification. In recent years, numerous studies have focused not only on the physical characterization of strawberries but also on their biochemical composition and postharvest utilization potential. The fruit and its by-products are rich sources of bioactive compounds, including anthocyanins, ellagitannins, flavanols, and phenolic acids, whose concentration and profile vary among cultivars, ripening stages, and growing conditions [
10]. However, despite the increasing use of strawberries in the production of jams, juices, purees, dried and frozen products, and as flavoring or coloring agents in dairy and confectionery industries, there is still a lack of comprehensive data correlating morphological attributes with the retention and recovery of these functional compounds during processing. Recent findings have also demonstrated that agro-industrial strawberry by-products—such as calyx, stems, and unmarketable fruit portions-contain higher phenolic contents and antioxidant capacities than the edible parts, suggesting an underexploited potential for revalorization in food, nutraceutical, and cosmetic formulations [
11]. Nonetheless, the stability of these compounds under industrial conditions and the optimization of extraction methods remain major challenges, highlighting the need for integrated studies linking fruit geometry, compositional variability, and processing performance.
A plethora of studies have hitherto been conducted on the influence of cultivation conditions on the morphological characteristics of strawberries. These characteristics include, but are not limited to, fruit length, diameter, and weight. These characteristics are critical indicators of external quality and market value. Güzdüz and Özdemir [
12] reported that environmental conditions significantly influenced fruit size, with the Camarosa cultivar producing larger fruits under greenhouse conditions, while open-field cultivation increased dimensional variability. In a similar vein, Sarıdaş et al. [
13] demonstrated that irrigation regimes influenced fruit morphology, where moderate water restriction reduced fruit length and mass but improved uniformity under controlled irrigation. In a recent study, Öztürk Erdem [
14] examined the impact of varying spring planting periods on the pomological and phytochemical characteristics of four strawberry varieties (Albion, Monterey, Portola, and Pineberry). The study revealed that planting time significantly influenced fruit weight and size, with the Albion and Monterey cultivars demonstrating superior performance in mid-spring plantings. Collectively, these studies indicate that environmental and management factors play a decisive role in shaping strawberry morphology. However, the number of studies that specifically and quantitatively assess shape and dimensional attributes under varying cultivation conditions remains limited, highlighting a research gap that warrants further investigation into the morphological determinants of fruit development.
In this study, the size and shape parameters of strawberry fruit are examined within a broad framework. The objective of the present research is to generate data for product processing technologies, machine systems, or robotic systems structured with deep learning algorithms, and to facilitate the development and design of new systems. In the context of this research, the objective is to identify the limits of variation in size for standard varieties, to determine the number of groups in terms of shape category, and to reveal the shape factors that generate differences in terms of contour geometry.
2. Materials and Methods
2.1. Strawberry Varieties
The present study employed the utilization of the strawberry varieties Albion, Monterey, and San Andreas because they are known to possess a neutral day-length characteristic. The fruits were obtained from the greenhouses of the Agricultural Application and Research Centre at Bilecik Şeyh Edebali University (40°06′48.3″ North, 30°00′08.4″ East) [
14].
2.2. Imaging of Strawberry Samples
After harvesting, the strawberry samples were stored in PET (polyethylene terephthalate) containers and transported to the imaging laboratory on the same day. The leaves and stems of each sample were then removed. Foreign matter, including plant stems and leaves, was removed from the strawberry samples meticulously. Damaged fruits were also removed from the group. Approximately 450 fruit samples (33% Albion, 29% Monterey, 37% San Andreas) were collected during the one-week harvesting period, and mass measurements were taken on a precision balance (±0.01 g accuracy). Subsequently, an imaging procedure was conducted on the bottom-illuminated fiberglass surface, as illustrated in
Figure 1.
2.3. Vision System and Image Acquisition Setup
The images of the strawberry samples were obtained by means of a digital camera (the Panasonic Lumix DMC-FZ50, manufactured by Panasonic Corp., Osaka, Japan) that was equipped with a 1/1.8″ CCD sensor, which provided an effective resolution of 10.1 megapixels (maximum image size: 3648 × 2736 pixels). The lens employed in this study was the built-in LEICA DC VARIO-ELMARIT lens (Leica Camera AG, Wetzlar, Germany), which has a focal range of 7.4–88.8 mm (equivalent to 35–420 mm in the 35 mm format) and an aperture range of f/2.8–3.7. The lens was utilized at a fixed focal length of 35 mm with the objective of minimizing perspective distortion. The camera was mounted in a vertical orientation at a fixed height of 55 cm above a transparent fiberglass plate. This ensured that the optical axis was perpendicular, thus facilitating geometric consistency among all samples.
The illumination was provided from below by a uniform diffuse LED light source integrated into the imaging platform. This configuration provided bottom-up illumination, thereby minimizing shadow formation and surface reflections on the strawberry surface. All images were acquired under constant laboratory conditions, with ambient light completely eliminated to maintain lighting uniformity. As contour detection was based on binary threshold segmentation, minor variations in light intensity did not influence the extracted fruit boundaries. Consequently, the illumination system ensured stable and reproducible imaging conditions for accurate geometric analysis. This was carried out in order to prevent shadow formation and surface reflections. All images were captured in jpeg format (4:3 aspect ratio) with constant exposure parameters (ISO 100, aperture f/4.0, shutter speed 1/125 s) under darkened laboratory conditions to eliminate ambient light interference. The built-in MEGA Optical Image Stabilization (O.I.S.) system was activated, and each image was remotely triggered via a wired port connection to ensure the elimination of hand-induced vibrations during capture. A millimeter-scale calibration ruler was placed in each image to establish the relationship between pixel and millimetric dimensions, ensuring accurate geometric scaling for subsequent morphometric analyses.
The camera was mounted vertically at a constant height, and all samples were positioned within the central field of view to minimize radial distortion The calibration ruler placed in each image was utilized not only for scale reference but also to verify geometric consistency, thereby confirming that optical distortion within the imaging area was negligible.
2.4. Image Processing
The digital files containing the images of the strawberries were subjected to image processing using ImageJ v.1.38x software [
15]. The image-processing operations conducted at this stage are elucidated and demonstrated in sequence in
Figure 2.
2.5. Size and Shape Characteristics
The length (
L, mm) and width (
W, mm) of the strawberry fruit were measured from horizontal orientation, while the thickness (
T, mm) was measured from vertical orientation (
Figure 3). The aspect ratio (
E) of the fruit was automatically measured by dividing the maximum length value in both orientation by the minimum length value. The area of the scanned surface was measured in accordance with the contours of the fruit, and this value was defined as the projection area (
PA, mm
2). The measurement of the perimeter of the fruit (
P, mm) was conducted on the basis of the length of the curve forming the closed contour. The feret diameter (
FD, mm) value was measured as maximum distance in the projection area. The roundness (
R) value was automatically calculated by the software and measured using the equation
. Shapes with an
R value of 1 are designated as perfect circles, whilst others assume values greater than 1 [
16].
The parameters defining the strawberry fruit in the spatial axis were calculated using measurements obtained from the images. The geometric mean diameter (
Dg, mm), surface area (
SA, cm
2), and sphericity (
Φ, %) were calculated using Equations (1), (2) and (3), respectively [
8,
9].
2.6. Elliptic Fourier Analysis
Elliptic Fourier analysis (EFA) was performed to reveal the morphological differences between strawberry samples, regardless of size. In order to facilitate a comparison of the resulting curve, which was modeled from the outer contour of the fruit, between strawberry samples, 24-bit bmp digital files were utilized. Analyses were then performed using the SHAPE (v. 1.03) software package [
17]. The application steps of the analysis comprised the following: firstly, digitizing the closed curve; secondly, determining the x and y coordinates of each point on the curve; thirdly, converting the coordinate values into a mathematical function; and finally, obtaining the function coefficients. The selection of a harmonic number of 20 was made for the purpose of creating these coefficients [
18]. The analysis procedures applied for EF were performed using the ChainCoder, Chc2Nef, PrinComp, and PrinPrint modules of the software. Utilizing these modules, the color images were converted to monochrome, and following noise reduction, the shape contours were coded. The contour codes for the shape data were then normalized to obtain EF descriptors. Principal component analysis (PCA) was then applied to the normalized shape descriptors, and the shape variations in the fruit contours were visualized. Consequently, PCA was employed to identify components that predominantly explain shape variation, and component scores corresponding to 422 fruit samples were obtained for each component.
2.7. Statistical Analysis
All data pertaining to the size and shape characteristics of strawberries were recorded meticulously in MS Excel 2021, and histogram graphs were created by classifying each size into its own category. Furthermore, employing a macro in the software, the arithmetic mean, standard deviation, median, minimum and maximum values, skewness and kurtosis statistics of the data were calculated. A comparison was made between the histogram graphs and statistics in order to evaluate the data distribution. Distributions with skewness and kurtosis coefficients ranging between −1 and +1 are considered to be normal. Distributions with a median equal to the mean are deemed to be symmetrical. Those with a median smaller than the mean is recognized as right-skewed (positive), and those with a median larger than the mean is recognized as left-skewed (negative). A positive kurtosis coefficient is indicative of a distribution that is more pointed than is typical, whilst a negative coefficient is indicative of a distribution that is flatter than is customary [
19,
20]. In order to ascertain the variation in size and shape characteristics according to horizontal and vertical orientation, a one-way analysis of variance (ANOVA) was applied to all data.
The analysis of EF revealed the proportion of total variance explained by the principal components that best account for the morphological differences among strawberry fruits. The k-means algorithm was utilized for the purpose of grouping the fruits according to their morphological characteristics. The number of groups was determined on the basis of the number of principal components explaining the highest proportion of variance. Discriminant analysis was performed in order to ascertain the efficacy of the principal components in distinguishing between the fruit groups. To this end, the groups were categorized according to shape and assigned as the dependent variable, whilst the scores for each principal component were assigned as the independent variable. The analysis demonstrated the efficacy of the principal components in classifying strawberries according to their shape. Statistical analyses were conducted utilizing SPSS 20.0 software.
3. Results
As demonstrated in
Table 1, the min-max values indicate a considerable degree of variation in terms of size among the strawberries. In contrast, the variation in shape data is confined to a more limited range when compared with the size data. The findings of the variance analysis demonstrate that the means obtained in the horizontal and vertical orientations are different (
p < 0.01). The mean values for length and projected area obtained from horizontal measurements are greater than those obtained from vertical measurements. The increase in roundness and the width-to-length ratio in the horizontal plane indicates a decrease in roundness.
It can be deduced from the median value of the majority of the size and shape data being lower than the mean that the data distribution is skewed to the left. The skewness coefficient that is closest to zero is obtained in the sphericity dataset, while the greatest skewness occurs in the surface area data. As indicated by the kurtosis coefficient, the length dataset demonstrates resemblance to a normal distribution. The surface area data, which exhibits the most significant kurtosis coefficient, features a pointed structure in comparison to the normal distribution.
Positive skewness in the size and shape data is evident in the histogram graphs in
Figure 4. The size data indicates that at least 80% of the strawberry samples exhibited a projection area ranging from 200 to 1000 mm
2. The variation in their feret diameter and length was observed to be between 20 and 40 mm, while their perimeter exhibited a range between 60 and 120 mm. In the morphological evaluation, the roundness value of 80% of the strawberry samples was determined to be within the range of 1.00–1.12, and the width-to-length ratio was found to be between 1.00 and 1.30.
As demonstrated in
Table 2, a strong positive correlation is evident between the projection area and other size variables. While the relationship between size and shape variables is statistically significant, the correlation coefficient is low.
As illustrated in
Table 3, 90.77% of the variation in shape differences in strawberry fruit is explained by the first seven principal components, and contour differences are visualized. In the case of standard strawberry varieties, the average shape variation is spherical–conical in appearance and grouped according to the change within the ±2.SD range. With regard to shape differentiation, the PC1 component exhibits the greatest variation load, and this group encompasses contours varying within the long cone-sphere geometry range. In a similar manner, variations in other components can be assessed in the following manner: for PC2, within the conical–pointed oval range; for PC3, within the oval–conical range; for PC4, within the range of asymmetrical spherical–cone appearances; for PC5, within the range of pointed (necked) conical–cylindrical base appearances; and for PC6 and PC7, within the range of asymmetrical conical appearances.
As demonstrated in
Figure 5, the k-means algorithm was utilized for the purpose of categorizing a set of strawberry images into seven distinct groups. It has been demonstrated that the images within each group exhibit notable similarities. Group 5 demonstrates a high degree of similarity in contour geometry, while Group 7 exhibits the least overlap with Group 5.
The investigation conducted in
Table 4 sought to ascertain the impact of the independent variables, encompassing the initial seven principal components (PC1–PC7), on the categorization of the fruit’s shape. The analysis yielded the conclusion that the components within the PC1–PC5 range were deemed to be highly significant (
p < 0.01) determinants. Discriminant analysis was therefore performed using the independent variables in the PC1–PC5 range, with the effect of the resulting five separation functions on explaining shape variation investigated. In the analysis of the eigenvalue statistics, it was observed that the variance was predominantly attributed to the highest 1st function, with a subsequent ranking from largest to smallest. The variance explanation ratio of the 5th function was found to be comparatively low. The observation that the canonical correlation coefficient of the 1st function approximates 1 suggests that it possesses a significant degree of effect power in the differentiation of fruit contours. Wilks’ Lambda statistics are employed to demonstrate the unexplained portion of the variance [
21]. Therefore, when considering the full range of functions, the unexplained variance is 2.4%, which is relatively low. Conversely, the fifth function alone does not have a significant effect on differentiating shape differences. The correlations between the independent variables and the discriminant functions are indicated in the structure matrix. It has been demonstrated that variables exhibiting high correlation are indicative of their greater contribution to the relevant function, thereby underscoring their efficacy in discriminative processes. The discriminant analysis functions effectively distinguished the strawberry fruit, classifying it into its own group with 94.1% accuracy, a feat attributable to its unique shape geometry.
As demonstrated in
Figure 6, the differences between discriminant functions are evident when based on the central distribution between groups. The first two functions that emerge from canonical discrimination analysis have the capacity to explain a significant proportion of the variance, amounting to 89.0%. The central distributions, which are proximate to each other according to the horizontal and vertical axis sets, provide information about the distance of this separation, even though the strawberry samples are morphologically different. When examining the distances of the central distributions on the horizontal axis (Separation function 1), It has been demonstrated that the 3rd, 4th and 7th shape groups have distinctive features. In a similar manner, on the vertical axis (discriminant function 2), the observation that the samples in the 1st and 2nd groups are distant from the center indicates that they exhibit different shape characteristics.
4. Discussion and Conclusions
The mass and size characteristics of the standard strawberry varieties utilized in the study correspond to the data documented in the extant literature. A study of strawberry fruit weight, based on both variety and genotype, revealed a range from 5.26 g to 35.00 g [
22,
23,
24,
25,
26].
It has been demonstrated that disparities exist amongst the various varieties of fruit in relation to their genetic composition, ecological conditions, and cultural practices. Consequently, discrepancies in fruit length and width can be observed, with measurements ranging from 22.2 to 66.3 mm and 17.4 to 47.3 mm, respectively [
27,
28,
29]. In the extant literature, fruit size is defined solely by length and width measurements. In this study, however, the size variables presented across a wide range (projection area, perimeter, feret diameter, length, width, and thickness) are strongly and positively correlated with each other, meaning that all size data obtained support the existing literature. The correlation between the size and shape variables of strawberry fruit is positive but weak. This finding suggests that the shape characteristic of strawberry fruit is not a size-dependent factor.
In this study, the shape category of strawberry fruit was not created based on an operator-dependent intuitive judgment. The results of the EF analysis demonstrate that strawberry samples can be divided into seven shape groups. The numerical analysis revealed the average contour geometry, and it was determined that the standard strawberry variety has a spherical–conical shape. The geometry of the contour is conical along the long axis and exhibits a bulging appearance on both sides. Strawberry samples with average contours demonstrated the greatest variability between long conical and spherical geometries. As asserted by Ishikawa et al. [
30], the morphology of the strawberry fruit can be categorized into nine distinct shapes: kidney-shaped (1—reniform); conical (2—conical); heart-shaped (3—cordate); oval (4—ovoid); cylindrical (5—cylindrical); parallelogram (6—rhomboid); oblate (7—oblate); globose (8—globose); and wedged (9—wedged). The categorization of these categories was previously defined by Li et al. [
31] as follows: globose, globose–conic, conic, long–conical (long-conic), double cone (biconic), conical–wedge (conic-wedge), wedge (wedge), square (square) and other (miscellaneous). In order to visually classify strawberry fruit into shape categories, three separate characteristics are examined: the width-to-length ratio, the shape of the upper region where the calyx is attached, and the shape of the lower region of the fruit. It has been demonstrated that each of the nine interactions formed by these three characteristics is directly defined by a single shape geometry [
30].
In the present study, the differentiation of strawberry fruit into seven distinct shape categories was determined through a data-driven process rather than by subjective visual judgment. The number of categories was determined through the implementation of k-means clustering on the initial seven principal components derived from elliptic Fourier descriptors. These components collectively accounted for 90.77% of the total shape variance. The present study proposes an optimized statistical classification that differs from previous studies, including those by Ishikawa et al. [
30] and Li et al. [
31], which identified nine visually defined categories. The variation in the number of shape classes among studies can be attributed not only to methodological differences but also to genotype–environment interactions. It is evident that even within the same cultivar, environmental and agronomic factors may exert a substantial influence on fruit development and shape geometry. Such factors may include soil texture, climate conditions, irrigation frequency, and water quantity [
12,
13,
14]. Consequently, the present study was restricted to three neutral-day cultivars (Albion, Monterey, and San Andreas) to guarantee morphological uniformity. However, further investigations incorporating a broader range of cultivars grown under diverse ecological conditions are required to establish a comprehensive morphological taxonomy of strawberry fruit.
Furthermore, only healthy and undamaged fruits were included in the analyses in order to create a standardized reference dataset that represents intrinsic morphological variability without the confounding effects of mechanical or pathological deformation. In the food industry, the shape of fruit exerts a direct influence on consumer preferences and is a determining factor in marketable quality. In practice, fruits exhibiting irregular or defective shapes can be efficiently detected and removed by AI-based sorting systems. From a machine learning standpoint, the geometric irregularities of deformed fruits necessitate separate model training, as they represent distinct contour structures not captured in the current dataset. Consequently, the present study deliberately concentrated on healthy fruits in order to establish fundamental geometric data for future research. This research aims to develop deep-learning models capable of identifying and categorizing fruits that are defective or non-standard.
Finally, although discriminant analysis successfully classified strawberry shapes with a 94.1% accuracy rate, the present study did not include an external validation set, as its primary objective was to assess intrinsic shape variability rather than predictive performance. It is recommended that future studies incorporate cross-validation or independent datasets in order to evaluate the generalization capacity of the proposed classification framework. Furthermore, for cultivars devoid of defined shape descriptors, additional research combining machine learning and elliptic Fourier analysis will be imperative to engineer novel shape recognition algorithms. The present work thus provides a foundational contribution by quantifying the morphological spectrum of strawberry fruit and highlighting the need for algorithmic development in data-driven fruit classification systems.
Discriminant analysis is a statistical technique that is employed to identify the distinguishing characteristics between different shape categories. In addition to this, it is also able to indicate the proximity of groups with different shape geometries. This proximity is of particular importance in the context of minimizing misclassification in automatic classification systems that are based on visual perception. Despite the performance of shape discrimination, it is predicted that the success rate of image-based systems will increase by the elimination of minor geometrical discrepancies that are associated with shape proximity. The utilization of deep learning structures, which are capable of constructing networks utilizing extensive raw datasets, has the capacity to rectify issues that may arise during image recognition, thereby enabling the system to function with a high degree of accuracy [
32].
The most significant factor hindering the effective classification of separation functions is the presence of instances with outlier values in the dataset. Despite the fact that these values result in a distribution that is not uniform, the inclusion of the relevant example in the dataset in a universe where sampling is entirely random is important in determining the true success of the system.
Consequently, the size and shape characteristics of strawberries are of paramount importance for the design and development of classification, packaging, and product processing systems (drying, slicing, mixing, sorting, etc.). The development of alternative products containing strawberries in the food industry, or ensuring consumer satisfaction in fresh consumption, is possible with standardized, high-quality, and sustainable production. The present study determined that the size characteristics of the standard strawberry variety vary considerably and can be classified into seven different groups based on shape.