Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool

Suarez, Emilio; Blaser, Manuel; Sutton, Mary

doi:10.3390/app15179750

Open AccessArticle

Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool

by

Emilio Suarez

¹

,

Manuel Blaser

² and

Mary Sutton

^1,*

¹

Department of Horticulture, University of Georgia, Tifton, GA 31793, USA

²

Department of Crop and Soil Sciences, University of Georgia, Tifton, GA 31793, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9750; https://doi.org/10.3390/app15179750

Submission received: 30 July 2025 / Revised: 28 August 2025 / Accepted: 2 September 2025 / Published: 5 September 2025

(This article belongs to the Special Issue Artificial Intelligence Applications in Precision Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Featured Application

This paper presents a fully automated Python-based tool that enables the rapid and reproducible measurement of citrus leaf area from scanned images, making it ideal for high-throughput phenotyping in agricultural research. Its batch-processing capability and robust performance under challenging imaging conditions make it especially valuable for breeding programs, physiological studies, and precision agriculture applications where large-scale, consistent leaf area data are essential.

Abstract

Leaf area is a critical trait in plant physiology and agronomy, yet conventional measurement approaches such as those using ImageJ remain labor-intensive, user-dependent, and difficult to scale for high-throughput phenotyping. To address these limitations, we developed a fully automated, open-source Python tool for quantifying citrus leaf area from scanned images using multi-mask HSV segmentation, contour-hierarchy filtering, and batch calibration. The tool was validated against ImageJ across 11 citrus cultivars (n = 412 leaves), representing a broad range of leaf sizes and morphologies. Agreement between methods was near perfect, with correlation coefficients exceeding 0.997, mean bias within ±0.14 cm², and error rates below 2.5%. Bland–Altman analysis confirmed narrow limits of agreement (±0.3 cm²) while scatter plots showed robust performance across both small and large leaves. Importantly, the Python tool successfully handled challenging imaging conditions, including low-contrast leaves and edge-aligned specimens, where ImageJ required manual intervention. Processing efficiency was markedly improved, with the full dataset analyzed in 7 s compared with over 3 h using ImageJ, representing a >1600-fold speed increase. By eliminating manual thresholding and reducing user variability, this tool provides a reliable, efficient, and accessible framework for high-throughput leaf area quantification, advancing reproducibility and scalability in digital phenotyping.

Keywords:

citrus; digital phenotyping; high-throughput measurement; color-based segmentation; ImageJ; image analysis; leaf area; Python

Graphical Abstract

1. Introduction

Artificial intelligence (AI) and precision agriculture (PA) are revolutionizing how crops are monitored, managed, and improved [1]. By integrating sensor data, digital tools, and machine learning, these technologies enable site-specific management, reduce labor demands, and optimize resource use [2,3]. Within this context, AI-powered image analysis and automation are transforming how plant traits are assessed—making data collection faster, more accurate, and scalable [4]. These innovations support digital phenotyping efforts that enhance research efficiency and field-level decision-making, particularly in agricultural systems [5]. As agriculture continues to adopt AI-driven tools, open-source and accessible platforms are essential to ensure widespread implementation and long-term impact [6,7].

Leaf area is a fundamental indicator of plant growth and function closely tied to photosynthesis, transpiration, and biomass accumulation [8,9]. In citrus production, the precise measurement of leaf area is critical for evaluating canopy performance, plant health, and physiological status [10]. It also enables predictions of crop responses to environmental stress and supports the optimization of cultural practices such as pruning, irrigation, and fertilization [11,12]. Moreover, leaf area provides valuable insight for monitoring tree productivity and health across diverse orchard conditions, facilitating more targeted and sustainable management decisions [13]. Despite their importance, traditional leaf area measurement methods remain labor-intensive or costly, limiting their practical application in large-scale or high-throughput studies [14]. While semi-automated, open-source tools such as ImageJ have made leaf area measurement more accessible, accurate, repeatable, and cost-effective [15,16], several limitations remain. These include the potential underestimation of area, sensitivity to image quality and sample preparation, and the need for manual steps that introduce user-dependent variability and limit scalability for high-throughput applications [17,18].

To enhance throughput and reproducibility in plant phenotyping, recent studies have developed automated, open-source tools for estimating leaf area from digital images. Huang et al. [19] presented a Python-based workflow using OpenCV that integrated image preprocessing, segmentation, and contour detection to extract morphological traits from Populus simonii leaves with high accuracy (r² > 0.97). However, their method depended on a bulky Scan1200 imaging device requiring a fixed power supply. Likewise, Easlon and Bloom [20] developed Easy Leaf Area, which rapidly distinguished leaves from the background using pixel color ratios and eliminated manual scale measurements, enabling much faster analysis than ImageJ (<5 s vs. ~3 min per leaf). Yet, Easy Leaf Area was optimized for Arabidopsis rosettes and can be less accurate in species with overlapping leaves, angled foliage, or non-green coloration; its calibration for other crops often requires additional coding adjustments. Jiang et al. [21] introduced a Python package using GANs to reconstruct damaged leaves and quantify herbivory with high accuracy (RMSE = 1.6%), but the method struggled with heavily damaged leaves and irregular damage types and required clean backgrounds without overlapping foliage. In field applications, Lee et al. [22] demonstrated a portable RGB-D system capable of non-destructive leaf area estimation outdoors, yet its accuracy decreased with dense canopies, high light intensity, or occluded leaves, requiring artificial shading or additional filters to maintain reliability.

Beyond open-source research pipelines, several commercial platforms exist for leaf area measurement. The CI-203 Handheld Laser Leaf Area Meter (CID Bio-Science, Camas, WA, USA) uses a sweeping laser and optical sensor for direct measurement but is a costly, dedicated instrument. The WinDIAS system (Delta-T Devices, Cambridge, UK) offers configurations from scanners to high-throughput conveyor units, yet all require licensed software and specialized hardware. Similarly, WinFOLIA (Regent Instruments Inc., Quebec City, QC, Canada) provides detailed morphological analysis but depends on licensed software and calibrated scanners. More recently, Petiole Pro (Petiole Ltd., Nottingham, UK) has introduced a smartphone-based solution that improves accessibility but remains a closed, company-dependent application that restricts algorithmic transparency. Collectively, these systems demonstrate commercial capabilities, but their proprietary nature, high costs, and limited customizability constrain broader adoption.

In this paper, we present a fully automated, open-source, Python-based tool for quantifying citrus leaf area from scanned images that requires only a standard flatbed scanner and computer—equipment readily available in most laboratories—offering a transparent, cost-free, and adaptable alternative to commercial systems. The tool integrates three key innovations: (1) multi-mask HSV segmentation to capture heterogeneous leaf coloration and improve robustness under variable imaging conditions; (2) contour-hierarchy filtering to ensure the accurate measurement of leaves with irregular shapes or partial damage from insects and diseases by excluding internal contours; and (3) a batch-calibration system with open-source flexibility that enables rapid, scalable processing across hundreds of images while allowing users to tailor masks, kernels, and thresholds for different crops and experimental contexts. Based on these innovations, we hypothesize that (i) multi-mask segmentation will reduce threshold-related failures compared with ImageJ, (ii) contour-hierarchy filtering will improve accuracy in cases of irregularly shaped or damaged leaves, and (iii) batch calibration combined with customizable parameters will significantly decrease analysis time while ensuring accuracy and reproducibility across diverse imaging conditions. Our specific objective was to rigorously evaluate the performance of the Python-based tool by statistically assessing its accuracy, precision, and measurement agreement relative to ImageJ across genetically diverse citrus cultivars. Additionally, to promote broader adoption and reproducible research practices, we provide open access to the code and documentation in a publicly accessible repository. By making this resource available, we aim to empower researchers and agronomists with a reliable, efficient, and adaptable method for precise leaf-area quantification, thereby advancing agricultural research.

2. Materials and Methods

2.1. Plant Material Collection

Leaf samples were collected on 28 May 2025 from 11 citrus cultivars (Citrus spp.) growing in a citrus orchard in Valdosta, Georgia (lat. 30.8228520° N, long. 83.2366239° W). These cultivars included several mandarins and their hybrids [Citrus reticulata; ‘USDA 88-2’, ‘Cleopatra’, ‘Early Pride’, ‘Fairchild’, ‘Gold Nugget’, ‘SugarBelle, ‘Tango’], two satsumas [Citrus unshiu; ‘Owari’ and ‘Orange Frost’], a sweet orange [Citrus sinensis; ‘Early Valencia-2’ (EV-2), and a trifoliate rootstock [Poncirus trifoliata × Citrus reticulata; ‘US-942’]. For each cultivar, a composite sample of leaves was collected from field-grown trees. Leaves were randomly selected across different canopy levels and standardized to a similar developmental age (4–6 months old). Immediately after collection, the leaves were placed in labeled plastic bags and stored in a cooler with ice packs to preserve leaf turgor and prevent dehydration. Samples were transported to the laboratory and scanned within the same day to ensure tissue integrity and minimize physical distortion during imaging.

2.2. Image Acquisition

The adaxial side of each leaf was scanned using a high-resolution flatbed scanner (Perfection V850 Pro; Epson America, Inc., Los Alamitos, CA, USA) at 300 dpi. Leaves were grouped and scanned by cultivars. Due to differences in leaf size and shape across cultivars, leaves were arranged in varying non-overlapping patterns to maximize the use of the scanner bed so that as many leaves as possible were included in each scan. All leaves were placed flat against the scanner’s white background to ensure high contrast and consistent image quality. A separate image of a ruler marked in centimeters was scanned under the same settings, named scale.jpg, and used as the universal reference for both ImageJ and Python-based measurements. All images were saved in JPEG format and organized in a centralized directory for batch analysis.

2.3. Image Analysis Tools

ImageJ (version 1.54) was used as the manual reference method [23]. The scanned ruler image was opened first, and a 1 cm line was drawn using the straight-line tool. The known distance was set to 1.0 cm with a pixel aspect ratio of 1.0, units were set to centimeters, and the scale was applied globally. Each image—containing multiple leaves—was then converted to 8-bit grayscale (Image > Type > 8-bit). Leaf segmentation was performed using the Adjust Threshold tool in the Default method with B&W display, with the options “Dark background” enabled and “Don’t reset range” checked. The “Limit to threshold” option was enabled in Set Measurements so that only thresholded regions were measured. Individual leaves were highlighted using the wand tool and recorded with the “Measure” function, which reported leaf areas in cm² rounded to the nearest thousandth. All images were processed by the same operator to ensure consistency and minimize user-induced variability.

The Python-based image analysis tool was developed using OpenCV and NumPy in Python (version 3.11.9, 64-bit) and executed in Microsoft Visual Studio as the integrated development environment (IDE). OpenCV was implemented for color space conversion, morphological operations, and contour extraction [24] while NumPy facilitated efficient array processing and numerical computations [25]. Before batch processing begins, the script automatically opens the scanned scale image and prompts the user to click two points exactly 1 cm apart to calibrate the pixel-to-centimeter ratio. This calibration is then applied globally to all leaf images. The pipeline utilizes HSV color space segmentation with three distinct color masks (green, brown, and yellow) to comprehensively isolate leaves of varying health states. Morphological opening and closing (with a 5 × 5 kernel, 2 iterations each) are applied to remove noise and refine the segmented masks. Contour detection (using RETR_TREE) with hierarchy filtering (selecting only parent contours, where hierarchy = −1) is then used to extract individual leaves. Contours are smoothed using the Douglas–Peucker algorithm [26] (ε = 0.0025 × perimeter) and filtered by a minimum area threshold of 0.10 cm² to exclude small debris and artifacts. The tool calculates total, average, and individual leaf areas per image and exports two CSV files (a summary and a detailed individual leaf dataset), along with annotated images where each detected leaf is outlined and numbered. The default parameter settings and recommended tunable ranges for HSV segmentation, morphological operations, and contour filtering are provided in Table 1.

A step-by-step comparison of the Python and ImageJ workflow is shown in Figure 1. All analyses were performed on a Windows 11 Pro (64-bit; Microsoft Corporation, Redmond, WA, USA) system equipped with a 12th Gen Intel^® Core™ i5-1245U CPU (1.60 GHz; Intel Corporation, Santa Clara, CA, USA), 16 GB RAM, and Intel^® Iris^® Xe Graphics (Intel Corporation, Santa Clara, CA, USA).

2.4. Statistical Analysis

All statistical analyses and visualizations were performed using RStudio software (version 2025.05.0 Build 496) with the ggplot2, ggpubr, dplyr, and car packages [27]. To evaluate agreement between the Python-based tool and ImageJ, a cultivar-specific and combined statistical workflow was applied.

For each citrus cultivar, the difference between Python and ImageJ leaf area measurements (Python − ImageJ) was calculated. The Shapiro–Wilk test was used to assess the normality of these differences as it is widely regarded as the most powerful test for normality, especially with the small-to-medium sample sizes typical of the individual cultivar datasets. The standard deviation (SD) of these differences was also calculated to check for computationally negligible variance. If the assumption of normality was met (p > 0.05) and the SD of the differences was greater than a machine tolerance threshold (>.Machine$double.eps^0.5, approximately 1.5 × 10⁻⁸), a paired t-test was performed; otherwise, a Wilcoxon signed-rank test was used. This additional check prevented numerical precision errors in the t-test when differences were nearly identical. To account for multiple hypothesis testing across cultivars, the Benjamini–Hochberg procedure was applied to the resulting p-values to control the False Discovery Rate (FDR); adjusted p-values were reported. This correction was only applied to the cultivar-level comparisons as the combined analysis involved a single paired test.

To further assess agreement, linear regression was performed for each cultivar, and the slope and intercept of the regression line were reported along with their 95% confidence intervals (CIs). A hypothesis test was conducted to determine whether the slope significantly differed from 1, indicating potential systematic bias.

Agreement was also evaluated using Bland–Altman analysis, which included calculation of the mean bias (average difference between methods) and the 95% limits of agreement (LoA), defined as follows:

LoA = Bias \pm 1.96 \times SD

where Bias was the mean of the differences and SD was their standard deviation. These metrics were visualized using Bland–Altman plots to identify patterns of deviation across leaf sizes.

Additional descriptive metrics included the coefficient of variation (CV) for each method, calculated as follows:

Coefficient of Variation = (\frac{SD}{Mean}) \times 100

and the mean percent error, calculated as follows:

Mean % error = (\frac{\sum (Phyton - ImageJ) / ImageJ}{n}) \times 100

Percent error calculations excluded any leaves with zero or missing ImageJ values to avoid division errors. These metrics quantified relative variability and bias, respectively, and were used to compare consistency and accuracy across cultivars.

To complement the cultivar-level analysis, a combined dataset including all 412 observations was analyzed using the same statistical procedures, with the exception that the Kolmogorov–Smirnov (K–S) test was used for normality assessment due to the large sample size where this test is considered more appropriate. Scatter plots were generated to visualize the relationship between Python and ImageJ measurements, including regression lines and confidence intervals. Bland–Altman plots were used to assess agreement and identify outliers or systematic deviations. A histogram of the differences was also generated to visually assess the distribution and normality of the data, with a normal curve overlaid for comparison.

3. Results

3.1. Validation Metrics Comparing Python and ImageJ

Validation metrics across 11 citrus cultivars demonstrated that the Python-based analyzer closely mirrors ImageJ in estimating individual leaf area (Table 2). Statistically significant differences between the two methods were observed in seven cultivars: ‘USDA 88-2’, ‘Cleopatra’, ‘Early Pride’, ‘Orange Frost’, ‘US-942’, ‘Sugar Belle’, and ‘Tango’ (raw p < 0.05). However, after applying the Benjamini–Hochberg correction to account for multiple comparisons, only five cultivars retained significance (‘Cleopatra’, ‘Orange Frost’, ‘US-942’, ‘Sugar Belle’, and ‘Tango’), indicating that most observed differences were modest and potentially due to random variation. Even among cultivars with significant differences, the absolute mean offsets were small, ranging from –0.14 cm² (‘Orange Frost’) to +0.06 cm² (‘US-942’), and all standard deviations of the differences were ≤0.33 cm². This suggests minimal bias between methods.

The coefficient of variation (CV) varied across cultivars, with ‘Early Valencia-2’ showing the highest variability (≈57%) and ‘US-942’ the lowest (≈17–18%). Importantly, the difference in CV between Python and ImageJ within each cultivar was consistently small (<0.8 percentage points), indicating comparable precision. The mean percent error remained within ±0.5% for 10 cultivars. The only exception was ‘US-942’, where the Python tool overestimated leaf area by 2.43%. Across all cultivars, the standard deviation of percent error did not exceed 1.05%, further supporting the consistency of the Python-based measurements.

A combined analysis of all 412 observations also showed strong agreement between methods, with a mean percent error of 0.16% and a CV of approximately 56% for both tools. Although the paired test for the combined dataset was statistically significant (p = 0.0012), the small bias (–0.04 cm²) and narrow limits of agreement (–0.35 to 0.28 cm²) suggest that the Python tool performs reliably across diverse leaf morphologies.

Linear regression and Bland–Altman analyses further quantified the relationship and agreement between the Python and ImageJ leaf area measurements across citrus cultivars (Table 3). The coefficient of determination (R²) exceeded 0.997 for all cultivars, with most values above 0.999, confirming an exceptionally strong linear relationship between the two methods.

The regression slope was not statistically different from 1.0—the value indicating perfect proportionality—in six of the eleven cultivars (‘USDA 88-2’, ‘Early Valencia-2’, ‘Orange Frost’, ‘US-942’, ‘Sugar Belle’, and ‘Tango’). In the remaining five cultivars, the slope was marginally but significantly less than 1.0 (Cleopatra’, ‘Early Pride’, ‘Fairchild’, and ‘Gold Nugget’) or greater than 1.0 (‘Owari’), suggesting subtle proportional biases in these specific cases. The intercept of the regression was not significantly different from zero for most cultivars, indicating that no substantial fixed bias was introduced by the Python tool.

Bland–Altman analysis revealed that the mean bias (Python-ImageJ) was minimal across all cultivars, ranging from –0.14 cm² (‘Orange Frost’) to +0.06 cm² (‘US-942’). The 95% limits of agreement were narrow, demonstrating high precision. For instance, the limits of agreement for the combined dataset (–0.35 to 0.28 cm²) indicate that for most leaves, the Python tool’s measurement will be within approximately ±0.3 cm² of the ImageJ value. The one exception was ‘US-942’, which showed a consistent positive bias, with all differences falling within a tight, positive range (0.02 to 0.11 cm²).

3.2. Scatter-Plot Agreement Between Python and ImageJ

The scatter-plot analysis (Figure 2) confirmed a striking, cultivar-wide concordance between the Python analyzer and ImageJ. In all 11 cultivars, individual leaf-area values clustered tightly along the 1:1 line. Pearson correlation coefficients were uniformly 1.00 (p < 2.2 × 10⁻¹⁶), indicating virtually excellent linear agreement across the entire range of leaf sizes, from the smallest ‘US-942’ blades (~2–4 cm²) to the largest ‘EV-2’ and ‘Tango’ leaves (>70 cm²). No cultivar exhibited visible systematic bias or heteroscedastic spread and only a single leaf from ‘Early Pride’ deviated modestly from parity. This exceptional agreement was further validated by the combined scatter plot of all 412 leaves, which showed a near-perfect linear fit (R² = 0.9999) and tight clustering of points around the regression line across the full spectrum of leaf sizes.

3.3. Bland–Altman Agreement Analysis

The Bland–Altman plots (Figure 3) confirmed strong agreement between the Python analyzer and ImageJ across all 11 citrus cultivars and the combined data set. The mean bias, represented by the solid red line, remained close to zero, suggesting minimal systematic error between the two methods. While most data points fell within the 95% limits of agreement (dashed gray lines), a small number of outliers were observed in each cultivar, ranging from one to four points. These deviations were expected due to the randomized sampling approach, which occasionally included leaves that were broken, curled, or otherwise atypical. Importantly, the outliers were isolated and did not follow a consistent pattern across the range of leaf sizes, suggesting no proportional error or heteroscedasticity. Cultivars such as ‘Fairchild’, ‘Sugar Belle’, ‘Tango’, and ‘US-942’ exhibited particularly tight agreement while others like ‘Cleopatra’ and ‘Orange Frost’ showed slightly wider dispersion. In the combined plot looking at all cultivars, the mean bias remained near zero, and the majority of data points were contained within the 95% limits of agreement, further confirming the strong overall agreement. Examples of processed leaf images, including those contributing to observed outliers, can be found in the Supplementary Materials (Figures S1–S11).

3.4. Distribution of Differences

The distribution of the differences between the two methods for the combined dataset, including all cultivars, is shown in a histogram (Figure 4). This visualization confirms that the differences were normally distributed, with the data points forming a bell-shaped curve that closely followed the overlaid normal distribution line. The histogram was centered near a mean bias of −0.04 cm², reinforcing the minimal systematic difference between the two tools. The tight clustering of the histogram bars around the mean, with the majority of values falling between the 95% limits of agreement [−0.35, 0.28], visually confirmed the high degree of precision and consistency of the Python tool when compared to ImageJ.

3.5. Processing Time Efficiency

To evaluate processing efficiency, we benchmarked both ImageJ and the Python-based tool using the same dataset of 48 scanned images containing a total of 412 citrus leaves. Each flatbed scan included approximately 7–9 leaves, with the exception of scans for the trifoliate rootstock ‘US-942’ where 29 and 18 leaves were imaged per scan due to their smaller leaf size. Image capture required ~3 min per scan, which included arranging the leaves on the scanner bed. This was consistent with previous reports that high-resolution scans may take 1–2 min per image and an additional few minutes to process each image [28].

The subsequent analysis revealed substantial differences between methods. Davidson [29] suggested that ImageJ analysis requires ~3 min per leaf. While this estimate is broadly representative, actual performance is affected by user efficiency and workflow setup. In our study, ImageJ processing was accelerated by using multiple monitors to open up to eight images simultaneously. Even with this optimized workflow, analyzing all 412 leaves (48 scans) required an average of 3 h and 12 min (11,520 s), including threshold adjustment, manual leaf selection, and careful data entry to ensure accuracy. Moreover, variability in leaf morphology across citrus cultivars increased the manual correction burden. This manual process is also inherently prone to operator error, such as misclicking during leaf selection, skipping leaves, or transcribing data incorrectly.

In contrast, the Python-based tool, run on a standard desktop computer, completed the analysis of all 48 scans (412 leaves) in just 7 s following a single initial calibration step to set the segmentation threshold. Output files (CSV tables and annotated images) were generated automatically, eliminating manual data transcription and the associated risk of human error. This demonstrated a >1600-fold reduction in processing time compared with ImageJ, highlighting the tool’s scalability, reproducibility, and significantly higher reliability while minimizing operator workload.

3.6. Improved Performance Under Challenging Imaging Conditions

In addition to quantitative validation, two qualitative examples illustrate the key limitations of ImageJ that our Python-based tool overcomes. In the first scenario (Figure 5), an ‘Owari’ leaf located at the edge of the scanned image could not be accurately segmented by ImageJ using the wand tool. The tool selected the image border, preventing accurate leaf area measurement. In contrast, the Python tool successfully detected and measured all leaves, including those touching the image border.

The second scenario (Figure 6) highlights challenges in thresholding due to leaf brightness and color. A ‘Tango’ leaf with low contrast against the background required multiple manual adjustments in ImageJ to isolate the leaf, increasing both processing time and potential user bias. The Python tool, however, handled the same image automatically, detecting all leaf contours—including the problematic one—without manual input.

3.7. The Impact of Leaf Morphology on Area Measurement Accuracy

The analysis revealed that leaf morphology, particularly complex and highly segmented shapes, can influence the agreement between measurement methods (Figure 7). This was most apparent in the trifoliate cultivar ‘US-942’, which exhibited a consistent, small positive bias (+2.43% mean error, Python vs. ImageJ). This discrepancy is not an error of the Python tool in isolation but arises from the cumulative effects of different image processing pipelines applied to challenging structures. The manual ImageJ protocol, which served as our reference standard, can be susceptible to minor overestimation during the thresholding step. The application of a uniform threshold value to convert a grayscale image to a binary mask may inadvertently include pixels at the leaf boundary with subtle color variations or low-contrast shadows, slightly inflating the area measurement. This effect is amplified in leaves with intricate contours such as the three leaflets seen in trifoliate citrus (‘US-942’).

Conversely, the Python tool’s automated contour-based approach introduces a different potential for bias. The algorithm’s use of contour smoothing via the Douglas–Peucker algorithm, while essential for creating robust, continuous boundaries, can lead to a slight expansion of the leaf perimeter. This smoothing effect is most pronounced on leaves with complex geometries, such as the sharp angles and deep lobes characteristic of trifoliate leaves, resulting in a consistent slight overestimation relative to the ImageJ standard. Thus, the observed bias for ‘US-942’ is best interpreted as a systematic difference between two measurement methodologies, each with unique sensitivities to extreme leaf morphology, rather than a standalone error.

4. Discussion

This study demonstrated that the fully automated Python-based leaf area measurement tool achieves near-perfect agreement with ImageJ across a diverse set of citrus cultivars. Across all comparisons, biases were minimal, and correlation coefficients approached unity, confirming that the Python tool is as reliable as ImageJ in quantifying individual leaf areas. Although a few cultivars showed statistically significant differences after multiple-comparison correction, the absolute mean offsets were ≤0.14 cm², well below biologically meaningful thresholds. This distinction highlights that statistical significance did not translate into practical significance, and the tool can be confidently applied across diverse leaf types. Linear regression and Bland–Altman analyses further supported these findings. Slopes were not statistically different from 1.0 in more than half the cultivars, and where deviations occurred, they were minor and cultivar-specific. The narrow 95% limits of agreement (approximately ±0.3 cm²) demonstrated that the Python tool consistently tracks ImageJ within a very small error margin. Importantly, scatter plots confirmed that this agreement held true across the full range of leaf sizes—from the smallest ‘US-942’ to the largest ‘EV-2’ and ‘Tango’—indicating that performance is robust regardless of scale.

Notably, the tool outperformed ImageJ under conditions where manual segmentation is error-prone—such as leaves touching the image border or exhibiting low color contrast—highlighting its robustness and consistency across variable imaging conditions. This strength was especially clear in the “Owari” border-leaf and “Tango” low-contrast examples, where ImageJ failed to isolate the target leaves without time-intensive manual adjustment but the Python tool completed detection seamlessly. These case studies illustrate its advantage under realistic imaging variability.

Previous studies have noted that ImageJ, while widely adopted, is sensitive to user-dependent thresholding and segmentation, particularly under inconsistent lighting or overlapping plant structures, which may compromise reproducibility [30]. Our tool addresses these limitations by implementing HSV-based color masking and automated contour detection, eliminating the need for manual parameter tuning. Similar to the findings of Zhang et al. [31], who demonstrated the efficiency of HSV color models for accurate, non-destructive leaf feature extraction, our design improves accuracy while minimizing user intervention. Furthermore, in alignment with the broader digital phenotyping objectives outlined by Kim et al. [32], the automation of our workflow enhances throughput and objectivity, making it suitable for high-throughput applications.

Processing efficiency is another critical advantage revealed in this study. Whereas optimized ImageJ workflows still required over three hours to process 412 leaves, the Python tool completed the same dataset in 7 s—a >1600-fold speed improvement. This dramatic gain in efficiency not only reduces operator workload and error but also makes large-scale or multi-season phenotyping studies feasible, where manual approaches would be prohibitively time-consuming. While the tool is highly reproducible under standard conditions, a few potential sources of user error remain. As with ImageJ, incorrect calibration, such as selecting the wrong 1 cm reference during pixel-to-length conversion, can yield inaccurate area measurements. Additionally, users unfamiliar with running Python scripts in an IDE may encounter challenges, although comprehensive documentation has been provided to mitigate this issue. Once calibrated and understood, the tool performs reliably and consistently across large batches of images. Cultivar-specific differences, such as the small positive bias in trifoliate ‘US-942’, underscore that both ImageJ and the Python approach introduce subtle, method-specific sensitivities. In this case, ImageJ thresholding tended to slightly overestimate boundary pixels while Python’s contour smoothing slightly expanded leaf perimeters. These findings should be interpreted as methodological trade-offs rather than tool-specific errors, and they highlight opportunities for future refinement, such as adaptive contour smoothing tailored to leaf morphology.

Although leaf overlapping is a potential challenge in image-based phenotyping, no overlapping leaves were present in the image sets analyzed in this study. To proactively enhance the tool’s robustness for future applications where human error might result in overlapping leaves, a Watershed segmentation algorithm could be incorporated into subsequent versions. This would help automatically separate touching or overlapping objects, improving accuracy in complex image sets [33]. Although this version is optimized for scanned images with high-contrast backgrounds, future development could also extend functionality to field-acquired images, integrate machine learning for species recognition or damage classification, and offer a graphical user interface to further lower the barrier to use.

Summary and Applicability

This tool is ideally suited for high-throughput leaf area quantification. Its design prioritizes accessibility, requiring only a flatbed scanner, a ruler for calibration, and a standard computer, thereby eliminating the need for expensive or proprietary hardware and software. For researchers intending to apply this tool, we emphasize a clear operational framework to ensure success. The tool was validated under specific conditions—using leaves scanned on a pure white background at a resolution of 300 dpi—and this protocol is strongly recommended for obtaining reliable results. Users should be aware that performance can be compromised by several factors, including low contrast between the leaf and background, the presence of shadows or glare, and overlapping leaves. To mitigate potential errors, successful application depends on two critical best practices: first, meticulous calibration using the included 1 cm reference; and second, a routine visual quality check of the output binary masks to quickly identify and address any segmentation errors. As an open-source platform, this tool provides a robust foundation for automated, reproducible phenotyping, and we encourage community adoption and development to expand its applicability to a wider range of imaging conditions and plant species.

The successful implementation of this automated tool aligns with the core objectives of AI-driven precision agriculture by enabling the rapid, reproducible, and scalable quantification of plant traits. By reducing manual input and standardizing image analysis processes, the tool significantly improves data consistency and throughput—both of which are essential for integrating phenotypic data into artificial intelligence models and decision-support frameworks [34]. Its open-source design not only fosters transparency and reproducibility but also encourages collaboration and future development. This foundation supports potential integration with machine learning pipelines for trait prediction, stress detection, or crop performance modeling. As digital agriculture continues to advance, tools such as this offer scalable, field-adaptable solutions for real-time, high-resolution monitoring, thereby enhancing site-specific crop management and informed decision-making.

5. Conclusions

This study validated a fully automated, open-source Python-based tool for citrus leaf area quantification that requires only a flatbed scanner and a standard computer, making it accessible to most laboratories without the need for specialized hardware. By integrating three innovations—multi-mask HSV segmentation, contour-hierarchy filtering, and batch calibration—the tool achieved near-perfect agreement with ImageJ across 11 diverse citrus cultivars. These design elements collectively reduced threshold-related failures, improved accuracy for irregular or damaged leaves, and provided scalable batch processing that drastically decreased analysis time. Our findings confirmed the hypotheses outlined at the outset: (i) multi-mask segmentation minimized failures compared with ImageJ, particularly in low-contrast or border-leaf cases; (ii) contour-hierarchy filtering improved robustness when handling irregular leaf morphologies; and (iii) automated calibration enabled reproducible, high-throughput analysis, with a >1600-fold reduction in processing time relative to ImageJ. Together, these results demonstrate that the Python tool not only matches ImageJ in accuracy and precision but also surpasses it in efficiency, reproducibility, and robustness under challenging imaging conditions.

Beyond citrus, the tool represents a practical and adaptable solution for high-throughput leaf area measurement in other crops and experimental contexts using flatbed scanners. It is important to note that the current method has been optimized for these controlled conditions; extrapolation to field-acquired images would require additional validation and algorithmic adjustments to handle variable lighting, complex backgrounds, and challenges like leaf overlap. Its open-source distribution ensures transparency, reproducibility, and flexibility for user customization, supporting broader adoption in digital phenotyping. Future development may extend functionality to field-acquired images—which would require advanced solutions for segmentation and occlusion—and integrate machine learning for trait recognition and incorporate user-friendly interfaces. In its current form, however, this tool already provides the plant science community with a reliable, low-cost, and scalable resource for rapid, reproducible leaf-area quantification, advancing the goals of precision agriculture and digital crop monitoring.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15179750/s1. Supplementary File: Word document containing Figures S1–S11. Processed leaf images of the following citrus cultivars: ‘USDA 88-2’, ‘Cleopatra’, ‘Early Pride’, ‘Early Valencia-2’, ‘Fairchild’, ‘Gold Nugget’, ‘Orange Frost’, ‘US-942’, ‘Owari’, ‘Sugar Belle’, and ‘Tango’.

Author Contributions

E.S. and M.S.; methodology, E.S.; software, E.S. and M.B.; validation, M.B.; formal analysis, E.S.; investigation, E.S. and M.S.; resources, M.S.; data curation, E.S.; writing—original draft preparation, E.S.; writing—review and editing, M.S.; visualization, E.S.; supervision, M.S.; project administration, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Python-based tool created in this study for automated leaf area analysis, along with its source code and documentation, is publicly available on GitHub and Zenodo at: https://github.com/esuarez-12/Leaf-Area-Analyzer, accessed on 26 August 2025, and a permanent, citable version of the tool, corresponding to version v1.0.0, has been archived on Zenodo with the following DOI: https://doi.org/10.5281/zenodo.16951132. These materials are openly accessible and provided under an open-source license to support reproducibility and further research.

Acknowledgments

The authors would like to thank Jake Price and the UGA Cooperative Extension Lowndes County Office for the use of their citrus trees. The UGA Citrus Lab is committed to advancing citrus research and supporting the development of the citrus industry in Georgia and the southeastern region. The authors also acknowledge the contributions of beta testers who provided valuable feedback during the development and validation of the Python-based leaf area tool. We also extend our gratitude to Xuelin Luo for her invaluable statistical guidance in addressing the reviewers’ comments on this manuscript. During the preparation of this manuscript, the authors used Grammarly (Grammarly Inc., San Francisco, CA, USA) for grammar and style checking. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations have been used in this manuscript:

HSV	Hue, Saturation, Value (color space)
IDE	Integrated Development Environment
OpenCV	Open Source Computer Vision Library

References

Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine learning applications for precision agriculture: A comprehensive review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
Xu, J.; Gu, B.; Tian, G. Review of agricultural IoT technology. Artif. Intell. Agric. 2022, 6, 10–22. [Google Scholar] [CrossRef]
Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine learning in agriculture: A review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef] [PubMed]
Pound, M.; Atkinson, J.; Townsend, A.; Wilson, M.; Griffiths, M.; Jackson, A.; Bulat, A.; Tzimiropoulos, G.; Wells, D.; Murchie, E.; et al. Deep machine learning provides state-of-the-art performance in image-based plant phenotyping. GigaScience 2017, 6, gix083. [Google Scholar] [CrossRef] [PubMed]
Araus, J.L.; Cairns, J.E. Field high-throughput phenotyping: The new crop breeding frontier. Trends Plant Sci. 2014, 19, 52–61. [Google Scholar] [CrossRef]
Linaza, M.; Posada, J.; Bund, J.; Eisert, P.; Quartulli, M.; Döllner, J.; Pagani, A.; Olaizola, I.; Barriguinha, A.; Moysiadis, T.; et al. Data-driven artificial intelligence applications for sustainable precision agriculture. Agronomy 2021, 11, 1227. [Google Scholar] [CrossRef]
Bökle, S.; Gscheidle, M.; Weis, M.; Paraforos, D.; Griepentrog, H. A concept of a decentral server infrastructure to connect farms, secure data, and increase the resilience of digital farming. Smart Agric. Technol. 2025, 5, 100701. [Google Scholar] [CrossRef]
Potter, J.R.; Jones, J.W. Leaf area partitioning as an important factor in growth. Plant Physiol. 1977, 59, 10–14. [Google Scholar] [CrossRef]
Pandey, S.K.; Singh, H. A simple, cost-effective method for leaf area estimation. J. Bot. 2011, 2011, 658240. [Google Scholar] [CrossRef]
Mazzini, R.B.; Ribeiro, R.V.; Pio, R.M. A simple and non-destructive model for individual leaf area estimation in citrus. Fruits 2010, 65, 269–275. [Google Scholar] [CrossRef]
Lo, S.; Jia, Z.; Herniter, I.; Seymour, D.; Kahn, T.; Bowman, C.; Roose, M.; Yu, L.; Lui, J.; Hiraoka, Y.; et al. Exploring the phylogenetic relationship among citrus through leaf shape traits: A morphological study on citrus leaves. Horticulturae 2023, 9, 793. [Google Scholar] [CrossRef]
Wu, J.; Yang, F.; Xu, G.; Xu, W.; Lan, Y.; Wu, J. Multiscale inversion of leaf area index in citrus tree by merging UAV LiDAR with multispectral remote sensing data. Agronomy 2023, 13, 2747. [Google Scholar] [CrossRef]
Zhang, Y.; Lu, X.; Li, W.; Xiao, J.; Yang, J.; Zhu, H.; Xu, X.; Lan, Y.; Yang, D. Inversion of leaf area index in citrus trees based on multi-modal data fusion from UAV platform. Remote Sens. 2023, 15, 3523. [Google Scholar] [CrossRef]
Thompson, F.; Leyton, L. Method for measuring the leaf surface area of complex shoots. Nature 1971, 229, 572. [Google Scholar] [CrossRef] [PubMed]
Kalshetti, O.; Kadam, K.; Kadam, A.; Sangnure, V.; Parashuram, S.; Madave, A.; Sowjanya, P.; Marathe, R. Evaluation of leaf area parameter of pomegranate (Punica granatum L.) germplasm by using ImageJ in comparison with manual method. J. Agric. Res. Technol. 2024, 49, 90–94. [Google Scholar] [CrossRef]
Hernandez-Monterroza, J.; Agehara, S.; Pride, L.; Gallardo, M. A simple, inexpensive, and portable image-based technique for nondestructive leaf area measurements. EDIS 2020, 2020, HS1395. [Google Scholar] [CrossRef]
Lopes, T.; Marques, P.; Pacheco, A.; Moura, L.; Nascimento, J.; Coelho, R.; Duarte, S. Non-destructive and destructive methods to determine the leaf area of zucchini. J. Anim. Sci. 2020, 8, 295–309. [Google Scholar] [CrossRef]
Koyama, K. Leaf area estimation by photographing leaves sandwiched between transparent clear file folder sheets. Horticulturae 2023, 9, 709. [Google Scholar] [CrossRef]
Huang, M.; Chen, S.; Zhang, L.; Zhang, H.; Zhu, S. Large-scale and high-accuracy phenotyping of Populus simonii leaves using the colony counter and OpenCV. Forests 2023, 14, 1766. [Google Scholar] [CrossRef]
Easlon, H.; Bloom, A. Easy Leaf Area: Automated digital image analysis for rapid and accurate measurement of leaf area. Appl. Plant Sci. 2014, 2, e1400033. [Google Scholar] [CrossRef]
Jiang, Y.; Diallo, A.; Wang, Z.; Kembel, S.W. Deep learning- and image processing-based methods for automatic estimation of leaf herbivore damage. Methods Ecol. Evol. 2024, 15, 732–743. [Google Scholar] [CrossRef]
Yau, W.K.; Ng, O.-E.; Lee, S.W. Portable device for contactless, non-destructive and in situ outdoor individual leaf area measurement. Comput. Electron. Agric. 2021, 187, 106278. [Google Scholar] [CrossRef]
Schneider, C.A.; Rasband, W.S.; Eliceiri, K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 2012, 9, 671–675. [Google Scholar] [CrossRef] [PubMed]
Xu, Z.; Baojie, X.; Guoxin, W. Canny edge detection based on OpenCV. In Proceedings of the 2017 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI), Yangzhou, China, 20–22 October 2017; pp. 53–56. [Google Scholar] [CrossRef]
Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
Douglas, D.; Peucker, T. Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or Its Caricature. Cartogr. Int. J. Geogr. Inf. Geovisualization 1973, 10, 112–122. [Google Scholar] [CrossRef]
Posit Team. RStudio: Integrated Development Environment for R; Posit Software, PBC: Boston, MA, USA, 2025; Available online: http://www.posit.co/ (accessed on 16 July 2025).
Thompson, G.L.; Haynes, C.L.; Lyle, S.A. Botanical scans as a learning aid in plant identification courses. Hort. Technol. 2022, 32, 398–400. [Google Scholar] [CrossRef]
Davidson, A. Measuring Leaf Perimeter and Leaf Area. Prometheus Protocols. Available online: https://prometheusprotocols.net/structure/morphology/leaf-size-and-shape/measuring-leaf-perimeter-and-leaf-area/ (accessed on 26 August 2025).
Treder, W.; Klamkowski, K.; Tryngiel-Gać, A.; Wójcik, K. Application of ImageJ Software in the assessment of flowering intensity and growth vigor of pear trees. J. Hortic. Res. 2021, 29, 85–94. [Google Scholar] [CrossRef]
Zhang, M.; Wang, Y.; Jia, Y.; Bai, Y.; Wei, Q. Digital image-based method of leaf color and area feature recognition. Adv. Comput. Commun. 2024, 5, 122–127. [Google Scholar] [CrossRef]
Kim, D.; Putra, G.; Kim, H.; Lee, U.; Chang, S. An automated, high-throughput plant phenotyping system using machine learning-based plant segmentation and image analysis. PLoS ONE 2018, 13, e0196615. [Google Scholar] [CrossRef]
Vincent, L.; Soille, P. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991, 13, 583–598. [Google Scholar] [CrossRef]
Gupta, G.; Kumar Pal, S. Applications of AI in precision agriculture. Discov. Agric. 2025, 3, 61. [Google Scholar] [CrossRef]

Figure 1. Leaf area measurement workflow: A step-by-step diagram illustrating the workflow for measuring leaf area using ImageJ (manual) and Python (automated). The diagram highlights shared preparation steps and then bifurcates into separate pipelines, showing scale calibration, image processing, and output generation.

Figure 2. Scatter plots comparing individual leaf area measurements from the Python script and ImageJ across citrus cultivars and for all cultivars combined. Each point represents a single scanned leaf. The solid line shows the linear regression fit; the dashed blue line is the 1:1 reference (perfect-agreement) line. For each panel, the coefficient of determination (R²) and p-value are reported. Axes are scaled equally, so the 1:1 line reflects perfect agreement.

Figure 3. Bland–Altman plots assessing agreement between Python and ImageJ across citrus cultivars and for all cultivars combined. Each point represents a single leaf; the x-axis is the mean leaf area (Python + ImageJ)/2, and the y-axis is the difference (Python-ImageJ). The solid red line denotes mean bias; dashed gray lines indicate the 95% limits of agreement (mean ± 1.96 × SD); the dotted black line marks perfect agreement (zero difference).

Figure 4. Distribution of differences between Python and ImageJ leaf area measurements. The histogram shows the frequency of differences; the dashed red line represents the theoretical normal distribution with the same mean and standard deviation; the solid blue line indicates the mean bias; dashed gray lines show the 95% limits of agreement.

Figure 5. Limitations of ImageJ in detecting leaves touching image borders. The arrow indicates an ‘Owari’ leaf positioned at the edge of the scanned image, which causes ImageJ to incorrectly select the entire image border when using the wand tool, preventing accurate area calculation. (A) Original image with leaf at the edge. (B) ImageJ output showing failed segmentation. (C) Python tool output successfully detecting and measuring all leaves, including those touching the edge.

Figure 6. Limitations of ImageJ thresholding due to leaf color and brightness. The red circle highlights a ‘Tango’ leaf with low contrast against the background, which made segmentation difficult in ImageJ. Manual threshold adjustments were required to isolate the leaf, increasing user workload and variability. (A) Original image showing the problematic leaf. (B) ImageJ output showing incomplete segmentation despite manual tuning. (C) Python tool output accurately detecting all leaves, including the low-contrast one, without user intervention.

Figure 7. Segmentation challenges of trifoliate leaf morphology. The top-right-corner red rectangle highlights a zoomed ‘US-942’ leaf. Black arrows around the pictures highlight how those leaves were processed either in ImageJ or the Python tool. (A) Original image showing the ‘US-942’ leaves scan. (B) The ImageJ output illustrates how the thresholding process can result in a slight overestimation by including background pixels. (C) The Python tool output where the contouring algorithm’s smoothing process slightly expands the boundary to trace the leaf’s intricate shape.

Table 1. Parameter settings of the Python-based leaf-area tool and recommended tunable ranges.

Component	Parameter	Default	Tunable Range	Rationale/Notes
Color mask (HSV)	Green (Lower/Upper)	[25, 40, 40]/ [90, 255, 255]	H ± 10; S floor 30–80; V floor 30–80	For healthy foliage, use a lower S/V floor in low light and a higher one to reduce noise.
	Brown (Lower/Upper)	[10, 40, 20]/ [30, 255, 200]	H ± 10; V max 180–220	Includes senescent/necrotic tissues; cap V to avoid bright backgrounds.
	Yellow (Lower/Upper)	[15, 80, 120]/ [35, 255, 255]	H ± 10; S floor 60–100	Extends to chlorotic/yellowing leaves; keep sufficient S to avoid beige paper.
Morphology	Kernel shape and size	Square 5 × 5	3 × 3–7 × 7	Opening then closing to remove speckles and fill small gaps.
Morphology	Opening/Closing iterations	2/2	1–3 each	Increase if background noise persists; decrease to preserve fine serrations.
Contours	Retrieval/Approximation	RETR_TREE/ approxPolyDP	ε = 0.25% of perimeter (0.1–0.5%)	RETR_TREE provides hierarchy; Douglas–Peucker smoothing stabilizes areas.
Contours	Parent-only selection	Yes (hierarchy parent = −1)	-	Excludes internal holes/children; robust with irregular or insect/disease damage.
Area filter	Min contour area	0.10 cm²	0.05–0.50 cm²	Drops tiny artifacts/debris while preserving small leaflets; raise for noisy scans.
Scale	Calibration method	Two clicks over 1 cm	-	One-time pixel to cm calibration applied globally.
Batch	Processing mode	All images in folder	-	Exports summary and per-leaf CSVs; saves annotated images with labels.

Implementation note: Defaults performed consistently across 11 citrus cultivars without parameter changes. Tunable ranges are provided for users adapting to different scanners, lighting, or backgrounds.

Table 2. Validation metrics comparing the Python leaf area tool and ImageJ across citrus cultivars.

Cultivar	Sample Size (n)	p_ normality ⁱ	Statistical Test	p_ paired ⁱⁱ	p_paired_adj ⁱⁱⁱ	CV Python (%) ^iv	CV ImageJ (%) ^iv	Mean % Error ^v	SD % Error ^v
USDA 88-2	28	0.0006	Wilcoxon	0.0489	0.0768	36.69	36.85	−0.10	0.81
Cleopatra	38	<0.001	Wilcoxon	1.5 × 10^–6	7.2 × 10^–6	41.23	41.44	−0.48	0.46
Early Pride	33	<0.001	Wilcoxon	0.0454	0.0768	41.83	42.17	−0.38	0.99
Early Valencia-2	33	<0.001	Wilcoxon	0.442	0.541	57.06	57.15	−0.12	0.54
Fairchild	41	0.0027	Wilcoxon	0.121	0.166	37.08	37.29	−0.03	0.47
Gold Nugget	41	<0.001	Wilcoxon	0.841	0.841	34.05	34.22	+0.07	0.6
Orange Frost	36	<0.001	Wilcoxon	2.0 × 10^–6	7.2 × 10^–6	37.23	37.2	−0.43	0.57
US-942	47	0.681	Paired t-test	1.7 × 10^–22	1.8 × 10^–22	17.3	17.91	+2.43	1.05
Owari	31	0.0146	Wilcoxon	0.764	0.840	37.12	37	−0.07	0.40
Sugar Belle	54	0.097	Paired t-test	0.0106	0.0233	42.82	42.87	+0.2	0.53
Tango	30	0.146	Paired t-test	0.00236	0.0065	38.03	38.02	−0.12	0.23
Combined	412	<0.001	Wilcoxon	0.00122	-	56.42	56.58	+0.16	1.06

ⁱ Shapiro–Wilk test for normality of paired differences; p < 0.05 indicates non-normal distribution. ⁱⁱ Raw p-value for paired test. ⁱⁱⁱ Benjamini–Hochberg multiple-comparison correction applied to cultivar-level tests. ^iv Coefficient of variation (Python and ImageJ measurements). ^v Error metrics: Mean and SD of percent error (Python vs. ImageJ).

Table 3. Regression and agreement metrics comparing Python tool and ImageJ across citrus cultivars.

Cultivar	Slope (95% CI) ⁱ	Intercept (95% CI) ⁱⁱ	p_slope ≠ 1 ⁱⁱⁱ	R^{2 iv}	Mean Bias (cm²) ^v	SD of Diff (cm²) ^vi	95% LoA ^vii
USDA 88-2	0.994 (0.987–1.001)	0.108 (–0.076–0.293)	0.089	0.9997	–0.04	0.17	[–0.37, 0.29]
Cleopatra	0.989 (0.985–0.993)	0.126 (0.017–0.235)	<0.001	0.9998	–0.13	0.16	[–0.45, 0.19]
Early Pride	0.986 (0.976–0.997)	0.207 (–0.070–0.484)	0.012	0.9992	–0.13	0.33	[–0.77, 0.51]
Early Valencia-2	0.997 (0.994–1.000)	0.048 (–0.057–0.152)	0.051	0.9999	–0.04	0.15	[–0.34, 0.25]
Fairchild	0.994 (0.990–0.997)	0.121 (0.038–0.204)	0.001	0.9999	–0.02	0.10	[–0.22, 0.18]
Gold Nugget	0.995 (0.991–0.999)	0.142 (0.032–0.252)	0.009	0.9999	0.00	0.12	[–0.23, 0.24]
Orange Frost	0.997 (0.992–1.001)	–0.023 (–0.197–0.150)	0.144	0.9998	–0.14	0.18	[–0.49, 0.21]
US-942	0.987 (0.973–1.001)	0.100 (0.060–0.139)	0.069	0.9977	0.06	0.02	[0.02, 0.11]
Owari	1.003 (1.000–1.006)	–0.099 (–0.189 to–0.009)	0.038	0.9999	–0.01	0.09	[–0.18, 0.16]
Sugar Belle	1.001 (0.998–1.003)	0.022 (–0.040–0.084)	0.731	0.9999	0.03	0.09	[–0.14, 0.20]
Tango	0.999 (0.997–1.001)	–0.006 (–0.076–0.065)	0.293	0.99997	–0.04	0.07	[–0.17, 0.09]
Combined	0.996 (0.995–0.997)	0.067 (0.037–0.097)	<0.001	0.9999	–0.04	0.16	[–0.35, 0.28]

ⁱ Regression slope of Python vs. ImageJ measurements; CI = confidence interval. Values close to 1 indicate strong agreement. ⁱⁱ Expected Python–ImageJ difference when ImageJ = 0; CI = confidence interval. Values close to 0 indicate minimal systematic bias. ⁱⁱⁱ Significance test of whether the slope differed statistically from 1 (proportionality). ^iv Coefficient of determination; higher values indicate a stronger linear fit. ^v Average difference (Python−ImageJ); negative values = Python slightly underestimates relative to ImageJ. ^vi Standard deviation of Python–ImageJ. ^vii Bland–Altman limits of agreement (bias ± 1.96 × SD); indicate the range within which most differences fall.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Suarez, E.; Blaser, M.; Sutton, M. Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool. Appl. Sci. 2025, 15, 9750. https://doi.org/10.3390/app15179750

AMA Style

Suarez E, Blaser M, Sutton M. Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool. Applied Sciences. 2025; 15(17):9750. https://doi.org/10.3390/app15179750

Chicago/Turabian Style

Suarez, Emilio, Manuel Blaser, and Mary Sutton. 2025. "Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool" Applied Sciences 15, no. 17: 9750. https://doi.org/10.3390/app15179750

APA Style

Suarez, E., Blaser, M., & Sutton, M. (2025). Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool. Applied Sciences, 15(17), 9750. https://doi.org/10.3390/app15179750

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automating Leaf Area Measurement in Citrus: The Development and Validation of a Python-Based Tool

Abstract

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Plant Material Collection

2.2. Image Acquisition

2.3. Image Analysis Tools

2.4. Statistical Analysis

3. Results

3.1. Validation Metrics Comparing Python and ImageJ

3.2. Scatter-Plot Agreement Between Python and ImageJ

3.3. Bland–Altman Agreement Analysis

3.4. Distribution of Differences

3.5. Processing Time Efficiency

3.6. Improved Performance Under Challenging Imaging Conditions

3.7. The Impact of Leaf Morphology on Area Measurement Accuracy

4. Discussion

Summary and Applicability

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI