Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize

Shinners, Kevin J.; Schade, Peter; Timm, Aaron J.; Digman, Matthew F.

doi:10.3390/agriengineering8020059

Open AccessArticle

Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize

by

Kevin J. Shinners

¹

,

Peter Schade

²,

Aaron J. Timm

¹ and

Matthew F. Digman

^1,*

¹

Department of Biological Systems Engineering, University of Wisconsin, Madison, WI 53706, USA

²

John Deere European Technology Innovation Center, 67657 Kaiserslautern, Germany

^*

Author to whom correspondence should be addressed.

AgriEngineering 2026, 8(2), 59; https://doi.org/10.3390/agriengineering8020059

Submission received: 15 December 2025 / Revised: 2 February 2026 / Accepted: 4 February 2026 / Published: 7 February 2026

Download

Browse Figures

Versions Notes

Abstract

Whole-plant maize (corn) (WPC) is a critical forage in ruminant diets, and rapid, reliable measurement of its nutritional composition is essential for precision feeding. We hypothesized that an on-site near-infrared spectroscopy (OS-NIRS—specifically, HarvestLab™ 3000) sensor would provide within-laboratory repeatability comparable to commercial analytical laboratories (ALs) and inter-laboratory reproducibility similar to conventional laboratory analyses. To test this, WPC samples were collected across three experiments and two countries (USA and Germany) and analyzed by both OS-NIRS and ALs, with precision metrics calculated according to ISO 5725. Results showed that OS-NIRS achieved intra-laboratory repeatability equal to or greater than ALs, particularly for protein and starch. The repeatability performance of the OS-NIRS sensors was similar to that of ALs for moisture and NDF. Inter-laboratory reproducibility varied widely across constituents and experiments. Including OS-NIRS data with AL measurements produced inconsistent effects—sometimes narrowing confidence intervals but more often widening them—while OS-NIRS data alone demonstrated repeatability on par with ALs but mixed reproducibility outcomes. Inclusion of OS-NIRS data did not introduce systematic bias and, in some cases, improved consistency. These findings indicate that OS-NIRS can complement laboratory analyses by providing timely, farm-level measurements that enhance decision-making in feed management.

Keywords:

feed analysis; corn; maize; NIRS; nutrition

1. Introduction

Maize harvested as whole-plant corn (WPC) is an essential and widely utilized feed in both dairy and beef production systems [1]. When preserved through ensiling, it provides a major source of energy and digestible nutrients that support ruminant productivity. Its nutritional value is largely defined by constituents such as dry matter, crude protein, starch, and fiber. These components can be quantified through conventional wet chemistry analyses or more rapidly estimated using sensing technologies, most commonly near-infrared (NIR) spectroscopy [2,3]. Conventional wet chemistry methods require specialized equipment, are costly, and are poorly suited to the rapid turnaround needed for strategic storage and ensiling management, as analyses of selected samples may not capture material variability or provide results in a timely manner [4]. In contrast, on-site NIR spectroscopy (OS-NIRS) can be deployed at the farm-level to provide a rapid, non-destructive, and cost-effective approach for estimating WPC constituents. Its ability to process large numbers of samples in a short time enables more timely ration adjustments and improved precision in livestock nutrition management [5].

To estimate WPC constituents using NIRS techniques, the material under study is illuminated by a light source of the NIR constituent sensor. In a reflection measurement configuration, the sensor captures light reflected from the sample and determines the optical reflection spectrum over a specific spectral region. Typically, a section in the near-infrared spectral region, between 900 nm and 1700 nm, is covered. As molecules of many relevant constituents, such as moisture, protein, starch, and fiber, have material specific absorption characteristics within this region, the spectrum carries information about the abundance of these constituents within the material presented to the sensor. The content of each material is finally estimated from the spectrum with the help of material- and constituent-specific calibration functions [3,6].

Near-infrared instruments used for WPC analysis are broadly categorized as either lab-scale/benchtop or transportable/on-site systems. Benchtop instruments are laboratory-grade analyzers that typically provide a high level of spectral resolution, stability, and calibration robustness, but they require controlled environments, trained operators, and substantial capital investment [7]. Laboratory-based NIR spectroscopy is typically performed on dried and finely ground WPC, producing a homogeneous sample that minimizes variation due to particle size and moisture, thereby improving spectral consistency and calibration accuracy.

On-site systems are more compact and cost-effective, allowing widespread deployment. These instruments enable more rapid and frequent analyses than laboratory-based systems, but potentially with some compromise in analytical precision and spectral range [8,9]. On-site instruments typically analyze fresh, undried samples, which are inherently more heterogeneous and subject to greater variability in moisture and particle distribution [5]. While this can reduce analytical precision relative to laboratory methods, it eliminates the need for drying and grinding, enabling a far greater number of samples to be scanned rapidly and with minimal preparation [10].

In commercial forage laboratories, stationary NIRS analyses typically begin by splitting the delivered sample, with a portion used for moisture determination and the remainder oven-dried at moderate temperatures (<60 °C) to preserve chemical integrity [5,11,12,13]. The dried material is then ground to a uniform particle size (~1 mm) to produce a homogeneous sample suitable for NIRS scanning. Spectra are compared against calibrations developed from reference wet chemistry analyses, enabling accurate prediction of key nutritional constituents [5,12,14,15].

Although numerous studies have compared the performance of portable on-site NIRS constituent sensors with NIRS measurements from certified laboratories, there has been limited systematic evaluation of measurement precision for both methods [16,17,18]. Precision refers to the closeness of agreement among repeated measurements of the same constituent on a given sample. When replicated samples are analyzed across multiple laboratories, two components of precision are typically considered: repeatability and reproducibility. Repeatability refers to the variability observed when replicate measurements are made under the same conditions within a single laboratory (within-lab error). Reproducibility refers to the variability observed when measurements are made on the same replicated sample under differing conditions across laboratories; it therefore includes both repeatability error and the additional inter-laboratory error [19,20]. Ensuring reproducibility among OS-NIRS sensors is critical in precision agriculture, as consistent performance across devices is needed for reliable, actionable compositional estimates in field-level decision making.

Chopped WPC exhibits considerable heterogeneity, both in anatomical composition and particle-size distribution, which poses significant challenges to obtaining homogeneous samples for analysis [21]. The sampling process typically begins at harvest or during feed-out from the silo, wherein numerous small subsamples are collected, pooled, and thoroughly mixed. These composite samples are then systematically partitioned into subsamples of appropriate size for subsequent analysis. Unfortunately, even under stringent sampling protocols, variability among subsample replicates can introduce error, thereby affecting the comparability of measurements within and between analytical laboratories.

The objectives of this study were to evaluate the intra- and inter-laboratory precision of nutritional composition analyses of WPC, using both commercial analytical forage laboratories and on-site NIRS sensors. Toward this goal, samples were collected and analyzed under typical user conditions to represent practical, on-site application of the technology by producers and nutritionists. Specifically, the study aimed to: (a) quantify the measurement variation in nutritional parameters of replicate WPC samples analyzed across different commercial laboratories and on-site NIRS sensors, and (b) assess the agreement between laboratory analyses and on-site NIRS sensor measurements. We hypothesized that (a) on-site NIR sensors would demonstrate within-laboratory repeatability comparable to that of commercial analytical laboratories, and (b) incorporating on-site NIR sensors would yield intra-laboratory reproducibility similar to that of conventional laboratory analyses. This expectation was based on OS-NIRS systems integrating a larger effective sample volume through a substantially greater number of measurements than is possible with discrete laboratory analyses of limited sample mass.

2. Materials and Methods

2.1. Sample Collection and Analysis

To capture a broad range of compositional and regional variation in WPC, samples were collected and analyzed on-site and in analytical laboratories in the United States and Germany. The goal was to assess the variability in WPC composition measurements that a typical forage producer or ruminant nutritionist might encounter. Three separate experiments were conducted over a three year period at different locations. In each experiment, three common constituents of WPC were analyzed: protein, starch, and neutral detergent fiber (NDF). Moisture content (% w.b.) was measured only in the second and third experiments.

For Experiments 1 (2021) and 3 (2024), eighteen WPC samples were collected on farms near Arlington WI, USA (43.3380°; −89.3804°). These samples were collected from four diverse farm locations and fields. In 2021, corn at Arlington experienced a warmer-than-normal growing season combined with substantial moisture deficits, particularly from mid-season onward, which likely increased drought stress during grain fill and favored higher fiber concentration and reduced starch deposition in corn silage. In contrast, 2024 featured excess early-season rainfall with generally adequate heat accumulation, supporting strong vegetative growth and kernel set, followed by a drier late summer that likely promoted good starch accumulation and harvest dry-down, conditions generally favorable for higher silage energy density.

For Experiment 2 (2022), six WPC samples were collected on farms near Neuhemsbach, DE (49.5360°, 7.9038°). Drought conditions negatively impact WPC growth at this location [22]. Samples were collected and analyzed under typical user conditions to represent practical on-site application of the technology. Harvester parameters, including length-of-cut and crop processor roll clearance, were maintained at values consistent with prevailing regional practices.

In all cases, WPC was collected from multiple fields to capture inherent variability. Material was sampled from the transport container after unloading, but prior to deposition in the bunker silo. The collected material was progressively halved several times to create smaller, manageable subsamples while maintaining representative composition. Samples analyzed with the on-site NIRS sensors were immediately brought to the sensors and scanned as described below. Samples designated for analytical laboratories were sealed in plastic bags, and within several hours were placed in a cooler set to approximately 5 °C, and shipped to the laboratories to ensure arrival within one to two days of collection. The number of samples and replicates is shown in Table 1.

In this research all samples were analyzed using near-infrared reflectance spectroscopy analytical equipment. The commercial laboratories analyzed the WPC samples using benchtop scanning monochromator NIR instruments similar to the widely used FOSS NIRSystem 6500 (1100–2500 nm, 2 nm step, FOSS, Hillerod, Denmark) [23]. The on-site NIRS sensors were the HarvestLab™ 3000 (950–1650 nm, 2 nm step, Deere & Co., Moline, IL, USA). Commercial benchtop instruments are denoted as analytical laboratories (AL), and mobile instruments as on-site NIRS (OS-NIRS) sensors. Each OS-NIRS unit was treated as an independent laboratory for intra- and inter-laboratory analyses; thus, the terms laboratories or labs refer collectively to AL and OS-NIRS sensors.

The OS-NIRS instruments consisted of a sensor body with the diode array spectrometer and a sampling unit with a rotating borosilicate glass bottom dish located above a halogen light source. The system operated with internal black and white references. Care was taken to ensure that the sampling dish was clean and dry prior to loading with a WPC sample. The measurement was repeated three times per replicate with mixing of the subsample in between. Mean values of moisture content (% w.b.) and crude protein, starch, and neutral detergent fiber (NDF) as concentrations of DM were recorded. The mean value of the three measurements was taken as the representative value for each replicate. Moisture content and constituent estimates were determined using the WPC calibration models published by John Deere in the year the studies were conducted.

In Experiment 3, WPC moisture content was measured by residual weight after drying subsamples according to NFTA procedures 2.1.1, 2.1.2, or 2.1.3 [13]. Constituent analysis in Experiments 1 and 3 was conducted using dried and ground samples (<60 °C, 1 mm), and analyzed for protein, starch, and NDF using NIRS Forage and Feed Consortium protocols [24]. In Experiment 2, WPC moisture was determined according to VDLUFA III, 3.1. methods. Constituent analysis was performed using dried and ground samples according to VDLUFA III, 31.2–31.3 methods.

2.2. Statistical Analysis

The statistical calculations follow the approach described in ISO standard 5725-2 [25]. The analysis result of a physical material sample y_SPL is described by:

y_{S P L} = m + δ_{s} + δ_{p} + δ_{L}

(1)

Here

m

is the mean over all analysis results of a constituent in one experiment,

δ_{s}

is the random deviation from the mean of an individual sample in the experiment,

δ_{p}

is the laboratory bias and

δ_{L}

denotes a random fluctuation of the analysis of the samples within the laboratory.

In the following statistical analysis, the standard deviations

{S D}_{S} = σ (δ_{S}), {S D}_{P} = σ (δ_{P}), {S D}_{L} = σ (δ_{L})

are estimated for each experiment. These describe the natural variability of constituent concentrations within an experiment, the systematic variance in laboratory-to-laboratory analysis results and the inter-laboratory variance of analysis results when analyzing identically prepared samples, respectively.

Where appropriate, data were analyzed using three approaches: AL data only, AL combined with OS-NIRS data, and OS-NIRS sensor data alone. In the statistical calculations described below, the subscript i denotes the sample number, j denotes the replicate number for a given sample, and k denotes the laboratory identifier. The evaluations were conducted separately for each of the three experiments.

2.2.1. Assessment of Sample Variability Within an Experiment

The variability of the mean constituent content within the samples of each experiment, SD_s, was calculated by:

{S D}_{S} = \sqrt{\frac{\sum_{s = 1}^{n} {({\bar{x}}_{s} - m)}^{2}}{(n - 1)}} \times C F

(2)

{C V}_{S} = 100 \times (\frac{{S D}_{S}}{m})

(3)

where n denotes the number of samples and

{\bar{x}}_{s}

is the mean across all laboratory mean values

{\bar{x}}_{i}

for sample s, and m is the mean across all laboratory mean values of the experiment, and CF is a correction factor to adjust for the small number of laboratories. A small sample variability suggests physically similar samples were collected within the experiment, while a large sample variability indicates the samples collected were diverse and cover a relevant range of the variability of analysis results of samples to be expected in the field.

When the degrees of freedom were low (i.e., small number of labs, samples, or replicates), the sample standard deviation would be a biased estimator that systematically underestimates the population standard deviation. This bias arises due to the limited information available to reliably estimate variability, particularly when the sample variance is calculated using a small number of independent observations [26]. To adjust for this, a correction factor (CF) can be applied to the standard deviation calculation, scaling it upward to provide a more accurate estimate of the population value. In this study, a CF was applied when the degrees of freedom were fewer than 10, with the CF values derived from those reported in [27].

To assess the extent of constituent variability captured within each experiment, we calculated the relative range (RR_E), defined as the experiment-specific range of a constituent (Max_E–Min_E) expressed as a percentage of the global range of that constituent observed across all experiments (Max_G–Min_G):

{R R}_{E} = 100 \times (\frac{{M a x}_{E} - {M i n}_{E}}{{M a x}_{G} - {M i n}_{G}})

(4)

High relative range RR_E indicated that individual experiments encompassed most of the overall constituent variability.

2.2.2. Within-Laboratory Repeatability

The pooled standard deviation (SD_P) of the replicate measurements within each sample was calculated to quantify the inherent variability and repeatability associated with measuring replicate subsamples within an individual laboratory:

{S D}_{P} = \sqrt{\frac{\sum_{i = 1}^{n} \sum_{j = 1}^{d} {(x_{i j k} - {\bar{x}}_{i k})}^{2}}{n \times (d - 1)}}

(5)

where n is the number of samples analyzed by a given laboratory; d is the number of replicate measurements per sample;

x_{i j k}

is the measurement of the j^th replicate of sample i in laboratory k; and

\bar{x}

_ik is the mean of the d replicates for sample i in laboratory k.

To evaluate whether within-laboratory repeatability (SD_P) differed among laboratories and sensors, Cochran’s C test was applied as recommended in ISO 5725-2. When the Cochran C test indicated significant heterogeneity among variances (α = 0.05), the largest standard deviation was sequentially removed and the test repeated until a subset of variances among the remaining groups satisfied the assumption of homogeneity. For subsets containing only two remaining groups, an F-test was used to assess variance compatibility. Groups whose standard deviations were found to be statistically compatible were considered to have equivalent repeatability and were assigned the same grouping designation. This procedure was repeated independently for all subsets to identify groups of laboratories or sensors with comparable within-laboratory variability. The F-Test was conducted as follows:

F = \frac{m a x ({S D}_{P, 1}^{2}, {S D}_{P, 2}^{2})}{m i n ({S D}_{P, 1}^{2}, {S D}_{P, 2}^{2})}

(6)

where SD_P,1 and SD_P,2 are the pooled within-laboratory standard deviations (Equation (5)) for the two groups being compared. The calculated F value was compared to the critical value from the F-distribution at α = 0.05 using the appropriate numerator and denominator degrees of freedom. If F > Fcritical, repeatability (variance) was concluded to differ significantly between the two groups.

The pooled standard deviation (SD_P) provided an overall measure of repeatability within each lab by combining information from all samples and their replicates. However, this pooled value could obscure differences in variability among individual samples. To quantify the uncertainty associated with the pooled SD_P, we estimated a standard error (SE_P) based on the variability in the replicates across samples. Specifically, for each sample within a lab, we first calculated the standard deviation of the replicate measurements. We then computed the standard deviation of the sample-level standard deviations (SD_SLP) and divided it by the square root of the number of samples tested (n_s):

{S E}_{P} = \frac{{S D}_{S L P}}{\sqrt{2 \cdot (n_{s} - 1)}}

(7)

To quantify the uncertainty inherent in the measurement process when samples were analyzed within individual labs, a pooled standard deviation (SD_CP) was calculated by aggregating the residual variation across all replicates, samples, and laboratories:

{S D}_{C P} = \sqrt{\frac{\sum_{k = 1}^{l} \sum_{i = 1}^{n} \sum_{j = 1}^{d_{i k}} {(x_{i j k} - {\bar{x}}_{i k})}^{2}}{(\sum_{k = 1}^{l} \sum_{i = 1}^{n} (d_{i k} - 1))}}

(8)

where

x_{i j k}

value of the j^th replicate for sample i in laboratory k; and

\bar{x}

_ik mean of replicates for sample i in lab k.

We applied the same pairwise F-test approach to the pooled within-laboratory standard deviation across all laboratories (SD_CP) to determine whether including the OS-NIRS sensor data significantly affected the repeatability estimate. In this case, the variances corresponding to SD_CP was calculated with and without OS-NIRS data were compared, using the same variance ratio method and critical values from the F-distribution described above.

A calculation was performed to quantify the precision of the pooled within-laboratory standard deviation (SP_CP), enabling comparison of repeatability estimates across experiments. For each experiment, we calculated the standard error of the pooled within-laboratory standard deviation (SE_CP) to quantify the uncertainty associated with SD_CP. The standard error was computed as:

{S E}_{C P} = \sqrt{\frac{{S D}_{C P}}{2 \times (d f)}}

(9)

where SD_CP is the pooled within-laboratory standard deviation, and df is the degrees of freedom used to calculate SD_CP. The degrees of freedom were determined as:

d f = \sum_{k = 1}^{l} \sum_{i = 1}^{n} (d_{i k} - 1)

(10)

where l is the number of laboratories, n is the number of samples per laboratory, and d_ik is the number of replicate measurements for sample i in laboratory k.

2.2.3. Inter-Laboratory Reproducibility

To characterize variability among laboratories, several metrics were calculated. The first (SD_L) quantified inter-laboratory variability on a sample-by-sample basis and reflected the reproducibility of individual sample results across laboratories. It was based on the deviation of each laboratory’s mean measurement (averaged across replicates) from the overall mean for that sample across all laboratories. This approach isolated the true inter-lab variability by removing within-lab variation from the calculation.

{S D}_{L} = \sqrt{\frac{\sum_{i = 1}^{n} \sum_{k = 1}^{l} {({\bar{x}}_{i k} - {\bar{x}}_{i})}^{2}}{n \cdot (l - 1)}}

(11)

where n is the number of samples; l is the number of laboratories;

\bar{x}

_ik is the average of all replicates for sample i measured by laboratory k;

\bar{x}

_i is the average of all replicates across all laboratories for sample i. A two-sided 95% confidence interval was obtained by scaling SD_L by 1.96, providing a range that reflects expected inter-laboratory variation in both directions.

The second metric, SD_G quantified the extent of systematic differences between laboratories. Specifically, it measured how much each laboratory’s overall mean (averaged across all samples and replicates) deviated from the global mean across all data. Unlike S_L, which varies by sample, this metric provides a single summary measure of lab-to-lab consistency, reflecting any persistent bias or offset in measurements attributable to individual laboratories:

{S D}_{G} = \sqrt{\frac{\sum_{k = 1}^{l} {(\bar{x} - {\bar{x}}_{k})}^{2}}{(l - 1)}} \times C F

(12)

where l is the number of laboratories;

\bar{x}

is the grand mean across all laboratories, samples and replicates; and

{\bar{x}}_{k}

is the average for laboratory k across all samples and replicates. A two-sided 95% confidence interval was obtained by scaling SD_G by 1.96, providing a range that reflects expected inter-laboratory variation in both directions.

To quantify the overall variability expected when samples are measured across laboratories, the inter-laboratory reproducibility standard deviation (S_R) was calculated following the ISO 5725 standard [25]. This method separates total variation into within-lab repeatability

S_{r}^{2}

and between-lab variability

S_{L}^{2}

. Repeatability was first estimated for each sample–lab pair (

S_{r, i k}^{2}

, Equation (13)), then averaged across all samples and labs to obtain

{S D}_{r}^{2}

(Equation (14)). The total reproducibility (Equation (15)) was computed as the square root of the sum of

{S D}_{r}^{2}

and

{S D}_{L}^{2}

, from Equation (11). Unlike the approaches used in Equations (11) and (12), which summarize inter-lab differences directly from sample or global means, the ISO method formally incorporates both random and systematic variation to provide a comprehensive estimate of reproducibility.

{S D}_{r, i k}^{2} = (\frac{1}{d_{i k} - 1}) \sum_{j = 1}^{d_{i k}} {(x_{i j k} - {\bar{x}}_{i k})}^{2}

(13)

{S D}_{r}^{2} = (\frac{1}{n \cdot l}) \sum_{i = 1}^{n} \sum_{k = 1}^{l} {S D}_{r, i k}^{2}

(14)

{S D}_{R} = \sqrt{{S D}_{L}^{2} + {S D}_{r}^{2}}

(15)

3. Results

3.1. Comparison of Analytical Lab Averages with OS-NIRS Sensor Averages

Moisture measurements from the OS-NIRS sensor closely matched AL values across the full range of sample moisture levels, with good agreement for high-moisture samples (Figure 1). Greater differences emerged at lower moisture levels, where the OS-NIRS sensors tended to report lower values than the ALs. For protein content, the overall large scatter in the data within an individual experiment showed the wide variability in measurement consistency across both the ALs and the OS-NIRS sensors.

In Experiments 1 and 3, starch and NDF measurements from the OS-NIRS sensors generally aligned well with those obtained from ALs (Figure 1). In contrast, Experiment 2 showed substantial discrepancies, with the OS-NIRS sensors consistently reporting higher starch and lower NDF values than the ALs. These discrepancies may indicate that the OS-NIRS sensors were not sufficiently calibrated for samples with low starch and high NDF contents.

Across both moisture and protein measurements, analytical laboratories generally exhibited greater variability than the OS-NIRS sensors (Figure 2). For moisture content, the range of lab results was similar to OS-NIRS in Experiment 2 but substantially greater in Experiment 3. For protein content, lab ranges exceeded those of OS-NIRS in Experiments 2 and 3, suggesting that sensor-based measurements were more repeatable across samples and experiments.

For both starch and NDF, the analytical laboratories generally exhibited greater variability than the OS-NIRS sensors (Figure 2). Differences between methods were minor in Experiment 1, but in Experiments 2 and 3, laboratory results showed considerably wider ranges than OS-NIRS, indicating that sensor-based measurements were more repeatable for these constituents under those conditions.

3.2. Sample Variability Within an Experiment

Assessment of sample variability within each experiment revealed differing degrees of consistency among replicate samples analyzed by the ALs and the OS-NIRS sensors (Table 2). Moisture and protein values generally exhibited low to moderate CV (i.e., CV less than 10%), indicating reasonable sample homogeneity. For example, in Experiment 2, moisture and protein had CVs of 3.6% and 3.8%, respectively, suggesting small variability among all replicates. Although the CVs for moisture and protein were greater in Experiments 1 and 3, for the most part, they were less than 12%.

Starch content showed the highest variability across experiments. The CV for starch on Experiment 2 was 22.3%, indicating substantial heterogeneity among replicate samples or challenges in accurate quantification at low concentrations (average 21.3%). Similarly, starch CVs were moderately high in Experiments 1 (10.0%) and 3 (12.2%), further suggesting that starch was more sensitive to sampling inconsistencies or analytical precision limits. Variability of NDF content was moderate and relatively consistent, with CVs ranging from 5.4% to 9.7% across all three experiments.

For most constituents, the relative range exceeded 50%, indicating that each experiment captured more than half of the total observed variability across all experiments. This suggests that the sampling within each experiment was sufficiently broad to represent a relevant portion of the range of values encountered in the study, supporting the robustness of comparisons across experiments.

3.3. Within-Laboratory Repeatability

Moisture content measurements in Experiment 2 were highly repeatable within both ALs and OS-NIRS sensors, as indicated by low pooled SD (Figure 3). Although there were some significant differences in repeatability among the ALs or OS-NIRS sensors in this experiment, the SEs were small for both methods. In Experiment 3, measurement variability was more pronounced. Lab G, in particular, exhibited significantly higher pooled SD and SE than other methods. In contrast, the OS-NIRS sensors showed consistently lower variability, reflecting significantly greater repeatability in moisture content measurements than any of the ALs.

Although Lab A had the greatest pooled SD, there were no significant differences in protein content among methods in Experiment 1 (Figure 3), indicating that all ALs and OS-NIRS sensors exhibited relatively consistent measurement repeatability. All three OS-NIRS sensors had greater SEs than the two ALs. In Experiment 2, significant differences were observed among methods. Labs D and E showed the highest pooled SDs and the largest SEs, reflecting greater variability, while Lab F exhibited repeatability similar to the rest of the comparisons. The OS-NIRS sensors (Sen. X, Y, Z) and Lab C had significantly lower pooled SDs, indicating better repeatability. In Experiment 3, the OS-NIRS sensors outperformed the ALs, with Sen. U and Sen. V producing the lowest pooled SDs and smallest SEs. Conversely, Labs A, B, and G demonstrated significantly higher variability. These results suggest that, under the conditions of Experiment 3, the OS-NIRS sensors provided more repeatable protein measurements than the ALs.

In Experiment 1, pooled SDs for starch varied across methods (Figure 3). Lab A had the lowest pooled SD, indicating the best repeatability. In contrast, the OS-NIRS sensors (Sen. U, V, W) showed significantly higher pooled SDs and larger SEs, suggesting lower repeatability. However, no significant differences were detected among the sensors or between Lab B and the sensors. In Experiment 2, Lab E and the OS-NIRS sensors (Sen. X, Y, Z) had the lowest pooled SDs and smallest SEs, indicating the most consistent starch measurements. Labs D and F exhibited significantly greater variability than sensors Y and Z. In Experiment 3, the OS-NIRS sensors (Sen. U and V) again showed the highest repeatability, with the lowest pooled SDs and SEs among all methods. Labs A, B, and G displayed significantly greater variability, with Lab B having the highest pooled SD. These results show the consistent repeatability of OS-NIRS sensors for starch measurement under the conditions of these experiments.

In Experiment 1, NDF measurement repeatability varied widely among methods (Figure 3). Lab A had the lowest pooled SD, indicating the most consistent measurements, while the OS-NIRS sensors (Sen. U, V, W) showed significantly higher pooled SDs and larger SEs, suggesting lower repeatability. Lab B’s variability was not significantly different from either Lab A or the OS-NIRS sensors. In Experiment 2, pooled SDs were generally lower than in Experiment 1, with only a few significant differences among methods, indicating comparable repeatability across the ALs and OS-NIRS sensors. In Experiment 3, the OS-NIRS sensors (Sen. U and V) had the lowest pooled SDs and SEs, reflecting the highest repeatability, while Labs A, B, and G exhibited significantly greater variability.

While SD_P reflects the repeatability of a single lab for a given sample set, SD_CP represents the average within-lab repeatability across all methods included in the calculation; a smaller SD_CP indicates that, on average, the applied method produced more consistent repeated measurements. In this analysis, we compared the SD_CP using two different methods: AL data alone and OS-NIRS data alone. Methods in Experiment 2 produced more consistent moisture measurements than those in Experiment 3, as shown by the generally lower SD_CP and SE_CP values (Figure 4). Relying solely on OS-NIRS sensor data significantly (p < 0.05) reduced measurement consistency in Experiment 2 but improved it in Experiment 3.

In all three experiments, relying on OS-NIRS alone generally improved the consistency of protein measurements, although not all differences were significantly different (Figure 4). A similar pattern was observed for starch and NDF in Experiments 2 and 3, whereas the opposite was true in Experiment 1. Consistency for starch and NDF was also greater in Experiments 2 and 3 than in Experiment 1, as indicated by lower SD_CP and SE_CP values.

3.4. Inter-Laboratory Reproducibility

To characterize variability among laboratories, we again compared results using three approaches: AL data alone, AL data combined with OS-NIRS data, and OS-NIRS data alone. Variability was evaluated in relation to the average constituent values using two metrics: SD_L, which quantified inter-laboratory variability on a sample-by-sample basis (Equation (11)), and SD_G, which measured systematic differences among laboratories (Equation (12)). For both metrics, two-sided 95% confidence intervals were obtained by scaling with 1.96.

In Experiment 2, the SD_L CIs for moisture content were 2.44, 3.43, and 3.48 percentage points for the three methods, respectively, while the SD_G CIs were 2.58, 3.72, and 3.74 (Figure 5). In Experiment 3, the corresponding SD_L spans were 4.22, 4.36, and 1.69, and the SD_G spans were 2.59, 2.68, and 2.30. Given that average moisture was in the 75% range in Experiment 2 and in the 55% range in Experiment 3, these values represent differences of only a few percentage points relative to the means. Overall, incorporating OS-NIRS data with the AL measurements, or relying solely on OS-NIRS sensor data, had minimal impact on inter-laboratory reproducibility.

For Experiments 1, 2, and 3, the full span of the SD_L CIs for protein content were 0.33 to 1.40 percentage points, and 0.32 to 1.44 for the SD_G CIs (Figure 5). Given the magnitude of the protein concentrations, these confidence intervals reflect substantial inter-laboratory variability and suggest poor reproducibility among laboratories (i.e., ALs and OS-NIRS), especially in Experiments 1 and 3. Incorporating OS-NIRS data with the AL measurements, or relying solely on OS-NIRS sensor data, slightly improved the inter-laboratory reproducibility for these two experiments.

For Experiments 1, 2, and 3, the full span of the SD_L CIs for starch content were 2.62 to 9.43 percentage points, and 1.95 to 9.53 for the SD_G CIs (Figure 6). Starch measurement uncertainty was substantial in all experiments, particularly in Experiment 2 where the average starch concentration was low. The wide confidence intervals reflect substantial inter-laboratory variability and limited reproducibility across methods. Adding OS-NIRS sensor data to the AL measurements further increased uncertainty in all three experiments. When considered alone, OS-NIRS sensor data showed reproducibility comparable to AL data in Experiments 1 and 3.

For Experiments 1, 2, and 3, the full span of the SD_L CIs for starch content were 1.56 to 10.88 percentage points, and 1.09 to 8.13 for the SD_G CIs (Figure 6). Measurement uncertainty was considerable in Experiment 2, coinciding with the highest average NDF concentration. The effect of including OS-NIRS sensor data with the AL data was inconsistent—narrowing confidence intervals in Experiment 2 but widening them in Experiments 1 and 3; a similar pattern was observed when analyzing OS-NIRS sensor data alone.

Constituent means obtained using ALs analysis; combined ALs and OS-NIRS analysis; and OS-NIRS sensors only were generally in close agreement across all experiments (Table 3). Differences were typically within the range of the corresponding reproducibility standard deviations (SD_R). For most constituents, SD_R values were similar between analytical approaches, indicating that inclusion of OS-NIRS-derived data did not materially affect overall reproducibility, which reflects both within- and between-laboratory variability. Coefficients of variation (CV) exhibited greater dependence on constituent type and experimental set than on analytical method. Moisture consistently displayed low CVs (< 5%), whereas protein, starch and NDF demonstrated higher relative variability. Across the three methods, CV values were generally less than 10%, with occasional values greater than 10%, and no one method consistently showed superior reproducibility. These patterns suggest that inherent sample heterogeneity and constituent-specific analytical challenges were the primary drivers of variability, rather than the choice of analytical approach.

4. Discussion

The objective of this study was to evaluate intra- and inter-laboratory precision of whole-plant corn (WPC) compositional analyses, with particular emphasis on the inclusion of on-site NIRS (OS-NIRS) sensors alongside commercial analytical laboratories (ALs). Across the three experiments, our results demonstrated that OS-NIRS measurements of moisture, starch, and NDF generally agreed with AL values, though experiment-specific biases were observed, especially under conditions of low starch and high NDF (Experiment 2). This finding aligns with prior reports that calibration robustness is a key determinant of sensor performance, particularly when sample composition deviates from the calibration range [8,9,28]. Nonetheless, the ability of OS-NIRS to generate results comparable to laboratory instruments supports their potential role as complementary tools for forage quality assessment [16,17].

Physical differences in WPC morphology, including the relative proportions of leaves, stems, and kernels and associated variations in tissue structure and hardness, can influence NIR reflectance and compositional predictions, suggesting that local bias adjustments or region-specific calibrations may improve accuracy across diverse germplasm and environments [28]. In addition, the physical state of samples influences NIR reflectance because water in fresh, intact plant tissue produces strong absorption features and the larger surface heterogeneity of unground material alters light scattering, often reducing calibration precision compared to dried, ground samples; this effect has been observed in forage and silage NIR studies and highlights how sample state contributes to variability in predictive performance [29].

Several factors should be considered when interpreting these results. OS-NIRS measurements used a common manufacturer-supplied calibration that was not independently validated with reference analyses; accordingly, results are best interpreted in terms of relative differences and analytical precision, with sensor reproducibility reflecting instrument and sampling effects, while inter-laboratory comparisons also include differences among proprietary calibrations. Because OS-NIRS calibrations are based on laboratory reference values, the sensor can reproduce the mean behavior of repeated laboratory measurements, but variability in the reference method is necessarily reflected in the calibration, so observed differences are not solely attributable to the sensor. The number of experiments was limited and conducted within a limited number of regions, so findings may not capture the full diversity of cropping systems, environmental conditions, or management practices. Regional variability could influence calibration robustness and the transferability of OS-NIRS performance. The results observed in Experiment 2 may reflect compositional extremes and regional forage characteristics, and while local calibration adjustment using reference analyses could improve performance under such conditions, this was beyond the scope of the present study. Differences in maize hybrid genetics and growing conditions were not explicitly controlled in this study and may contribute to variability in spectral response; however, these factors were not expected to materially affect the comparative assessment of OS-NIRS and laboratory precision under practical use conditions. In Experiment 1, inter-laboratory reproducibility was estimated using only two laboratories and therefore represents a limited assessment that should be interpreted cautiously relative to formal ISO 5725 reproducibility evaluations. Future multi-region studies incorporating independent reference analyses are needed to assess generalizability and calibration transferability across diverse production systems.

With respect to intra-laboratory repeatability, OS-NIRS sensors performed similarly or in some cases better than ALs. For instance, OS-NIRS provided highly repeatable protein and starch measurements in Experiments 2 and 3, often outperforming several ALs which had greater within-lab variability. These results support our first hypothesis that OS-NIRS would achieve within-laboratory repeatability comparable to ALs, even when analyzing heterogeneous, undried and unground samples [5,10]. The enhanced repeatability of OS-NIRS is particularly significant given the lack of sample drying and grinding, which are standard practices in ALs to minimize heterogeneity [12,14]. This suggests that modern portable OS-NIRS technology has advanced sufficiently to deliver consistent results despite greater sample variability, an outcome also reported in recent inter-comparison studies [18]. The improved repeatability observed for some constituents, and in particular protein, with OS-NIRS may partly reflect the larger effective sample volume integrated across repeated measurements, which can reduce sampling-related variance compared with laboratory analyses based on small subsamples. While not explicitly tested here, this interpretation is consistent with established sampling error theory in near-infrared spectroscopy [30].

In terms of inter-laboratory reproducibility, substantial variability persisted across both ALs and OS-NIRS for starch and NDF, consistent with the inherent heterogeneity of WPC samples [21]. However, the inclusion of OS-NIRS data did not materially degrade reproducibility estimates for most constituents. In some cases, such as NDF in Experiment 2, OS-NIRS inclusion even reduced overall uncertainty. These findings partly confirm our second hypothesis that incorporating OS-NIRS would yield reproducibility metrics similar to those obtained when only AL data were considered. They also reflect broader challenges in achieving high reproducibility across forage laboratories, where differences in calibration sets, instrument configurations, and sample handling contribute to variability [16,20].

Although inclusion of OS-NIRS data occasionally widened confidence intervals for inter-laboratory comparisons, particularly for starch, these effects were modest relative to the overall magnitude of sample variability. Importantly, constituent means derived from combined AL and OS-NIRS analyses were nearly identical to those from ALs alone, with differences well within reproducibility standard deviations. This suggests that OS-NIRS sensors can be integrated into multi-lab networks without introducing systematic bias, provided that calibration models remain current and representative of the full range of expected forage compositions [3,24].

When directly comparing analytical strategies, clear differences emerged. Including OS-NIRS data alongside AL data generally improved intra-laboratory repeatability for protein and sometimes starch, but its effect on inter-laboratory reproducibility was more variable. In contrast, OS-NIRS data used alone provided repeatability metrics on par with, and in some cases superior to, ALs, though reproducibility outcomes were mixed. For example, OS-NIRS-only analysis narrowed confidence intervals for starch and NDF in Experiment 2 but widened them in Experiments 1 and 3. These contrasting patterns indicate that OS-NIRS is best positioned to strengthen within-laboratory consistency, while its effect on cross-laboratory reproducibility depends strongly on constituent type and experimental context.

Overall, our findings highlight the trade-offs inherent in deploying OS-NIRS in practical on-site WPC forage analysis. While OS-NIRS may not fully eliminate inter-laboratory variability, it offers substantial benefits in measurement repeatability, speed, and accessibility at the farm level. This capability enables producers and nutritionists to generate timely, reliable compositional estimates that support strategic storage decisions at the time of harvest, while still maintaining compatibility with commercial laboratory analyses. When comparing the analytical results from the OS-NIRS devices with those from the ALs for the constituents tested, some absolute differences in the typical reproducibility standard deviations of both methods (Table 3) can be expected. Continued efforts to improve calibration transferability and to expand calibration datasets will further enhance the role of OS-NIRS in complementing laboratory-based systems [7,21].

5. Conclusions

This study confirmed that on-site NIRS sensors can achieve within-laboratory repeatability comparable to commercial laboratories. Although inter-laboratory reproducibility remained variable, inclusion of OS-NIRS data produced results broadly consistent with those of conventional analytical laboratory analyses. When used alone, OS-NIRS sensors generated constituent means closely aligned with ALs and provided repeatability equal to or greater than laboratories, though reproducibility outcomes were mixed and strongly dependent on constituent type and experimental context. Future research should prioritize improving OS-NIRS sensor calibration robustness under diverse conditions and expanding validation datasets to strengthen reproducibility and ensure seamless integration of OS-NIRS into forage analysis networks.

Author Contributions

Conceptualization, P.S. and M.F.D.; methodology, M.F.D., K.J.S. and P.S.; formal analysis, K.J.S. and P.S.; investigation, M.F.D. and A.J.T.; resources, M.F.D.; data curation, K.J.S., A.J.T. and P.S.; writing—original draft preparation, K.J.S. and P.S.; writing—review and editing, K.J.S. and M.F.D.; supervision, M.F.D.; project administration, M.F.D.; funding acquisition, M.F.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by John Deere Material Properties Sensing Group.

Data Availability Statement

Data is not publicly available, although the data may be made available on request from the corresponding author.

Acknowledgments

During the preparation of this manuscript/study, the authors used ChatGPT version 5, for the purposes of editing the grammar and flow of the original draft. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare that researchers affiliated with the funding organization were directly involved in the design of the study, the collection and analysis of data, and the preparation and development of this manuscript.

Abbreviations

The following abbreviations are used in this manuscript:

CI	Confidence interval
CV	Coefficient of variation
DM	Dry matter
OS-NIRS	On-Site Near-Infrared Spectroscopy
NDF	Neutral detergent fiber
SD	Standard deviation
SE	Standard error
w.b.	Wet basis
WPC	Whole plant corn

References

Klopfenstein, T.J.; Erickson, G.E.; Berger, L.L. Maize is a critically important source of food, feed, energy and forage in the USA. Field Crops Res. 2013, 153, 5–11. [Google Scholar] [CrossRef]
Hossain, M.E.; Kabir, M.A.; Zheng, L.; Swain, D.L.; McGrath, S.; Medway, J. Near-infrared spectroscopy for analysing livestock diet quality: A systematic review. Heliyon 2024, 10, e40016. [Google Scholar] [CrossRef] [PubMed]
Vincent, B.; Dardenne, P. Application of NIR in Agriculture. In Near-Infrared Spectroscopy; Springer: Singapore, 2021; pp. 331–345. [Google Scholar] [CrossRef]
Harris, P.A.; Nelson, S.; Carslake, H.B.; Argo, C.M.; Wolf, R.; Fabri, F.B.; Brolsma, K.M.; van Oostrum, M.J.; Ellis, A.D. Comparison of NIRS and wet chemistry methods for the nutritional analysis of haylages for horses. J. Equine Vet. Sci. 2018, 71, 13–20. [Google Scholar] [CrossRef]
Weiss, W.; Hall, M.B. Laboratory Methods for Evaluating Forage Quality. In Forages; Moore, K.J., Collins, M., Nelson, C.J., Redfearn, D.D., Eds.; Wiley: Hoboken, NJ, USA, 2020; pp. 659–672. [Google Scholar] [CrossRef]
Dias, C.M.; Nunes, H.; Borba, A. Near-Infrared Spectroscopy in animal nutrition: Historical insights, technical principles, and practical applications. Analytica 2024, 5, 481–498. [Google Scholar] [CrossRef]
Ozaki, Y.; Huck, C.W.; Beć, K.B. Near-IR Spectroscopy and Its Applications. In Molecular and Laser Spectroscopy; Gupta, V.P., Ed.; Elsevier: San Diego, CA, USA, 2018; pp. 11–38. [Google Scholar] [CrossRef]
Yang, X.; Cherney, J.H.; Casler, M.D.; Berzaghi, P. Forage calibration transfer from laboratory to portable near infrared spectrometers. J. Near Infrared Spectrosc. 2023, 31, 126–140. [Google Scholar] [CrossRef]
Acosta, J.J.; Castillo, M.S.; Hodge, G.R. Comparison of benchtop and handheld near-infrared spectroscopy devices to determine forage nutritive value. Crop Sci. 2020, 60, 3410–3422. [Google Scholar] [CrossRef]
Yang, X.; Cerezo, A.A.; Berzaghi, P.; Magrin, L. Comparative near Infrared (NIR) spectroscopy calibrations performance of dried and undried forage on dry and wet matter bases. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 316, 124287. [Google Scholar] [CrossRef] [PubMed]
Cassida, K.A.; Collins, H. USDA LTAR Common Experiment measurement: Obtaining quality metrics in forage aboveground biomass v1. Protocols 2024. [Google Scholar] [CrossRef]
Alomar, D.; Fuchslocher, R.; de Pablo, M. Effect of preparation method on composition and NIR spectra of forage samples. Anim. Feed Sci. Technol. 2003, 107, 191–200. [Google Scholar] [CrossRef]
Undersander, D.J.; Mertens, D.R.; Thiex, N. Forage Analysis Procedures; National Forage Testing Assoociation: Omaha, NE, USA, 1993. [Google Scholar]
Jeong, E.C.; Han, K.J.; Ahmadi, F.; Li, Y.F.; Wang, L.L.; Yu, Y.S.; Kim, J.G. Application of near-infrared spectroscopy for hay evaluation at different degrees of sample preparation. Anim. Biosci. 2024, 37, 1196–1203. [Google Scholar] [CrossRef] [PubMed]
Shenk, J.S.; Workman, J.J.J.; Westerhaus, M.O. Application of NIR spectroscopy to agricultural products. In Handbook of Near-Infrared Analysis; Burns, D.A., Ciurczak, E.W., Eds.; Dekker Inc.: New York, NY, USA, 2001; pp. 383–431. [Google Scholar] [CrossRef]
Castillo, M.S.; Griggs, T.C.; Digman, M.F.; Vendramini, J.M.B.; Dubeux, J.C.B.; Pedreira, C.G.S. Reporting forage nutritive value using near-infrared reflectance spectroscopy. Crop Sci. 2025, 65. [Google Scholar] [CrossRef]
Le Cocq, K.; Harris, P.; Bell, N.; Burden, F.; Lee, M.R.F.; Davies, D.R. Comparisons of commercially available NIRS-based analyte predictions of haylage quality for equid nutrition. Anim. Feed Sci. Technol. 2022, 283, 115158. [Google Scholar] [CrossRef]
Loučka, R.; Jambor, V.; Nedělník, J.; Lang, J.; Homolka, P.; Jančík, F.; Koukolová, V.; Kubelková, P.; Tyrolová, Y.; Výborná, A. Differences between chemical analysis and portable near-infrared reflectance spectrometry in maize hybrids. Czech J. Anim. Sci. 2022, 67, 176–184. [Google Scholar] [CrossRef]
ISO 5725-1:2023; Accuracy (Trueness and Precision) of Measurement Methods and Results—Part 1: General Principles and Definitions. ISO: Vernier, Switzerland, 2023. Available online: https://www.iso.org/standard/69418.html (accessed on 19 August 2025).
Saha, U.K.; Kern-Lunbery, R. Accuracy and Precision of Near Infra-Red Spectroscopy (NIRS) Versus Wet Chemistry in Forage Analysis. [Online]. Available online: https://www.foragetesting.org/ (accessed on 19 August 2025).
Malebana, I.M.M.; Cherney, D.J.R.; Parsons, D.; Cox, W.J. Corn silage analysis as influenced by sample size collected. Anim. Feed Sci. Technol. 2015, 210, 17–25. [Google Scholar] [CrossRef]
Nagpal, M.; Heilemann, J.; Samaniego, L.; Klauer, B.; Gawel, E.; Klassert, C. Measuring extremes-driven direct biophysical impacts in agricultural drought damages. Nat. Hazards Earth Syst. Sci. 2025, 25, 2115–2135. [Google Scholar] [CrossRef]
Feng, X.; Cherney, J.H.; Cherney, D.J.R.; Digman, M.F. Practical Considerations for Using the NeoSpectra-Scanner Handheld Near-Infrared Reflectance Spectrometer to Predict the Nutritive Value of Undried Ensiled Forage. Sensors 2023, 23, 1750. [Google Scholar] [CrossRef] [PubMed]
McIntosh, D.W.; Anderson-Husmoen, B.J.; Kern-Lunbery, R.; Goldblatt, P.; Lemus, R.; Griggs, T.; Bauman, L.; Boone, S.; Shewmaker, G.; Teutsch, C. Guidelines for Optimal Use of NIRSC Forage and Feed Calibrations in Membership Laboratories, 2nd ed.; The University of Tennessee Press: Knoxville, TN, USA, 2022; Available online: https://trace.tennessee.edu/utk_planpubs/100 (accessed on 23 August 2025).
ISO 5725-2:2019; Accuracy (Trueness and Precision) of Measurement Methods and Results—Part 2: Basic Method for the Determination of Repeatability and Reproducibility of a Standard Measurement Method. ISO: Vernier, Switzerland, 2019. Available online: https://www.iso.org/standard/69419.html (accessed on 28 August 2025).
Gurland, J.; Tripathi, R.C. A simple approximation for unbiased estimation of the standard deviation. Am. Stat. 1971, 25, 30–32. [Google Scholar] [CrossRef]
Duncan, A.J. Quality Control and Industrial Statistics, 4th ed.; Richard D. Irwin, Inc.: Homewood, IL, USA, 1974. [Google Scholar]
Pupo, M.R.; Diepersloot, E.C.; de Paula, E.M.; Dórea, J.R.R.; Ghizzi, L.G.; Ferraretto, L.F. Real-time dry matter prediction in whole-plant corn forage and silage using portable near-infrared spectroscopy. Animals 2025, 15, 2349. [Google Scholar] [CrossRef] [PubMed]
Ikoyi, A.Y.; Younge, B.A. Influence of forage particle size and residual moisture on near infrared reflectance spectroscopy (NIRS) calibration accuracy for macro-mineral determination. Anim. Feed Sci. Technol. 2020, 270, 114674. [Google Scholar] [CrossRef]
Esbensen, K.H.; Romañach, R.J. A Framework for Representative Sampling for NIR Analysis—Theory of Sampling (TOS). In Handbook of Near-Infrared Analysis; Ciurczak, E.W., Igne, B., Workman, J., Jr., Burns, D.A., Eds.; CRC Press: Boca Raton, FL, USA, 2021; pp. 415–462. [Google Scholar] [CrossRef]

Figure 1. Comparison of OS-NIRS sensors and analytical laboratory measurements of moisture content (% wet basis) and protein, starch and NDF (% of DM) for whole-plant corn samples. Each point represents the mean constituent value of a sample, averaged across all replicates, with the x-axis showing analytical laboratory values and the y-axis showing OS-NIRS sensor values. Data is from Experiments 1 (green circles); 2 (red squares); and 3 (blue triangles). The dashed line denotes the 1:1 line of identity, illustrating agreement between methods.

Figure 2. Range of constituent values (maximum–minimum) for OS-NIRS sensors and analytical laboratories across three experiments. Each box represents the interquartile range (IQR), the line indicates the median, whiskers extend to the minimum and maximum values, and the cross marks the mean.

Figure 3. Pooled standard deviation (SD_P; Equation (5)) for moisture content (% w.b.) and protein, starch, and NDF (% of DM) across laboratories and OS-NIRS sensors. Error bars represent standard error (SE_P; Equation (7)). Data is from Experiments 1 (blue bars); 2 (yellow bars); and 3 (green bars). Within each experiment, columns with different letters denote significantly different pooled SD_P based on Cochran’s C test and the F-test (p < 0.05).

Figure 4. Pooled standard deviation (SD_CP; Equation (8)) for moisture content (% w.b.) and protein, starch, and NDF (% of DM) across laboratories. Error bars represent standard error (SE_CP; Equation (9)). Data is from Experiments 1 (blue bars); 2 (yellow bars); and 3 (green bars). Within each experiment, columns with different letters denote significantly different pooled SD_CP based on F-test (p < 0.05). A = with analytical lab data only (AL) only; C = with OS-NIRS data only.

Figure 5. Average moisture and protein content for each experiment. Solid error bars show the 95% CI for inter-laboratory variability among samples (SD_L × 1.96; Equation (11)). Dashed error bars show the 95% CI for systematic lab-to-lab differences (SD_G × 1.96; Equation (12)). Data is from Experiments 1 (blue bars); 2 (brown bars); and 3 (green bars). A = with analytical lab data only (AL) only; B = with both AL and OS-NIRS data; C = with OS-NIRS data only.

Figure 6. Average starch and NDF content for each experiment. Solid error bars show the 95% CI for inter-laboratory variability among samples (SD_L × 1.96; Equation (11)). Dashed error bars show the 95% CI for systematic lab-to-lab differences (SD_G × 1.96; Equation (12)). Data is from Experiments 1 (blue bars); 2 (brown bars); and 3 (green bars). A = with analytical lab data only (AL) only; B = with both AL and OS-NIRS data; C = with OS-NIRS data only.

Table 1. Overview of whole-plant corn sample distribution, number of replicates per sample, and total samples analyzed by analytical laboratories and on-site NIRS sensors across three experiments.

		Analytical Laboratories			On-Site NIRS
Experiment No.	No. of Samples	No. of Labs	No. of Replicates ^[a]	Total Samples Analyzed	No. of Sensors	No. of Replicates ^[a]	Total Samples Analyzed
1	8	2	4	64	3	2	48
2	6	4	3	72	3	5	90
3	10	3	3	90	2	2	40

[a] Number of replicates analyzed per sample.

Table 2. Variability in constituent content across all samples and replicates in an individual experiment for moisture content (% wet basis) and protein, starch, and NDF (% of DM) in whole-plant corn.

Experiment	Source	Aver. ^[a]	Std Dev ^[b]	CV ^[c]	Relative Range ^[d]
Number				(%)	(%)
1	Protein	7.3	0.50	6.9	75.8
	Starch	43.7	3.24	7.4	38.8
	NDF	34.0	2.60	7.7	48.1
2	Moisture	77.7	2.85	3.7	28.6
	Protein	7.4	0.34	4.6	53.6
	Starch	20.0	5.39	27.0	50.4
	NDF	47.3	2.79	5.9	56.1
3	Moisture	58.1	6.93	11.9	60.3
	Protein	7.8	0.55	7.0	82.3
	Starch	40.5	5.43	13.4	55.1
	NDF	32.3	3.03	9.4	50.1

[a] Overall mean across all samples and replicates. [b] Standard deviation of average of all samples per experiment—see Equation (2). [c] Coefficient of variation—see Equation (3). [d] Relative range of constituent compared to the global range—see Equation (4).

Table 3. Mean constituent values, reproducibility standard deviation (SD_R), and coefficients of variation (CV) for three experiments using three analytical approaches.

Experiment	Parameter	Moisture (% w.b.)			Protein (% of DM)			Starch (% of DM)			NDF (% of DM)
No.		A	B	C	A	B	C	A	B	C	A	B	C
Exp. 1	Mean ^[a]				7.3	6.9	6.6	43.7	43.3	43.1	34.0	35.3	36.1
	SD_R ^[b]				0.88	0.87	0.59	2.96	3.85	4.85	2.50	3.80	4.67
	CV ^[c]				12.0	12.7	9.0	6.8	8.9	11.3	7.4	10.8	12.9
Exp. 2	Mean ^[a]	77.7	76.7	75.8	7.4	7.5	7.6	20.0	23.6	26.8	47.3	45.0	43.4
	SD_R ^[b]	1.19	1.79	2.27	0.59	0.62	0.56	2.55	2.75	2.82	2.92	2.54	2.31
	CV ^[c]	1.5	2.3	3.0	8.0	8.2	7.3	12.8	11.7	10.5	6.2	5.7	5.3
Exp. 3	Mean ^[a]	58.1	57.4	56.1	7.8	7.5	6.9	40.5	41.7	44.5	32.3	33.0	34.7
	SD_R ^[b]	2.14	1.92	2.11	0.74	0.85	0.53	3.40	3.08	3.83	3.01	2.66	3.18
	CV ^[c]	3.7	3.3	3.8	9.4	11.3	7.7	8.4	7.4	8.6	9.3	8.1	9.2

[a] Constituent average across all samples and replicates for each of three methods. [b] SD_R—Reproducibility standard deviation; includes both within-laboratory and between-laboratory sources of variability—see Equation (15). [c] CV—Coefficient of variation; SD_R expressed as a percentage of the overall average. A = with analytical lab data only (AL) only; B = with both AL and OS-NIRS data; C = with OS-NIRS data only.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shinners, K.J.; Schade, P.; Timm, A.J.; Digman, M.F. Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize. AgriEngineering 2026, 8, 59. https://doi.org/10.3390/agriengineering8020059

AMA Style

Shinners KJ, Schade P, Timm AJ, Digman MF. Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize. AgriEngineering. 2026; 8(2):59. https://doi.org/10.3390/agriengineering8020059

Chicago/Turabian Style

Shinners, Kevin J., Peter Schade, Aaron J. Timm, and Matthew F. Digman. 2026. "Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize" AgriEngineering 8, no. 2: 59. https://doi.org/10.3390/agriengineering8020059

APA Style

Shinners, K. J., Schade, P., Timm, A. J., & Digman, M. F. (2026). Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize. AgriEngineering, 8(2), 59. https://doi.org/10.3390/agriengineering8020059

Article Menu

Comparative Assessment of On-Site and Commercial Laboratory near Infrared Reflectance Spectrometer Measurements of Fresh Maize

Abstract

1. Introduction

2. Materials and Methods

2.1. Sample Collection and Analysis

2.2. Statistical Analysis

2.2.1. Assessment of Sample Variability Within an Experiment

2.2.2. Within-Laboratory Repeatability

2.2.3. Inter-Laboratory Reproducibility

3. Results

3.1. Comparison of Analytical Lab Averages with OS-NIRS Sensor Averages

3.2. Sample Variability Within an Experiment

3.3. Within-Laboratory Repeatability

3.4. Inter-Laboratory Reproducibility

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI