A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations

Stanteen, Sean Guidry; Su, Jianzhong; Flanagan, Paul; Zhang, Xunchang John

doi:10.3390/forecast7040076

Open AccessArticle

A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations

¹

Mathematics Department, University of Texas at Arlington, Arlington, TX 76019, USA

²

United States Department of Agriculture—Agricultural Research Service, El Reno, OK 73036, USA

^*

Author to whom correspondence should be addressed.

Forecasting 2025, 7(4), 76; https://doi.org/10.3390/forecast7040076

Submission received: 30 September 2025 / Revised: 30 November 2025 / Accepted: 2 December 2025 / Published: 10 December 2025

(This article belongs to the Section Weather and Forecasting)

Download

Browse Figures

Review Reports Versions Notes

Highlights

What are the main findings?

The proposed GEM-kNN has a statistically significant improvement in performance compared to the tested controls.

What is the implications of the main findings?

There may be a way to forecast monthly precipitation to give farmers more information on the necessity of irrigation.
With improvement to what data weather stations collect daily, this method could become even more efficient.

Abstract

This study introduces a novel k-nearest neighbors (kNN) method of forecasting precipitation at weather-observing stations. The method identifies numerous monthly temporal patterns to produce precipitation forecasts for a specific month. Compared to climatological forecasts, which average the observed precipitation over the prior thirty years, and other existing contemporary iterations of kNN, the proposed novel kNN method produces more accurate forecasts on a consistent basis. Specifically, the novel kNN method produces improved root mean square errors (RMSE), mean relative errors, and Nash–Sutcliffe coefficients when compared to climatological and other kNN forecasts at five weather stations in Oklahoma. Rather than looking at the daily data for feature vectors, this novel kNN method takes so many days and evenly groups them, using the resulting average as one feature each. All methods tested were lacking in the ability to forecast wet extremes; however, the novel kNN method produced more frequent higher precipitation forecasts compared to climatology and the two other kNN methods tested.

Keywords:

k-nearest neighbors; precipitation; supervised machine learning; sub-seasonal to seasonal; forecasting

1. Introduction

Accurate seasonal forecasts would assist agricultural stakeholders in minimizing losses that might occur from planting crops that require more precipitation than will occur [1,2,3]. Reliable forecasts allow for more dynamic planning and have the potential to increase a field’s production capabilities greatly [4,5,6]. As it stands, the National Oceanic and Atmospheric Association (NOAA), specifically the Climate Prediction Center (CPC), provides seasonal forecasts for precipitation in the form of probabilistic distributions of 30-day and 90-day totals. However, these forecasts are not useful for agricultural stakeholders, due to the limited predictability in terms of accuracy and inadequate spatial scale they present [5,7,8]. As such, the exploration of different forecasting methods is warranted.

Sub-seasonal to seasonal (S2S) forecasting has been and remains a challenge to this day [9,10,11,12]. For an in-depth discussion on the S2S topic, [13], however a short description of the issue will be presented here. Long thought of as the “predictability desert” by forecasters and researchers alike, S2S forecasting is hindered by the short memory of initial atmospheric conditions and the minimum impact sea-surface temperatures have in S2S forecasts [13,14]. Thus, the S2S gap between two-week weather forecasts and long-term climate forecasts remains an important area of research. Two different approaches exist to predict the S2S time scale: numerical weather/climate prediction using dynamical models and statistical methods using historical data to predict weather/climate at various time scales [13]. Statistical methods cover a variety of problems, from simple persistence models to much more complex machine learning algorithms of pattern recognition and analogue pattern-matching methods. However, statistical methods have long had issues forecasting precipitation because its higher spatial and temporal variability, compared to temperature for example, is a hindrance to statistical techniques [15,16]. Thus, dynamical models have been a focus of the S2S problem in recent years, primarily due to the increase in predictive skill on the S2S time scale by atmospheric models in the last two decades [17]. Operational S2S forecasts using dynamical models are becoming more of a normal in recent years, for example the European Centre for Medium-Range Weather Forecasts (ECMWF) producing operational forecasts out to 42 days, and the CPC producing experimental 3-to-4-week forecasts using dynamical model output. However, the primary drawback of dynamical models is the complexity and computational expense of producing these forecasts [18]. While dynamical models have shown skill in predicting the S2S time scale, statistical models still have their place, given their relative simplicity compared to dynamical models. Thus, there is still a need for research and analysis on improving S2S forecasts from statistical models, especially within the realm of S2S precipitation-forecasting.

k-nearest neighbors (kNN) is a non-parametric method of pattern classification/regression, respectively denoting data with categorical labels (dry, wet, stormy, etc.) and numerical labels (0 mm, etc.). This algorithm utilizes a set of predictands referred to as labels, each of which have a set of numerical representations of certain properties called predictors or features, stored in a feature vector (we shall refer to these as labeled feature vectors or historical feature vectors). We have our operational (target) data whose feature vector is known, but its label is not (in our case because the target has not yet occured). The objective of kNN is to find which k labeled feature vectors are most similar to, or “nearest,” the target’s feature vector. Once those labeled feature vectors are found, their labels are used to predict the label of the target feature vector. For classification, whichever label is most represented by the k nearest neighbors is predicted to apply best to our target feature vector. For regression, an average of the k nearest neighbors’ predictands is taken and used as the average prediction, which is what was performed for this study.

The foundations of kNN were first laid in 1951 [19]. Initially, the method was conceived as the compliment to Fix and Hodges’s naive kernel estimate, which is discussed in both the original paper and the commentary released shortly thereafter [19,20]. It was not until the next year that the researchers would introduce the terminology which gives the method its modern name [21]. In summary, given two distributions (akin to “labels”) F and G with an equal number of p-dimensional samples (

p = M + 1

where M is the number of predictors), a p-dimensional sample with an unknown distribution (our target feature vector), and an odd positive integer k (to prevent a tie), the distances of all samples with known distributions from that of the one whose distribution is unknown are found. Whichever of the two distributions owns a majority of the nearest k samples, it is predicted that the target belongs to that distribution. In addition to this, they gave two important findings; the sample size has a negative correlation with the probability of error, while the number of features in a feature vector has—at least, at its simplest form possible—a positive correlation with the probability of error. In 1967, Cover and Hart proved the upper bound on the method’s probability of error was twice that for Bayes’s method when

k = 1

[22]. While using only the single nearest neighbor k = 1 may make the most intuitive sense, it runs the risk of allowing noise or outliers to have an undue effect, especially when several distributions are available, or the data in question are particularly volatile. In 1970, Hellman published his proposed solution and brought us one step closer to the method we know today; the

(k, k^{'})

nearest neighbor method [23]. Given two positive integers

k^{'}, k

such that

k^{'} < k

, the target is predicted to have a label if at least

k^{'}

of the k nearest neighbors share it. This allowed for sample sets to be composed of samples from more than just two distributions, while retaining the requirement that a significant number of the neighbors should be the same.

As valuable as all this information is, its usefulness is mitigated by one simple fact; an estimation of future precipitation is desired, rather than a classification that denotes a range of quantities. For this method to be of any use, it needs to be adaptable to not just a continuous space but a time series. Fortunately, in 1968, Cover was able to extend his upper error bound from classification to regression. He showed that the large sample risk of the nearest neighbor method was at least less than or equal to half the risk presented by probability distributions such as normal, uniform, and Gaussian [24].

As such, several attempts at producing an algorithm to predict precipitation with this method have been made by several individuals (e.g., [25,26,27,28,29]), by comparing feature vectors of daily precipitation and other variables to predict the precipitation of the next day, adding that prediction to the data set and using it to predict the day after, and so on until we have a prediction of the desired length. Attempts at reproducing this have found that while it can be technically accurate on a day-to-day basis, the actual quantity of forecasts can be unduly impacted by extreme bouts of precipitation in the past. Indeed, looking to prior attempts, while kNN’s ability to predict temperature is impressive, its predictions of precipitation, while promising, can leave something to be desired [26]. While they can certainly present impressive results, said results tend to be for only a single year, which leaves it open as to whether the method is of quality or simply a favorable year for the method. Given how volatile precipitation can be, improvements are needed for kNN to be useful for precipitation prediction.

This paper proposes a novel solution. Rather than looking at the daily data for feature vectors, for some target day t, this novel kNN method takes so many days and evenly groups them, using the resulting average as one feature each, which is referred to as the

(a, b)

pair. In this context, a is the number of groups, while b is the number of days in each group. Given this method of building a feature vector, the novel KNN results will be used to forecast the total precipitation that occurs 30 days after t. This length of time, one month, is used in place of typical daily or weekly forecasts to assist farmers in determining whether they can expect rain or drought over a long period. This is valuable information when it comes to deciding what to plant, as well as when to perform irrigation.

2. Methodology

2.1. Data Preparation

Before feature vectors can be built, there is the matter of preparing the data. Precipitation and temperature cannot be added together, so to overcome this the data must be made unitless. This is performed through standard normalization wherein the mean and standard deviation of each data set was determined by all days before the earliest date used for validation. Normalization was also performed separately for each station. In this study, three sets of daily data (the same ones used in experiments performed by contemporaries [26]) were used: precipitation (mm), minimum temperature (°C), and maximum temperature (°C), henceforth the ordered sets

P, T_{m i n}, T_{m a x}

, respectively. While other variables exist which may affect precipitation to some degree in the study region, the three variables used are the most likely to be recorded by any weather station regardless of location or resources, and so are also best to consider as a baseline for developing an operational forecast tool.

With the inputs understood, the outputs are decided, and furthermore, what would be considered “good.” Without these definitions firmly established, an algorithm can be neither created nor judged to any useful degree. To demonstrate, suppose for a target day t, a forecast five days long could be made, and have three potential analogues (in mm):

A : [0, 0, 0, 0, 50]

B : [0, 0, 0, 10, 15]

C : [25, 0, 0, 0, 0]

After waiting for five days, the actual precipitation that occurred each day was recorded, which turned out to be as follows:

O : [0, 0, 0, 0, 25]

Which five-day forecast could be considered the best?

Of these, A matches the first four days of O, overestimating the fifth by 25 mm. Meanwhile the remaining analogues, B and C, match only three of the five days. By instinct, then, one might conclude that A is the best forecast, and by extension, a good forecasting model would choose A from the other analogues.

However, A also forecasts twice the total precipitation that is observed, whereas the others forecast precisely 25 mm in total. While B projects the 25 mm occurring over two days, it is entirely possible that B’s precipitation began at 11:57 PM, and continued on through the night. Disregarding a forecast for being merely three minutes off from perfection is simply unreasonable. As for C, while the order is inaccurate, it does perfectly forecast four days of no precipitation, and one day of 25 mm. Is it right to disregard such information as well?

To resolve this dilemma, the period’s total precipitation is considered instead. Now, B and C forecast a total of 25 mm, with A forecasting 50 mm, and O observing 25 mm. By our new criterion, B and C are equally valid forecasts, and both superior to A.

2.2. Generalized Feature Vectors

Generalized feature vectors (GFV) are denoted generally by the ordered pair

(a, b)

, which in this context can be read as “a spans of b days,” where a span is simply a length of time. For example, a week is one span of seven days, so

(1, 7)

, three weeks is

(3, 7)

, one month is

(1, 30)

, thirty days is

(30, 1)

, etc. A GFV is then formed by averaging the normalized data of the b days for each of the a spans. The Features were averaged, rather than taking their maximum or variance, while building GFVs because the average is the most similar to the sum which is forecast for precipitation. The sum was not used for the features because taking the sum of temperatures is not a practical measure. An example of this process is given in Figure 1. As well, a more detailed breakdown of the method is available in the Supplementary Materials.

Once a GFV is decided, the target feature vector

\vec{t}

can be built. And while normally people record and look at daily time series for precipitation and other meteorological variables, with GFVs we are able to look at two-day, three-day, etc. series as well and determine which of these is best for the current situation. Figure 2 and Figure 3 is a good representation of this method in chart form.

2.3. Good Enough Method

As it stands, using GFVs does nothing to curtail the issues inherent to precipitation forecasting with kNN, but instead gives said issues notation. This is where the good enough method (GEM) comes in. Let M be a set of days from the same month as t, but whose average precipitation f days afterwards are known. For this example, this would be a set of days in April from years 2021 and earlier. For each of these days, run through several GFVs and determine which return acceptable results based on some standard. For this study, producing a hindcast whose RMSE was lower than that of the hindcast produced by climatology, was considered better, or “good”. This was performed as climatology was superior to both SotA and typical kNN, as will be shown in Section 4. Because of this, it was used as a baseline to decide the quality of the method.

Specifically, the decision is made by month for each station. First, a date from M is selected. For each of the potential GFVs, kNN is performed with a

k = ⌊\sqrt{N}⌋

[28], where N is the number of potential analogues, and the RMSE of each hindcast is noted. Then, the hindcast produced by climatology, and its RMSE, are also calculated. If the RMSE produced by a potential GFV is lower than that produced by climatology, it is awarded one point. This is then repeated for each target day in the list M. Whichever potential GFV has the most total points will be selected to produce the current period’s forecast for the month. In the case of tied scores, the GFV with both the highest score and the lowest average RMSE across all days in M will be selected. A visualization of this method is demonstrated in Figure 2.

True to its name, GEM focuses not on finding the GFVs for each target day that produce the lowest error in their forecasts. Instead, it tries to find an acceptable (good enough) GFV that can work for that specific month a majority of the time.

Figure 3. Chart of a typical kNN progression.

3. Testing

This method was tested on five stations from Oklahoma, USA, each with at least 90 years of precipitation, minimum temperature, and maximum temperature data where missing values were filled using data from adjacent stations. The annual average of each station, as well as the database period, is given in Table 1, along with a map of the stations given in Figure 4. Lahoma, Weatherford, and Chandler are all located in central Oklahoma, with southerly flow driving increases in humidity and related precipitation during the warm season and producing less harsh winter temperatures. Hooker, the northwest-most station located in the Oklahoma panhandle, is an arid region, where precipitation comes in bursts due to isolated thunderstorm activity and infrequent convective systems. Idabel is meanwhile the southeast-most station and likewise the most humid and whose larger precipitation totals come primarily from organized convective systems and synoptic wave activity. For target dates, the 9th, 12th, 15th, 18th, and 21st of each month of the five most recent years made available for this study (2004–2008 for all but Lahoma, which is 2002–2006) were used to ensure diverse results for validation. This gave 25 test target dates per month per station, 300 per station, or 1500 in total. The k nearest neighbors were identified for each target date to forecast the average precipitation over the next 30 days.

As mentioned in the introduction, a large volume of tests were completed to provide robust quantification of the skill of this method compared to climatology and other kNN methods. Before this, however, a GFV must be selected. For each month, those same 5 days (9th, 12th, 15th, 18th, and 21st) were used in the 45 years prior to the testing years for validation, resulting in 225 validation target dates per month used to select a GFV via GEM using 1 to 180 spans each 1 to 20 days long, such that the product of the two is between 30 days (one month) and 365 days (one year). The reader is once again directed towards the Supplementary Materials for a more notational explanation. All possible combinations were exhaustively calculated to find which GFV was most likely to perform well for that particular month. In order to maintain computational efficiency, the calculations were parallelized on the length of the spans (

b = 1, 2, \dots, 20

) so days would only need to be averaged together once, as in Figure 1. Furthermore, to avoid overfitting, bootstrapping of the 45 years prior was used as appropriate for each of the validation years.

For comparison, four additional methods of forecasting were implemented: a typical kNN method [26] as demonstrated graphically in Figure 3, the state of the art method [26] (SotA), a support vector machine (SVM) method [30], and the climatological method. Three of these methods (typical, SotA, and GEM) are similar systems using the same basic concepts of kNN; however the typical and SotA kNN systems specifically use daily data by fixing their GFV to (365, 1) rather than search through a large set of potential GFVs to identify a good fit as is performed in GEM. Furthermore, the typical method is different from SotA in terms of what it forecasts. SotA forecasts the total precipitation 30 days after t all at once. Meanwhile, the typical is recursive, only forecasting the precipitation of day

t + 1

(one day at a time), before adding the predicted forecast to the end of the observed data and repeating the process until f days are forecast. Once this is calculated, alongside the forecasts for

t + 1

of all other variables, said forecasts are normalized and the target feature vector is updated to include them. Then, all historical feature vectors are shifted to include the new “future” day and its data, and these new feature vectors are used to forecast day

t + 2

. This process is repeated up to day

t + 30

, after which the precipitation forecasted for each day is added together to attain the total forecast. As for climatology, it is simply the average precipitation that occurred from the target date to f days later averaged over the 30 years before the target date. This method is used as a baseline to evaluate the usefulness and skills of each forecast. Finally, to compare all of the above with a more widely-used methodology, a support vector machine regression (SVM) methodology was implemented. The reader is directed to [31] for more information on the SVM’s algorithm. Here, however, we shall mention that in order to build the feature vectors necessary to train the SVM, the same method used for SotA was used. A Gaussian kernel function was used for training and hyperparameter optimization, and data was standardized in the same manner described for all the kNN experiments. The optimization routine used was the Sequential Minimization Optimization as selected by default, as well as all other defaults used by MATLAB’s (R2023b) fitrsvm function which was used for building this method of the study.

The root mean squared error (RMSE) is used to assess the quality of forecasts. Let e be the set of forecasts (either those found with GEM, SotA, typical, or climatology) and o the corresponding set of observed data.

R M S E = \sqrt{\sum_{i = 1}^{m} \frac{{(e_{i} - o_{i})}^{2}}{m}}

(1)

where m is the number of target days tested.

While useful, the RMSE can be oversensitive to outliers and extreme errors. As such, the absolute mean relative error (MRE) is also used, where

M R E = \frac{1}{m} \sum_{i = 1}^{m} \frac{| e_{i} - o_{i} |}{o_{i}} .

(2)

A comparison to the overall observed precipitation average was implemented, with said average being

\bar{o} = \frac{1}{m} \sum_{i = 1}^{m} o_{i}

. Then, the Nash-Sutcliffe coefficient (NS), given by

N S = 1 - \frac{\sum_{i = 1}^{m} {(e_{i} - o_{i})}^{2}}{\sum_{i = 1}^{m} {(\bar{o} - o_{i})}^{2}},

(3)

is a comparison with the observed average, with a range

(- \infty, 1]

. If

N S

is negative, the forecasting method tested performed poorly, the observed average being generally superior. If

0 \leq N S < 1

, the method performed at least as well as the observed average, with greater

N S

correlating to greater performance. An

N S = 1

means the method forecasted all target dates perfectly, meaning it had an RMSE of 0.

Significance testing was performed via the Mann-Whitney U-test [32] to ensure that the differences between GEM and SotA results were significant. All tests were performed for each individual station with a significance level of

α

= 0.05, meaning that if the p-value given by the U-test was below 0.05, GEM’s distribution is assumed to be significantly different to that of SotA. Seeing as how being both superior and distinct to SotA was the priority, U tests were only performed on GEM and SotA together.

And finally, reliability graphs [33] were composed to look at how consistently the predictions performed. To create these graphs, all forecast data, and their corresponding observed values, are binned into discrete 10 mm wide bins. Then, the average values of both the binned forecasts and their corresponding observations are taken. An example of this process is given in the form of Table 2. The final reliability graph depicts the average forecast versus average observed values for those forecasts to show how “reliable” the forecast system is in reproducing specific binned values of data. Observed averages below the forecast average represent an overestimation for that specific forecast bin and when the observed average is higher than the forecast average that means the forecast system is underestimating the observed values for that range of forecast values.

4. Results and Discussion

The RMSEs, MREs, and NSs of each station are given in Figure 5. Idabel had the highest RMSE across all five stations, whereas Hooker had the lowest. In all five cases GEM produced a lower RMSE than those produced by climatology and the other kNN methods. Figure 6 and Figure 7 contain scatter plots of forecasts produced by GEM, SotA, and climatology, all being compared to the average precipitation observed f days after the target day. Due to the similarities of SotA and the typical kNN results, plots of the latter were omitted from Figure 6 and Figure 7.

Applying the GEM method resulted in non-ideal but nonetheless higher quality forecasts. For the target dates tested, GEM produced a greater Nash-Sutcliffe coefficient than either the SotA, typical

k N N

, or the climatological forecasts. Save for Hooker and Idabel, all GEM NS are greater than 0.17. For typical kNN, Hooker and Idabel have negative NS, with Weatherford at 0.026 and Chandler the only one to exceed 0.15. SotA has somewhat better results, in that Lahoma’s NS is positive, and Chandler and Weatherford also saw minor improvements to their NS. However, Hooker and Idabel performed markedly worse when using SotA compared to the typical kNN. And for climatology, Hooker and Idabel are below 0.1, with the rest being below 0.2.

All stations achieved superior RMSE, MRE, and NS when using GEM compared to SotA, typical, or climatology. This indicates that GEM not only produced superior forecasts, but when it did, these forecasts had a non-trivial improvement over climatology.

It should be noted that there is a positive correlation between average total annual precipitation and RMSE, in that for

i = 1, \dots, 5

, the station with the

i^{t h}

greatest total annual precipitation also had the

i^{t h}

greatest RMSE. This is the case regardless of the forecasting method used.

Looking at Figure 6, Figure 7 and Figure 8 gives important insights into these statistical measures. Climatology is extremely prone to underestimation, typically several centimeters short of the observed amounts on higher precipitation days. Using SotA does fix this to an extent, especially regarding the cold season, but it is only when GEM is implemented that more extreme forecasts are made. Unfortunately, this also brings about an opposite problem: overestimation. Both of these problems are to be expected, seeing as the forecasts are built using weighted averages. As such, while all methods have instances of this, GEM’s willingness to make forecasts that exceed the upper bounds of the others’ predictions can result in forecasts that go further past lower observations as they do approach higher ones. And, as can be seen in Figure 8, the SVM is even less likely to make extreme forecasts, with the forecasts it makes clustered into several more noticeable clumps. With the exception of Lahoma, forecasts using this method also seem to lock up in a straight line similar to what occurs in climatology. This falls in line with the performance seen in [34], where both kNN and SVM performed poorly.

One can also look to the aforementioned figures for the differences in performance at different times of the year, with red circles indicating the warm season of March through September, and blue being the cold, October through February. During the warm season, GEM is likely to make lower forecasts, regardless of observed precipitation. Interestingly, though, Idabel is just as likely to over-forecast during the warm season, and in Chandler it rarely under-forecasts at all. For both SotA and climatology, forecasts skew into a mix of over and under-forecasting for both the warm and cold seasons, though much like with GEM, their seasonal forecasts at all stations sans Idabel can be clearly distinguished by the range of values each method was willing to put out.

Figure 9, meanwhile, is a demonstration of how GEM (the highest performing of the three kNN implementations) performed in monthly forecasts of the testing data. The purpose of this is to give a visual aid for not only how GEM is meeting individual forecasts, but how well it is able to match the trends of the observed data. For GEM, all stations had a correlation coefficient above 0.25, with the highest being Chandler at 0.512, and the lowest being Idabel at 0.263. Alphabetically, GEM has an RMSE of 54.465, 41.937, 75.829, 41.32, and 56.953 mm for these graphs. With climatology, the resulting correlation coefficients were universally lower, with its higest being Weatherford at 0.428 and its lowest being Idabel at 0.101. Alphabetically, climatology has an RMSE of 59.015, 43.280, 78.937, 44.552, and 58.634 mm for these graphs. Alphabetically, SVM has an RMSE of 60.412, 44.995, 80.040, 46.488, and 61.358 mm for these graphs. By its nature as the 30-year average, the climatological forecast takes on a seasonal cycle, making any similarities to the observed trend coincidental. GEM, however, has a much more interesting relationship. Although it rarely predicts the heavier precipitation totals accurately, GEM’s forecasts often will ”peak” in comparison to the forecasts made a month prior and after, suggesting a level of sensitivity to the larger precipitation values. Examples of this trend can be seen in most larger precipitation totals, with exceptions for Lahoma, and the earliest peak that occurs in Weatherford. It should be noted that the GEM forecast shown in the above results is a weighted average from the k nearest neighbors identified by the GEM system, thus this represents not a single instance of a precipitation prediction but a smoothed average. The spread of the nearest neighbor forecast values extends beyond the observed values in most cases (outside of extreme values).

Of the 5 stations, Hooker, Lahoma, and Weatherford’s GEM forecasts were found to be significantly different from those of SotA, with p-values of 0.029, 0.019, and 0.002, respectively. Chandler and Idabel, meanwhile, had p-values of 0.800 and 0.144, respectively, which means they failed to provide statistically significant improvements. Given that, as shown in Figure 4, these two stations experience the most extreme precipitation, the GEM method may have a ceiling regarding its usefulness in making quality forecasts for higher. It may also be worth noting that Chandler and Idabel are the two east-most of the five stations, potentially suggesting a decrease in GEM’s feasibility as stations move closer to the ocean. In either case, further study is necessary before conclusions should be drawn. When compared with climatology, the differences are more apparent both in Figures and in these statistics, with Chandler, Hooker, Lahoma, and Weatherford having p-values of less than 0.001 while only Idabel had an extreme p-value of 0.947.

Figure 10 shows the reliability graphs for each station. For lower precipitation totals, the averaged forecasts and observed precipitations are relatively close. The primary exception to this is Idabel at 30–50 mm. Once the expected forecasts pass a certain threshold, however, they are found to be generally lacking. In the cases of Chandler and Weatherford, they tend to underestimate, while Hooker and Lahoma overestimate. This points to a systemic bias in the methodologies provided, specifically a preference for lower forecasts. Once again, the exception seems to be Idabel; however this should not necessarily be taken as evidence of quality. Looking back to Figure 7, while many of the extreme forecasts are accurate, moderate forecasts have several examples of over and under-forecasting to give the results seen in Figure 10. The best way to counter this shortcoming is to lower the k value needed to create quality forecasts. Since the forecast itself is an average of prior precipitation amounts, by definition, the few extreme precipitation events considered will be diluted by typical. Another option is to perhaps give more weight to extreme precipitation when calculating the total forecast; however, this runs the risk of creating an opposite problem where typical occurrences of precipitation become difficult to forecast.

Table 3. The correlation coefficients of each of the methods tested to the observed forecasts.

Stations	$r_{GEM}$	$r_{Clim}$	$r_{Typical}$	$r_{SotA}$	$r_{SVM}$
Chandler	0.512	0.419	0.224	0.517	0.218
Hooker	0.382	0.276	0.296	0.292	0.130
Idabel	0.263	0.101	−0.157	0.105	0.082
Lahoma	0.487	0.385	0.379	0.451	0.241
Weatherford	0.482	0.428	−0.303	0.511	0.360

For SotA, some similarities in shape to GEM can be observed in Chandler and Hooker; however, it is otherwise quite different. In the cases of Chandler, Idabel, and Lahoma, SotA was usually unable to produce forecasts greater than those of GEM. While it is able to do so in Weatherford, it is only able to make over-forecasts, whereas in Hooker, where both GEM and SotA tend to over-forecast, SotA’s are slightly more extreme.

Perhaps unsurprisingly, climatology has the fewest extremes of these three forecasting methods. It will have the lowest range of forecasted precipitation, particularly noticeable in Chandler, Lahoma, and Weatherford, and lacking extreme deviation from the 1:1 line, or at least rarely to the extreme of its kNN counterparts. The main exceptions to this are in Hooker where its lowest forecasts correlate to observed precipitation nearly seven times greater, as well as Weatherford where forecasts around 140 mm correlate to observed precipitation nearly double that on average.

Take note of the performance during the warm and cold seasons for most stations. In Figure 6 and Figure 7, while there can be some overlap, forecasts from each season are clearly segregated. This is not the case for the more humid region of Idabel, where the warm and cold seasons’ forecasts are mixed. Most likely, the change in performance is a consequence of the greater range of possible precipitation over the span of 30 days. Since humid regions such as Idabel have such variance in their precipitation, averages of past precipitation tend to not allow for forecasts close to 0 mm, unlike stations in arid climates. In short, the differences in precipitation variances between wet and dry months and between wet and cold months might have played some role in affecting the forecast performance between Idabel and Hooker stations.

5. Conclusions

In this paper, a novel method of performing kNN was introduced and tested for five different stations across Oklahoma, being compared to the climatological average as well as the typical and state of the art methodologies for kNN. This comparison was performed by noting the RMSE, MRE, and NS of all methodologies’ forecasts of the total precipitation 30 days after several target dates. Additionally, statistical significance of the results were measured, and reliability graph were made to determine whether the novel GEM method was not only superior, but distinct to those others.

The GEM method was demonstrated as superior to all other methods in this study. While far from perfect, GEM universally had a lower RMSE and MRE than all other methods, and was the only method to maintain a positive NS across all five stations. The conclusions drawn by GEM were found to be statistically significantly different from those of SotA, which implies that with refinement the program could continue to improve on those results. However, the improvement, while distinct as told by the statistical significance tested, is minor, suggesting that the skill of kNN as a whole may be limited in this context.

A greater quantity of historical data, as well as parameters with a correlation to precipitation such as el nino/la nina patterns and moisture flux, would greatly improve the quality of the method, which will be investigated in the future. Further, studies of data outside of Oklahoma would be useful in expanding our understanding of the limitations of this method. As it stands, this research only examines the method’s performance for regions in Oklahoma. While the climates of Oklahoma can in and of themselves be volatile, testing in different regions is necessary to assess this method’s effectiveness beyond Oklahoma. A look into gridded rainfall data may also be necessary, as the station-centric methodology employed here can be blind to other forms of data useful to forecasting precipitation. Additionally, efforts will be made to create a range in which forecasts are made to try and include the observed forecast in as small a range as feasible, using as few years of historical data as possible to try and discern the minimum necessary data to create accurate predictions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/forecast7040076/s1.

Author Contributions

Conceptualization, J.S. and X.J.Z.; methodology, S.G.S. and J.S.; software, S.G.S.; validation, S.G.S., J.S., X.J.Z., and P.F.; formal analysis, J.S. and X.J.Z.; resources, X.J.Z.; data curation, X.J.Z.; writing—original draft preparation, S.G.S. and P.F.; writing—review and editing, J.S., X.J.Z., and P.F.; supervision, J.S. and X.J.Z.; project administration, J.S. and X.J.Z.; funding acquisition, J.S. and X.J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the United States Department of Agriculture (USDA) and the University of Texas at Arlington (UTA). All data used was provided by the USDA under Non-Assistive Cooperative Agreement #58-3070-0-006. The authors were partially supported by USDA NIFA grant award #2022-67037-36259, “Developing an Alliance for Training and Apprenticeship in Climate-Smart Agriculture (DATA-Ag)” and # 2018-38422-28564 “ALFA-IoT: ALliance For Smart Agriculture in the Internet of Things Era”. USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.

Data Availability Statement

Data is available at https://github.com/SeanGuSt/GEM-kNN-Precipitation-Forecasting (accessed on 25 October 2025).

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

S2S	Sub-seasonal to seasonal
kNN	k-nearest neighbor
GEM	Good enough method
SotA	State of the art
RMSE	Root mean square error
MRE	Mean relative error
NS	Nash-Sutcliffe coefficient
SVM	Support Vector Machine

References

Carberry, P.; Hammer, G.; Meinke, H.; Bange, M. The Potential Value of Seasonal Climate Forecasting in Managing Cropping Systems. In Applications of Seasonal Climate Forecasting in Agricultural and Natural Ecosystems; Springer Nature: Berlin/Heidelberg, Germany, 2000; pp. 167–181. [Google Scholar] [CrossRef]
Jones, J.; Hansen, J.; Royce, F.; Messina, C. Potential benefits of climate forecasting to agriculture. Agric. Ecosyst. Environ. 2000, 82, 169–184. [Google Scholar] [CrossRef]
Meinke, H.; Stone, R. Seasonal and Inter-Annual Climate Forecasting: The New Tool for Increasing Preparedness to Climate Variability and Change in Agricultural Planning and Operations; Springer Nature: Berlin/Heidelberg, Germany, 2005; Volume 70, pp. 221–253. [Google Scholar] [CrossRef]
Bruno Soares, M.; Dessai, S. Barriers and enablers to the use of seasonal climate forecasts amongst organisations in Europe. Clim. Chang. 2016, 137, 89–103. [Google Scholar] [CrossRef]
Klemm, T.; McPherson, R. Assessing Decision Timing and Seasonal Climate Forecast Needs of Winter Wheat Producers in the South-Central United States. J. Appl. Meteorol. Climatol. 2018, 57, 2129–2140. [Google Scholar] [CrossRef]
Nicholls, J. Economic and social benefits of climatological information and services: A review of existing assessments. In World Climate Applications and Services Programme; Procedia Environmental Sciences; World Meteorological Organization: Geneva, Switzerland, 1996. [Google Scholar]
Garbrecht, J.; Schneider, J. Climate forecast and prediction product dissemination for agriculture in the United States. Aust. J. Agric. Res. 2007, 58, 966–974. [Google Scholar] [CrossRef][Green Version]
Schneider, J.; Wiener, J. Progress toward filling the weather and climate forecast need of agricultural and natural resource management. J. Soil Water Conserv. 2009, 64, 100A–106A. [Google Scholar] [CrossRef]
Merryfield, W.; Baehr, J.; Batté, L.; Becker, E.; Butler, A.; Coelho, C.; Danabasoglu, G.; Dirmeyer, P.; Doblas-Reyes, F.; Domeisen, D.; et al. Current and Emerging Developments in Subseasonal to Decadal Prediction. Bull. Am. Meteorol. Soc. 2020, 101, E869–E896. [Google Scholar] [CrossRef]
Zhang, L.; Kim, T.; Yang, T.; Hong, Y.; Zhu, Q. Evaluation of Subseasonal-to-Seasonal (S2S) precipitation forecast from the North American Multi-Model ensemble phase II (NMME-2) over the contiguous U.S. J. Hydrol. 2021, 603, 127058. [Google Scholar] [CrossRef]
White, C.; Domeisen, D.; Acharya, N.; Adefisan, E.; Aura, S.; Balogun, A.; Bertram, D.; Bluhm, S.; Brayshaw, D.; Browell, J.; et al. Advances in the Application and Utility of Subseasonal-to-Seasonal Predictions. Bull. Am. Meteorol. Soc. 2021, 103, E1448–E1472. [Google Scholar] [CrossRef]
Hwang, J.; Orenstein, P.; Cohen, J.; Pfeiffer, K.; Mackey, L. Improving Subseasonal Forecasting in the Western U.S. with Machine Learning. arXiv 2019, arXiv:1809.07394. [Google Scholar] [CrossRef]
Vitart, F.; Robertson, A.W. Chapter 1—Introduction: Why Sub-seasonal to Seasonal Prediction (S2S)? In Sub-Seasonal to Seasonal Prediction; Robertson, A.W., Vitart, F., Eds.; Elsevier: Amsterdam, The Netherlands, 2019; pp. 3–15. [Google Scholar] [CrossRef]
Weyn, J.A.; Durran, D.R.; Caruana, R.; Cresswell-Clay, N. Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models. J. Adv. Model. Earth Syst. 2021, 13, e2021MS002502. [Google Scholar] [CrossRef]
Applequist, S.E.A. Comparison of Methodologies for Probabilistic Quantitative Precipitation Forecasting. Weather Forecast. 2002, 17, 783–799. [Google Scholar] [CrossRef]
Li, G.; Chang, W.; Yang, H. A Novel Combined Prediction Model for Monthly Mean Precipitation With Error Correction Strategy. IEEE Access 2020, 8, 141432–141445. [Google Scholar] [CrossRef]
Vitart, F. Evolution of ECMWF sub-seasonal forecast skill scores. Q. J. R. Meteorol. Soc. 2014, 140, 1889–1899. [Google Scholar] [CrossRef]
Ding, T.; Ke, Z. A Comparison of Statistical Approaches for Seasonal Precipitation Prediction in Pakistan. Weather Forecast. 2013, 28, 1116–1132. [Google Scholar] [CrossRef]
Fix, E.; Hodges, J.L. Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties. Int. Stat. Rev./Rev. Int. De Stat. 1989, 57, 238–247. [Google Scholar] [CrossRef]
Silverman, B.; Jones, M.C.E. Fix and J.L. Hodges (1951): An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951). Int. Stat. Rev. 1989, 57, 233. [Google Scholar] [CrossRef]
Fix, E.; Hodges, J.L. Discriminatory Analysis-Nonparametric Discrimination: Small Sample Performance; Technical Report; California Univ Berkeley: Berkeley, CA, USA, 1952. [Google Scholar]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Hellman, M.E. The Nearest Neighbor Classification Rule with a Reject Option. IEEE Trans. Syst. Sci. Cybern. 1970, 6, 179–185. [Google Scholar] [CrossRef]
Cover, T. Estimation by the nearest neighbor rule. IEEE Trans. Inf. Theory 1968, 14, 50–55. [Google Scholar] [CrossRef]
Bannayan, M.; Hoogenboom, G. Predicting realizations of daily weather data for climate forecasts using the non-parametric nearest-neighbour re-sampling technique. Int. J. Climatol. 2007, 28, 1357–1368. [Google Scholar] [CrossRef]
Bannayan, M.; Hoogenboom, G. Weather analogue: A tool for real-time prediction of daily weather data realizations based on a modified k-nearest neighbor approach. Environ. Model. Softw. 2008, 23, 703–713. [Google Scholar] [CrossRef]
Zhang, X. Calibration, refinement, and application of the WEPP model for simulating climatic impact on wheat production. Trans. Am. Soc. Agric. Eng. 2004, 47, 1075–1085. [Google Scholar] [CrossRef]
Yates, D.; Gangopadhyay, S.; Rajagopalan, B.; Strzepek, K. A technique for generating regional climate scenarios using a nearest-neighbor algorithm. Water Resour. Res. 2003, 39, 1199. [Google Scholar] [CrossRef]
Huang, M.; Lin, R.; Huang, S.; Xing, T. A novel approach for precipitation forecast via improved K-nearest neighbor algorithm. Adv. Eng. Inform. 2017, 33, 89–95. [Google Scholar] [CrossRef]
Müller, K.R.; Smola, A.; Rätsch, G.; Schölkopf, B.; Kohlmorgen, J.; Vapnik, V. Using Support Vector Machines for Time Series Prediction; MIT Press: Cambridge, MA, USA, 1999; pp. 243–253. [Google Scholar] [CrossRef]
Fan, R.E.; Chen, P.H.; Lin, C.J. Working Set Selection Using Second Order Information for Training Support Vector Machines. J. Mach. Learn. Res. 2005, 6, 1889–1918. [Google Scholar]
Mann, H.B.; Whitney, D.R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann. Math. Stat. 1947, 18, 50–60. [Google Scholar] [CrossRef]
Roberts, A. Calibration Curve: What You Need to Know. 2023. Available online: https://arize.com/blog-course/what-is-calibration-reliability-curve/ (accessed on 1 December 2025).
Zhang, X.; Shen, M.; Chen, J.; Homan, J.; Busteed, P. Evaluation of Statistical Downscaling Methods for Simulating Daily Precipitation Distribution, Frequency, and Temporal Sequence. Trans. ASABE 2021, 64, 771–784. [Google Scholar] [CrossRef]

Figure 1. Example of GFV (3, 2).

Figure 2. Chart of Good Enough Method (GEM) progression.

Figure 4. Physical location of Oklahoma weather stations, along with total annual precipitation (mm) under each station. The color fill in this plot was computed from the National Centers for Environmental Information’s (NCEI) 1991 to 2020 monthly precipitation climatology. Each tick in latitude and longitude is equivalent to 1 degree.

Figure 5. RMSE, MRE, and NS data presented as a bar graph.

Figure 6. Scatter plots of mean monthly precipitation forecasts using the GEM, SotA, and climatology in Chandler and Hooker. Red circles are forecasts for March through September (the warm season of the year), and blue circles are for all other months (the cold season). The orange line is simply a 1:1 line to assist in judging the quality of forecasts, and does not point to correlation or lack thereof.

Figure 7. Scatter plots of mean monthly precipitation forecasts using GEM, SotA, and climatology in Idabel, Lahoma, and Weatherford. Red circles are forecasts for March through September (the warm season of the year), and blue circles are for all other months (the cold season). The orange line is simply a 1:1 line to assist in judging the quality of forecasts, and does not point to correlation or lack thereof.

Figure 8. Scatter plots of mean monthly precipitation forecasts using SVM. Red circles are forecasts for March through September (the warm season of the year), and blue circles are for all other months (the cold season). The orange line is simply a 1:1 line to assist in judging the quality of forecasts, and does not point to correlation or lack thereof.

Figure 9. A comparison of forecasts developed by climatology and GEM to the observed monthly precipitation from 2004 to 2008 at Chandler, Hooker, Idabel, Lahoma, and Weatherford. Also given are the correlation coefficients for GEM and climatology to the observed precipitation with

r = \frac{\sum (e_{i} - \bar{e}) (o_{i} - \bar{o})}{\sqrt{\sum {(e_{i} - \bar{e})}^{2} \sum {(o_{i} - \bar{o})}^{2}}}

where

\bar{e}

and

\bar{o}

are the respective means of the forecast being correlated and the observer precipitation.

Figure 9. A comparison of forecasts developed by climatology and GEM to the observed monthly precipitation from 2004 to 2008 at Chandler, Hooker, Idabel, Lahoma, and Weatherford. Also given are the correlation coefficients for GEM and climatology to the observed precipitation with

r = \frac{\sum (e_{i} - \bar{e}) (o_{i} - \bar{o})}{\sqrt{\sum {(e_{i} - \bar{e})}^{2} \sum {(o_{i} - \bar{o})}^{2}}}

where

\bar{e}

and

\bar{o}

are the respective means of the forecast being correlated and the observer precipitation.

Figure 10. Reliability graphs for each of the five stations, showing the average of expected precipitation and how it compares to the average of the observed precipation those particular days were trying to forecast. See Table 3 for an example.

Table 1. Physical location, historical annual average of minimum temperature (Min. Temp.), maximum temperature (Max. Temp.), total annual precipitation, and the period of data available for each study site.

Station	Latitude	Longitude	Precipitation	Min. Temp.	Max. Temp.	Database
Station	(N)	(W)	(mm)	(°C)	(°C)	Period
Chandler	35″42′	−96″53′	886.5	9.49	22.78	1902–2009
Hooker	36″51′	−101″12′	473.8	5.32	22.02	1907–2009
Idabel	33″53′	−94″49′	1175.1	10.93	24.96	1907–2009
Lahoma	36″5′	−96″55′	790.9	9.14	22.15	1909–2007
Weatherford	35″31′	−98″42′	754.4	8.85	22.7	1905–2009

Table 2. Example of how each point of the reliability graphs is calculated. The forecasts that belong to each bin are averaged together and compared to the average of their corresponding observed data. All values in this table are given in mm.

Bins	Forecast Data	Observed Data	Forecast Average	Observed Average
20–30	21, 24, 29, 22, 26	32, 10, 25, 29, 40	24.4	27.2
30–40	32, 35, 37, 38	51, 39, 28, 45	35.5	40.75
40–50	42, 49, 45	32, 44, 53	45.3	43

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stanteen, S.G.; Su, J.; Flanagan, P.; Zhang, X.J. A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations. Forecasting 2025, 7, 76. https://doi.org/10.3390/forecast7040076

AMA Style

Stanteen SG, Su J, Flanagan P, Zhang XJ. A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations. Forecasting. 2025; 7(4):76. https://doi.org/10.3390/forecast7040076

Chicago/Turabian Style

Stanteen, Sean Guidry, Jianzhong Su, Paul Flanagan, and Xunchang John Zhang. 2025. "A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations" Forecasting 7, no. 4: 76. https://doi.org/10.3390/forecast7040076

APA Style

Stanteen, S. G., Su, J., Flanagan, P., & Zhang, X. J. (2025). A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations. Forecasting, 7(4), 76. https://doi.org/10.3390/forecast7040076

Article Menu

A Novel k-Nearest Neighbors Approach for Forecasting Sub-Seasonal Precipitation at Weather Observing Stations

Highlights

Abstract

1. Introduction

2. Methodology

2.1. Data Preparation

2.2. Generalized Feature Vectors

2.3. Good Enough Method

3. Testing

4. Results and Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI