Previous Article in Journal
Predicting Financial Contagion: A Deep Learning-Enhanced Actuarial Model for Systemic Risk Assessment
Previous Article in Special Issue
Application of a Machine Learning Algorithm to Assess and Minimize Credit Risks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Credit Risk Index as a Support Tool for the Financial Inclusion of Smallholder Coffee Producers

by
María-Cristina Ordoñez
*,
Ivan Dario López
,
Juan Fernando Casanova Olaya
and
Javier Mauricio Fernández
Ecotecma SAS, Popayán 190001, Colombia
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2026, 19(1), 73; https://doi.org/10.3390/jrfm19010073
Submission received: 27 November 2025 / Revised: 29 December 2025 / Accepted: 5 January 2026 / Published: 16 January 2026
(This article belongs to the Special Issue Lending, Credit Risk and Financial Management)

Abstract

This study aimed to develop a credit risk index to classify coffee producers according to socioeconomic, agronomic, and financial performance variables, with the purpose of strengthening financial inclusion. We combined qualitative and quantitative methods to understand credit risk factors among smallholder coffee producers. The study followed a descriptive-analytical approach structured in consecutive methodological phases. The systematic review, conducted following the Kitchenham protocol, identified theoretical factors associated with credit risk, while fieldwork with 300 producers provided the socioeconomic and productive contexts of coffee-growing households. Producer income, cost of living, and farm management expenses were modeled using regression, statistical, and machine learning methods. Subsequently, these variables were integrated to construct a financial risk index, which was normalized using expert scoring. The index was validated using data from 100 additional producers, for whom annual repayment capacity and maximum loan amounts were estimated according to their risk level. The results indicated that incorporating municipal-level economic variables, such as estimated average prices, income, and expenses, enhanced predictive accuracy and improved the rational allocation of loan amounts. The study concludes that credit risk analysis based on variables related to human, productive, and economic capital constitutes an effective strategy for improving access to finance in rural areas.

1. Introduction

Financial inclusion is widely recognized as a key driver of poverty reduction and sustainable economic growth, particularly in rural and agricultural communities (Demirgüç-Kunt et al., 2020). Recent studies indicate that advances in digital inclusive finance can strengthen the creditworthiness of small agricultural units by expanding information sources and reducing transaction costs in rural contexts (Zhang & Li, 2025). Nevertheless, its implementation continues to face significant challenges that prevent small agricultural producers from accessing formal credit markets. Persistent barriers include distrust in banking institutions and the heightened risk associated with financing poorly diversified activities that are exposed to both production-related and market uncertainties (Villarreal, 2017).
These constraints become more pronounced when examining the specific conditions of credit access, given the mismatch between available financial services and the actual needs of small producers (Beck et al., 2011; Ghosh, 2013). Recent evidence for Colombia confirms that these barriers persist, particularly among small rural producers, who face restrictions related to limited credit information, high financial costs, and insufficient adaptation of credit products to their production cycles (Estrada et al., 2025).
These challenges are especially evident in the coffee sector, which is characterized by small producers cultivating less than five hectares of coffee. Long production cycles, price volatility, and high climate sensitivity increase their financial vulnerability and hinder access to formal credit (ICO, 2021). These factors are compounded by structural constraints such as the limited availability of tailored financial products and the low ownership of assets suitable as collateral (Villarreal, 2017), as well as by the role of financial literacy and climate risk management capacity in shaping producers’ credit behavior.
Against this backdrop, credit risk indices have emerged as key analytical tools to mitigate information asymmetries between lenders and borrowers in rural contexts, thereby contributing to a more efficient allocation of credit without excluding producers with limited or no formal credit history (Hurley & Adebayo, 2017). Recent literature emphasizes that the construction of such indices should go beyond traditional financial variables by incorporating socio-productive dimensions—such as agricultural experience and prior relationships with financial institutions—which allow for a more accurate characterization of risk profiles in the agricultural sector (Roy & Shaw, 2021; Björkegren & Grissen, 2020). The inclusion of external factors, including market conditions, production costs, and price fluctuations, is also essential. In particular, explicitly accounting for price volatility and climatic variables has been shown to significantly improve the estimation of default risk in agricultural activities (Castro & García, 2014; Liu et al., 2025). Along these lines, hybrid approaches that combine machine learning models with explainable criteria and expert judgment have demonstrated substantial improvements in the assessment of farmers’ credit risk, while maintaining the level of interpretability required for financial decision-making (Y. Li & Zhang, 2025).
Despite these advances, traditional agricultural credit risk models face three critical limitations: (1) they rely heavily on formal credit histories and collateral requirements, inherently excluding small farmers; (2) they treat coffee producers as a homogeneous group rather than recognizing context-specific risk factors; and (3) they fail to integrate machine learning–based predictions with expert judgment to achieve a holistic risk assessment. This study addresses these limitations through the development of a composite index that incorporates non-traditional socioeconomic and agronomic variables, validated using empirical data from 100 coffee producers in Colombia.
In response to these limitations, the development of a financial risk index grounded in producers’ human, productive, and economic capital enables a more precise and personalized evaluation of credit applicants (Moreno-Menéndez et al., 2025). Complementarily, aligning repayment schedules with coffee production cycles constitutes an effective mechanism for matching income flows with financial obligations, significantly reducing default rates and increasing farmers’ willingness to demand formal credit (Dorfleitner et al., 2017).
On this conceptual basis, the central hypothesis of this study is that a financial risk index model developed using productive and socioeconomic data—at both the general context level and the specific production unit level—together with external variables affecting the crop, can enhance the understanding of the credit profile of this segment and serve as a foundation for more inclusive credit decisions.
Consistent with this hypothesis, the main objective of this study is to propose and validate a financial risk index that characterizes small coffee producers in terms of their probability of default and credit vulnerability. The model design draws on contributions from the literature on rural financial inclusion and recent approaches that employ non-traditional data for agricultural risk analysis.
In contrast to previous studies, this research develops a comprehensive and multidimensional index for small coffee producers (less than five hectares). Its novelty lies in four main aspects: (1) the simultaneous integration of human capital (education, experience, household size), productive capital (agricultural practices, certifications, technical assistance), and economic capital (income diversification, cost structures) within a single validated framework; (2) the use of machine learning–based production forecasts to reduce information asymmetry; (3) the incorporation of expert-based scoring; and (4) the adaptation of loan amounts and repayment schedules to the seasonal cycle of coffee production. In addition, the model includes contextual variables at the municipal level (prices, regional costs, and climatic patterns), which enhance predictive accuracy.
In conclusion, the findings suggest that a financial risk index tailored to the context of small-scale coffee producers contributes to strengthening their financial inclusion. The model opens new avenues for more equitable and evidence-based rural financing, provided that variable selection, community validation, and continuous monitoring are carefully implemented to avoid biases or unintended exclusions.

2. Materials and Methods

2.1. Study Design and Methodological Approach

This study was conducted with the objective of designing a credit risk index that would allow the classification of coffee producers based on socioeconomic, agronomic, and productive performance variables. A mixed-methods approach was employed because the process integrated qualitative and quantitative analyses to achieve a comprehensive understanding of the factors influencing credit risk among small producers. The methodology was structured in five consecutive and interrelated phases, as shown in Figure 1.

2.1.1. Rationale for Mixed-Methods Design

A mixed-methods approach was selected because the research required both exploratory (qualitative) and confirmatory (quantitative) components that inform different stages of the analysis.
Qualitative component (systematic review following the Kitchenham protocol and semi-structured interviews) identifies context-specific variables relevant to the credit risk of small-scale coffee producers. These variables cannot be fully identified a priori solely from theoretical literature, as they reflect the lived experiences, challenges, and adaptation strategies of small-scale farmers in Colombia.
Quantitative component (regression modeling, machine learning algorithms, expert scoring, and CHAID classification tree validation): This component enables statistical testing of identified relationships, prediction of outcomes, and validation of the index through empirical data from independent producer samples.

2.1.2. Systematic Review

A systematic review following the approach proposed by Kitchenham (2007) was conducted. The process includes the following steps, as shown in Figure A1:
Planning of the literature review: A protocol was defined, including objectives, research questions, keywords, search equations, and sources, inclusion and exclusion criteria, and data extraction templates.
Identification of studies: Documents were searched using the search equations and sources defined during the planning stage.
Selection of primary studies: Titles and abstracts are reviewed according to the established inclusion and exclusion criteria.
Quality assessment of studies: The selected studies were evaluated using quality criteria, and only those that met the minimum required score proceeded to the data extraction phase.
Data extraction: The required data were extracted using the templates established during the planning phase.
These steps supported the analysis of relationships among variables, the prediction of outcomes, and the validation of the index using empirical data from independent producer samples.

2.1.3. Collection of Qualitative Information and Socioeconomic Characterization

Semi-structured interviews and direct field observations were conducted with coffee producers in the selected region to understand their production, economic, social, and credit access conditions. This information was used to support the contextual identification of variables relevant to the credit risk index, as shown in Figure A1.
Sociodemographic variables were evaluated, including age, educational level, years of experience in productive activities, and household size. Economic and financial variables included income from non-agricultural activities, previous access to loans, land tenure (owned or leased), possession of agricultural insurance, and coffee marketing channels. Production variables included farm size, planting density, access to technical assistance, membership in coffee associations, possession of certifications, and the availability of soil analysis data.

2.1.4. Modeling of Composite Variables

The modeling of income, cost of living, and per-hectare management costs (Table 1) allows the integration of financial information associated with coffee production into the credit risk index. These components are described in detail below.
Incomes
Income was calculated by multiplying the number of loads of dry parchment coffee by the price paid per load (Equation (1)).
Income = Load of dry parchment coffee a × Price per coffee load b
a 
The load of dry parchment coffee was calculated based on cherry coffee production, which was estimated using the coffee production estimation model developed from the analysis of the datasets following the CRISP-DM methodology (IBM, n.d.) (As shown in Figure 2).
A brief description of the main phases of the selected methodology is provided below:
  • Potential meteorological variables and regression techniques for estimating coffee production were identified. In addition, relevant sources of meteorological and coffee production data in Colombia were identified to support the construction of the dataset used to train and evaluate the prediction models.
  • Data understanding: The data obtained from the identified sources were collected, described, explored, and evaluated for quality.
  • Data preparation: The data collected in the previous phase were selected and cleaned. Meteorological data were then integrated with coffee production data, followed by formatting procedures, as the information originated from heterogeneous sources and was structured in different formats.
  • Modeling: The models for estimating coffee production based on meteorological variables were developed and trained using the dataset created in the previous phase. Necessary adjustments and calibrations were made until optimal model performance was achieved.
  • This phase was conducted in parallel with the modeling stage. Once the trained model results were obtained, their performance was evaluated using different metrics to select the model with the best performance.
  • Deployment: The dataset was stored in a database along with the trained estimation model. An Application Programming Interface (API) was developed to manage and automate the extraction of new meteorological data and to generate new production estimates using the trained model.
The following tasks were conducted within the data understanding and data preparation phases to construct the final dataset:
  • Data collection: Coffee production data and meteorological data were collected from 112 different locations in Colombia, selected due to their importance in national coffee production. The production data covered the period from 2007 to 2023, while the meteorological data covered 2006 to 2023.
  • Data aggregation: The collected data were aggregated to a 0.5° × 0.5° spatial scale (latitude and longitude) to homogenize the information and facilitate analysis.
  • Descriptive analysis: A descriptive analysis was conducted to identify key characteristics of the datasets, including the number of records and variables, as well as the temporal and spatial distribution of the data.
  • Analytical tools: Various statistical and data visualization tools were used to interpret the collected information. These tools facilitated the identification of patterns and trends in both production and meteorological data.
  • Data validation: Validation procedures were performed to ensure the accuracy and consistency of the collected and aggregated data.
b 
The price of the coffee load was calculated using reports from the National Federation of Coffee Growers (Equation (2)).
P r i c e   o f   t h e c o f f e e   l o a d = ( N e w   Y o r k   f u t u r e s   m a r k e t   p r i c e +   C o l o m b i a n   p r e m i u m C o f f e e   c o n t r i b u t i o n   C o s t s   a n d   e x p e n s e s )     E x c h a n g e   r a t e     205.25
Expenses
  • Producer Living Costs
  • Definition of attributes and data extraction
A set of expenditure categories defined by DANE for Colombian households was used as a reference. These categories include food and non-alcoholic beverages, housing, water, electricity, gas, miscellaneous goods and services, transportation, clothing, health, information and communication, and other items (education, entertainment and culture, fuels).
Historical data were obtained from the National Household Budget Survey (ENPH). The average monthly current expenditure for populated centers and dispersed rural areas in 2016–2017 (DANE, n.d.) was used as the baseline for the reference period (Equation (3)).
C o s t t = C o s t t 1 ( 1 + M e n s u a l   v a r i a t i o n t )
where the following applies:
  • C o s t o t is the estimated cost of the category in month t
  • C o s t o t 1 is the cost in the previous month
  • M e n s u a l   v a r i a t i o n t is the monthly CPI variation for that category
  • Projection of Cost of Living through linear regression
A linear regression model (Equation (4)) was estimated using the derived values to project the cost of each expenditure category for 2025 and 2026. Model accuracy was assessed by comparing the projected values with the most recent observed data and by examining the mean squared error (MSE).
Costt = β0 + β1t + ϵ
where the following applies:
  • Costt is the projected cost in month t
  • β 0 and β 1 are the coefficients estimated using ordinary least squares
  • t represents time (in months since the start of the analysis)
  • ϵ is the model error term
B.
Coffee Management Costs per Hectare
  • Data collection and organization
A database was constructed using historical production cost data from farms of different sizes, supplemented by official annual inflation records published by DANE for the period 2018–2019. Farms were classified into three categories based on area: less than 5 ha, between 5 and 10 ha, and more than 10 ha.
  • Definition of variables
The dependent variable was the annual production cost (COP) at the farm level. The independent variables were:
  • Year (continuous time variable)
  • Annual inflation (economic variable, expressed as a proportion)
  • Farm area (categorical variable grouped, defined as described above)
  • Statistical model construction
Multiple linear regression was applied for each farm category, using year and inflation as predictors (Equation (5)). Parameter estimation was conducted using ordinary least squares (OLS) in Microsoft Excel. Model fit was evaluated using the coefficient of determination (R2), the statistical significance of the estimated coefficients (p-values), and residual diagnostics.
C o s t   m a n a g e m e n t = β 0 + β 1 ( A n ~ o ) + β 2 ( I n f l a t i o n ) + ϵ
where the following applies:
  • β 0 is the intercept
  • β 1 and β 2 are coefficients associated with the explanatory variables
  • ϵ is the error term

2.1.5. Financial Risk Index

Once the relevant risk factors and their corresponding levels were defined, a composite financial risk index was constructed to represent the overall credit risk profile of small-scale coffee producers. The index integrates expert-based assessments of socioeconomic, productive, and environmental dimensions into a single standardized measure of credit risk. This composite indicator captures producer heterogeneity and supports empirical validation and classification.
Expert Panel Composition and Mathematical Aggregation
To incorporate qualitative risk dimensions not directly observable through quantitative data, this study employed an expert-based scoring approach combined with mathematical aggregation. Expert judgment is widely recognized as an appropriate method for agricultural risk assessment when historical credit data are limited or incomplete (Clemen & Winkler, 1999; Green et al., 2015).
Expert Panel Composition
A panel of five experts was selected based on standardized criteria: (i) a minimum of ten years of professional experience in agricultural credit, risk management, or coffee sector analysis; (ii) demonstrated expertise in rural finance, agronomy, or environmental risk; and (iii) familiarity with Colombian coffee production systems. The panel included specialists in agricultural economics, credit risk management, agronomy, environmental science, and agricultural engineering. Prior literature indicates that panels of five to seven experts provide an optimal balance between diversity of perspectives, bias reduction, and practical feasibility (Keeney et al., 2011).
Scoring Procedure
Each expert independently evaluated 28 predefined risk factor levels using a standardized ordinal scale ranging from 0 to 3 (0 = no risk, 3 = high risk), reflecting the perceived contribution of each factor to credit default probability. Experts received a detailed scoring protocol containing operational definitions, supporting literature, and illustrative examples to ensure consistency. Independent scoring without interaction among panel members was used to prevent groupthink and preserve the authenticity of individual judgments.
Aggregation Method and Reliability Assessment
Expert scores were aggregated using simple arithmetic averaging, assigning equal weights to all experts. This approach was selected due to the absence of theoretical justification for differential weighting, its transparency, and empirical evidence showing that simple averages perform comparably to more complex aggregation schemes in expert-based forecasting (Green et al., 2015; Genova et al., 2012).
To document the level of agreement among experts, inter-rater reliability was assessed using Cronbach’s alpha. This metric was calculated based on the matrix of expert scores across all factor levels, following standard procedures for evaluating internal consistency in expert judgment studies.
Empirical Validation of the Financial Risk Index
The empirical validation of the financial risk index was conducted using a classification tree approach to examine whether the expert-derived risk categories were consistent with observable producer characteristics. An independent sample of 100 Colombian coffee producers was used to avoid endogeneity between index construction and validation.
Validation was performed using the Chi-squared Automatic Interaction Detection (CHAID) algorithm, which is suitable for categorical dependent variables and mixed-type predictors. The dependent variable was the credit risk category derived from the financial risk index (no risk, low risk, and medium risk). Predictor variables included technical, social, and economic characteristics of producers, such as coffee-growing experience, access to technical assistance, household size, farm size, income diversification, and commercialization channels.
The corrected chi-square statistic was employed as the splitting criterion, with statistical significance defined at p ≤ 0.05. The analysis was implemented in SPSS version 25. The CHAID method was selected due to its interpretability, ability to capture non-linear relationships, and compatibility with ordinal risk categories, making it appropriate for validating the structure and applicability of the proposed financial risk index.

2.1.6. Estimation of Annual Payment Capacity

Payment capacity was estimated as the difference between monthly income and monthly expenses, multiplied by 12, consistent with the annual coffee production cycle. This approach accounts for the fact that coffee growers generate liquidity primarily during harvest periods, which makes regular monthly payments impractical (FNC, 2020; BID, 2018) (Equation (6)).
Annual Payment Capacity = (Income − Expenses) × 12
Adjustment by Risk Score
Payment capacity was adjusted according to the financial risk level: low, medium, medium-high, and high (Table 2). Each level was assigned a maximum percentage of payment capacity that could be allocated to loan repayment, following a prudential credit framework (World Bank, 2020b).
Calculation of Maximum Loan Amount
The present value of the available payment amount was estimated using a monthly interest rate of 1.22% (approximately 15.65% EAR), consistent with Colombian agricultural credit market rates (Banco Agrario, 2023) (Equation (7)).
P = A v a i l a b l e   a m o u n t 1 + i n
where the following applies:
  • P = maximum loan amount
  • i = monthly interest rate
  • n = loan term in months
To identify the determinants of the available credit amount for each coffee producer (available amount, COP), a multiple linear regression model was estimated using SPSS v.25. The dependent variable was the available loan amount. Independent variables included socioeconomic factors (age, education, household size, additional income), productive factors (coffee experience, farm size, coffee type, commercialization strategy, production costs per hectare, expected productivity), and management and service access variables (association membership, land tenure, previous credit, crop insurance, technical assistance, soil analysis, infrastructure).
The model was estimated using ordinary least squares (OLS). Assumptions of normality, homoscedasticity, and absence of multicollinearity were verified, with multicollinearity assessed using the variance inflation factor (VIF) and the tolerance statistic. Model fit was assessed using analysis of variance (ANOVA), and predictor significance was evaluated using Student’s t-tests at a 95% confidence level (p ≤ 0.05).

2.2. Sampling Frame, Geographic Scope, and Representativeness

  • Fieldwork Sample (n = 300)
The fieldwork sample was drawn from the Department of Cauca, one of Colombia’s main coffee-growing regions. Cauca accounts for approximately 10% of national coffee production and includes about 93,000 registered smallholder coffee producers (holdings < 5 ha), according to FNC (2023a).
  • Sampling frame and method
The sampling frame consisted of all smallholder producers registered in the FNC Register of Coffee Growers in Cauca. We applied stratified random sampling with two stratification variables: (i) municipality and (ii) farm-size category (<1 ha, 1–2 ha, 2–5 ha). Within each municipality-by-size stratum, producers were selected at random from the FNC Register.
Sample characteristics: The fieldwork sample (n = 300) had a mean farm size of 1.5 ha (SD = 1.2), a mean annual coffee income of approximately 9 million COP (SD = 4), and a mean yield of 1020 kg/ha. Mean producer age was 54 years (SD = 8), with most producers having completed primary or secondary education (mean educational attainment: 5.3 years).
Data saturation: Semi-structured interviews were designed to systematically identify credit risk factors and producer profiles. Data saturation, the point at which additional interviews generated minimal new information, was reached at approximately n = 250. Interviews 1–80 continuously produced new categories; interviews 81–250 mainly refined existing categories; and interviews 251–300 contributed less than 5% novel information, serving primarily to confirm the stability of the identified patterns.
B.
Validation Sample (n = 100 producers)
The validation sample comprised 100 additional smallholder coffee producers from two other coffee-growing departments: Huila and Nariño. These departments were selected to test the transferability of the index across geographically and institutionally distinct, but agroecologically comparable, contexts.
Sampling Frame and Method: The validation sample was extracted from the FNC Register using identical stratification (municipality and farm-size category). Producers were purposively selected from different municipalities and producer organizations than those in the fieldwork phase to ensure organizational independence and authentic external testing.
Sample Characteristics and Transferability Assessment: The four departments selected for validation represent diverse agroecological zones within Colombia’s main coffee region. Together, these four departments account for 25–40% of national coffee production, ensuring that validation spans Colombia’s main coffee-growing regions outside Cauca.
C.
Assessment of Representativeness
The fieldwork sample’s mean farm size (1.5 ha vs. ~1.6–1.8 ha in FNC statistics for Cauca), median annual coffee income (8–10 million COP vs. 8–12 million COP), and mean yield (1020 kg/ha vs. 900–1100 kg/ha) show close correspondence to Cauca smallholder population parameters. This indicates that the sample is reasonably representative of Cauca smallholder coffee producers.
However, the index construction sample is geographically restricted to Cauca (≈10% of national coffee production), and the external validation covers only Huila and Nariño in addition to Cauca (together ≈25–30% of national production). The index is therefore most directly applicable to smallholder coffee producers in this Cauca–Huila–Nariño corridor; application to other Colombian coffee-growing regions would require additional empirical validation.

3. Results

3.1. Selection of Parameters for the Credit Risk Index for Coffee Producers

Based on a systematic literature search using keywords related to financial inclusion and credit risk, 183 articles were initially identified. After applying the quality and relevance criteria defined in the Kitchenham methodology, the number was refined to 24 studies that met the required standards for analysis (as shown in Figure 3).
The findings showed that credit risk indicators aimed at promoting financial inclusion should not rely exclusively on formal credit history, as this criterion restricts the participation of small producers who lack prior financial records. Instead, the reviewed studies emphasized the importance of incorporating variables that reflect productive experience, economic stability, and the applicant’s management capacity. These factors allow for a more comprehensive assessment of the producer’s responsibility and commitment when applying for credit.
The semi-structured interviews also yielded relevant insights. Specifically, there is a significant relationship between access to credit and variables such as age, years of experience, technical assistance, and membership in an association. Farmers with greater experience and stronger connections are more likely to access credit, likely due to increased trust from financial institutions and better resource management. Income diversification and access to financial services and products also positively influence the likelihood of accessing credit.
In line with these results, the variables and indicators used for the construction of the credit risk index were classified into social, productive, and economic dimensions, as detailed below (Table 3):

3.2. Modeling of Socioeconomic Indicators of Smallholder Producers

3.2.1. Income

Coffee Production Estimation
We developed a machine learning model to estimate coffee production based on meteorological variables and historical production data. This model serves as a proxy for producer yield expectations and is subsequently integrated into credit risk assessment model.
Model Development and Selection
We compared three regression techniques:
Multiple Linear Regression (MLR), Support Vector Regression (SVR), and Random Forest Regression (RFR).
Models were evaluated using: R2 (coefficient of determination), MAE (mean absolute error), RMSE (root mean square error), and MAPE (mean absolute percentage error).
Detailed model comparison results, including all 132 test configurations, are presented in Appendix B, Table A5.
Optimal Model Results
The Random Forest Regression model achieved the best performance when using the natural logarithm of production as the target variable (rather than yield in kg/ha):
  • R2: 0.9128
  • MAE: 0.3775 ln (tons)
  • RMSE: 0.5437 ln (tons)
  • MAPE: 0.061 (6.1%)
  • Correlation Coefficient: 0.9563
This model was trained on historical meteorological data (2006–2023) and municipal-level coffee production records (2007–2023) aggregated to a 0.5° × 0.5° spatial resolution across 577 Colombian municipalities.
See Figure 4 for predicted vs. observed values across all regions.
Why RFR Outperformed Other Models
RFR’s superior performance is attributable to its ability to capture non-linear relationships between meteorological variables and coffee production, as well as to handle the complex interactions among temperature, precipitation, and production dynamics specific to different agroecological zones.
Dataset Aggregation
Two datasets were tested: bpmn
Aggregated dataset: Meteorological variables spatially aggregated to 0.5° × 0.5° (faster computation, better performance)
Interpolated dataset: Variables interpolated to municipal coordinates (higher resolution, slower computation)
The aggregated dataset produced superior metrics and was selected for the final model.
Integration with Subsequent Econometric Analyses
The trained RFR model generates yield predictions (tons ha−1) for each producer based on their location and historical meteorological patterns. These predictions serve as an independent variable in the multiple linear regression model (Section 2.1.6) that determines maximum loan amounts. By incorporating machine learning-based production expectations rather than relying solely on farmers’ self-reported yields, we reduce information asymmetry and improve the accuracy of loan sizing. The statistical significance of yield predictions in the credit amount regression (Table A4) confirms that this integration enhances model performance.

3.2.2. Expenditure

Cost of Living
Base prices for rural household expenditures for the 2016–2017 period were obtained from the National Household Budget Survey conducted by DANE (n.d.).
Monthly Consumer Price Index (CPI) variation from January 2019 to January 2025 was extracted from DANE’s (n.d.) Consumer Price Index records (Table 4).
Monthly CPI variation was collected for the 2019–2025 period. The data show an average monthly variation of 0.67 percent, with a general upward trend between 2019 and 2023, followed by mild stabilization in subsequent years. The highest monthly value occurred in January 2023 (3.98 percent), while the lowest was observed in April 2020 (−0.89 percent), reflecting periods of economic volatility, including post-pandemic fluctuations.
Using the projection model, the evolution of the monthly cost of living from 2019 to 2025 was estimated. The results show a clear upward trend throughout the analyzed period, indicating a sustained increase in the average expenditure of rural households.
The largest annual variation occurred in 2022, the year in which the monthly cost of living exceeded one million Colombian pesos for the first time (1,221,454 COP in December), driven by rising prices of food, transportation, and basic services. In contrast, the lowest variation was observed in 2019, with monthly values close to 950,000 COP and relatively stable inflation.
On average, the cost of living increased approximately 7.1 percent annually during the period, rising from 934,420 COP in January 2019 to 1,382,111 COP in December 2025.
At the monthly level, increases were more pronounced during the first quarter of each year, particularly between January and March, coinciding with price adjustments and the beginning of the economic cycle.
Based on this analysis, a linear regression model was developed using monthly cost-of-living data over time. The model achieved a coefficient of determination of 0.95 (Table 5).
Production Management Costs per Hectare
Multiple linear regression estimated production costs by farm size. Both year and inflation significantly affected production costs, although their effects varied across farm categories.
In general, the models show high explanatory power (R2 between 0.86 and 0.91), confirming that the macroeconomic variables considered, year, and inflation explain a significant share of the variation in production costs (Table 6). Table 7 presents projected production costs for 2025 under an inflation scenario of 5.2 percent, differentiated by farm size.

3.2.3. Credit Risk Level

Based on the scoring assigned by the expert panel for each factor and level, a credit risk matrix consisting of 16 factors was constructed. The average values obtained for each level, based on expert judgment (Table 8), were used to construct a standardized risk score that classifies coffee producers into four risk categories.
The final scale, presented in Table 9, ranges from no risk to high risk, with credit scores spanning from 0 to 40.99. Each category is associated with a qualitative interpretation that guides the evaluation of the producer’s credit risk profile.
Credit Risk Profiles
The CHAID classification tree was used to identify distinct credit risk profiles among the 100 validated coffee producers. The model segments producers into three categories—low risk, medium risk, and no risk—based on statistically significant splits (Figure 5), which are summarized into operational profiles in Table 10.
The first and most influential split is soil analysis (χ2 = 25.455; df = 2; p < 0.001), indicating that the adoption of basic technical practices is the strongest discriminator of credit risk. Among producers who conduct soil analyses, 68% are classified as low risk, 30% as medium risk, and only 2% as no risk. In contrast, producers who do not conduct soil analyses are predominantly classified as medium risk (80%).
Among producers who perform soil analyses, technical assistance represents the second-level discriminator (χ2 = 8.981; df = 2; p = 0.011). Producers receiving technical assistance concentrate 82.8% in the low-risk category, while those without assistance are mainly classified as medium risk (52.4%). Within the latter group, household size further differentiates risk: small households (1–3 members) show 100% medium risk, whereas medium-sized households (4–6 members) reach 76.9% low risk. This suggests that household labor availability partially compensates for the absence of formal technical support.
For producers who do not conduct soil analyses, membership in a producers’ association is the main mitigating factor (χ2 = 6.272; df = 1; p = 0.012). Non-members concentrate 90.6% in the medium-risk category, while association members reduce this share to 61.1%. Within this subgroup, additional income sources further reduce risk: producers with supplementary income have a 23.1% low-risk rate, compared to 100% medium risk among those without income diversification. Coffee-growing experience also plays a relevant role: producers with more than five years of experience show a 66.7% low risk, whereas those with five years or less show an 88.9% medium risk.
The CHAID model identifies a small number of coherent and interpretable risk profiles. Low-risk producers are characterized by the adoption of technical practices (soil analysis and technical assistance), greater experience, and complementary sources of income or household labor stability. Medium-risk producers are primarily those lacking soil analysis, organizational membership, or income diversification, with probabilities of medium risk exceeding 80%.
Although the model distinguishes three risk categories, the “no risk” category is empirically rare (2% of the validation sample). This reflects the conservative construction of the index, where “no risk” represents a theoretical benchmark rather than a typical real-world condition. Even technically advanced producers remain exposed to climatic, price, and institutional risks. For this reason, in applied credit decisions, the “no risk” and “low risk” categories may be interpreted jointly as a low-risk segment, while their conceptual distinction is preserved within the index framework.

3.2.4. Characteristics of Coffee Growers Receiving Higher Loan Amounts

The multiple regression model (ANOVA, F = 168.478; p < 0.001; Table A2) explained 98% of the variance in the loan amount received (R2 = 0.977; adjusted R2 = 0.971; Table A3 and Table A4). Model assumptions were satisfied, including normality (Kolmogorov–Smirnov p = 0.090; Shapiro–Wilk p = 0.116), homoscedasticity, and the absence of severe multicollinearity (VIF values below 5).
The model identified six variables as significant predictors of the amount of credit available to coffee growers (Table A3 and Table A4). First, coffee-growing experience (β = 794,705.19; p = 0.03) was positively associated with credit availability. Technical assistance (β = 558,280.15; p = 0.031) was also positively associated with higher loan amounts. Similarly, the presence of additional income (β = 629,263.25; p = 0.017), farm size (β = 0.068; p = 0.046), and total income (β = 0.0614; p < 0.001) all exhibited positive and statistically significant effects. In contrast, cost per hectare had a significant negative effect (β = –0.790; p = 0.007), indicating that higher production costs are associated with lower loan amounts.
Overall, these results suggest that coffee growers with greater experience, access to technical support, diversified income sources, larger farm sizes, and controlled production costs tend to receive the highest loan amounts within the rural financial system.

3.3. Summary of Key Findings

This section synthesizes the main empirical findings of the study to facilitate a concise understanding of the core results.
  • Finding 1: Distinct Credit Risk Profiles among Coffee Producers
The CHAID analysis reveals clearly differentiated credit risk profiles among smallholder coffee producers. These profiles are primarily shaped by productive and technical characteristics rather than by traditional financial variables alone. In particular, the adoption of soil analysis practices and access to technical assistance emerge as the strongest predictors of lower credit risk.
  • Finding 2: Key Determinants of Credit Amounts
The regression analysis indicates that the loan amount accessible to producers is mainly determined by productive capacity and financial resilience. Greater farming experience, access to technical support, income diversification, and lower production costs are consistently associated with higher approved loan amounts.
  • Finding 3: Added Value of the Composite Risk Index
By integrating socio-economic, productive, and contextual variables, the proposed risk index allows for a more nuanced classification of producer risk profiles compared to traditional approaches. This enables a better alignment between credit conditions and producers’ repayment capacity, particularly in contexts where collateral and formal credit histories are limited.

4. Discussion

4.1. Credit Risk Profiles Among Coffee Growers

The identified profiles confirm that the credit risk index for small-scale coffee producers is not determined solely by financial variables but reflects the interaction among technical, social, and productive dimensions. In this regard, factors such as soil analysis, technical assistance, and participation in producer associations act as mitigating factors against credit risk. Notably, soil analysis emerged as the initial variable in the classification tree, functioning as an indicator of the producer’s level of technological adoption. Consistent with these findings, several studies have reported that producers who conduct technical diagnostics exhibit greater productive capacity and stronger financial management, thereby reducing credit risk (ICO, 2021; Villarreal, 2017; De Janvry et al., 2010).
Moreover, membership in associations and income diversification operate as mechanisms of economic resilience. Access to formal associations not only facilitates technical assistance and collective negotiation but also enhances the confidence of financial institutions by mitigating information asymmetry (Beck et al., 2011; Roy & Shaw, 2021). These findings align with previous research emphasizing the role of social and technical capital in reducing the perceived risk among financial entities (De Janvry et al., 2010; Ghosh, 2013).
Finally, coffee-growing experience also functions as a mitigating factor for risk. Producers with more than five years of experience in coffee cultivation demonstrate a greater capacity to manage price fluctuations and production shocks. This outcome is consistent with the literature linking longer productive trajectories to a lower probability of financial default (Dercon, 2004; Breeden, 2021).

4.2. Characteristics of Coffee Growers with a Higher Probability of Receiving Large Loans

Access to rural credit in the coffee sector is determined by factors related to human, productive, and economic capital, which is consistent with previous research by Leibovich et al. (2023) and Wossen et al. (2017), where agricultural experience and technical assistance were found to enhance the confidence of financial institutions. Additionally, the presence of income sources beyond coffee production reflects a degree of economic diversification, which reduces producers’ vulnerability to productive risks, such as those arising from climatic conditions (Barrett et al., 2001).
This relationship aligns with evidence from developing economies, where property rights and productive assets significantly enhance credit access (Carter & Olinto, 2003), and where improved institutional frameworks for credit markets have demonstrated substantial welfare impacts (Diagne & Zeller, 2001).
Conversely, production costs exert a negative influence on credit allocation. Producers with high production expenses are perceived by financial institutions as less financially sustainable, as reduced profit margins directly affect the repayment capacity of coffee growers (FNC, 2022).

4.3. Financial Risk Indices for the Financial Inclusion of Coffee Producers

The construction of an alternative credit risk index based on variables related to human, productive, and economic capital constitutes an effective strategy for improving access to financing in rural areas. This approach supports the proposed hypothesis that incorporating non-financial information derived from agricultural activity can enhance the prediction of credit risk, thereby contributing to greater financial inclusion among small producers who have traditionally been excluded from the formal financial system.
First, structuring credit products with a single annual payment reflects the productive reality of Colombian coffee growers, whose income to meet financial obligations typically derives from the main harvest. Reports indicate that more than 70% of producers in the municipalities of Pitalito and Chinchiná depend on a single main source of income from this activity (FNC, 2020). In this context, the monthly payment schemes required by traditional banking institutions fail to account for the liquidity dynamics of coffee producers (BID, 2018). Consequently, financial products that are aligned with the agricultural cycle have been shown to significantly reduce delinquency rates and increase farmers’ willingness to apply for credit (Dorfleitner et al., 2017).
Second, the development of a qualitative credit risk index represents a major contribution to financial inclusion. Unlike traditional models based on credit history and physical collateral, this approach incorporates factors such as coffee-growing experience, participation in associations, land tenure, and adoption of technical assistance as variables that enhance financial stability. These findings are consistent with De Janvry et al. (2010), who demonstrated that membership in agricultural associations strengthens the confidence of financial institutions in highly informal contexts. Similarly, studies in India and Sub-Saharan Africa have shown that the inclusion of socio-productive variables improves the prediction of default in rural microcredit (Ghosh, 2013; Björkegren & Grissen, 2020). Education is also recognized as a relevant factor for assessing farmers’ knowledge and managerial skills, which may influence their financial management capacity (Z. Li & Zhang, 2022; Urrea & Maldonado, 2011).
Income diversification is another factor that increases the likelihood of credit access. This finding is consistent with studies conducted in Nicaragua and Guatemala, which revealed that farmers engaged in complementary economic activities presented a lower probability of loan default (Guirkinger & Boucher, 2008). Similarly, Nuñez and Osorio-Caballero (2021) found that households with additional income sources such as remittances or off-farm employment were able to maintain more stable investments in coffee plantations during periods of price decline.
Third, applying a percentage adjustment by risk level (high, medium-high, medium, and low) to the maximum loan amount helps prevent over-indebtedness and ensures financial sustainability by linking loan size to the borrower’s risk profile. This mechanism balances financial inclusion with the mitigation of systemic risks (World Bank, 2020a). Such a model is particularly useful in managing default risk within the agricultural sector, where the non-performing loan ratio reached 7.2% in 2022 (Banco de la República, 2023). Furthermore, assessing repayment capacity in relation to the profitability of agricultural activities contributes to lower financial risk and facilitates access to agricultural credit (Aitken, 2017).
Fourth, incorporating economic variables at the municipal level—such as estimated prices, average income, and expenses of producers—enhances both the credit risk model and the determination of appropriate loan amounts (BID, 2018; ICO, 2021). This is because such variables help contextualize financial performance in relation to external factors such as climatic variations and international price volatility (FAO, 2020). This finding aligns with studies by the World Bank (2020a) and De Janvry et al. (2010), which emphasize the importance of integrating contextual and productive variables into credit models, as they reduce the probability of default and allow for the allocation of loan amounts that are more consistent with the producers’ economic reality.
Finally, the relevance of this credit risk index lies in its ability to capture the realities of the coffee sector, where over 90% of producers operate on plots smaller than five hectares, and profit margins are constrained by rising costs of fertilizers, labor, and services (ICO, 2021; FNC, 2020). For instance, production costs per hectare of coffee in Colombia exceed 11 million pesos, while income depends heavily on productivity and the volatility of international coffee prices (DANE, 2023).

5. Conclusions

The alternative credit risk index that incorporates variables related to human, productive, and economic capital is an effective tool to enhance financial inclusion for small-scale coffee growers, overcoming the limitations of traditional models based solely on financial information.
Adapting financial products to the coffee production cycle, such as annual payments aligned with the main harvest, increases producers’ willingness to apply for loans and reduces default rates, highlighting the need for flexible schemes in the rural sector.
Factors such as farming experience, participation in associations, income diversification, and the adoption of technical assistance are key determinants for credit access and the reduction in default risk, underscoring the importance of evaluating technical and social dimensions in addition to financial ones.
The integration of contextual economic variables, such as estimated prices and municipal-level costs, allows for better adjustment of loan amounts and improves the accuracy of risk assessment, contributing to a more equitable, sustainable, and context-appropriate rural financial system for coffee growers.

Limitations and Scope of Generalizability

The proposed index is calibrated for coffee producers in Colombia, specifically in the Cauca–Huila–Nariño corridor. It reflects the agroecological, institutional, and production conditions of this region. Application to other Colombian coffee regions, crops, or countries requires empirical re-validation and recalibration of factor weights to account for differences in production systems, market access, and institutional settings.
The model relies on formal documentation, including soil analysis records, technical assistance participation, and production histories. Consequently, information-poor and less formalized producers may be underrepresented or face greater uncertainty in scoring. Future applications should incorporate proxy variables and alternative data-collection strategies to reduce the risk of exclusion.
Expert-derived weights were elicited at a single point in time. Risk relationships may evolve with changes in prices, costs, climate conditions, and institutional coverage. Periodic recalibration and continued empirical validation are therefore recommended. Risk scores should be interpreted as a decision-support tool to promote responsible credit allocation and financial inclusion, rather than as a deterministic assessment.

Author Contributions

Conceptualization, J.F.C.O. and M.-C.O.; methodology, M.-C.O., I.D.L., J.F.C.O. and J.M.F.; formal analysis, M.-C.O. and I.D.L.; investigation, M.-C.O.; data curation, M.-C.O., I.D.L. and J.M.F.; writing—original draft preparation, M.-C.O. and J.F.C.O.; supervision, J.F.C.O.; project administration, J.F.C.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Colombian Ministry of Science, Technology and Innovation through Call 934 of 2023 for mission-oriented postdoctoral fellowships.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to its exclusive use of minimal-risk, voluntary interviews with adult participants, without the collection of identifiable or sensitive personal information.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data supporting this study are not publicly available due to corporate privacy and confidentiality restrictions imposed by Ecotecma SAS. The data may be obtained from the corresponding author upon reasonable request and with permission from Ecotecma SAS.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT (OpenAI, GPT-5.1) for text generation and language editing. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

Authors were employed by the company Ecotecma SAS. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DANEDepartamento Administrativo Nacional de Estadística
FNCFederación Nacional de Cafeteros de Colombia

Appendix A

Figure A1. Stages of variable selection for the credit risk index.
Figure A1. Stages of variable selection for the credit risk index.
Jrfm 19 00073 g0a1
  • Methodological Diagram of the Index Design Process
To summarize the stages, activities, and methodological decisions made throughout the research, a flow diagram was developed using Business Process Model and Notation (BPMN), representing the process from the systematic literature review to validation and loan amount estimation (As shown in Figure A2).
Figure A2. BPMN Diagram of the Methodological Process for the Design of the Credit Risk Index. Note: BPMN notation is used: circles indicate start/end events, triangles represent gateways or intermediate events, solid arrows show process flow, and dashed lines denote conditional stages. Multiple regression.
Figure A2. BPMN Diagram of the Methodological Process for the Design of the Credit Risk Index. Note: BPMN notation is used: circles indicate start/end events, triangles represent gateways or intermediate events, solid arrows show process flow, and dashed lines denote conditional stages. Multiple regression.
Jrfm 19 00073 g0a2
Table A1. Normality test.
Table A1. Normality test.
Kolmogorov–Smirnov aShapiro–Wilk
EstatisticglSig.EstatisticglSig.
Standardized Residual0.0831000.0900.9791000.116
a Corrección de significación de Lilliefors.
Table A2. Anova Test.
Table A2. Anova Test.
ModelSum of SquaresdfMean SquareFSig.
1 Regression4,305,720,320,332,348.00020215,286,016,016,617.400168.4780.000 a
Residual100,948,225,050,433.830791,277,825,633,549.795
Total4,406,668,545,382,782.00099
Dependent variable: credit_amount (COP); a Predictors: (Constant), Total income, Coffee id, Age (years), Coffee experience (years), Family members, Land ownership, Agricultural insurance, Technical assistance, Previous credits, Additional income, Association member, Infrastructure, Type of marketing, Coffee type, Soil analysis, Expenses (COP), Educational level, Farm size (ha), cost_per_hectare (COP), Yield prediction (Ton ha−1).
Table A3. Multiple regression.
Table A3. Multiple regression.
ModelRR SquareAdjusted R SquareStandard Error of the Estimate
10.988 a0.9770.9711,130,409.49817
a Predictors: (Constant), Total income, Coffee ID, Age (years), Coffee experience (years), Family members, Land ownership, Agricultural insurance, Technical assistance, Previous credits, Additional income, Association member, Infrastructure, Type of marketing, Coffee type, Soil analysis, Expenses (COP), Educational level, Farm size (ha), cost_per_hectare (COP), Yield prediction (Ton ha−1).
Table A4. Regression Coefficients and Collinearity Diagnostics for the Credit Amount Model a.
Table A4. Regression Coefficients and Collinearity Diagnostics for the Credit Amount Model a.
ModelUnstandardized CoefficientsStandardized CoefficientsTSig.Collinearity Statistics
BStd. ErrorBeta
1(Constant)−10,989,198.7343,321,727.038 −3.308
Age (years)115,076.395162,044.5940.0130.710
Family members8611.623154,341.9650.0010.056
Educational level179,623.144163,415.8360.0221.099
Coffee experience (years)794,705.194257,812.0640.0603.082
Technical assistance558,280.157253,640.7200.0422.201
Soil analysis488,137.012264,178.5610.0371.848
Coffee type61,886.67095,449.3240.0120.648
Infrastructure63,517.876241,965.3710.0050.263
Association member410,575.894254,416.5040.0311.614
Land ownership−287,678.654240,709.448−0.022−1.195
Additional income629,263.253258,358.8880.0472.436
Farm size (ha)401,455.387230,019.2610.0681.745
Previous credits213,702.691253,741.7500.0160.842
Agricultural insurance286,933.931241,376.2010.0221.189
Coffee ID122,631.480254,022.0250.0090.483
Type of marketing221,577.711265,911.2560.0170.833
Cost_per_hectare (COP)−0.7900.284−0.135−2.780
Expenses (COP) b2.303 × 10−120.0000.0040.186
Yield prediction (Ton ha−1)−1,263,408.0361,311,554.419−0.059−0.963
Total income0.6140.0401.07015.399
a Dependent variable: credit_amount (COP); b Total expenses scale with farm size and income and therefore lose explanatory power once these variables are controlled for, while cost per hectare captures cost intensity and remains a significant predictor of loan capacity.

Appendix B

This appendix contains technical details of model development and comparison. Table A5 summarizes results from 132 model configurations, including different:
  • Regression techniques (MLR, SVR, RFR)
  • Feature selection strategies
  • Time ranges and temporal aggregations
  • Geographic regions and target variables
Complete descriptions of hyperparameter optimization procedures, K-fold cross-validation strategies, and detailed computational specifications are available from the corresponding author upon request.
Table A5. Tests for the coffee cherry production estimation model.
Table A5. Tests for the coffee cherry production estimation model.
Dataset TypeRegression ModelFeature SelectionTime RangeTemporal AggregationRegionTarget VariableMAEMAPER2Max ErrorRMSESelected Features
AggregatedRFRNoFirst 6 monthsMonthSouthLog_Production0.230.030.960.700.2921
AggregatedRFRNoFirst 6 monthsStagesCentral SouthLog_Production0.170.020.950.530.2218
AggregatedRFRNoFirst 6 monthsStagesNationalLog_Production0.380.060.913.820.5510
AggregatedRFRNoFirst 6 monthsStagesNationalLog_Production0.380.060.913.990.5519
AggregatedRFRYesFirst 6 monthsStagesNationalLog_Production0.380.060.913.880.5519
AggregatedRFRYesFirst 6 monthsStagesNationalLog_Production0.390.060.913.720.5619
AggregatedRFRNoFirst 6 monthsStagesNationalLog_Production0.380.060.913.840.5619
AggregatedRFRNoFirst 6 monthsMonthCentral NorthLog_Production0.280.030.901.380.3821
AggregatedRFRYesFirst 6 monthsMonthCentral SouthLog_Production0.240.030.890.820.3121
AggregatedRFRNoFirst 6 monthsMonthCentral SouthLog_Production0.240.030.890.820.3121
AggregatedRFRNoFirst 6 monthsMonthNationalLog_Production0.410.070.894.440.6122
AggregatedRFRNoFirst 6 monthsMonthNationalLog_Production0.430.070.884.400.6315
AggregatedRFRNoFirst 6 monthsMonthNationalLog_Production0.430.070.884.610.6422
AggregatedRFRNoFirst 6 monthsStagesCentral NorthLog_Production0.260.030.882.420.4118
AggregatedRFRYesFirst 6 monthsMonthCentral NorthLog_Production0.290.040.872.470.437
AggregatedRFRNoCoffee YearMonthNationalLog_Production0.490.080.853.640.7140
AggregatedRFRYesFirst 6 monthsMonthNationalLog_Production0.480.080.824.980.7922
AggregatedRFRNoCoffee YearStagesNationalLog_Production0.460.070.814.810.8028
AggregatedRFRYesCoffee YearStagesNationalLog_Production0.580.090.804.250.839
AggregatedRFRYesCoffee YearMonthNationalLog_Production0.600.090.803.860.8310
AggregatedRFRNoFirst 6 monthsStagesNorthLog_Production0.520.080.803.030.7018
AggregatedRFRNoFirst 6 monthsMonthNorthLog_Production0.550.090.783.010.7321
AggregatedRFRYesFirst 6 monthsMonthNationalLog_Production0.510.080.764.940.9022
AggregatedSVRYesFirst 6 monthsMonthCentral NorthLog_Production0.240.030.930.770.317
AggregatedSVRNoFirst 6 monthsStagesCentral SouthLog_Production0.240.030.900.820.3018
AggregatedSVRNoFirst 6 monthsStagesSouthLog_Production0.390.050.851.580.5318
AggregatedSVRNoFirst 6 monthsMonthCentral SouthLog_Production0.230.030.891.350.3221
AggregatedMLRNoFirst 6 monthsStagesCentral SouthLog_Production0.330.040.771.630.4618
AggregatedMLRNoFirst 6 monthsMonthCentral SouthLog_Production0.310.040.801.570.4421

References

  1. Aitken, R. (2017). “All data is credit data”: Constituting the unbanked. Competition & Change, 21(4), 1–28. [Google Scholar] [CrossRef]
  2. Banco Agrario. (2023). Informe de tasas de interés crédito agropecuario. Banco Agrario de Colombia. [Google Scholar]
  3. Banco de la República. (2023). Informe de estabilidad financiera. Banco de la República. [Google Scholar]
  4. Banco Interamericano de Desarrollo (BID). (2018). Inclusión financiera en América Latina y el Caribe: Acceso, uso y calidad. BID. [Google Scholar]
  5. Barrett, C. B., Reardon, T., & Webb, P. (2001). Nonfarm income diversification and household livelihood strategies in rural Africa. Food Policy, 26(4), 315–331. [Google Scholar] [CrossRef]
  6. Beck, T., Demirgüç-Kunt, A., & Martínez Peria, M. S. (2011). Bank financing for SMEs: Evidence across countries and bank ownership types. Journal of Financial Services Research, 39(1–2), 35–54. [Google Scholar] [CrossRef]
  7. Björkegren, D., & Grissen, D. (2020). Behavior revealed in mobile phone usage predicts loan repayment. The World Bank Economic Review, 34(3), 618–634. [Google Scholar] [CrossRef]
  8. Breeden, J. L. (2021). A survey of machine learning in credit risk. Journal of Credit Risk, 17(3), 1–60. [Google Scholar] [CrossRef]
  9. Carter, M. R., & Olinto, P. (2003). Getting institutions right for whom? Credit constraints and the impact of property rights on the quantity and composition of investment. American Journal of Agricultural Economics, 85(1), 173–186. [Google Scholar] [CrossRef]
  10. Castro, C., & Garcia, K. (2014). Default risk in agricultural lending: The effects of commodity price volatility and climate. Inter-American Development Bank. [Google Scholar] [CrossRef]
  11. Clemen, R. T., & Winkler, R. L. (1999). Combining probability distributions from experts in risk analysis. Risk Analysis, 19, 187–203. [Google Scholar] [CrossRef]
  12. De Janvry, A., McIntosh, C., & Sadoulet, E. (2010). The supply- and demand-side impacts of credit market information. Journal of Development Economics, 93(2), 173–188. [Google Scholar] [CrossRef]
  13. Demirgüç-Kunt, A., Klapper, L., Singer, D., Ansar, S., & Hess, J. (2020). The global findex database 2017: Measuring financial inclusion and the fintech revolution. The World Bank Economic Review, 34, 2–8. [Google Scholar] [CrossRef]
  14. Departamento Administrativo Nacional de Estadística (DANE). (n.d.). Índice de precios al consumidor (IPC) histórico. DANE. Available online: https://www.dane.gov.co/index.php/estadisticas-por-tema/precios-y-costos/indice-de-precios-al-consumidor-ipc/ipc-historico (accessed on 20 February 2025).
  15. Departamento Administrativo Nacional de Estadística (DANE). (2018). Encuesta nacional de presupuesto de los hogares 2016–2017. DANE. Available online: https://www.dane.gov.co/files/investigaciones/boletines/enph/boletin-enph-2017.pdf (accessed on 20 February 2025).
  16. Departamento Administrativo Nacional de Estadística (DANE). (2023). Sistema de información de precios y abastecimiento del sector agropecuario (SIPSA)—Informe anual 2023. DANE. [Google Scholar]
  17. Dercon, S. (2004). Growth and shocks: Evidence from rural Ethiopia. Journal of Development Economics, 74(2), 309–329. [Google Scholar] [CrossRef]
  18. Diagne, A., & Zeller, M. (2001). Access to credit and its impact on welfare in Malawi (153p, Research Report No. 116). International Food Policy Research Institute. [Google Scholar]
  19. Dorfleitner, G., Just-Marx, S., & Priberny, C. (2017). What drives the repayment of agricultural micro loans? Evidence from Nicaragua. The Quarterly Review of Economics and Finance, 63, 89–100. [Google Scholar] [CrossRef]
  20. Estrada, D., Granger, C., Salas, V., & Segura, J. S. (2025). Characterization of credit for small rural producers in Colombia: Recent developments and challenge. Documentos de Trabajo sobre Economía Regional y Urbana 337. Banco de la Republica de Colombia. [Google Scholar] [CrossRef]
  21. Federación Nacional de Cafeteros de Colombia (FNC). (2019). Costos de producción de café en Colombia 2018–2019. FNC. Available online: https://federaciondecafeteros.org/app/uploads/2019/12/Econom%C3%ADa-Cafetera-No.-30_Web.pdf (accessed on 20 February 2025).
  22. Federación Nacional de Cafeteros de Colombia (FNC). (2020). El café en Colombia: Informe anual 2020. FNC. [Google Scholar]
  23. Federación Nacional de Cafeteros de Colombia (FNC). (2022). Informe de sostenibilidad económica del sector cafetero colombiano. FNC. [Google Scholar]
  24. Federación Nacional de Cafeteros de Colombia (FNC). (2023a). Estadísticas de productores cafeteros, departamento del Cauca. FNC. [Google Scholar]
  25. Federación Nacional de Cafeteros de Colombia (FNC). (2023b). Informe de gestión 2023. FNC. [Google Scholar]
  26. Food and Agriculture Organization of the United Nations (FAO). (2020). The state of agricultural commodity markets 2020: Agricultural markets and sustainable development—Global value chains, smallholder farmers and digital innovations. FAO. Available online: https://www.fao.org/interactive/state-of-agricultural-commodity-markets/2020/en/ (accessed on 5 September 2025).
  27. Genova, M. C., Yu, M., & Highhouse, S. (2012). Pushing the limits for judgmental consistency: Comparing random and unit weighting schemes to expert judgment. Journal of Behavioral Decision Making, 25(5), 435–449. [Google Scholar] [CrossRef]
  28. Ghosh, M. (2013). Microfinance and rural poverty. In Liberalization, growth and regional disparities in India. India Studies in Business and Economics. Springer India. [Google Scholar] [CrossRef]
  29. Green, K. C., Armstrong, J. S., & Cuzán, A. G. (2015). The aggregation of expert judgment: Do good things come to those who weight? Risk Analysis, 35(1), 5–11. [Google Scholar] [CrossRef]
  30. Guirkinger, C., & Boucher, S. (2008). Credit constraints and productivity in Peruvian agriculture. Agricultural Economics, 39(3), 295–308. [Google Scholar] [CrossRef]
  31. Hurley, M., & Adebayo, J. (2017). Credit scoring in the era of big data. Yale Journal of Law and Technology, 18(1), 148–216. Available online: https://yjolt.org/credit-scoring-era-big-data (accessed on 5 September 2025).
  32. IBM. (n.d.). Conceptos básicos de ayuda de CRISP-DM—Documentación de IBM. ibm.com. Available online: https://www.ibm.com/docs/es/spss-modeler/saas?topic=dm-crisp-help-overview (accessed on 25 October 2025).
  33. Instituto de Hidrología, Meteorología y Estudios Ambientales—IDEAM. (2024). Datos climatológicos y meteorológicos históricos. Sistema de Información Ambiental de Colombia (SIAC). [Google Scholar]
  34. International Coffee Organization (ICO). (2021). Coffee development report 2020: The value of coffee—Sustainability, inclusiveness, and resilience of the coffee global value chain. Available online: https://icocoffee.org/wp-content/uploads/2022/11/CDR2020.pdf (accessed on 5 September 2025).
  35. Keeney, R. L., McDaniels, T. L., & Swalm, A. (2011). Value-focused thinking: A path to creative decision making. University of Cambridge Press. [Google Scholar] [CrossRef]
  36. Kitchenham. (2007). Guidelines for performing systematic literature reviews in software engineering. Keele University and Durham University Joint Report, UK, Evidence-Based Software Engineering (EBSE) Group 2007-01. Available online: https://legacyfileshare.elsevier.com/promis_misc/525444systematicreviewsguide.pdf (accessed on 5 May 2011).
  37. Leibovich, J., Córdoba, C., Méndez, J. D., Aguinaga, M., Izquierdo, J., & Buitrago, L. A. (2023). Determinantes socioeconómicos, productivos y de disponibilidad de productos financieros para el acceso y uso de crédito de los productores de café en Colombia (Working Paper). Universidad del Rosario. [Google Scholar] [CrossRef]
  38. Li, Y., & Zhang, C. (2025). Farmers’ credit risk evaluation with an explainable hybrid machine learning model. Pacific-Basin Finance Journal, 89, 102744. [Google Scholar] [CrossRef]
  39. Li, Z., & Zhang, Q. (2022). Credit index screening model of family farms and family ranches based on fuzzy bayesian theory of depth weighting. Complexity, 2022, 5381208. [Google Scholar] [CrossRef]
  40. Liu, B., Ren, B., & Jin, F. (2025). Does climate risk affect the ease of access to credit for farmers? Evidence from CHFS. International Review of Economics & Finance, 97, 103813. [Google Scholar] [CrossRef]
  41. Moreno-Menéndez, F. M., González-Prida, V., Pariona-Amaya, D., Zacarías-Rodríguez, V. E., Zacarías-Vallejos, V., Zacarías-Vallejos, S. R., Aguilar-Cuevas, L. A., & Campos-Carpena, L. P. (2025). Improving financial sustainability through effective credit risk management and human talent development in microfinance institutions. International Journal of Financial Studies, 13(2), 60. [Google Scholar] [CrossRef]
  42. Nuñez, R., & Osorio-Caballero, M. I. (2021). Remittances, migration, and poverty. A study for Mexico and Central America. Investigación Económica, 80(318), 98. [Google Scholar] [CrossRef]
  43. Roy, P. K., & Shaw, K. (2021). A multicriteria credit scoring model for SMEs using hybrid BWM and TOPSIS. Financial Innovation, 7(1), 77. [Google Scholar] [CrossRef]
  44. Urrea, M. A., & Maldonado, J. H. (2011). Vulnerability and risk management: The importance of financial inclusion for beneficiaries of conditional transfers in Colombia. Canadian Journal of Development Studies, 32(4), 381–398. [Google Scholar] [CrossRef]
  45. Villarreal, F. G. (Ed.). (2017). Inclusión financiera de pequeños productores rurales. Libros de la CEPAL, N° 147 (LC/PUB.2017/15-P). Comisión Económica para América Latina y el Caribe (CEPAL). [Google Scholar]
  46. World Bank. (2020a). Enabling inclusive agricultural finance. World Bank. [Google Scholar]
  47. World Bank. (2020b). World development report 2020: Trading for development in the age of global value chains. World Bank Group. [Google Scholar]
  48. Wossen, T., Abdoulaye, T., Alene, A., Feleke, S., Menkir, A., & Manyong, V. (2017). Impacts of extension access and cooperative membership on technology adoption and household welfare. Journal of Rural Studies, 54, 223–233. [Google Scholar] [CrossRef]
  49. Zhang, C., & Li, Y. (2025). Digital inclusive finance harvest: Cultivating creditworthiness for small agricultural businesses. Pacific-Basin Finance Journal, 91, 102731. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the five-phase research methodology for developing and validating the credit risk index for smallholder coffee producers.
Figure 1. Flowchart of the five-phase research methodology for developing and validating the credit risk index for smallholder coffee producers.
Jrfm 19 00073 g001
Figure 2. CRISP-DM methodology.
Figure 2. CRISP-DM methodology.
Jrfm 19 00073 g002
Figure 3. Summary of the systematic review.
Figure 3. Summary of the systematic review.
Jrfm 19 00073 g003
Figure 4. Predicted vs. observed values across all regions. (a) Scatter plot (b) Time-series comparison plot.
Figure 4. Predicted vs. observed values across all regions. (a) Scatter plot (b) Time-series comparison plot.
Jrfm 19 00073 g004
Figure 5. Classification tree (CHAID) according to credit risk level.
Figure 5. Classification tree (CHAID) according to credit risk level.
Jrfm 19 00073 g005
Table 1. Composite variables and calculation methods.
Table 1. Composite variables and calculation methods.
CategoryVariableScaleData TypeSourceCalculation Method
Income3.1 Producer incomeContinuousQuantitative(FNC, 2023b; IDEAM, 2024)Income = Load of dry parchment coffee × Price per load. The price was estimated using the equation: Price per load = (NY market price + Colombian premium − Coffee contribution − Costs) × Exchange rate × 205.25. The load of dry parchment coffee (production) was estimated using the CRISP-DM method.
Expenses3.2.1 Cost of living of the producerContinuousQuantitative(DANE, 2018, 2023)Monthly adjustment of living costs based on CPI variation. Projection performed using linear regression: Costt = β0 + β1 t + ε.
Expenses3.2.3 Management costs per hectare of coffeeContinuousQuantitative(FNC, 2019; DANE, 2018) Multiple linear regression model: Cost = β0 + β1 (Year) + β2 (Inflation) + ε. Farms were classified by size: <5 ha, 5–10 ha, >10 ha.
Table 2. Adjustment of payment capacity by risk score.
Table 2. Adjustment of payment capacity by risk score.
Financial Risk LevelPercentage of Payment Capacity Allowed
No risk100%
Low risk80%
Medium risk60%
High riskCredit denied
Table 3. Selected factors for the construction of the credit risk index.
Table 3. Selected factors for the construction of the credit risk index.
CategoryFactorLevels
SocialAgeYoung (<30 years)
Middle-aged (30–50 years)
Older (>50 years)
Household sizeSmall household (1–3)
Medium household (3–5)
Large household (>6)
Educational levelBasic or no education Secondary
Higher or technical education
Coffee-growing experience greater than 5 yearsYes
No
Technical capacity and experienceTechnical assistanceYes
No
Soil analysisYes
No
Type of coffeeConventional
Organic
Certified
Differentiated
Organic + certified
Coffee processing infrastructureYes
No
AssociativityMembership in an associationYes
No
FinanceLand tenureOwned
Rented
Additional incomeYes No
ProductivityFarm size≤1 ha
1–5 ha
5–10 ha
≥10 ha
Relationship with financial entitiesPrevious loansYes
No
Agricultural insuranceYes
No
Coffee grower ID cardYes
No
CommercializationTypeDirect
Indirect
Table 4. Base prices of household expenses 2016–2017.
Table 4. Base prices of household expenses 2016–2017.
CategoryBase Price 2016–2017 (COP)
Food310,000
Housing, water, electricity, gas229,000
Goods and services159,000
Transportation98,000
Clothing68,000
Health29,000
Information and communication29,000
Other33,000
Table 5. Linear regression results, errors, and cost of living projections.
Table 5. Linear regression results, errors, and cost of living projections.
EquationR2RMSEMAEProjection January 2026Projection January (Next Value Unclear)
Price(t) = 861,498 + 7236.2 ∗ t0.9539,453.5832,708.661,563,4091,650,244
Table 6. Multiple linear regression results by farm size category.
Table 6. Multiple linear regression results by farm size category.
Farm CategoryEstimated ModelAdjusted R2F Critical Value
≤5 ha−1,490,846,920.44 + 741,459.48 ∗ Year + 29,277,281.88 ∗ Inflation0.960.0189
5–10 ha−1,053,855,607.54 + 525,762.93 ∗ Year + 37,743,972.76 ∗ Inflation0.960.0183
≥10 ha−1,534,129,603.20 + 764,435.34 ∗ Year + 20,809,731.14 ∗ Inflation0.720.0662
Table 7. Coffee production costs per hectare (cop).
Table 7. Coffee production costs per hectare (cop).
YearInflation>5 ha5–10 ha<10 ha
20190.0386,910,0008,932,0009,169,300
20200.01617,744,8008,630,30010,493,100
20210.05629,324,00011,405,90013,376,400
20220.131212,486,00014,179,00014,626,000
20230.092811,536,00013,027,00013,263,000
20240.05211,389,47812,251,25314,169,633
20250.04911,301,64612,138,02114,107,204
Table 8. Selected factors and average scores for the credit risk index.
Table 8. Selected factors and average scores for the credit risk index.
CategoryFactorLevelsAverage Score
SocialAgeYoung (<30 years)2.0
Middle (30–50 years)0.0
Older (>50 years)1.0
Household sizeSmall household (1–3)1.2
Medium household (3–5)0.5
Large household (>6)2.0
Educational levelBasic or no education3.0
Secondary1.5
Higher or technical0.2
Coffee farming experience > 5 yearsYes0.2
No2.3
Technical capacity and experienceTechnical assistanceYes0.2
No2.8
Soil analysisYes0.3
No2.5
Coffee typeConventional2.0
Organic1.0
Certified0.0
Differentiated0.0
Organic + certified0.0
Coffee processing infrastructureYes0.2
No2.0
AssociativityMembership in an associationYes0.2
No2.3
FinancesLand tenureOwned0.2
Leased2.0
Additional incomeYes0.2
No2.2
ProductivityFarm size≤1 ha3.0
1–5 ha2.0
5–10 ha1.0
≥10 ha0.0
Relationship with financial institutionsPrevious loansYes0.8
No1.8
Agricultural insuranceYes0.2
No2.7
Coffee grower ID Yes0.2
No2.0
CommercializationTypeDirect0.8
Indirect2.2
Table 9. Categories, score ranges, and interpretation of the credit risk index.
Table 9. Categories, score ranges, and interpretation of the credit risk index.
CategoryScore RangeInterpretation
High risk31–40.99High level of risk exposure with an elevated probability of default.
Medium risk21–30.99Significant presence of unfavorable factors. Requires monitoring.
Low risk11–20.99Some risk factors are present but not decisive.
No risk0–10.99Optimal profile, with all or nearly all factors in favorable conditions.
Table 10. Summary of credit risk profiles according to the CHAID classification tree.
Table 10. Summary of credit risk profiles according to the CHAID classification tree.
ProfileMain ConditionsRisk DistributionInterpretation
Consolidated low-risk
  • Conduct soil analyses
  • Receive technical assistance
  • Households with 4–6 members
  • Experience > 5 years
  • Additional income
  • Low: 67–83%
  • Medium: 13–33%
  • No risk: up to 3%
Most robust profile, characterized by technical practices, household stability, and income diversification.
Partial low-risk
  • Conduct soil analyses
  • No technical assistance
  • Medium-sized households 4–6 members
  • Low: 76.9%
  • Medium: 23.1%
  • No risk: 0%
Although they lack technical assistance, household size mitigates risk.
Moderate medium-risk
  • Do not conduct soil analyses
  • Members of a producers’ association
  • Additional income
  • Medium: 61–77%
  • Low: 23–39%
  • No risk: 0%
Membership in an association and external income reduce risk, although moderate exposure remains.
High medium-risk
  • Do not conduct soil analyses
  • Not association members
  • No additional income
  • Medium: 90–100%
  • Low: 0–9%
Medium risk predominates. Lack of technical practices and organizational support increases vulnerability.
Critical medium-risk
  • Small households (1–3 members)
  • No technical assistance
  • No additional income
  • Medium: 100%
Most vulnerable profile, with small households and limited capacity for technological adoption.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ordoñez, M.-C.; López, I.D.; Casanova Olaya, J.F.; Fernández, J.M. Credit Risk Index as a Support Tool for the Financial Inclusion of Smallholder Coffee Producers. J. Risk Financial Manag. 2026, 19, 73. https://doi.org/10.3390/jrfm19010073

AMA Style

Ordoñez M-C, López ID, Casanova Olaya JF, Fernández JM. Credit Risk Index as a Support Tool for the Financial Inclusion of Smallholder Coffee Producers. Journal of Risk and Financial Management. 2026; 19(1):73. https://doi.org/10.3390/jrfm19010073

Chicago/Turabian Style

Ordoñez, María-Cristina, Ivan Dario López, Juan Fernando Casanova Olaya, and Javier Mauricio Fernández. 2026. "Credit Risk Index as a Support Tool for the Financial Inclusion of Smallholder Coffee Producers" Journal of Risk and Financial Management 19, no. 1: 73. https://doi.org/10.3390/jrfm19010073

APA Style

Ordoñez, M.-C., López, I. D., Casanova Olaya, J. F., & Fernández, J. M. (2026). Credit Risk Index as a Support Tool for the Financial Inclusion of Smallholder Coffee Producers. Journal of Risk and Financial Management, 19(1), 73. https://doi.org/10.3390/jrfm19010073

Article Metrics

Back to TopTop