1. Introduction
Non-payment of mandatory membership fees constitutes a systemic financial risk that undermines both the institutional sustainability of professional associations and the continuity of licensed practices. This phenomenon generates recurring liquidity constraints, disrupts medium- and long-term budget planning, and limits investment capacity in infrastructure expansion, welfare facilities, and continuing professional development programs. Consequently, membership delinquency emerges as a critical management challenge that requires evidence-based risk-mitigation strategies. This structural vulnerability extends beyond administrative inefficiency and directly constrains institutional growth, infrastructure maintenance, and service quality, thereby reinforcing the need for proactive financial governance. Previous studies demonstrate that delinquency reflects structured behavioral patterns, rather than random non-compliance. Age, income level, marital status, professional seniority, and employment status systematically influence the propensity to default or incur financial delinquency [
1,
2,
3,
4,
5,
6]. These findings establish that payment behavior is shaped by identifiable sociodemographic and professional characteristics, suggesting the feasibility of predictive approaches to delinquency management. In the field of credit risk management, machine learning methods, particularly ensemble algorithms, consistently outperformed traditional statistical models on complex classification tasks. For example, Random Forest achieves accuracies above 89% in insurance applications [
7], while XGBoost demonstrates strong predictive performance and computational efficiency in financial risk assessment [
8]. Moreover, integrating these algorithms with balancing techniques such as the Synthetic Minority Oversampling Technique (SMOTE), effectively addresses imbalanced distributions in binary classification problems [
9]. These advances indicate that ensemble learning provides robust solutions for high-dimensional and imbalanced datasets in finance. Despite these advances, predictive modeling remains underexplored in professional association contexts, particularly in relation to disqualification due to non-payment. Conventional approaches, including logistic regression and discriminant analysis, often perform suboptimally in datasets characterized by nonlinear relationships, high dimensionality, and pronounced class imbalance [
10,
11]. Furthermore, few studies have integrated institutional variables, including membership type, payment history, banking institution, and payment location, with sociodemographic and academic characteristics into a unified predictive framework [
10,
11]. This limitation restricts the development of institution-specific risk mitigation strategies tailored to the operational realities of professional licensing organizations. This study develops a supervised machine learning model to estimate the probability of disqualification due to delinquency among licensed professionals in Peru. By integrating institutional financial transaction records with member registry data, this study evaluates multiple supervised algorithms under controlled conditions to identify the most effective approach for delinquency prediction. The analysis incorporates demographic, academic, and financial-institutional variables within a comprehensive predictive framework. The findings demonstrate that gradient-boosting frameworks, particularly CatBoost and XGBoost, exhibit robust discriminatory capacity in identifying members at risk of disqualification, outperforming traditional statistical methods and other machine-learning approaches. By systematically evaluating multiple algorithms across a large-scale institutional dataset, the analysis establishes an empirical foundation for preventive delinquency management systems within professional associations. The integration of institutional and sociodemographic variables with advanced modeling techniques enables the development of proactive, data-driven intervention strategies. The remainder of this article is structured as follows:
Section 2 establishes the Research Hypothesis and Conceptual Model;
Section 3 reviews the relevant literature;
Section 4 describes the data and methodology;
Section 5 presents the empirical results;
Section 6 discusses their implications; and
Section 7 concludes and outlines directions for future research.
2. Research Hypothesis and Conceptual Model
Building on behavioral economics, financial risk theory, life-cycle theory, and human capital theory, this study advances a framework that conceptualizes professional disqualification risk as a multifactorial, institutional outcome. Non-payment does not occur by chance—it emerges from the structured interaction of structural, professional, and behavioral determinants that can be modeled through supervised machine learning [
1,
10,
11]. By integrating individual attributes with institutional conditions, the framework delivers a coherent predictive architecture grounded in economic reasoning.
Behavioral economics and financial life-cycle theory treat age as a signal of economic stability, accumulated capital, and financial maturity [
2,
4,
5]. Evidence indicates that early-career professionals experience greater income volatility and reduced capacity to absorb shocks, thereby increasing delinquency risk [
1,
6,
12]. Accordingly, H1 predicts a negative association between age and professional disqualification risk: younger professionals are more likely to be classified as high-risk. In this way, life-cycle dynamics function as a structural driver of financial vulnerability.
Sociodemographic variables—including gender, marital status, and country of origin—capture structural differences in financial stability and compliance behavior [
1,
2,
5,
12]. These factors reflect broader socioeconomic conditions that shape distinct institutional risk profiles. Accordingly, H2 posits a significant association between sociodemographic characteristics and professional disqualification risk, extending the analysis beyond purely financial indicators.
Human capital theory provides the foundation for H3 by conceptualizing institutional seniority and number of registered specialties as accumulated investments in education, experience, and labor market positioning [
13]. Greater specialization and longer tenure signal more stable income expectations and stronger organizational commitment, thereby reducing the default probability. Accordingly, H3 posits a negative association between academic and professional variables and disqualification risk, framing them as protective factors against financial instability.
Within financial risk theory, historical payment behavior constitutes the most robust predictor of future default [
8,
11,
14,
15]. Indicators such as payment frequency, proportion of months with outstanding debt, and balance variability capture persistent behavioral patterns, rather than static attributes [
1,
10,
11]. Because these measures reflect dynamic financial discipline or instability, they are expected to display superior discriminatory capacity. Accordingly, H4 proposes that financial and payment behavior variables exhibit greater predictive power than static sociodemographic characteristics.
From an organizational perspective, administrative infrastructure—payment group, payment method, and banking institution—structures the context in which compliance decisions [
1,
10]. These institutional conditions may introduce operational frictions or facilitate timely payments through automated mechanisms. Thus, H5 posits that payment infrastructure variables contribute significantly to the classification of disqualification risk.
Finally, methodological considerations motivate H6. Research on imbalanced classification consistently shows that gradient boosting ensemble methods outperform traditional linear models when nonlinear relationships and high-dimensional interactions are present [
7,
8,
9,
16]. Given the heterogeneity and interplay among structural and behavioral predictors in this framework, H6 predicts that supervised ensembles—XGBoost, LightGBM, and CatBoost—achieve superior predictive performance compared to conventional statistical models for identifying high-risk members. Collectively, these hypotheses translate the conceptual framework into a coherent theoretical and methodological basis for predictive institutional risk assessment.
Conceptual Model
The model conceptualizes professional disqualification risk, operationalized as a binary dependent variable, as the outcome of systematic interactions among three theoretically grounded dimensions. The first dimension encompasses sociodemographic attributes that capture individual structural conditions; the second includes academic and professional characteristics that reflect educational trajectory and accumulated human capital [
13]; and the third comprises financial-institutional factors that integrate historical payment behavior with relevant administrative features [
8,
10,
11,
14,
15]. This tripartite structure delineates the determinants of the phenomenon with conceptual precision and aligns observable indicators with explicit theoretical foundations.
The framework guides variable selection and avoids a purely exploratory data mining approach. By integrating structural and behavioral dimensions within a supervised modeling architecture, the study compares predictive algorithms and evaluates the explanatory contribution of each conceptual domain to the risk classification. This design ensures coherence between theoretical foundations and empirical implementation.
Accordingly, the study extends beyond a technical comparison of modeling methods and advances a theory-grounded framework for preventive institutional risk management. By aligning conceptual rationale, variable operationalization, and empirical evaluation, it enhances explanatory rigor and establishes a solid foundation for decision-making based on predictive evidence.
3. Related Works
The use of supervised machine learning models has been widely documented for predicting complex events across various domains. This section reorganizes the literature into studies directly related to financial risk, delinquency prediction, and institutional compliance modeling, with the objective of positioning the present work within the current state of the art.
3.1. Models Applied to Financial Risk and Fraud Detection
Recent literature demonstrates that machine learning models consistently outperform traditional statistical approaches in fraud detection and financial risk prediction, particularly when integrating data balancing techniques and hybrid optimization strategies. In the health insurance domain, Nabrawi and Alanazi [
17] compare Random Forest, Logistic Regression, and Artificial Neural Networks on imbalance-corrected datasets and report anomaly-detection precision exceeding 98%, identifying that policy type, education level, and claimant age are key determinants. Their findings confirm the superiority of ensemble-based methods in heterogeneous financial contexts.
Similarly, in the U.S. banking sector, Kusaya and O’Keefe [
18] evaluate supervised models—including Artificial Neural Networks, Gradient Boosting, Random Forest, Logistic Regression, and deep autoencoders—for predicting internal abuse and fraud. They demonstrate that model selection must prioritize predictive performance and out-of-sample generalization capacity rather than theoretical simplicity.
In financial risk management, Sundar et al. [
14] propose a hybrid framework combining CatBoost and Support Vector Machines with SMOTE and Principal Component Analysis, achieving 95.93% accuracy and outperforming Logistic Regression and Random Forest. Complementarily, Brygala and Korol [
15] show that gradient-based algorithms such as LightGBM, CatBoost, and XGBoost enhance personal insolvency prediction and employ SHAP values to interpret variable contributions, including income, credit denial history, and payment delays.
Collectively, these studies consolidate empirical evidence that advanced ensemble algorithms provide scalable, high-precision, and interpretable solutions for delinquency and fraud detection, particularly in high-dimensional and imbalanced financial datasets. However, their application to professional association contexts remains limited, thereby justifying the present study.
3.2. Models for Social and Professional Phenomena
Al-Alawi et al. [
13] proposed a hybrid model to predict academic and psychological stress in students, combining k-NN, Random Forest, XGBoost, and weighted voting techniques. The integration of multiple models enabled the effective handling of interrelated variables.
Akinyemi et al. [
19] applied SVM, Random Forest, k-NN, and neural networks to predict student dropout based on personal, academic, and social variables. Random Forest achieved accuracy above 90%, supported by systematic cross-validation and attribute selection.
Naik et al. [
20] evaluated Naïve Bayes, k-NN, and Decision Trees to predict salary class using the UCI Census dataset. Their findings validate the predictive value of demographic and professional attributes in administrative contexts, a methodological principle shared with the present research.
3.3. Theoretical Foundations and Variable Selection in Delinquency Modeling
Payment behavior has been extensively analyzed in behavioral economics and financial risk theory, which demonstrates that delinquency is shaped by sociodemographic, academic, and institutional determinants rather than purely by income constraints [
1,
6,
12]. Age, marital status, and gender are systematically associated with financial stability and compliance patterns, consistent with life-cycle finance theory and household decision-making models [
6,
12].
Academic and professional variables, including tenure and number of specialties, operate as proxies for accumulated human capital and employment stability. Human capital theory posits that higher specialization increases income expectations and reduces default probability [
13]. Institutional variables, such as payment method and banking institution, capture operational compliance mechanisms and transactional friction [
10,
11].
Prior research on delinquency prediction also emphasizes the importance of integrating complementary data sources to adequately represent the domain structure. For instance, Nurdin et al. [
21] analyze default risk in educational financing using demographic, academic, and financial variables—including gender, study program, funding type, installment amount, payment status, and historical arrears—demonstrating that combining individual attributes with historical payment behavior improves predictive accuracy.
Following this methodological logic, the present study integrates two complementary institutional databases—“Payment Records” and “Membership Registry”—to capture both transactional dynamics and structural professional attributes. The “Payment Records” dataset includes transactional variables such as payment date, amount, payment type, payment method, and banking institution, enabling the construction of behavioral indicators such as frequency, punctuality, and cumulative arrears. The “Membership Registry” dataset incorporates sociodemographic and administrative attributes, including tenure, gender, age, marital status, country of qualification, professional category, and number of specialties, which characterize the structural profile of each member.
This dual-database structure ensures coherent representation of the problem domain and aligns with best practices in delinquency modeling, where preliminary variable assessment and domain-specific feature engineering are critical for robust predictive performance.
3.4. Studies Focused on Benchmarking and Algorithm Evaluation
Tousi et al. [
22] emphasize that model selection must balance predictive accuracy, interpretability, and computational efficiency of the model. Belavagi et al. [
23] highlight the importance of F1-score, AUC, and cross-validation in imbalanced classification contexts. These methodological criteria guide the evaluation framework adopted in the present study.
Table 1 summarizes the principal studies related to this research, detailing context, algorithms, evaluation metrics, and methodological contributions.
4. Methodology
4.1. Research Design
This study is an empirical, applied, and quantitative investigation aimed at developing a predictive model using supervised learning techniques [
28,
29] to estimate the risk of disqualification among licensed professionals in Peru, based on sociodemographic, occupational, and financial characteristics. The analysis was conducted at the individual level using anonymized administrative records provided by a national professional association.
4.2. Data Sources
Dataset 1 contains information on payment history, charges, and deposits, with approximately 5.7 million records. Dataset 2 includes demographic, academic, and professional information on 28,802 active and historical members. Both datasets were integrated to construct the final dataset. Each record represents a member, aggregated at the monthly level. The target variable (risk_level) was defined as binary (0 = low risk; 1 = high risk) according to internal accounting rules, based on accumulated debt relative to institutional delinquency thresholds. Also, exploratory analysis was conducted to characterize initial patterns in both numerical and categorical variables, with the objective of examining their association with the observed levels of disability risk (risk_level).
4.3. Data Preprocessing and Variable Construction
Data processing was performed using Python 3.9, employing the libraries pandas, scikit-learn, xgboost, catboost, lightgbm, among others.
The steps were as follows:
Initial cleaning: Duplicate entries were removed.
Monthly aggregation: Monthly financial indicators were calculated for each member, including variables such as collections_month, payments_month, and accumulated_debt.
Let i represent an individual identified by their unique code , and let t denote a calendar month defined by truncating the transaction date to the year-month level. Based on the individual transaction records, the following variables were defined:
This variable represents the average of positive collection transactions for individual i during month t—that is, the amounts actually collected.
This variable corresponds to the total sum of positive amounts collected from individual i during month t.
This variable captures the total amount paid by individual i during month t, represented as negative values in the transaction records.
Where and represent, respectively, the number of positive (collections) and negative (payments) transactions for individual i in month t.
Target variable—high risk: The target variable, denoted as high risk, was defined according to institutional accounting rules, using accumulated debt and total disqualification duration as classification criteria. An individual was labeled high-risk if their total debt exceeded a threshold equivalent to 25% of the number of months they remained disqualified. Formally, let
denote the accumulated debt for individual
i, and let
be the total number of disqualification months. The high-risk condition is then defined as follows:
The 25% threshold was determined based on internal policy and empirically validated through a sensitivity analysis aimed at optimizing the F1 score and balancing debt recovery with classification accuracy for the minority class.
Feature engineering: Variables such as the following were derived:
Encoding: Categorical variables were coded using target encoding or one-hot encoding, depending on the model. Numerical variables were normalized using MinMaxScaler.
4.4. Predictor Variables
The selection of predictor variables is guided by a theory-driven approach to capture the multifactorial nature of institutional financial risk [
30,
31,
32]. Rather than relying exclusively on transactional indicators, this study integrates sociodemographic, academic, and institutional variables to represent the structural, behavioral, and organizational determinants of payment compliance.
Twelve predictor variables were included in the analysis (see
Table 2), organized into three conceptual categories to reflect their distinct analytical roles. The first category comprised demographic characteristics, specifically age, gender, marital status, and country of birth, which capture individual-level background attributes. The second category encompassed academic and professional factors, including length of time at the school, registered specialties, country of degree, and type of tuition, thereby representing educational trajectory and professional positioning. The third category consisted of financial and institutional variables, namely payment group, payment method, bank, and payment office, which characterize economic arrangements and administrative structures. This classification establishes a structured framework for examining how individual, professional, and institutional dimensions jointly relate to the outcome of interest.
4.5. Modeling Framework
Figure 1 presents the proposed methodological framework for estimating risk levels through a structured machine learning pipeline composed of seven sequential stages. The process begins with data extraction and proceeds to data cleaning and transformation, where validation, standardization, and indicator construction are performed to ensure data integrity, consistency, and analytical reliability. Subsequently, multiple data sources are integrated to construct a unified and coherent analytical dataset aligned with the problem domain.
The fourth stage involves feature engineering, aimed at identifying and selecting variables with the highest predictive relevance. This step supports the subsequent training and validation of multiple supervised classification models, including Logistic Regression (LR), Linear Discriminant Analysis (LDA), Gaussian Naïve Bayes (GNB), k-Nearest Neighbors (k-NN), Decision Trees (DT), Random Forest (RF), Multilayer Perceptron (MLP), XGBoost, LightGBM, and CatBoost. Model performance is then systematically evaluated and compared using appropriate classification metrics to identify the algorithm with the strongest predictive accuracy and generalization capacity. Finally, the selected model generates a prediction output that classifies individuals into mutually exclusive risk categories—high or low—thereby completing the analytical workflow of the proposed framework.
4.6. Predictive Modeling
The selection of supervised learning algorithms in this study was based on a strategic balance between generalization capacity, computational efficiency, and model interpretability [
11]. First, traditional linear models such as logistic regression (LR) were included, serving as a robust reference due to their transparency and ease of interpretation in regulated sectors, enabling stakeholders to understand the functional relationships between input variables and financial risk [
10,
11]. These models are valued for their training efficiency and flexibility regarding class distributions in the feature space [
26,
33]. However, to capture the multifactorial nature and nonlinear interactions inherent in institutional data, tree-based and ensemble methods—such as Random Forest (RF), XGBoost, LightGBM, and CatBoost—were incorporated [
9]. The Random Forest algorithm was selected for its capacity to reduce variance and overfitting by averaging multiple independent decision trees [
10,
11]. In contrast, gradient boosting methods, particularly XGBoost, offer enhanced scalability and effective handling of imbalanced data through regularization and gradient-based optimization [
8]. Notably, state-of-the-art models such as CatBoost and LightGBM have demonstrated high effectiveness in managing high-cardinality categorical variables, minimizing encoding bias, and optimizing computational resources through leaf-wise tree growth strategies [
34]. This combination of approaches enables the development of models that not only achieve high predictive accuracy (R
2 > 0.90) but also maintain robustness against volatility in historical data [
22,
34].
Ten supervised learning algorithms were systematically compared to evaluate their predictive performance across methodological families. The first group comprised base models, including Logistic Regression (LR), Linear Discriminant Analysis (LDA), Gaussian Naive Bayes (GNB), and the k-nearest neighbors (k-NN) algorithm, which represent established statistical and distance-based classification approaches. The second group consisted of tree-based, boosting, and neural network approaches, specifically Decision Trees (DTs), Random Forest (RF), XGBoost, LightGBM, and CatBoost, as well as Multilayer Perceptrons (MLPs), which model nonlinear relationships and complex feature interactions through hierarchical partitioning, boosting, or layered network architectures. This design allows a systematic comparison of linear, probabilistic, instance-based, tree-based, boosting, and neural network paradigms
Hyperparameter Tuning
GridSearchCV with stratified cross-validation () was applied for each model, optimizing the F1-Score metric. The random seed (random_state = 42) was fixed to ensure reproducibility.
4.7. Class Balancing
Due to the severe imbalance in the target variable (89% high-risk, ; 11% low risk, ), SMOTE (Synthetic Minority Oversampling Technique) was applied within each training fold, and class weighting (class_weight = ‘balanced’) was implemented for algorithms that allow it. This revision explicitly reports the absolute number of records per class, thereby clarifying the magnitude of the imbalance in the original dataset.
4.8. Evaluation Metrics
In classification problems with extreme class imbalance, where the incidence rate may be as low as 1%—as in fraud or dropout detection—accuracy becomes a misleading metric that systematically favors the majority class [
19]. In such scenarios, a naive model that classifies all instances as negative can achieve 99% accuracy, yet entirely fails to identify critical cases of institutional risk [
9]. From a risk assessment perspective, it is essential to prioritize discriminative metrics. Although AUC-ROC is a widely used standard, under severe imbalance, it tends to overestimate performance, as the false-positive rate is artificially suppressed by the large volume of negative cases [
9]. Therefore, the PR-AUC (area under the precision–recall curve) is preferred, as it focuses exclusively on minority class performance and more accurately reflects the trade-off between case detection and prediction precision [
9]. Accordingly, the F1-score was adopted as the primary optimization metric during hyperparameter tuning, balancing sensitivity and precision for the positive class while reducing bias toward majority-class predictions [
9].
Model performance was evaluated using complementary classification metrics that capture overall predictive accuracy and class-specific discrimination. Accuracy was quantified as the proportion of correctly classified instances among all observations. Precision measured the proportion of true positives among predicted positives, thereby reflecting the reliability of positive classifications. Recall, or sensitivity, assessed the proportion of actual positives correctly identified, indicating the model’s ability to detect relevant cases. The F1-score is the harmonic mean of precision and recall, providing a single, robust summary measure that is robust to class imbalance. The area under the receiver operating characteristic curve (AUC–ROC) evaluated the model’s discriminatory capacity across classification thresholds. Confusion matrices and ROC curves were examined for the best-performing models to enable a detailed assessment of classification errors and threshold-dependent performance.
4.9. Implementation Environment
The experiments were conducted in a hybrid computing environment, consisting of an institutional Linux server (32 GB RAM, Xeon 3.2 GHz CPU, Lima, Peru) and local workstations (Intel Core i7, 16 GB RAM, Lima, Peru). The codebase was version-controlled using GitHub. Workflow traceability and experimental replicability were ensured throughout the process.
6. Discussion
From a theoretical perspective, the findings support the idea that the risk of professional disqualification is a multifactorial institutional phenomenon in which contextual and organizational variables play a role comparable to that of traditional financial indicators. This is consistent with contemporary data-driven governance frameworks, which emphasize the integration of administrative, behavioral, and sociodemographic information to support preventive decision-making in institutional risk management.
The results empirically validate the conceptual framework by aligning theoretical foundations with predictive evidence. H1 supports financial life-cycle theory: age consistently differentiates risk, with early-career professionals more likely to be classified as high-risk [
1,
2,
4,
6,
6]. Life-course dynamics therefore shape financial vulnerability. H2 shows that sociodemographic variables have moderate predictive relevance [
1,
6,
12]. Although weaker than behavioral indicators, they improve model performance and structure differentiated risk profiles beyond strictly financial data. For H3, the evidence confirms human capital theory [
13]. Institutional seniority and specialization are negatively associated with disqualification risk. Accumulated experience acts as a protective factor, even if its effect is less pronounced than that of financial behavior. Hypothesis H4 receives strong empirical support, in line with the credit risk modeling literature [
8,
10,
11]. Payment behavior indicators rank among the most influential variables in ensemble models. This result confirms that historical financial conduct constitutes the most robust predictor of future default. With respect to H5, variables related to payment infrastructure provide additional explanatory power, albeit secondary to direct behavioral patterns [
1,
10]. This pattern suggests that administrative conditions indirectly shape members’ financial behavior and complement individual-level risk determinants. Finally, the findings confirm H6 and align with evidence demonstrating the superiority of ensemble algorithms in imbalanced data settings [
7]. Gradient boosting models outperform traditional statistical approaches across key performance metrics because they capture nonlinear relationships and complex interactions more effectively. Taken together, these results reinforce the validity of the conceptual model and demonstrate that disqualification risk primarily arises from financial behavioral patterns embedded within a broader structural framework. The integration of economic theory, human capital perspectives, and computational modeling provides a robust foundation for institutional governance grounded in predictive evidence.
The results confirm the effectiveness of machine learning-based approaches for predicting delinquency phenomena in professional contexts, particularly when working with highly unbalanced and multidimensional datasets. The CatBoost algorithm ranked as the most effective model in the final test set, achieving an F1-Score of 57.96% and an accuracy of 79.92%, due to its capacity to handle categorical variables without prior encoding and to model complex nonlinear relationships. However, in the cross-validation process, XGBoost achieved a higher F1-macro (0.585), suggesting greater overall stability across different data subsets. This difference underscores the need to consider both average performance and production behavior when selecting optimal models. The observed F1-Score indicates that the model effectively identifies high-risk professionals who might otherwise be undetected, enabling early interventions to reduce delinquency rates and administrative costs.
These findings are consistent with previous studies in which XGBoost has demonstrated advantages in financial and credit risk scenarios, especially when combined with balancing techniques such as SMOTE [
9]. However, as [
27] also points out, high overall accuracy can be misleading in imbalanced contexts, as it tends to favor the majority class. A similar pattern emerged in this study: models such as Random Forest and k-NN achieved high accuracy (>90%) but lower F1-macro scores, reflecting a bias toward the majority class (high risk).
The weight adjustment strategy in XGBoost, combined with oversampling, improved the recall of the minority class (low risk) from 39.3% to 60.6%, reducing false negatives. This improvement advances one of the key objectives of any risk prediction tool: correctly identifying cases that would otherwise be overlooked [
9]. Regarding model interpretability, although a technical discussion of the metrics is presented, the analysis does not include variable importance. Incorporating tools such as SHAP (SHapley Additive exPlanations) would clarify which attributes most influence risk prediction, strengthening transparency before institutional authorities or auditors. This is particularly relevant in applications where decisions carry direct professional consequences.
The set of variables used—covering demographic, financial, and institutional attributes—represents an improvement over studies that focus exclusively on credit or clinical aspects. As shown by [
13,
19], incorporating contextual variables enhances model performance in social and administrative prediction tasks.
Estimating risk at a specific point in time aligns methodologically with institutional decision-making processes, which operate through discrete evaluations under operational constraints. The literature on financial risk management demonstrates that predictive models applied to information available at a given moment can identify individuals with a high probability of default before adverse events occur [
17,
18]. Likewise, behavioral economics and life cycle theory indicate that sociodemographic variables observed at a given moment are systematically associated with future financial behavior [
1,
6,
12]. Consequently, the cross-sectional prediction adopted in this study is conceptually robust and empirically supported for practical institutional implementation.
However, this study presents several important limitations. Independent validation was not conducted, as all data were assessed exclusively through k-fold cross-validation. This methodological choice may constrain the model’s capacity to generalize beyond the analyzed dataset, thereby weakening the robustness of the resulting inferences.
The variables used are structured and quantifiable, while qualitative dimensions such as member satisfaction, perceived institutional value, and motivations for payment compliance are excluded. Future research could incorporate these factors through surveys or sentiment analysis of digital platforms.
From an institutional perspective, the study provides a basis for developing preventive monitoring systems, early warning mechanisms, and evidence-based financial retention strategies. Implementing models such as the one proposed would enable targeted interventions for high-risk subgroups and the design of differentiated strategies aligned with sociodemographic and academic profiles.
The institutional implications identified align with empirical evidence from other contexts of financial risk management in the FinTech sector. Moise et al. [
35] demonstrate that institutions design differentiated strategies based on users’ demographic characteristics and optimize technologies that precisely respond to the priorities of each segment. In the context of professional associations, the predictive models proposed in this study enable the identification of specific risk groups based on sociodemographic and academic profiles, thereby facilitating targeted interventions and segmented campaigns. Similarly, Alamsyah et al. [
36] argue that incorporating alternative data sources into risk assessment frameworks improves decision-making and strengthens institutional adaptation to the evolving digital environment. Their analysis of credit risk using professional social network data shows that the integrated use of demographic, behavioral, and relational information reduces defaults and expands access to credit by decreasing reliance on traditional scoring systems. Consequently, the findings of the present study indicate that professional associations can enhance their predictive capacity by integrating behavioral and digital engagement variables, thereby enabling a more comprehensive assessment of delinquency risk and potential professional disqualification.
Consistent with the institutional implications discussed above, the findings show that machine learning models—particularly XGBoost and CatBoost—when calibrated with class-balancing and interpretability techniques, offer robust solutions for anticipating disqualification risk among licensed professionals. Their implementation enhances preventive financial management and strengthens the long-term sustainability of professional associations.
7. Conclusions
This study demonstrates that supervised machine learning can estimate disqualification risk due to delinquency using large-scale institutional data. By consolidating more than 5.7 million transactions from 27,964 members, it developed and tested a predictive framework grounded in historical and contextual variables. Among ten algorithms, CatBoost achieved the strongest performance (F1-score = 57.96%; AUC = 0.72), followed closely by XGBoost after class balancing. These results confirm the effectiveness of gradient boosting for modeling structurally imbalanced institutional risk data.
Methodologically, the study integrates sociodemographic, academic, and financial-institutional variables within a unified predictive architecture. Individual-level analysis and standardized metrics (F1-score, AUC) ensure rigorous evaluation under asymmetric class conditions. The findings demonstrate that contextual and organizational attributes enhance prediction beyond purely financial indicators.
Operationally, the results provide a solid empirical basis for early-warning systems within professional associations. Estimating risk at specific decision points reflects the discrete logic of institutional governance. Evidence from financial risk and behavioral economics supports the use of contemporaneous data to anticipate future payment behavior. Implementing CatBoost- or XGBoost-based systems would enable proactive identification of high-risk members and reinforce financially sustainable governance. Incorporating interpretability tools such as SHAP would strengthen transparency and accountability.
Beyond its predictive utility, the risk scoring model strengthens the operational capacity of professional associations by structuring portfolio management around quantifiable criteria. Its implementation as a periodic classification tool allows cases to be prioritized according to exposure level, optimizing the allocation of follow-up resources and reducing administrative costs associated with indiscriminate collection processes. Segmentation by profile facilitates the application of differentiated strategies—such as adjustments in contact frequency, personalized payment plans, and targeted actions based on sociodemographic and professional characteristics—improving collection efficiency without affecting the institutional relationship with members. Likewise, the analysis of variables linked to the payment infrastructure (bank, modality, and channel) provides concrete inputs for redesigning internal processes through automation, operational simplification, and reduction of administrative friction. Integrated into regular governance mechanisms, the model enables the establishment of performance indicators—risk ratios by segment, differentiated recovery rates, and average management times—and the periodic review of intervention criteria, thereby strengthening financial planning and the consistency of institutional decisions.
Limitations remain. The absence of temporal or external validation limits the assessment of model stability under changing economic or regulatory conditions. Reliance on structured quantitative variables also excludes psychosocial dimensions such as institutional engagement, perceived value, or motivational drivers of compliance.
Future research should prioritize longitudinal validation, integration of behavioral and psychosocial indicators, and experimentation with hybrid or deep learning architectures paired with interpretability techniques. Advancing these directions will enhance the robustness, generalizability, and strategic value of predictive systems for preventive disqualification risk management in professional associations.