A Machine Learning Approach to Predict Stress Hormones and Inflammatory Markers Using Illness Perception and Quality of Life in Breast Cancer Patients

Psychosocial factors have become central concepts in oncology research. However, their role in the prognosis of the disease is not yet well established. Studies on this subject report contradictory findings. We examine if illness perception and quality of life reports measured at baseline could predict the stress hormones and inflammatory markers in breast cancer survivors, one year later. We use statistics and machine learning methods to analyze our data and find the best prediction model. Patients with stage I to III breast cancer (N = 70) were assessed twice, at baseline and one year later, and completed scales assessing quality of life and illness perception. Blood and urine samples were obtained to measure stress hormones (cortisol and adrenocorticotropic hormone (ACTH) and inflammatory markers (c-reactive protein (CRP), erythrocyte sedimentation rate (ESR) and fibrinogen). Family quality of life is a strong predictor for ACTH. Women who perceive their illness as being more chronic at baseline have higher ESR and fibrinogen values one year later. The artificial intelligence (AI) data analysis yields the highest prediction score of 81.2% for the ACTH stress hormone, and 70% for the inflammatory marker ESR. A chronic timeline, illness control, health and family quality of life were important features associated with the best predictive results.


Introduction
Breast cancer is the most common malignancy in women [1]. In Romania, the Ministry of Health reports an increase of over 15% in breast cancer incidence for the past decades, with 9629 new cases in 2018 [2]. Moreover, Romania presents a worrisome increase in breast cancer mortality. While, between 2005 and 2012, the mortality had a descendent trend, since 2013, the number of deaths has been constantly rising. In Western Europe, the 5-year survival rate improved to 80% in the past decade. Romania has one of the lowest survival rates in the European Union, due to reduced breast cancer screening and delayed diagnosis [2,3].
The progress in oncology research improved survivorship in all cancer patients changing the focus from simply surviving to quality living after cancer. Resiliency grew into a central concept in cancer research as quality of life and distress became the sixth vital sign along with temperature, respiration, heart rate, blood pressure and pain [4][5][6][7]. The psychosocial factors turned out to be even more important during the Covid-19 pandemic when 64% of cancer patients (breast cancer patients included) experienced moderate or high stress associated with uncertainty, life changes, coping strategies, communication, experience or health services [8]. Generally, most cancer patients do no report clinical levels of depression [9], but the diagnosis, the symptoms and the treatment significantly decrease their reported quality of life [10].
The studies exploring the relationship between quality of life and cancer prognosis have produced contradictory results. While some investigations found that quality of life might be a prognostic factor for survival in cancer patients, in general [11,12], others report negative or inconsistent results [13]. Lee et al. [14] found that quality of life dimensions are not consistent predictors for illness outcome at first diagnosis. The association becomes significant in case of a relapse and is stronger later in the course of the recurrent disease [14]. The reason for the heterogeneous results remains unclear. The significant associations between quality of life and illness outcomes in more advanced forms of cancer might be linked to differences in illness perceptions.
The illness perceptions precede cancer diagnosis, but continue to develop and change after it. Generally, they are associated with cancer patients' adherence to treatment, survival outcomes and perceived severity of symptoms [15][16][17]. In particular, breast cancer patients report more negative illness perceptions [18]. The breast cancer patients who report more negative emotions associated with cancer and expected more negative consequences related to their illness have higher mortality rates [15] and poorer health related quality of life [19]. The link between illness perception and mortality rates in cancer patients could be explained through the body's stress response.
Multiple studies report dysregulations in endocrine and sympathetic nervous systems in breast cancer patients [20], and stressful life events have been associated with physiologic disturbances. For example, acute stress elicits an adaptive response in the human body, stimulating the nervous and the endocrine systems to cope with the stressor. Nonessential functions of the body such as reproduction, digestion and growth are inhibited. Glucose and free fatty acids are increased. Stress hormones as adrenaline and cortisol are released to prepare the person to fight the threat. The hypothalamic-pituitary-adrenal (HPA) axis is one system responsible for the body's stress response. In stressful situations, the paraventricular nucleus in the hypothalamus discharges corticotropin-releasing hormone (CRH), causing the pituitary to release adrenocorticotropic hormone (ACTH), which, in turn, will determine the release of cortisol in the adrenal glands. To avoid the system overuse, the cortisol then obstructs the discharge of CRH. In short-term, these reactions are helpful and necessary. Generally, the body returns to its normal functions once the perceived danger has passed [21]. However, breast cancer is a more chronic stressor, and patients might feel vulnerable for longer periods of time. In chronic stress, the production of stress hormones loses its balance and the body cannot return to normal [22]. Several studies show that prolonged stressful experiences are associated with both hyper-and hypo-cortisol regulation [23]. The cortisol has a strong anti-inflammatory function, preventing widespread tissue and nerve impairment due to inflammation [24], but long-lasting chronic stress results in cortisol dysfunction associated with an unmodulated inflammatory response to both pathogens and psychological stress [25]. These dysfunctions could explain the way how the psychosocial factors influence cancer prognosis and survival outcomes [26].
Despite the consistent body of research showing a significant relationship between psychosocial factors and breast cancer survival [27], the physiological mechanisms involved are still controversial. While several studies suggest that the HPA axis is an important biological system associated with psychosocial factors and survival outcomes [28,29], others find no significant relationships between stress markers and psychological measures [30,31].
In the present study, we examine whether psychosocial factors (illness perception and quality of life reports) can predict stress hormones and inflammatory markers in breast cancer survivors, one year later. We conjecture that lower levels of quality of life at baseline yield higher levels of stress hormones and inflammatory markers one year later. Based on previous findings, we expect that negative illness perceptions predict higher levels of stress hormones and inflammatory markers over time. We use statistics and machine learning methods to analyze our data and build a best prediction model.

Participants
The patients were recruited in one medical establishment in Iasi, where they came for their periodic medical examination. Baseline data collection took place during March-May of 2018. All participants were invited to also take part in the second assessment, one year apart from the first, when they were scheduled for their next check-up. The inclusion criteria for all participants comprised a diagnosis of stage I to III breast cancer and treatment completion. The exclusion criteria were potentially fatal comorbid diagnosis, a stage IV cancer diagnosis.

Measures
The Quality of Life Index (QLI)-Cancer III Version [32], was used to measure both satisfaction and importance regarding different aspects of life. Final scores report satisfaction with the aspects of life valued by the person. It contains 4 sub-scales that offer independent scores measuring satisfaction on different domains: health and functioning (α = 0.80), psychological/spiritual (α = 0.84), social and economic (α = 0.73) and family (α = 0.75). Items can be summed up to generate a total quality of life score (α = 0.90).
The revised version of the Illness Perception Questionnaire (IPQ-R) [33] was used to assess illness perception. The questionnaire measures nine dimensions of illness perception. Five dimensions assess negative illness perceptions such as attributing more negative consequences, emotions and symptoms to the illness and perceiving it as chronical: identity, timeline, consequences, time cyclical and emotional representations. Higher scores on these dimensions indicate a more negative illness perception. The other three dimensions assess positive perceptions as treatment control, personal control and illness coherence, with higher scores indicating more positive beliefs. The questionnaire was used to measure illness perceptions among patients with different diseases, including cancer with good psychometric properties [34][35][36]. The Cronbach's alpha coefficients for the translated Romanian version ranged between 0.68 and 0.85 for the 9 dimensions.
The blood and urine samples were obtained at baseline, and one year later, to measure stress hormones (cortisol and ACTH) and inflammatory markers (c-reactive protein (CRP), erythrocyte sedimentation rate (ESR) and fibrinogen). The samples were processed in the hospital laboratory. We chose these markers based on previous studies identifying stress hormones and inflammatory markers associated with illness evolution in cancer patients and on the laboratory tests routinely available in the medical institution. The participants were instructed to collect their urine over a period of 24 h. They were asked to urinate at 7 o'clock in the morning and to throw away the urine. For the next 24 h, they were told to collect all urine discharges in a clean 2-3 L container, until 7 am the next day. They were asked to homogenize the collected urine by stirring, measure the entire quantity and retain 10 mL in a disposable plastic container. The samples were to be stored at 2-8 • C until they were effectively processed. We clearly explained the procedure and the importance of collecting all urine discharges over the day. The patients knew they would receive the test results and discuss them with their doctor. Noncompliance with the sampling instructions should be minimal [37]. The urine was used to test levels of free urinary cortisol. The blood samples were drawn between 8.00 and 11.30 am for each patient and for both assessments; the patients were instructed to fast after midnight and drink liquids as needed.

Procedure
After the study was approved by the review board, we approached prospective participants and explained the objectives, risks and benefits of our study. The participants were informed that they were free to withdraw at any time. The study discussions took place away from any member of the patient's medical team to ensure that they would not feel any outside constraint to participate. After we obtained their written consent, they received the self-report questionnaires. Quality of life and illness perception were measured only at baseline. The blood and urine samples were obtained as part of their periodical check-ups. One year later, they repeated the biological tests.

Data Analysis
The SPSS 25.0 program (IBM Corp, Armonk, NY, USA) was used for preliminary data analysis. We used descriptive statistics, including frequencies, percentage, means and standard deviation to describe our sample at baseline. The Pearson correlations were used to explore the relationships between the research variables. Multiple hierarchical regression was used to predict total quality of life using illness perception domains. ACTH was also predicted using family quality of life. Paired samples T tests were used to compare initial levels of stress hormones and inflammatory markers with values obtained one year later. There were 3% missing data, which were replaced with the sample mean.
A priori power analysis was performed to estimate the minimal number of patients needed for hierarchical linear regression. Power calculations were performed with G*Power 3.1 (Franz Faul, Kiel University, Germany) for a power level of 0.80 and 5% level of significance, and the sample size was estimated at 61 participants. Given a 10% probability of loss of participants and for a higher accuracy, we addressed more patients than the minimal calculated.
For more in-depth analysis, we used machine learning to explore the predictive value of the chosen variables. We tested six different algorithms on our datasets: logistic regression, linear discriminatory analysis, K-nearest neighbors classification and regression trees, Naive Bayes, and support vector machine. We chose these machine learning algorithms based on previous research studying breast cancer risk calculation and prognosis using machine learning. We also used the support vector machine algorithm as multiple studies report that this algorithm was the most accurate in predicting breast cancer risk.

Characteristics of Breast Cancer Patients
A total of 125 breast cancer patients were assessed for eligibility; 81 agreed to take part in our study and completed a baseline assessment; 11 women of the original sample did not take part in the second assessment, one year later. The analytic sample therefore included 70 breast cancer patients, resulting in an 86% retention rate. No significant differences existed in the baseline data (of age and explored variables) of the participants who took part in the second assessment and those who dropped out. The mean age of the participants was 53 years (SD = 11.6). The mean duration between completion of cancer treatments and study entry was 4.7 years (SD = 5.01) ( Table 1).

Quality of Life and Illness Perception
We conducted Pearson correlations to explore the associations between the quality of life and illness perception dimensions ( Table 2). Table 2. Correlations between quality of life and illness perception dimensions.

Health
Social Psychological Family Our findings are that the women who feel their illness is more permanent manifest a lower level of psychological quality of life. A cyclical perception of symptoms is associated with lower health-related quality of life. Patients who associate more negative consequences to the illness show lower levels of health, social and psychological quality of life. Perceiving higher coherence in one's symptoms and associating fewer negative emotions to the illness is associated with higher levels of quality of life in all domains. The sociodemographic and illness-related variables were examined in relation to illness perception and quality of life. The older women reported perceiving less illness coherence (r = −0.28, p = 0.019). Women who are closer to the time of their treatment and diagnosis associate more negative emotions with their illness (r = −0.34, p = 0.005). There are no other significant correlations between age, time since treatment, number of births and illness perception or quality of life.
We conducted Mann-Whitney tests to examine the differences between women who had mastectomy and those with conservative interventions. Women with conservative intervention perceived more personal control over their illness (M = 42.08) compared with women who had mastectomy (M = 30.56, U = 308.50, p = 0.026). There are no other differences between the two groups' quality of life and illness perception.
We ran multiple regression analysis to explore whether illness perception dimensions predict quality of life. We selected dimensions that showed significant correlations to the total quality of life score. Our predictors were: time cyclical, consequences, coherence and emotions. The results of the regression indicate that the model explained 42% in the variance. It was found that illness coherence and emotion representations significantly predicted global quality of life (Table 3).

Stress Hormones and Psychosocial Factors
To explore the changes in stress hormones over 12 months, we computed paired samples t-tests between the two assessments. There were no significant differences between the two measures for ACTH: t (69) = 1.45, p = 0.150 or for cortisol: t (69) = 0.99, p = 0.325 ( Figure 1, Table 4).  We also conducted Pearson correlations between quality of life and illness perception dimensions and stress hormones (Table 5). Women with higher quality of life in their family have lower levels of ACTH, one year later (r = −0.57, p < 0.001). There is also a marginal significant correlation between treatment control and ACTH. Women who perceive having more control over their treatment exhibit lower levels of ACTH, one year later (r = −0.24, p = 0.090). There are no significant correlations between free urinary cortisol at follow up and psychosocial factors. We used hierarchical multiple regression analysis to explore if familial quality of life and perception of treatment control at baseline predict ACTH levels, one year later. We controlled for age, cancer stage and years since the diagnosis. The regression results indicate that the model explained 48% in the variance of the variable. It was found that family quality of life significantly predicted ACTH levels, one year later (Table 6). Step 1 Step 2 Step

Inflammatory Markers and Psychosocial Factors
To explore the changes in the inflammatory markers over 12 months, we computed paired samples t-tests between the two assessments. There were no significant differences between the two measures for ESR: t (69) = 1. 45 (Table 7). Table 7. Correlations between quality of life, illness perception dimensions and inflammatory markers.

Data Analysis with Artificial Intelligence (AI) Methods
We created five .csv files using the general database (Table 8. We placed the eight illness perception features and the four quality of life features in columns. The last column contained the target variables, the stress hormones: ACTH, CLU and the inflammatory markers CRP, ESR and FBG, as indicated in the image below. At the same time, all missing values were replaced with the average score for each variable. We assessed multiple different machine learning algorithms on the 5 datasets in Python (Python Software Foundation. Python Language Reference, version 2.7. Available at http://www.python.org) with scikit-learn. We used the same test harness to evaluate the algorithms, and we summarized the results both numerically and using a box and whisker plot. We used the gradient boosting ensemble from scikit-learn for classification and then explored the effect of the gradient boosting model hyperparameters on the model performance.
We used feature selection for preparing machine learning data in Python with scikitlearn and applied 4 different automatic feature selection techniques on our datasets: univariate selection, recursive feature elimination, principal component analysis and feature importance. Appendix A contains more details about the process of comparing the machine learning algorithms in Python with scikit learn.

Comparing Consistently the Machine Learning Algorithms
We evaluated each algorithm identically on the same data, on a consistent test chain. We compared six different algorithms: logistic regression (LR), linear discriminatory analysis (LDA), K-nearest neighbors (KNN) classification, regression trees (CART), Naive Bayes (NB) and support vector machine (SVM) [38].
We analyzed a standard binary classification dataset (ACTH.csv), with two classes and twelve numeric input variables at different scales. The 10-fold cross-validation procedure was used to evaluate each algorithm, configured with the same random seed to ensure that the same divisions were performed with the training data and that each algorithm was evaluated in exactly the same way (Appendix B) ( Table 9). Our results suggest that both KNN (k nearest neighbors) and SVM (support vector machine) are algorithms worthy of further study in connection with this problem.

Gradient Boosting for Classification
We analyzed the use of gradient boosting for a classification problem. We included a more detailed description of this process in Appendix C [39][40][41][42]. We loaded the ACTH.csv dataset and evaluated a gradient boosting algorithm on this dataset. We assessed the model using repeated stratified k-fold cross-validation, with three repetitions and 10 folds. We reported the mean and standard deviation of the model accuracy for all iterations and folds (Appendix D).
Running the example, we obtained a mean accuracy of 0.695 and a standard deviation of 0.142 for this model.

Grid Search for Hyperparameters
We used a search process to find model hyperparameters that work well or best for a given predictive modeling problem. Popular search processes include a random search and a grid search.
We analyzed the usual grid search intervals for the key hyperparameters of the gradient growth algorithm that we could use as a starting point for our own projects. This was done using the GridSearchCV class and specifying a dictionary that maps the name of the model hyperparameters to the searchable values.
In this case, we looked in the grid for four key hyperparameters for gradient boosting: the number of trees used in the ensemble, the learning rate, the size of the sub-sample used to train each tree and the maximum depth of each tree. We used for each hyperparameter a series of values widely used for the good performances they achieve. Each configuration combination was evaluated using repeated k-fold cross-validation, and the configurations were compared using the average score, in this case, the accuracy of the classification. The complete example of grid search of the key hyperparameters of the gradient growth algorithm in our classification dataset is listed in Appendix E. The configuration with the best score is reported first, followed by the scores for all other configurations considered.
We observed that a configuration with a learning rate of 0.0001, maximum depth of 3 levels, 10 trees and a sub-sample of 50% performed best with a classification accuracy of about 81.2%. The model could work better with multiple trees, such as 1000 or 5000; these configurations were not tested to ensure that the grid search is completed within reasonable time.
The example in the appendix demonstrates this on our binary classification dataset (Appendix F). The example fitted the model of the gradient boosting assembly on the entire dataset and was then used to make a prediction on a new dataset, as we would do in applications.

Feature Selection for Machine Learning in Python
The features of the data used to train machine learning models have significant influence on the performance that can be achieved. Irrelevant or partially relevant features may have a negative impact on the model's performance. In the following, we present the automatic feature selection techniques we used to prepare the Python machine learning data with scikit-learn [43,44].

Feature Selection
Feature selection is a process in which we select the features from our data that contribute most to the prediction variable or the output in which we are interested. Irrelevant features in data can reduce the accuracy of many of the models, especially in the case of linear algorithms, such as linear and logistic regression.

Feature Selection for Machine Learning
We list here the 4 recipes for selecting the features for machine learning in Python, which we used on our database. Each recipe was designed to be complete and independent so that we can copy and paste it directly into the project and use it immediately. The recipes use our datasets to demonstrate how to select features. This is a binary classification problem in which all attributes are numeric.

Univariate Selection
Statistical tests can be used to select those characteristics that have the strongest relationship with the output variable. The scikit-learn library offers the SelectKBest class that can be used with a suite of different statistical tests to select a specific number of features. Many different statistical scans can be used with this selection method. For example, the F-value ANOVA method is suitable for numeric inputs and categorical data. It can be used via the f_classif () function. We selected the best 4 features using this method in the example below; see Appendix G. The features with indices 0 (IPQtimeline), 3 (IPQperscontrol), 8 (hfsub) and 11 (famsub) had the highest scores.

Recursive Feature Elimination
The recursive feature elimination (or RFE) works by recursively deleting features and building a model on those remaining attributes. We used the accuracy of the model to identify which attributes (and combination of attributes) contribute the most to predicting the target variable.
The example below uses RFE on the logistic regression algorithm to select the first 3 features. The chosen algorithm is not too important, as long as it is skillful and consistent (Appendix H). RFE chose the first 3 features as IPQtimeline, hfsub and pspsub.

Principal Component Analysis
The Principal Component Analysis (or PCA) uses linear algebra to transform a dataset into a compressed form. A feature of PCA is that we can choose the number of dimensions or main components in the transformed result. In our example, we used PCA and selected 3 main components (Appendix I) so that the transformed dataset does not resemble the source data.

Feature Importance
Bagged decision trees, such as random forest and extra trees, can be used to estimate feature importance.
For the example in Appendix J, we built an ExtraTreesClassifier for datasets (Appendix J). We assign an importance score to each attribute; the higher the score, the more important the attribute (e.g., IPQtimeline, IPQtimecycle and famsub).
Thus, feature selection prior to entering the data in the model lead to reduced overfitting, improved accuracy and reduced training time.

Machine Learning Results
For ACTH, the use of features 0, 1, 3, 8, 10 and 11 (selected as important through the reduction methods) in the algorithms SVM: 0.809524 (0.132993) or KNN: 0.778571 (0.157952) lead to the best results. We ran the test program for machine learning algorithms on ACTH.csv, which contains all the features, and on ACTH_1.csv, which contains only the features 0, 3, 8 and 11. The results are collected in Table 10 ( Figure 2): As expected, there is an improvement in all variables.  Initial results for all variables are collected in Table 11. Completing the analysis with artificial intelligence (AI) methods, the highest prediction score was obtained for a GBM algorithm after adjusting the hyperparameters, 81.2% for the ACTH stress hormone and 70% for the inflammatory marker ESR. Selecting the relevant features prior to entering the data in the model, we obtained better results for all 7 machine learning algorithms used, as expected.

Illness Perception and Quality of Life
Consistent with previous research, our results suggest that negative illness perceptions are associated with lower quality of life in breast cancer patients [19,45]. The patients who perceive breast cancer as a serious condition, with major consequences on their lives, show lower levels of health, social and psychological quality of life. Moreover, seeing cancer as a permanent condition, with an unpredictable course, is associated with lower psychological and consequentially lower health quality of life. Previous studies found similar results, especially in older breast cancer patients reporting less positive illness perception and lower wellbeing [34]. Our findings also underline the predictive value of patients' illness perception. Emotion representations and illness coherence significantly predicted global quality of life. Our results are consistent with the studies showing that more negative illness perception is associated with lower wellbeing and quality of life [19,34]. Emotional representations were the strongest predictor. While emotional representations are reported as an important quality of life predictor in other studies as well, illness coherence is less central [19,34]. The Romanian patients might have lower cancer literacy [46], which could explain the stronger relationship between illness coherence and quality of life. Those who better understood their illness and symptoms reported higher quality of life in all domains. The older women reported feeling more puzzled about their symptoms, showing less illness coherence [34]. The intervention efforts should take into account this specific need, giving patients more information about breast cancer symptoms and signs, helping them to perceive cancer as more coherent.

Ilness Perception, Quality of Life and Stress Hormones
The present study investigated if illness perception and quality of life can predict the levels of stress hormones one year later. Traditional regression analysis shows that ACTH levels could be predicted using family quality of life. Women who report higher quality of life in their family have lower levels of ACTH one year later. We could not find other associations between psychosocial measures and ACTH. This is consistent with previous research on the subject reporting inconclusive results [22]. The importance of family quality of life can be explained by the extensively documented role of perceived social support in breast cancer patients' adjustment [34], and good family life may have a protective role over time.
The artificial intelligence analysis yielded the highest prediction score for ACTH, 81.2%. The most important features, selected through the reduction methods, were perception of a chronic timeline, perceived personal control, health and family quality of life. The importance of the perceived personal control and the chronic timeline were also underlined in a study exploring breast cancer patients' perceptions of gene expression profiling [47]. The perceived personal control and the will to prevent a chronic timeline were part of the patients' tendency to overestimate the importance of gene expression profiling. Perceived helplessness and the fear of the chronic timeline could lead to stress, explaining higher levels of ACTH. Illness perceptions and quality of life are associated with cancer mortality risk [15], and our findings highlight parts of the biological mechanism involved.
We did not find any significant Pearson correlation between the free urinary cortisol at follow-up and the psychosocial factors. The learning machine algorithm found a weak prediction score of 52% for urinary cortisol using perceived illness consequences, social, psychological and family quality of life as predictors, which is close to the random guess and validates the above statistical analysis. Breast cancer is accompanied by long-lasting stress, which is known to result in the cortisol dysfunction associated with an unmodulated inflammatory response [25]. While some studies have found positive relationships between cortisol and psychosocial factors in cancer patients, others have shown no relationship [48,49].

Quality of Life, Illness Perception and Inflammatory Markers
We also explored the relationship between quality of life, illness perceptions and inflammatory markers. Most psychosocial measures do not correlate with inflammatory markers. However, women who perceive their illness as being more chronic at baseline have higher levels of ESR and fibrinogen, one year later. The machine learning algorithm found a 70% prediction score for ESR (using the perceived illness coherence and identity, chronic timeline and treatment control as predictors), and a 68% prediction score for fibrinogen. The most important features were the illness coherence and identity, health and psychological quality of life. For CRP, the algorithm found a 61% prediction score, based on perceived chronic timeline, illness coherence, social and psychological quality of life. Previous studies also suggest that negative perceptions on consequences, timeline, identity and emotions are associated with higher mortality risks [15]. Inflammation might be the frame explaining how illness perception can predict breast cancer survival outcomes. Previous studies show that breast cancer survivors have high CRP levels immediately after treatment, but they tend to normalize with the passage of time [50]. These patients were in various moments of post-treatment time, and this high variability might explain the lower prediction scores. The mean duration between completion of cancer treatments and study entry was 4.7 years.

Limitations and Future Directions
The results should be considered within the limitations of this study. Our patients filled in the self-report measures when they came for their periodical medical examinations. Their answers might have been influenced by the stressful situation. As such, it would be important for future research to examine psychosocial factors in different contexts, using multiple recurrent assessments.
Both stress hormones and inflammatory markers oscillate from the time of diagnosis through treatment and survival. Our sample consisted of breast cancer survivors with varying years since diagnosis. Future studies should try to have more homogenous samples in terms of time passed since diagnosis and treatment completion. Larger samples would also offer more reliable results.
We could only measure psychosocial factors at baseline. It would have been useful to have a second assessment one year later. The intermediate measures of biological markers could help to better understand the dynamics of their evolution.

Conclusions
Our study adds to the growing body of research exploring the relationship between psychosocial factors and biological markers in cancer patients. For many years, unidentified psychosocial distress has been linked to weaker adherence to treatment recommendations, more healthcare needs for nonmedical concerns, maladaptive coping mechanisms and chronic mental health issues in cancer patients and survivors [51]. Three conclusions can be drawn from this study. First, perceived illness coherence and negative emotions are significant predictors of breast cancer patients' quality of life. While, in previous studies, negative emotions were strongly associated with quality of life, the predictive value of illness coherence may be more specific to the Romanian context, where cancer literacy is lower. Older patients report lower illness coherence. Psychosocial intervention efforts should include illness coherence among their objectives, prioritizing older patients. Second, perception of a chronic timeline, perceived personal control, health and family quality of life at baseline show an 80% prediction score for ACTH, one year later. Familial support through cancer survivorship might be a vital resource. Addressing strained family relations and increasing personal control through counselling and psychotherapy could help cancer prognosis. Third, perceived illness coherence and identity, chronic timeline and treatment control at baseline show a 70% prediction score for ESR. Stress hormones and inflammation processes might be the frame explaining how illness perception and quality of life can predict breast cancer survival outcomes.  Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of "Grigore T. Popa" University of Medicine and Pharmacy, Ias , i (UMF141218).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest. We consistently compared the performance of several machine learning algorithms. In the following, we describe how we created a test chain to compare several machine learning algorithms in Python with scikit-learn. We used this test chain as a template for machine learning problems in the analyzed database, and we add several different algorithms to compare them [38].

Appendix A.2. Choosing the Best Machine Learning Model
There are several good models for machine learning projects. Each model has different performance features. Using sampling methods, such as cross-validation, one gets an estimate of how exactly each model can predict unseen data. We use these estimates to choose one or two of the best models from the suite of models.

Appendix A.3. Careful Comparison of Machine Learning Models
When working with a new dataset, it is recommended to visualize the data using different techniques and observe them from different perspectives. The same idea applies to model selection. We used a number of different ways to look at the estimated accuracy of machine learning algorithms to choose one or two to complete the analysis. One can use different visualization methods to highlight the average accuracy, the variance and other properties of the precisions' distribution model. We continued by undertaking this in scikit-learn from Python.
Next, we will describe how to develop a gradient boosting set for classification.