A Predictive Risk Score to Diagnose Adrenal Insufficiency in Outpatients: A 7 Year Retrospective Cohort Study

Background: The diagnosis of adrenal insufficiency (AI) requires dynamic tests which may not be available in some institutions. This study aimed to develop a predictive risk score to help diagnose AI in outpatients with indeterminate serum cortisol levels. Methods: Five hundred and seven patients with intermediate serum cortisol levels (3–17.9 µg/dL) who had undergone ACTH (adrenocorticotropin) stimulation tests were included in the study. A predictive risk score was created using significant predictive factors identified by multivariable analysis using Poisson regression clustered by ACTH dose. Results: The seven predictive factors used in the development of a predictive model with their assigned scores are as follows: chronic kidney disease (9.0), Cushingoid appearance in exogenous steroid use (12.0), nausea and/or vomiting (6.0), fatigue (2.0), basal cortisol <9 µg/dL (12.5), cholesterol <150 mg/dL (2.5) and sodium <135 mEq/L (1.0). Predictive risk scores range from 0–50.0. A high risk level (scores of 19.5–50.0) indicates a higher possibility of having AI (positive likelihood ratio (LR+) = 11.75), while a low risk level (scores of <19.0) indicates a lower chance of having AI (LR+ = 0.09). The predictive performance of the scoring system was 0.82 based on the area under the curve. Conclusions: This predictive risk score can help to determine the probability of AI and can be used as a guide to determine which patients need treatment for AI and which require dynamic tests to confirm AI.


Introduction
Adrenal insufficiency (AI) can be categorized into primary and secondary AI. The main causes of primary AI worldwide are tuberculosis and autoimmune adrenalitis [1,2]. Post-glucocorticoid therapy-induced AI has often been cited as the most common cause of secondary AI. Previous studies have identified multiple clinical and biochemical factors associated with AI [3][4][5][6][7], including a history of cirrhosis, autoimmune diseases, hepatitis C, HIV infection, chronic kidney disease (CKD), fatigue, nausea and vomiting, symptoms of hypotension and glucocorticoid use, and Cushingoid appearance. Biochemical factors associated with AI include low basal cortisol, cholesterol, and sodium [7].
Different protocols for AI diagnosis have been used in various institutions. Those suspected of having AI may proceed directly to adrenocorticotropin (ACTH) stimulation testing without screening for serum morning cortisol levels [8]. Some protocols propose that if serum cortisol is drawn at 08:00 and the level is below 3-5 µg/dL (83-138 nmol/L), it is strongly suggestive of AI and indicates that other dynamic tests, e.g., an ACTH stimulation test or insulin-induced hypoglycemia, are not necessary [8,9]. Another study suggested that if the 08:00 serum cortisol level is >15 µg/d (414 nmol/L), a diagnosis of AI is less likely [10]. When the serum cortisol levels are in the indeterminate range, dynamic tests are mandatory.
A frequently encountered problem in health care centers is the lack of access to diagnostic procedures such as ACTH stimulation tests or insulin-induced hypoglycemia tests. The ACTH stimulation test is currently the preferred diagnostic test as it is both safe and reliable [8]. In some institutions, the ACTH utilized in the tests may be in short supply or unavailable. In situations where supplies are limited, ACTH diluted to a low dose (1-5 µg) may be employed instead of the usual high dose of ACTH (250 µg). The use of diluted ACTH can lead to errors in the interpretation of the results and diagnosis if the diluted ACTH is not properly prepared [8].
Multiple reports have recommended upper and lower cut-off levels for serum cortisol levels to help rule out and rule in the presence of AI [3,11]. It has been estimated that using these cut-off levels could potentially diminish the number of dynamic tests by approximately 30% [3]. However, a problem occurs when serum cortisol levels fall in the intermediate level where AI cannot be either excluded or diagnosed; in those cases, dynamic tests are mandatory. A simple predictive tool that incorporates readily available clinical and laboratory data and that increases the accuracy of the prediction of AI could potentially reduce the number of ACTH stimulation procedures. Such a method would be particularly valuable where supplies of ACTH are limited or nonexistent. To date, there have been no reports of such a tool that could predict the risk of AI in cases where serum cortisol levels are in the intermediate range.
This study aimed to design a simple-to-use predictive score based on easy-to-obtain clinical and biochemical parameters to facilitate the prediction of secondary AI in patients with intermediate levels of cortisol.

Materials and Methods
A 7 year retrospective cohort study was conducted at the adult endocrinology outpatient department unit of Maharaj Nakhon Chiang Mai Hospital, Thailand. All data were acquired during January 2010-December 2016. The study was approved by the institutional board review of the Faculty of Medicine, Chiang Mai University. The ethical code is EXEMPTION-6193/2019 and date of approval is 27 March 2019. Informed consent was waived by the ethics committee. Inclusion criteria were adult patients aged more than 18 years with 08:00 serum morning cortisol between 3-17.9 µg/dL (83-500 nmol/L) who had undergone ACTH stimulation testing. We excluded the patients suspected of having primary AI or congenital adrenal hyperplasia, those with incomplete data for ACTH stimulation tests, females currently on hormonal therapy or taking oral contraceptive pills containing estrogen, and patients who had undergone pituitary surgery within the previous 2 months. The method used has been described in a previous study conducted by the authors [7].

ACTH Stimulation Test Protocol
Details of the ACTH stimulation test protocol have been described previously [7]. In brief, patients currently taking glucocorticoids are instructed to discontinue the medications at least 24 h before the tests. During May 2010-March 2014, only low-dose ACTH stimulation tests were performed in Thailand due to a shortage of ACTH. Serum cortisol was obtained at 0 (basal cortisol), 30, and 60 min after either 1 or 250 µg of ACTH had been administered intravenously.

Definitions
In this study, AI was defined as a peak serum cortisol level at 30 or 60 min after ACTH stimulation of less than 18 µg/dL (<500 nmol/L). Normal adrenal response was defined as a peak serum cortisol level 30 or 60 min after ACTH stimulation of ≥18 µg/dL (≥500 nmol/L). CKD was diagnosed if the patient had an estimated glomerular filtration rate (eGFR) of less than 30 mL/min/1.73 m 2 as calculated using the modification of diet in renal disease (MDRD) formula. Fatigue, nausea/vomiting, and orthostatic hypotension were symptoms reported by the patients and documented in the medical record by a medical practitioner. The definition of weight loss was a loss of 5% of body weight in one month or 10% over a period of six months or longer [12]. Cushingoid appearance was defined as at least one sign of glucocorticoid excess documented in the medical record by the medical practitioners, e.g., moon face, facial plethora, dorsocervical fat pad, proximal muscle weakness, easy bruising, and hirsutism [7].

Predictive Variables
Clinical and biochemical data were obtained from electronic medical records. Clinical data included demographic information, e.g., age, sex, and underlying diseases. Indications for ACTH stimulation testing were also collected. Biochemical factors such as serum albumin, creatinine, and cholesterol were acquired within 3 months before or following the tests. Serum 08:00 morning cortisol and basal cortisol levels were obtained using an electrochemiluminescence immunoassay (ECLIA) (Elecsys ® Cortisol 1010, Roche Diagnostics, Laval, QC, Canada). The intra-and inter-assay coefficients of variation for serum cortisol were <10%.

Outcome Variable
The results of the ACTH stimulation tests were categorized into 2 groups: AI and normal adrenal response.

Statistical Analysis
STATA program version 15.1 (Stata Corp., College Station, TX, USA) was used for analysis. The statistical significance level was defined as p-value < 0.05 for two-tailed tests. Data are demonstrated either as count and percentage or as mean and standard deviation (SD). Fisher's exact test and the t-test or the Mann-Whitney U test were performed for univariable comparative statistics for categorical and continuous data, respectively. Poisson regression clustered by ACTH dose was performed using multivariable analysis; the results are reported as a coefficient value and a 95% confidence interval (CI). Significant predictive factors identified in a multivariable model from an earlier study were employed in the current model to predict AI [7].
Item scores were calculated by the transformation of the regression coefficient. The coefficient of each level for each factor was divided by the smallest coefficient of the model and rounded to the nearest 0.5. Item scores were then added together to calculate a total score. The total scores were then divided into 2 risk levels: groups at a low risk and at a high risk of having AI. The cut-off point for the risk levels was acquired from the level which yielded the lowest positive likelihood ratio (LHR+) of AI and the highest LHR+ of AI for the low-risk and the high-risk group, respectively. Discrimination of the prediction scores is presented as the area under the receiver operating characteristic (AuROC) curve and a 95% CI. Internal validation was performed using a resampling technique (bootstrapping method) and the concordance index (C-index) was reported. To give 80% power at the 5% significance level (two-sided with an odds ratio of 0.42 of detecting AI for a specific risk factor), a sample size of at least 430 patients was estimated to be needed [3].

Results
A total of 527 patients who had serum morning cortisol between 3-17.9 µg/dL were included in this study. Three patients with serum morning cortisol <3 or ≥18 µ/dL, 2 patients with incomplete results from the ACTH stimulation tests, 1 patient who was on oral contraceptive pills, 2 patients who had pituitary surgery in the past 2 months, 2 patients with congenital adrenal hyperplasia, and 10 patients with primary AI were excluded. Therefore, 507 patients were enrolled. Baseline characteristics and biochemical investigation results are shown in Table 1. A total of 507 patients were included in the predictive model analysis. Of these, 24.7% (n = 125/507) were diagnosed with AI. AI was significantly more common in patients aged ≥50; those with hypertension, CKD, fatigue, or a history of pituitary surgery; and in patients with exogenous steroid use and Cushingoid appearance. Baseline biochemical investigations found that serum albumin was significantly lower in the AI group than in the normal adrenal response group (p < 0.001).
Seven initial predictors of AI were chosen based on the predictive clinical factors previously reported by Manosroi et al. [7]. Those factors were CKD, Cushingoid appearance in patients with exogenous steroid use, symptoms of nausea/vomiting, and fatigue. The biochemical factors were serum basal cortisol <9 µg/dL (<248 nmol/L), serum cholesterol <150 mg/dL, and serum sodium <135 mEq/L. Risk scoring was created to predict the probability of patients with a normal adrenal response having AI. The transformed scores ranged from 1.0 to 13.5. The scoring scheme is shown in Table 2. The predictive ability of the scoring system with the transformed scores of all seven predictive factors represented by AuROC was 0.82, 95% CI (0.78-0.86), which is similar to the predictive ability of the model before transforming the scoring system (AuROC 0.84) (95% CI 0.80-0.88) ( Figure 1).  The total scores were classified into two groups: a low-risk group (scores 0-20.0) and a high-risk group (scores 20.5-50.0) ( Tables 3 and 4). Patients with a normal adrenal response were more common in the low-risk group (72.9%). In the low-risk group, 370 patients had normal adrenal responses and 77 patients had AI, which demonstrated a 61.6% specificity. In the high-risk group, 48 patients had AI while 12 patients had normal adrenal responses, which showed a 96.9% specificity. Figure 2 shows the relationship between the proportion of patients with AI and the total scores. The higher the score, the greater the proportion with AI. The accuracy of our model was further verified by bootstrap validation. The C-index was 0.77 (95%CI 0.72-0.82). The proposed predictive   The total scores were classified into two groups: a low-risk group (scores 0-20.0) and a high-risk group (scores 20.5-50.0) (Tables 3 and 4). Patients with a normal adrenal response were more common in the low-risk group (72.9%). In the low-risk group, 370 patients had normal adrenal responses and 77 patients had AI, which demonstrated a 61.6% specificity. In the high-risk group, 48 patients had AI while 12 patients had normal adrenal responses, which showed a 96.9% specificity. Figure 2 shows the relationship between the proportion of patients with AI and the total scores. The higher the score, the greater the proportion with AI. The accuracy of our model was further verified by bootstrap validation. The C-index was 0.77 (95% CI 0.72-0.82). The proposed predictive criteria are shown in Table 4.

Discussion
The present study has proposed that the predictive risk score system for facilitating the prediction of AI shows good diagnostic accuracy: 82% based on AuROC. This clinical prediction model represents a simple and affordable tool to facilitate the diagnosis of AI. Present AI diagnostic procedures require multiple steps, including the screening for serum morning cortisol followed by ACTH stimulation tests. In institutions where ACTH is not available, patients suspected of having AI may need to be transferred to other institutions where the tests are available. With the predictive risk score system, the number of patient referrals as well as the number of tests could potentially be reduced, representing time and cost savings for patients, healthcare practitioners, and health care facilities.
To maximize the potential for high diagnostic specificity (96.9%) and to minimize the false positive rate, a high LHR+ for the cut-off point was employed for the group with a high risk of having AI. Similarly, a low LHR+ cut-off point was used with the group at low risk of having AI in order to minimize the number of false negatives. A single cut-off point that demonstrates both high sensitivity and high specificity simultaneously cannot be achieved. To reduce the number of false positive diagnoses, the proposed cut-off demonstrated a high specificity for diagnosing AI. The scoring system categorized patients into two groups: those with a high risk of AI and those with a low risk. Patients with scores above 20.5 were in the high-risk group. Cushingoid appearance in patients with exogenous glucocorticoid use and those with serum basal cortisol levels <9 µg/dL (<248 nmol/L) who scored 12.0 and 12.5 each, respectively, played a major role in the  High risk of adrenal insufficiency if the total score is >20.5

Discussion
The present study has proposed that the predictive risk score system for facilitating the prediction of AI shows good diagnostic accuracy: 82% based on AuROC. This clinical prediction model represents a simple and affordable tool to facilitate the diagnosis of AI. Present AI diagnostic procedures require multiple steps, including the screening for serum morning cortisol followed by ACTH stimulation tests. In institutions where ACTH is not available, patients suspected of having AI may need to be transferred to other institutions where the tests are available. With the predictive risk score system, the number of patient referrals as well as the number of tests could potentially be reduced, representing time and cost savings for patients, healthcare practitioners, and health care facilities.
To maximize the potential for high diagnostic specificity (96.9%) and to minimize the false positive rate, a high LHR+ for the cut-off point was employed for the group with a high risk of having AI. Similarly, a low LHR+ cut-off point was used with the group at low risk of having AI in order to minimize the number of false negatives. A single cut-off point that demonstrates both high sensitivity and high specificity simultaneously cannot be achieved. To reduce the number of false positive diagnoses, the proposed cut-off demonstrated a high specificity for diagnosing AI. The scoring system categorized patients into two groups: those with a high risk of AI and those with a low risk. Patients with scores above 20.5 were in the high-risk group. Cushingoid appearance in patients with exogenous glucocorticoid use and those with serum basal cortisol levels <9 µg/dL (<248 nmol/L) who scored 12.0 and 12.5 each, respectively, played a major role in the predictive score. Patients who had at least one of these factors in addition to other factors were promptly categorized into the AI high-risk group. In terms of clinical application, easy-to-use predictive criteria for AI were suggested based on the risk score system.
Of the patients at high risk for AI based on their predictive risk score, all but 12 had AI. Based on these findings, we recommended that those in the high-risk group proceed directly to AI treatment including the initiation of physiologic doses of glucocorticoid. For those in the low-risk group, we recommended dynamic tests such as the ACTH stimulation test to rule out AI and to preclude a misdiagnosis of this disease, as a misdiagnosis of AI can potentially lead to a critical and even life-threatening situation. Applying this predictive risk score system to patients suspected of having AI could potentially decrease the number of ACTH stimulation tests by 9.5% (n = 48/507). This clinical prediction model is intended for use with patients who have intermediate serum morning cortisol levels, i.e., levels between 3-17.9 µg/dL (83-500 nmol/L). It can help to guide decision making by physicians regarding whether or not further dynamic tests are indicated.
The proposed predictive risk score incorporates both clinical and biochemical predictive factors. Some of the variables in the final clinical prediction model have been stated in previous studies to be related with AI, including Cushingoid appearance in exogenous steroid use patients, nausea/vomiting, fatigue, low basal cortisol, low serum cholesterol, and hyponatremia [5,7,[13][14][15][16]. Among the factors included in the model, Cushingoid appearance and serum basal cortisol levels <9 µg/dL showed a very high level of association with AI. Data regarding the association between AI and basal cortisol levels were recently published by our group [11]. That report showed that basal cortisol can be employed as an alternative method for the diagnosis of AI. The relationship between CKD and AI remains controversial, with some studies reporting that most CKD patients have normal adrenal function, while another study indicates that higher serum creatinine is associated with a lower risk of having AI [17][18][19].
To the best of our knowledge, previously reported tools to help diagnose AI have all been evaluated exclusively in cirrhotic patients, including the screening and diagnostic algorithms [20]. Other studies have explored groups of factors which can potentially predict the occurrence of AI [3,21]. The present study developed a simple and practical scoring system with good diagnostic accuracy that is suitable for use in normal clinical practice. A strength of this study is that the population used to create the scoring system had various indications of ACTH stimulation testing. Thus, this led to an advantage in terms of the generalizability of the new scoring system. Additionally, the present study had a large sample size, which provided adequate power of analysis. Finally, the observed relationships are not likely to have occurred by chance, as most of the factors related to AI in this study can be explained by the underpinning pathophysiology.
We acknowledge some limitations in this study. Symptoms of nausea/vomiting and fatigue are subjective and are only perceived by the patient; the documentation of these symptoms depends on the decision of the clinicians. Additionally, only patients with intermediate serum cortisol levels (between 3-17.9 µg/dL (83-500 nmol/L)) were included, making the results most relevant to that subgroup. Although the summary of recommendations from the Endocrine Society guidelines suggested that the low-dose (1 µg) corticotropin test for the diagnosis of AI can be used when the substance is in short supply [8]. The variation within and between low-dose synacthen dilution methods can provide an inaccurate dosage, leading to invalid results [22].
Finally, this study was internally verified based on a patient population in a single institution; external validation should be accomplished to confirm the predictive ability of this model.

Conclusions
The proposed predictive risk score system and criteria to diagnose secondary AI has an acceptable diagnostic accuracy. This system can potentially reduce the number of dynamic ACTH stimulation tests required, saving time, money, and resources. The scoring system can be utilized as a guide for clinicians in institutions where ACTH stimulation testing is limited or not available. In low-risk groups (scores 0-20.0), ACTH stimulation tests or other dynamic tests should be required. In high-risk groups (scores 20.5-50.0), AI treatment is indicated. Future external validation of this predictive risk score is warranted.
Author Contributions: W.M. designed the study, collected, analyzed, and interpreted the data and was the major contributor to writing the manuscript. P.A., J.K., and M.P. performed the data analyses and wrote the manuscript. T.P. edited the manuscript. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Patient consent was waived by the Ethics Committee due to the nature of retrospective study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Acknowledgments:
The authors are grateful to G. Lamar Robert, and Chongchit S. Robert, for reviewing the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.