Clinical Implication of the Relationship between Antimicrobial Resistance and Infection Control Activities in Japanese Hospitals: A Principal Component Analysis-Based Cluster Analysis

There are few multicenter investigations regarding the relationship between antimicrobial resistance (AMR) and infection-control activities in Japanese hospitals. Hence, we aimed to identify Japanese hospital subgroups based on facility characteristics and infection-control activities. Moreover, we evaluated the relationship between AMR and hospital subgroups. We conducted a cross-sectional study using administrative claims data and antimicrobial susceptibility data in 124 hospitals from April 2016 to March 2017. Hospitals were classified using cluster analysis based the principal component analysis-transformed data. We assessed the relationship between each cluster and AMR using analysis of variance. Ten variables were selected and transformed into four principal components, and five clusters were identified. Cluster 5 had high infection control activity. Cluster 2 had partially lower activity of infection control than the other clusters. Clusters 3 and 4 had a higher rate of surgeries than Cluster 1. The methicillin-resistant Staphylococcus aureus (MRSA)/S. aureus detection rate was lowest in Cluster 1, followed, respectively, by Clusters 5, 2, 4, and 3. The MRSA/S. aureus detection rate differed significantly between Clusters 4 and 5 (p = 0.0046). Our findings suggest that aggressive examination practices are associated with low AMR whereas surgeries, an infection risk factor, are associated with high AMR.


Introduction
Antimicrobial resistance (AMR) is an emerging global public health crisis. In Japan, the National Action Plan on AMR 2016-2020 issued in 2016, demanded medical institutions to promote comprehensive AMR control, linking the efforts of the existing infection control team (ICT) at the field level with those of the antimicrobial stewardship program (ASP) [1]. Methicillin-resistant Staphylococcus aureus (MRSA) is the most common AMR and should be monitored by each institution [2]. One of the goals of the action plan was to reduce the MRSA resistance rate to 20% by 2020 [1]. In 2020, the MRSA resistance rate decreased; however, it is still much higher than the corresponding outcome indices [3].
Quality indicators (QIs) related to infection control and the proper use of antibacterial agents including hand disinfection compliance rates, implementation of facility guidelines [4] and the implementation rate of Therapeutic Drug Monitoring (TDM), have been proposed [5]. In Japan, a surveillance system called Japan Surveillance for Infection Prevention and Healthcare Epidemiology (J-SIPHE) is in operation, and clinical indicators (e.g., antibacterial drug usage, blood culture collection rate, etc.) of participating institutions were collected [6]. In this way, various Qis have been proposed as infection control indicators, but implementing them remains difficult as there is no clear benchmark. Individual indicators are intricately intertwined and should be evaluated comprehensively [7].
The Japanese Ministry of Health, Labour and Welfare (MHLW) incorporated the medical fee for infection prevention and control (IPC), which is classified into two types (IPC type 1 or IPC type 2) [8]. Type 1 applies to physicians or nurses with >0.8 full-time equivalent (FTE), while type 2 applies to each member with >0.5 FTE [9,10]. However, some facilities had difficulties claiming the IPC medical fee because of shortages in infection control supply materials, as well as human resources and facility equipment [11,12]. Therefore, we believe that it is necessary to evaluate not only IPC and ASP but also the status of the facility when considering the relationship with AMR.
The optimal Qis for evaluating IPC are unclear. Additionally, there are differences in medical care, human resources, and physical resources even in facilities with a similar type of IPC medical fee. Thus, it is difficult to distinguish between facilities based on specific QI or medical fee. A more practical approach to understanding individual IPC and its underlying mechanism is to consider the specific facility factors and infection control measures. Furthermore, there is a need to evaluate the factors that may be related to AMR.
In this study, we aimed to objectively summarize the variables associated with infection control and identify facility clusters based on structure and process factors. Moreover, we assessed the relationships between AMR and these clusters to clarify the factors that may affect AMR. The conceptual framework that guided our study was the quality of care model developed by Donabedian [13]. This conceptual framework comprises the following three domains: structure factors, process factors, and AMR ( Figure 1). We conducted a principal component analysis to transform the selected variables included in the cluster analysis [14].

Results
A total of 124 hospitals were analyzed, excluding 21 hospitals from 145 participating hospitals in this study. Data from these 21 hospitals were missing the following variables: the diagnostic procedure combination/per-diem payment system (DPC/PDPS) data file (4), the Japan Nosocomial Infections Surveillance (JANIS) specimen data (n = 10), JANIS patient age data (n = 6), and JANIS microbial susceptibility data (n = 1). Characteristics of the 124 analyzed hospitals are described in Table 1. The structure factors represented by the median number of beds in the facilities was 330 (interquartile range [IQR]: 223-478) beds; the median length of hospital stay was 12.4 (IQR: 11.1-14.1) days; the rate of surgeries for surgical site infection (SSI) surveillance among all surgeries was 26.5% ± 5.6%; 59.7% of the facilities were located in the eastern region of Japan; and 92.7% claimed IPC type 1 medical fee. The process-related factors were represented by the TDM implementation rate for vancomycin (79.2%), multiple sets of blood culture (81.1%), blood culture contamination (3.1%), number of Clostridioides difficile (CD) detection tests (4.2/1000 bed days), blood culture collected prior to broad spectrum antibiotic therapy (60.1%), and the number of bacterial tests (9.8/100 bed days). The median MRSA/Staphylococcus aureus (S. aureus) detection rate was 42.3% (IQR: 33.3-52.5). We performed a principal component analysis based on an optimal subset of ten selected variables (in the first principal component created with all variables, there were no variables with eigenvectors >0.4, when eleven variables were selected, the cumulative eigen-value was less than 60%). Four structure factors including average length of stay, rate of surgeries, region, and IPC type were selected. By contrast, six process factors including the TDM implementation rate for vancomycin, multiple sets of blood cultures, blood culture contamination, the number of bacterial tests, the number of CD-detected tests, and blood culture collected prior to broad spectrum antibiotic therapy were selected. The first four principal components with eigenvalues >1 explained 63.0% of the variance and were therefore retained for further cluster analysis (Table A1). The first principal component accounted for 24.9% of variance for which the three factor scores with the largest eigenvectors were the number of bacterial tests, blood culture collected prior to broad spectrum antibiotic therapy, and number of CD detected tests. Therefore, the first principal component was characterized as the component for "bacterial test". Likewise, the second principal component, accounting for 14.8% of variance, was characterized as "surgeries", because the components were the rate of surgery, average length of stay, and region. The third principal component, which accounted for 12.9% of variance, was characterized as the "ability of infection control team", as the components were the TDM implementation rate for vancomycin, medical fee for IPC type, and rate of blood culture contamination. The fourth principal component, which accounted for 10.8% of variance, was characterized as "skill in performing blood cultures", as the components were multiple sets of blood cultures performed and blood culture contamination.
A hierarchical cluster analysis was performed among the 124 facilities based on the four principal components derived from the principal component analysis, and six distinct clusters were obtained ( Figure 2). Cluster 6 included only one facility with data on the number of bacterial tests (52.0/100 bed days) and the number of CD detected tests (27.0/1000 bed days), and those values were considered outliers because they were significantly different from those of the other clusters. Therefore, we excluded this Cluster from further analysis. Cluster 1 (n = 25; 20%), Cluster 2 (n = 13; 11%), Cluster 3 (n = 5; 4%), Cluster 4 (n = 49; 40%), and Cluster 5 (n = 31; 25%) were established.  Table 2 shows the clinical characteristics of the five clusters. The median number of beds decreased from the highest to lowest, as follows: Cluster 5, Cluster 4, Cluster 2, Cluster 1, and Cluster 3. Cluster 2, Cluster 4, and Cluster 5 included almost all facilities with IPC type 1. Cluster 5 had the shortest average length of stay (10.9 days) and the highest number of bacterial tests (14.6/100 bed days). Cluster 4 had the highest rate of surgeries (29.3%) and second longest average length of stay (13.2 days). Additionally, Cluster 4 had a lower number of blood cultures collected prior to broad spectrum antibiotic therapy (59.2%) and a lower number of bacterial tests (8.5/100 bed days) than Cluster 5. All facilities in Cluster 2 claimed IPC type 1. Meanwhile, Cluster 2 had the lowest number of multiple sets of blood cultures (46.3%) and the highest rate of contaminated blood cultures (4.1%). Cluster 1 was composed of facilities with both IPC types 1 and 2 and the lowest rate of surgeries (22.1%), while the number of blood cultures collected prior to broad spectrum antibiotic therapy (40.6%) was lower than for Cluster 2 and Cluster 5. Cluster 1 also had the second lowest number of bacterial tests (6.8/100 bed days). Cluster 3 was composed of facilities with IPC type 2 only. The average length of stay in Cluster 3 was longer than that in other clusters (14.9 days), and it had the lowest TDM implementation rate for vancomycin (9.7%), the lowest rate for blood culture collected prior to broad spectrum antibiotic therapy (30.1%), and the lowest number of bacterial tests (6.7/100 bed days). Table A2 shows more detailed data.  Figure 3 shows a summary of the characteristics of the five clusters (based on Tables 1 and 2) in addition to the relationship between clusters and the MRSA/S. aureus detection rate. The MRSA/S. aureus detection rate ranked from the lowest to highest as follows: Cluster 1, Cluster 5, Cluster 2, Cluster 4, and Cluster 3 (36.8%, 37.3%, 40.6%, 47.0%, and 50.0%, respectively). The Kruskal-Wallis test followed by the Steel-Dwass test revealed significant differences between Cluster 4 and Cluster 5 (p = 0.0046).  (Table 1) and the comparison of that variable with each cluster (

Discussion
In this study, we implemented a principal component analysis-based cluster analysis of structure and process factors, which eventually resulted in the identification of five clusters. Our results indicate that when evaluating the MRSA detection rate for each facility, consideration should be paid to not only process factors but also structure factors including resources (e.g., number of medical staff) or risk factors (e.g., surgery), as these are also related with AMR.
We conducted multicenter analyses by principal component analysis-based cluster analysis. This method required many samples, which were not only useful in reducing dimensionality but also in detecting the key features of the data. This method has been applied to common diseases, such as chronic obstructive pulmonary disease, dermatomyositis, and chronic heart failure [15][16][17][18]. However, it has scarcely been used in evaluating infection control in health facilities.
This study showed the need to consider AMR in terms of infection control and facility characteristics, especially the number of bacterial examinations and surgeries related to SSI. For instance, the MRSA/S. aureus ratio between Cluster 4 and Cluster 5 had significant differences. In this study, Cluster 5 had the highest rate of activity for infection control. International guidelines demonstrate the detection of bacterial species, followed by optimization of antimicrobial stewardship, and prevention of growing AMR [19]. In our study, Cluster 4 had the highest rate of surgeries. The National Healthcare Safety Network reported that S. aureus is the most common pathogen of SSIs and that the proportion of SSIs caused by S. aureus increased to 30%, with MRSA comprising 49.2% [20]. Many surgeries for SSI surveillance may have reflected the high detection rate of AMR. Cluster 2 showed a higher MRSA/S. aureus detection rate than Cluster 5. A previous descriptive study reported by a Japanese university hospital showed that ICT recommendation improved the ratio of multiple sets of blood culture, similarly, affecting AMR [21]. Partial infection control activities may be insufficient (e.g., education of blood culture for medical staff) and may represent poor ICT activity, thereby affecting AMR. Cluster 3 was composed of facilities collecting IPC type 2 medical fees. These facilities did not have enough human resources, medical resources, and stratified facilities, as reported in previous studies [9,10]. In this study, the MRSA/S. aureus rate was the highest in Cluster 3, which suggests that IPC type 2 facilities require national level support for better implementation of infection control. Cluster 1 had low rates of infection control activities, surgeries, and MRSA/S. aureus detection. A recent retrospective study of one small-sized and eleven medium-sized hospitals (range: 73-354 number of beds) reported that these hospitals had a low number of bacterial tests, which meant that patients with serious infections were likely to be transferred to an acute care hospital. Therefore, AMR-isolation rates were reflected as lower in these hospitals [22]. Cluster 1 may reflect only a few incidents of encountering AMR, and infection-control activity could also possibly be low.
Nevertheless, there are also several limitations to our study. First, various factors that may affect resource use were not considered in our study because of the lack of available data. For example, a previous study reported that compliance with hand hygiene reduced nosocomial infections and MRSA transmission [23]. However, the usage of alcohol disinfection in the hospital is not recorded in the administrative data, and hence could not be used. These data may influence the MRSA/S. aureus detection rate. Second, as participation in this study was voluntary, the participating hospitals may not be representative of all hospitals in Japan. Moreover, a project managed by Kyushu University believes that the hospitals that participated in JANIS and participated in this study were already highly conscious of infection control; however, we were not able to evaluate associated biases. Future studies should evaluate not only the recruited hospitals but also all hospitals in Japan. Third, our two sets of data were not linked by individual patient identification. This limitation meant that patients in the DPC data cannot be linked with the JANIS data. Therefore, it was not possible to directly evaluate the medical procedures of patients with submitted bacterial tests. As an alternative, facility-level variables were obtained from the data to achieve our purpose. Our conceptual model showed that AMR is affected by the activity of the facilities; hence, we found the evaluation of variables at the facility-level to be appropriate. In the future, more detailed analyses may be possible if databases containing the provided medical care data and detected bacterial data at the individual patient-level are constructed.

Data Source
The administrative claims data were based on the DPC/PDPS, which is a national administrative claims database for acute inpatient care in Japan. Nationally uniform electronic DPC data include facility information (e.g., number of beds), patient clinical information (e.g., age, sex, disease, and admission and discharge date), and information on medical procedures (e.g., drug administration, surgery, examination, procedure, their codes, and cost) [24,25] used for patient classification and the DPC-based reimbursement system. They are used to improve systems and policies, including hospital management and service reimbursement. The JANIS data, which include microbiological data, provide information regarding patient demographics, specimen reception dates, specimen sources, types of bacteria, and susceptibility test results [26]. Moreover, the JANIS data encompass all testing results regardless of characteristics, such as infection status, colonization, or carrier status. JANIS has been launched as a voluntary AMR surveillance funded by the MHLW and managed by the National Institute of Infectious Diseases. JANIS clinical laboratory division collects comprehensive specimen-based data, which comply with JANIS data format, from participating hospitals each month.
We conducted a cross-sectional database analysis. We used both administrative claims data and microbiological data obtained from a research project managed by Kyu-shu University [27]. The project aimed to analyze the association between infection control, including antimicrobial consumption, and AMR. Kyushu University collected two sets of data from 145 acute care Japanese hospitals, which responded to the call for voluntary study participation.
The DPC/PDPS data and the JANIS data have different datasets. These data have common facility identifications, but not common patient identifications. Furthermore, our research objective was at the facility-level and not at the patient-level. Therefore, the variables collected and calculated were based on the identification of each facility.

Study Inclusion and Exclusion Criteria
This study included patients who had been admitted and included in the DPC data in each facility as well as those with specimen reception dates as indicated in the JANIS data between 1 April 2016, and 31 March 2017. Facilities without records of patient characteristics, hospitalizations, procedures, drugs, surgeries, or microbial susceptibility data were excluded because these variables were necessary for the principal component analysis and cluster analysis.

Variable Definitions and Facility Categories
Variables from the DPC data and JANIS data were obtained, calculated, and summarized at the facility-level and were evaluated as potential candidate variables for the following three domains: structure factor, process factor, and outcome. These definitions are presented in Table A3. The region and number of hospital beds were obtained from the DPC data. The following variables were also calculated from the DPC data: number of patient admissions per year, average length of stay, rate of surgeries, intensive care unit patient admissions, central venous catheter use, urinary catheter use, medical fee for IPC type (type 1 or type 2), hospital status (teaching or non-teaching), hospital charge index (7:1 or 10:1), and pharmaceutical services (claimed or non-claimed). Process factors included the following variables calculated from the JANIS data: multiple sets of blood cultures and contamination of blood cultures. Moreover, the following variables were calculated from DPC data: TDM implementation rate for vancomycin, number of CD detected tests, blood culture collected prior to broad spectrum antibiotic therapy, specimens for culture prior to broad spectrum antibiotic therapy, number of bacterial tests, antimicrobial use density (AUD) of antibiotic injections, and days of therapy (DOT) of antibiotic injections. AMR included the MRSA/S. aureus detection rate calculated from the JANIS data.

Principal Component Analysis and Cluster Analysis
Principal component analysis is a dimension-reducing method that replaces the variables in a dataset with a smaller number of derived variables. Dimension reduction may help to remove redundant variables that could possibly hinder the clustering process [18,28]. The goal of the principal component analysis was to extract a number of principal components to be used as clustering variables. One way to perform principal component analysis is by selecting a subset of variables that best approximates all the variables [29]. During variable selection for principal component analysis, we performed the following steps: first, we selected variables that can be considered resources for hospital infection control and excluded similar variables that reflect the size of the hospitals. In this study, medical fee for IPC type had been selected. The AMR domain was also selected for the outcome measure. Second, for the independent variables, correlations between survey variables were computed. Variables that were highly correlated (γ > 0.70) were avoided for simultaneous use in the principal component analysis and only either one was used. The coefficient of variation (CV) was calculated for highly correlated variables, and those with the largest CVs were selected for principal component analysis. Third, the optimal variable subset was explored in performing principal component analysis following these criteria: (1) greater number of variables, (2) fewer number of principal components with an eigenvalue >1, and (3) cumulative eigenvalue >60%. Variables with eigenvectors >0.4 were considered principal components.
A hierarchical cluster analysis was performed based on the Euclidean distances among the principal components [30]. We implemented the Ward method to minimize the total variance within clusters [31]. The number of clusters were based on the distribution of variables among the clusters.

Statistical Analyses
After performing the cluster analysis and choosing the number of clusters, each variable was compared among the clusters, and each cluster was labeled. We then compared the outcome measure (MRSA/S. aureus detection rate), which was the parameter that had not been used for the principal component analysis and cluster analysis. Continuous variables are expressed as means (standard deviation [SD]) or medians (interquartile range [IQR]) depending on whether their distributions are normal or skewed. Categorical variables are expressed as numbers (percentages) and binary variables are coded as 0/1 forms. Differences in characteristics between the clusters were assessed using an analysis of variance for continuous normally distributed variables, and non-parametric Kruskal-Wallis tests were used for non-normally distributed variables. All statistical analyses were performed with SAS 9.4 (SAS Institute, Cary, NC, USA), and p < 0.05 was defined as statistically significant.

Conclusions
We established a novel exploratory statistical methodology for multicenter Japanese hospitals, which led to the identification of subgroups based on structure and infection control factors. Our results suggest the importance of a multidimensional assessment of AMR, involving structure and infection control factors. Antimicrobial stewardship, including aggressive examination, is associated with low AMR while surgeries, an infection risk factor, are associated with high AMR. For more detailed research, it would be desirable to establish a database associating patient information with microbiological detection.  Data Availability Statement: The data are not publicly available as the participants of this study did not agree for their data to be shared publicly.

Conflicts of Interest:
The authors declare no conflict of interest.