How Should the Social Service Quality Evaluation in South Korea Be Verified? Focusing on Community Care Services

The quality evaluation (QE) of social services tends to have a large variation in results depending on the object and method of service measurement. To overcome these limitations, an analysis of the internal consistency or validity of the social service QE index is necessary, but meta-research on this is insufficient. This study analyzes the internal consistency and validity of evaluation indexes based on the results of social service QE. We utilized the social services QE manual of the Social Security Information Service’s Facility Evaluation Department. The social service QE indexes implemented in 2013 and 2016 were coded and analyzed. We found that there was internal consistency between the results of the care services evaluation in 2013 and 2016. In addition, there were differences between the care services QE indexes by service type in 2013 and 2016. It is necessary to construct effective indexes by simplifying, diversifying, and differentiating social service QE indexes. In addition, control devices for external factors (region, composition of the evaluation team, etc.) must be prepared to maintain the consistency of evaluation scores, and in the long term, standardization of social service QE indexes is necessary.


Introduction
Social service quality indicates the degree to which the needs of service users are satisfied [1,2]; in quality control for social services, quality should be assessed and managed in a macroscopic and broad social context and should include the assessment of individual services [3][4][5]. In particular, for the quality control of social services, separate interventions in service quality are required to meet the users' quality expectations. To protect the rights of service users, ensure public good, and prevent the adverse selection of suppliers, etc., quality control efforts are vital [6][7][8].
Among social services, care services could be seen as a better alternative than facility admission, in terms of the right guarantee and accessibility to services. Accordingly, developed countries, including the United States, are providing incentives to expand care services [9,10]. With respect to care services in Korea, the country mainly provides financial support for social services, such as services for postpartum women and infants, home and health (H&H) help, elderly care, etc., which could have high public responsibility as a public service model [11,12].
In Korea, since 2010, as in the US, the work of facility evaluation has been entrusted to the Social Security Information Service under the non-profit central office group. A pilot project for the QE of social services has been promoted; after the Act on the Use of Social Services and the Management of Social Service Vouchers was enacted in 2011, the QE project for social services started in 2012. As of 2018, more than six QEs of social services have been conducted by the Social Security Information Service (SSIS). In terms of care service projects, excepting regional investment projects, evaluations were performed in 2013 and 2016, and the data related to the evaluation results have been accumulated. As of 2019, QEs of social services have been implemented for care services, and work to improve the QE indexes is being executed in multilateral ways. Nevertheless, in the case of Korea, there is no certification system by meeting certain indicators, unlike the US, and it does not have any service quality standards, unlike the UK. It is staying at a level of ranking A, B, C, and D through an every 3-year social service QE. Therefore, it is difficult to guarantee the reliability and validity on the QE because the indexes are varying for each evaluation period [13][14][15].
In the QE of social services, there is a tendency for the deviation in the results to increase depending on the measurement targets and methods if the service contents are professional or the service users are relatively vulnerable to service information [16,17]. Moreover, there are aspects in which service quality differs depending on the characteristics of the provided social service types or organizations, and it is difficult to reflect this in evaluations [17,18]. Although the social service QE system has recently entered a period of settlement, there is a limit to the QE of social services, and there is demand for improvement. There has not been an in-depth analysis of the differences between the 2013 and 2016 social service QE results, which were conducted by the Social Security Information Service, or by service type and institution type. In addition, research on internal consistency and validity in the QE indexes of social services is lacking; in particular, the meta-studies lack QE of social services, especially those centered on care services [19]. Therefore, social demands for improvement are emerging so that the current social service QE can be applied practically.
Korean social services have marketability and are public at the same time. Although the private sector holds most social service providers, most of the resources required in the process of providing social services are public services funded by the government and local governments. As social services based on public sector finances occupy an absolute amount, the importance of the public sector and QE for social services is emphasized at the same time. However, the QE of social services is being operated as a private for-profit model transferred to private institutions. Like the US and the UK, Korea transferred the QE of social services to private institutions, but the government's regulatory level is not as high as theirs. Also, although public institutions oversee social services, like in Sweden, a strong national QE system is insufficient. Therefore, it is judged that systematic indexes that can perform practical evaluations are necessary for the QE of social services.
Thus, in this study, we analyzed the internal consistency and validity of QE indexes based on the results of the QE of social services that were conducted twice, focusing on care services. Through this, we intended to highlight the problems with the QE indexes of social services, including care services, and then propose improvement directions for these indexes in the future. The results of this study serve as a basis for the standardization of social service QE indexes and can help establish a strong, national-level guideline to ensure the quality of social services. The design of such evaluation criteria will derive evaluation indexes suitable for global standards and can improve the quality of social services by resolving user complaints and reinforcing options.

Definition of the Variables
Among the QEs of social services implemented and centered on the SSIS in 2012, service QEs were conducted in 2013 and 2016 on care services, including those for postpartum women and infants, H&H help, elderly care, etc.
The quality of care service was evaluated based on the evaluation indexes that measure its quality. Also, based on the QE results, the evaluation of the care service institution and the improvement plan for institutional operation are derived. Considering the results of the QE of social services conducted in 2013 and 2016, it was found that the evaluation results of institutions, which have received a field evaluation in 2013 (1st period), were superior to those of the non-evaluated institutions, demonstrating a learning effect from evaluation. Since a majority of projects have been increased or continuously kept the same social service QE grades, it is necessary to examine the internal consistency of the evaluation results. Also, it is necessary to analyze the difference in evaluation index scoring as projects whose rating grades have increased vary by service type.
Meanwhile, the comparative analysis of social service QE indexes for care services, conducted in 2013 and 2016 showed that a total of 14 indexes, including six indexes in institutional operations, five in human management, and three in service areas, were consistent. First, in terms of institutional operations for operating systems, the operational regulations and operating plans were derived as common indexes; for information management, information protection and information security were used as common indexes; in accounting management, accounting management and settlement disclosure were common indexes; there was no common index in project evaluation and publicity. Second, in terms of the manpower management sector within human resource management, the recruiting process, labor contracts, and standard compliance were common indexes; in the educational system, education time was applied in common. Common indexes did not exist in business control, educational content, and right guarantee. Third, in the case of service environment among service areas, attire management was a common index, and in terms of tenure rate area, the tenure rate belonged to a common index also. In establishment of plans, the counseling plan and record management appeared as a common index; user satisfaction was commonly used in implementation and monitoring. In the sector of service linkage and termination, contract termination and document filing were applied as common indexes. Fourth, in the field evaluation area, the field evaluation itself was commonly used in both 2013 and 2016. In view of this, we judged that it is necessary to analyze by matching the evaluation weights in the years of 2013 and 2016 based on common indexes.
In this study, we derived the improvement directions and priority order for QE indexes of social services by analyzing the internal consistency and validity of the social service QE indexes of care services, which have been conducted twice. The analysis framework is as follows. A paired t-test was conducted to analyze the internal consistency between the QE results of 2013 and 2016. To analyze the validity of the QE indexes of care services from 2013 and 2016, a factor analysis was utilized. To analyze the difference between the QE results by profit type and service type in 2013 and 2016, we utilized an analysis of variance ( Figure 1). have received a field evaluation in 2013 (1st period), were superior to those of the non-evaluated institutions, demonstrating a learning effect from evaluation. Since a majority of projects have been increased or continuously kept the same social service QE grades, it is necessary to examine the internal consistency of the evaluation results. Also, it is necessary to analyze the difference in evaluation index scoring as projects whose rating grades have increased vary by service type. Meanwhile, the comparative analysis of social service QE indexes for care services, conducted in 2013 and 2016 showed that a total of 14 indexes, including six indexes in institutional operations, five in human management, and three in service areas, were consistent. First, in terms of institutional operations for operating systems, the operational regulations and operating plans were derived as common indexes; for information management, information protection and information security were used as common indexes; in accounting management, accounting management and settlement disclosure were common indexes; there was no common index in project evaluation and publicity. Second, in terms of the manpower management sector within human resource management, the recruiting process, labor contracts, and standard compliance were common indexes; in the educational system, education time was applied in common. Common indexes did not exist in business control, educational content, and right guarantee. Third, in the case of service environment among service areas, attire management was a common index, and in terms of tenure rate area, the tenure rate belonged to a common index also. In establishment of plans, the counseling plan and record management appeared as a common index; user satisfaction was commonly used in implementation and monitoring. In the sector of service linkage and termination, contract termination and document filing were applied as common indexes. Fourth, in the field evaluation area, the field evaluation itself was commonly used in both 2013 and 2016. In view of this, we judged that it is necessary to analyze by matching the evaluation weights in the years of 2013 and 2016 based on common indexes.
In this study, we derived the improvement directions and priority order for QE indexes of social services by analyzing the internal consistency and validity of the social service QE indexes of care services, which have been conducted twice. The analysis framework is as follows. A paired t-test was conducted to analyze the internal consistency between the QE results of 2013 and 2016. To analyze the validity of the QE indexes of care services from 2013 and 2016, a factor analysis was utilized. To analyze the difference between the QE results by profit type and service type in 2013 and 2016, we utilized an analysis of variance ( Figure 1). The hypotheses for this study are as follows:  The hypotheses for this study are as follows: Hypothesis 1a. There will be internal consistency between the QE results of care services in 2013 and those in 2016.

Hypothesis 1b.
There will be internal consistency between the QE indexes of care services by service type in 2013 and those in 2016.

Hypothesis 2.
There will be individual validity in the QE indexes of care services in 2013 and 2016.

Method
Internal consistency analysis is a method to evaluate reliability, which divides one measurement tool into two-each having the same number of questions-and then evaluates the correlation between the two overall scores [20]. The methods of evaluating internal consistency include the confidence coefficient (Cronbach's α) and Cohen's Kappa coefficient, etc. The confidence coefficient has a value of 0 to 1 in the reliability evaluation tool of the question; values of 0.7 or higher are considered to have high reliability. Cohen's Kappa coefficient is a method used to measure the reliability of two evaluators; a value of 0.6 or higher can be considered as having consistency [21]. In this study, as a result of measuring Cronbach's α and the Kappa coefficients in advance, significant results were not obtained. Accordingly, a test-retest method of reliability evaluating methods was used. In other words, internal consistency was evaluated by comparing the consistency of the results at each time with the average of the internal consistency rating for each care service type, targeting 423 institutions that have the same indexes and that commonly became the evaluation subjects both in 2013 and 2016 [20,22]. Validity evaluation is a concept that indicates whether a particular index sufficiently reflects the actual meaning of the considered concept [23]. The methods to measure validity include Content Validity, Construct Validity, Criterion Validity, etc. [24]; in this study, factor analysis was utilized to verify construct validity. Factor analysis is a method of classifying multiple interrelated variables into a more limited number of common factors. Factor analysis includes Exploratory Factor Analysis and Confirmatory Factor Analysis; this study analyzed what common factors the common evaluation indexes in 2013 and 2016 are grouped by [25].
Therefore, in this study, the following research methods were used to analyze the internal consistency and validity of the QEs of social services based on care services. First, we attempted to use the Kappa coefficient to analyze the internal consistency of the QE of social services in 2013 and 2016, but no significant value was derived; therefore, we conducted a paired t-test on the concerted QE indexes for 2013 and 2016. As a result of the analysis, it was judged that if the score of the evaluation indicator in 2016 was improved compared to that in 2013, there was an internal consistency. This is because it was considered that the learning effect on the previous evaluation indexes appeared.
Second, to analyze the internal consistency by service type (postpartum women and infants, H&H help, and elderly care), an ANOVA was performed. Scheffé's method was adopted for the post-hoc validation of the ANOVA. As a result of the analysis, if there was a difference in averages for each service type, it was considered that the internal consistency was low. It is because the reliability of the evaluation indicator lowers if the evaluation score varies by service type.
Third, to analyze the validity of the QE index of care services, we verified the degree of validity of the factors, such as institutional operation, human management, service area, and field evaluation, using factor analysis.

Data Collection
In the data collection stage, we utilized the manual of QE of social services by the Facility Evaluation Department of the SSIS, and the secondary data were quantified based on the analysis results [26,27]. To this end, in connection with the information held by the SSIS, we newly coded scores for indexes commonly used for evaluations in both 2013 and 2016. To analyze the internal consistency and validity between the QE indexes in 2013 and 2016 for care services among the social services, the following steps were conducted, centered on common indexes: For institutional operation, 1 point was commonly applied to operational regulations, operation plan, information protection, information security, accounting management, and settlement disclosure.
For human management, the recruitment process and labor contracts were unified into 1 point; the standard compliance and education time were assigned 1 and 2 points, respectively, in accordance with the mark distribution criteria of 2013.
In terms of service area, 1 point, 3 points, 1 point, 1 point, 1 point, and 1 point were assigned to attire management, tenure rate, counseling plan, record management, community, and contract termination, respectively, according to the criteria of 2013; 1 point, the same distribution criterion, was applied to contract termination. In the case of satisfaction, as the difference in the allotment of points was excessively large (1 point for 2013 and 25 for 2016), only 1 point was commonly applied to it.
Fourth, in the case of field evaluations, 6 points were allotted based on the point distribution criteria in 2013. The specific common indexes for QEs of social services are listed in Table A1.

Status of Evaluation Target Institutions
In order to evaluate the quality of social services, the evaluation was performed based on the data inputted on the information system, and these data were collected through the facility evaluation information system of the SSIS. Therefore, the QE on care services was also executed by the SSIS. The status of specific providers is as follows. In 2016, the target institutions for the social service QE were total 705, in the order of 409 elderly care institutions, 202 postpartum women and infants care institutions, and 94 home and health (H&H) help ones. The number of institutions providing elderly care and postpartum women and infants services in Gyeonggi-do was 64 and 53, respectively, which accounted for the most of these institutions; Jeollabuk-do had the largest number of institutions providing H&H services, with 14 institutions. Since it was based on the entire number service providers in 2013, there were more providers than those in 2016, but its national distribution shows a similar tendency. Since this study measures the internal consistency of the indexes that measure the quality of social services, the number of institutions was not expected to have a significant effect. Elderly care and H&H services have high proportions in metropoles and rural areas, and it is judged that both the number of absolute populations and the proportion of the elderly are considered to have an impact on them. In the case of postpartum women and infants, the proportion in the metropolitan area, where 40% of the total population is concentrated, appeared high (Table 1). Table 1. Distribution status of target providing institutions of social service evaluation (unit: piece (%)).

Internal Consistency between 2013 and 2016
To verify the internal consistency of the common indexes of QEs of social services in both 2013 and 2016, a paired t-test was performed. An analysis was conducted on whether the quality of social service was consistently measured by verifying the internal consistency of the indexes that evaluate the quality of service of social service providers. The result showed that if there was a statistically significant difference between the QE results of 2013 and 2016, it would be difficult to judge the internal consistency of the indexes (Table 2).
First, the analysis showed that the number of users and sales was significantly different in performance. The number of users and sales increased in 2016, compared to 2013. Most social services in Korea are provided by private institutions, and the sales of the institutions are related to human resource management and institutional operation. Therefore, an increase in sales seems to have improved the quality of social services and ultimately increased the number of users. Second, in terms of institutional operation, there were significant differences in the rest of the evaluation indexes, except for operation plans, and the evaluation score was higher in 2016. It could be seen that a learning effect existed in evaluation indexes under the 3-year cycle evaluation process. Third, in the case of human resource management, there were significant differences in recruitment process, period compliance, and education time, but not in labor contracts. However, in terms of period compliance and education time, the evaluation score appeared lower in 2016 than in 2013, showing low internal consistency in these evaluation indexes. Fourth, in the case of service sectors, there were improvements in the 2016 evaluation scores in attire management, tenure rate, record management, and contract termination, but the scores of satisfaction and community linkage were lower than those in 2013. Fifth, it could be seen that even in the case of field evaluations, the QE indexes in 2016 had improved over those in 2013.
In conclusion, the comparison of the internal consistency of QE indexes in both 2013 and 2016 via a paired t-test showed no change in operation planning in institutional operation, labor contracts in human resource management, and counseling planning and document filing. In contrast, most of the evaluation indexes showed a statistically significant increase due to learning effects, etc., but standard compliance, education time, etc., in human resource management and satisfaction and community linkage in the service area showed lower evaluation scores in 2016 compared with 2013, which in general contributed to lowering the internal consistency of those QE indexes.

Internal Consistency by Service Type
To compare internal consistency in the evaluation scores for each type of service (e.g., postpartum women and infants, H&H help, and elderly care services), a one-way batch analysis was conducted. We tried to pursue a diversity of evaluation indicators by confirming what difference the social service evaluation score represents for each service type (measured by the QE index) and confirming what factors impact the service evaluation score.
First, the mean difference by service type was analyzed based on the QE results of social services in 2013. The results showed that services for postpartum women and infants had a higher number of users, and in the case of sales, services for postpartum women and infants and elderly care were higher than H&H help.
Second, in operational planning, information protection, information security, accounting management, accounting disclosure, etc., in the institutional operations area, the services of H&H help and elderly care had higher evaluation scores than services for postpartum women and infants. Third, in recruitment process, labor contract, standard compliance, etc., in the human resource management area, the evaluation scores of H&H help and elderly care services appeared higher than those of services for postpartum women and infants.
Third, in attire management, tenure rate, record management, community, contract termination, education time, etc., in the service area, the evaluation scores of H&H help and elderly care services were statistically higher than those of services for postpartum women and infants. Fourth, in field evaluation, the evaluation scores of H&H help and elderly care services also appeared higher than those of services for postpartum women and infants. The fact that an inconsistency in the evaluation scores was visible among three separate service areas means that it should be considered to apply differently to evaluation indexes. In particular, in the case of postpartum women and infants, it is necessary to develop evaluation indexes suited to the service characteristics through adjustment of the evaluation indexes (Table 3).
Next, we compared and analyzed internal consistency of each service area for postpartum women and infants, H&H help, and elderly care in the QE of social services in 2016. The analysis results were as follows. First, the number of service users was higher in services for postpartum women and infants. In terms of sales, the evaluation scores of services for postpartum women and infants and elderly care appeared higher than those of H&H help services. Second, in the case of institutional operations, in the evaluation indexes of operational regulations, information security, settlement disclosure, etc., the evaluation scores of the services of H&H help and elderly care were higher than those of services for postpartum women and infants. Third, in terms of the human resource management area, in labor contracts, education time, etc., the evaluation scores of services for postpartum women and infants appeared lower than those of H&H help and elderly care. Fourth, in terms of the service area, in satisfaction, community linkage, contract termination, etc., the evaluation scores of H&H help and elderly care services were higher than those of services for postpartum women and infants. Fifth, even in the field evaluation, the evaluation scores of H&H help and elderly care services appeared higher than those of services for postpartum women and infants. In conclusion, they were of low internal consistency by service types. Thus, also in the case of social service QEs by service type in 2016, there is a need to specialize in the evaluation indexes of services for postpartum women and infants, and it is vital to develop differentiated evaluation indexes by service type (Table 3).

Validity of QE Indexes of Social services
A factor analysis was conducted to verify the index validity of the QE of social services, implemented in both 2013 and 2016. The method of principal component analysis was selected, and for the rotation method the varimax method was used. We verified the validity of the evaluation indexes in three areas, institutional operation, human resource management, and service area, with the exception of field evaluation, among the QE indexes of social services.
A factor analysis was conducted on the QE indexes of 2013; the Kaiser-Meyer-Olkin (KMO) value was higher than 0.5, indicating that the evaluation indexes were appropriate for a factor analysis. Since the KMO value was 0.807 and the p-value of Bartlett's test was 0.001, using a factor analysis was considered proper. In Factor 1, operational plans, operational regulations, information protection, and information security, which were related to institutional operation, were unified into a single factor that included evaluation indexes such as recruitment processes, etc. in human resource management and community, attire management, etc. in the service area. In Factor 2, standard compliance, education time, labor contracts, etc. in human resource management were integrated into a single factor that contained accounting management, settlement disclosure, etc. in institutional operation. Factor 3 consisted of document filing, contract termination, satisfaction, counselling plan, and record management in the service area as a single factor. Thus, we judged that the QE of social services in 2013 was low in terms of the validity of the evaluation indexes in the areas of institutional operations and human resources (Table 4).
There is a mean difference between the group in a and the group in b. ( ): standard deviation. ANOVA Normality Test: The number of samples for each service type is 30 or more, which meets the normality assumption. Meanwhile, the reliability coefficient (Cronbach' α) k, measuring the reliability of the evaluation indexes, was measured. Cronbach' α refers to the 'High Stakes Testing' if it is 0.9 or higher, and the 'Low Stakes Testing' if it is 0.7 or higher. It could be seen as acceptable only when it becomes at least 0.6 or higher. Factor 1 was 0.694, Factor 2 was 0.264, and Factor 3 (service area) was 0.478; even Factor 3 (service area), which appeared to be relatively valid, did not have any reliability.
A factor analysis was conducted to verify the validity of institutional operation, human resource management, and services, which belong to the QE index of social services in 2016. As the KMO value was 0.813 and the p-value of Bartlett's test was 0.001, using a factor analysis was considered proper. In Factor 1, operational regulations, operational plans, information security, and information protection in the institutional operations area were unified into a single factor. It included education time in the area of human resource management and attire management, and contract termination, document filing, community, etc., in the service area. Factor 2 included settlement disclosure in the institutional operation area; labor contracts, recruitment process, etc., in the area of human resource management; and tenure rate and satisfaction in the service area. Factor 3 comprised of record management and counseling contracts in the service area and accounting management, etc. in the institutional operations area as a single factor. In conclusion, the QE indexes of social services in 2016 had a lower construct validity compared with 2013; thus, it is difficult to distinguish the evaluations for the three areas (Table 4).
In the meantime, as a result of measuring the reliability coefficient (Cronbach' α) k to measure the reliability of the evaluation indexes, it was found that Factor 1 was 0.643, Factor 2 was 0.311, and Factor 3 was 0.280. In other words, the social service QE indexes in 2016 did not show both validity and reliability.

Conclusions
In this study, to analyze the internal consistency and validity of the social service QE system, we utilized evaluations of care services performed in 2013 and 2016. For the research data, we used the QE results of services for postpartum women and infants, H&H help, and elderly care, which were executed by the SSIS in 2013 and 2016, and we selected and utilized the indexes that were commonly applied in both years. In terms of the research method, to verify the internal consistency on the social service quality system, a paired t-test and ANOVA were implemented; for the validity analysis, a factor analysis was used.
First, as a result of the analysis, Hypothesis 1a was accepted. In other words, after comparing the internal consistency of the QE indexes in 2013 and 2016 through the paired t-test, it was found that most of the evaluation indexes showed a significant increase in their evaluation scores due to learning effects or the like. However, standard compliance and education time in human resource management, and satisfaction and community linkage in the service area, had lower evaluation scores. It was found that the social service evaluation score conducted in 2016 increased compared to that of 2013. In particular, the scores for the variables of performance and institutional operation increased. Service institutions that received social service QE in 2013 improved service quality through supplementation. In other words, it is judged that the evaluation score has increased due to the learning effect of social service institutions.
Second, Hypothesis 1b was rejected. The QE score by service type for services for postpartum women and infants appeared lower than those of H&H help and elderly care services. These results were derived because regional differences were not reflected when evaluating user satisfaction and there were differences in the users' characteristics [28].
Third, Hypothesis 2 was rejected. In the QE index for 2013, only the service area was found to be valid by the factor analysis and it was also found that the other areas of institutional operation and human resource management were not valid. The evaluation indexes in 2016 were found to be invalid in all of the institutional operations, human resource management, and service areas. Through this, we judged that, although the characteristics of the project users differed by service type with regard to the composition of the indexes for the QE of social services in 2013 and 2016, the validity of the evaluations was reduced by using the same evaluation indexes.
Based on the results of the internal consistency and validity analysis on the above social service QE system, the priorities for the improvement direction of the social service QE system centered on care services are as follows.
First, the QE indexes of social services should be simplified so that they provide a valid evaluation of the actual service, not just a nominal evaluation for evaluation's sake. Unnecessary evaluation indexes should be removed and the composition of effective indexes should be discussed [29]. The Facility Evaluation Department of the SSIS abolished the settlement disclosure item, integrated the accounting management item, and repealed the document filing index in the scheme research to improve the social service QE system in 2019. In addition, it was improved so that it now measures the satisfaction of both consumers and suppliers by adding provided manpower satisfaction to user satisfaction [30].
Second, the QE indexes by service type should be diversified and differentiated. This study also found that the evaluation scores for services for postpartum women and infants were remarkably lower than those of H&H help and elderly care services. Accordingly, the Facility Evaluation Department of the SSIS intends to realize diversification of the evaluation indexes through the improvement of the indexes, by adding a visiting counseling management index to the H&H help and elderly care services and by adding a purchasing conversion rate index to the services for postpartum women and infants [22,31].
Third, it is necessary to compose a pool of QE indexes for social services and to introduce a modular approach to sorting them into essential indexes that contributed to the improvement of social service quality, and optional indexes, then to exclude the indexes of total indexes that contributed to the service quality improvement, to some extent, and add new sub-indexes to it [32].
Fourth, a control system should be in place for external factors such as regional characteristics, evaluation team composition, etc., which work as constraints on securing fairness in social service QE. In other words, the differences in the characteristics of service users in large cities and those in rural areas should be reflected in the evaluation index; sufficient training is required to maintain the consistency of the evaluation scores according to the composition of the evaluation team.
Fifth, standardization of QE indexes for social services should be attempted for strategic quality control in the long term. That is, it is necessary to establish a standardization basis for the evaluation indexes based on internal consistency and validity and to restructure the evaluation indexes to meet global standards [33].
This study attempted to provide a direction for improvement of the social service QE indexes in the future based on the social service QE, which has been conducted triennially since 2013. However, in 2019, QE was conducted by reducing detailed items at each level for the quantitative easing of social services. Due to this, there was a limit to finding common indexes using the evaluation indexes of 2013 and 2016. In future studies, it is expected that verification at an empirical level should be conducted through a comparative analysis of the 3-year social service QE indexes of 2013, 2016, and 2019. However, this study suggested directions for improvement to effectively conduct QE for social services. In particular, a direction for improving the QE system through analysis of the internal consistency and validity of the evaluation indicators can be used as basic data for the construction of a 4th social service QE model in the future. In conclusion, the framework offered here can serve as the basis for system development and operational direction for social service QE operationalization at the practical level.