3.1. Data and Variables
The DB of firms used in this study is the survey of business activities DB of Statistics Korea. The survey of business activities DB includes comprehensive data from a survey that not only collects financial information, such as sales, number of employees, assets, and liabilities, but also business strategies, such as IT application utilization/non-utilization, strategic alliances between companies, outsourcing, existence of a parent or daughter company, and foreign ownership. The survey of business activities began in 2006 and is conducted every year. Since a complete survey is conducted for firms with more than 50 permanent employees and capital of over KRW 300 million, the proportion of firms that enter or drop out of the DB is low, facilitating the long-term observation of panel data. 
Table 2 shows the descriptive statistics of the survey of business activities for the selected years (monetary values were deflated using the GDP deflator with 2010 as the base year.) In 2013, the total number of firms was 11,572, and each year 5.5% of firms newly entered the DB and 5.3% of firms dropped out. Thus, a relatively stable panel structure was maintained. The average sales of firms in 2016 was about KRW 208 billion and the average firm age was about 22 years. Earnings before tax (EBT) were about 4.9% of sales in 2016 and the average number of employees increased from 290 in 2007 to 357 in 2016. About a half of all firms were conducting R&D, and about half of the firms owned patents or utility model rights. 
Neirotti and Paolucci [
12] argued that the information intensity of production processes, products and process innovation varied depending on the industrial sector, and, consequently, firm performance due to IT utilization was different between industrial sectors. Consistent with Neirotti and Paolucci [
12], all industries were divided; explanations of each industrial sector are provided in 
Table 3.
In the survey of 2016, the agriculture, forestry, and fishery sectors and mining sector accounted for 0.3% of all firms, while traditional manufacturing accounted for 21.0%; medium-tech manufacturing, 16.4%; high-tech manufacturing, 11.7%; material service sectors, 25.2%; and information service sectors, 25.4%. Since the agriculture, forestry, and fishery sectors and mining sector were less than 0.5% of all firms, they were excluded when the sectoral analysis was conducted. 
The seven IT applications related to the utilization/non-utilization of e-business systems in the survey of business activities DB were used to examine the impact of a firm’s IT application utilization on firm performance. 
Table 4 provides the definitions for the seven applications. 
Figure 2 shows the proportion of firms using at least one of the IT applications each year and in each industrial sector. From 2006 to 2014, the proportion of firms using IT applications increased in every industrial sector. In 2014, the proportion using IT was high in the high-tech and medium-tech manufacturing sectors, about 76% of firms. This was followed by the traditional manufacturing sector, where about 70% of firms were using IT in 2014. The proportion of firms in service sectors utilizing IT applications was lower: about 57% for the material service sector and about 51% for the information service sector; the agriculture, forestry, and fishery sectors had the lowest IT application utilization. 
 Table 5 shows the applications primarily used by IT-utilizing firms in each industrial sector (since there were firms that used two or more applications, the total exceeded 100%.) ERP systems were most used in the manufacturing sector (among manufacturing firms utilizing IT, 94% had adopted an ERP), and applications related to transactions among businesses such as B2BI and SCM were also actively used. In the service sectors, the most utilized application was also ERP (among service sector firms utilizing IT, 78.5% had adopted an ERP) and other applications suitable for the sector’s characteristics were also actively used. Technologies related to customer management or internal human resource management were actively used in the information service sector: CRM, HRMS, and LMS were used by 27.4%, 19.9%, and 5.5% of firms, respectively. Furthermore, 14.2% used KMS to manage the firm’s accumulated knowledge in the information sector. In the material service sector, CRM and HRMS were also important: 20.3% of firms were using CRM and 15.3% of firms were using HRMS. 
   3.2. Definition of IT-Starting Firms and Non-IT Firms
To examine the changes after the adoption of IT applications is initiated, the present study defined IT-starting firms and non-IT firms as follows. First, companies that indicated they do not use IT applications were selected from the firm panel database. The selected companies (with no IT applications) were divided into those that began to use at least one IT application and those that did not in the following year, and the performances between two groups are compared. 
That is, in the actual analysis, the performance of firms that began utilizing IT were compared to firms that continued not to utilize IT. Among the firms not using any of the IT applications (the seven applications are explained in 
Table 4) in period (t − 1), those that started using at least one IT application in period t were labeled “IT-starting firms,” and firms that still did not utilize any IT applications in period t were labeled non-IT firms. These labels will be used throughout this study. 
The rationales for this analysis strategy are as follows. First, it is necessary to clearly define the IT adoption of the companies, of which the study tries to investigate the effectiveness. Companies continuously carried out the investment in and use of IT applications as a type of management activity [
31]. Although the same type of IT applications is used, the degree of usage and technological level could be different depending on companies. Therefore, when examining the effects of using IT application across the entire firm list, it is difficult to discern whether the IT applications they are using are being implemented for the first time or for years, whether their technological level is high or low, and whether their IT utilization level is on increasing or decreasing. Moreover, the meaning of changes in performance by year after IT adoption becomes vague. Therefore, instead of analyzing the entire firm list, the current study examined the impact of IT applications after adoption by companies that initially did not use them. This method enables comparison between IT-starting firms and non-IT firms as all companies have no IT applications at the initial stage. 
During the analysis period, 83.3% of IT-starting firms began using IT applications by introducing ERP systems. In addition, the ratios of companies that adopted B2BI, CRM, HRM, LMS, SCM, and KMS as their first IT utilization are 13.0%, 9.2%, 8.7%, 5.4%, 4.2%, and 2.0%, respectively. (Since there were firms that began using two or more applications, the total exceeded 100%.)
A total of 80.2% of IT-starting firms initiated their IT utilization by adopting one single IT application, 15.8% chose two at the beginning, and 4.0% introduced more than three. 
T-tests were performed by comparing the characteristics of IT-starting firms and non-IT firms before the adoption of IT (year t − 1) for each industrial sector (shown by pooling for t = 2010, 2011, 2012); the results are shown in 
Table 6. First, comparing firm populations between 
Table 6 and 
Table 2, the companies with no IT utilization in 
Table 6 are much smaller compared to the entire list of companies in 
Table 2, and have an average of fewer than 200 employees, falling into the category of SME. Meanwhile, looking at 
Table 6, although there were differences between the industrial sectors, sales and the number of employees of IT-starting firms were larger than those of non-IT firms, and it can be seen that labor productivity was higher, profitability was higher, the debt ratio was lower, and the patent application rate was higher due to more R&D. Furthermore, in many cases, there was a sister or parent company, and the firms provided various incentives to employees. That is, firms that were competitive in size and other financial indices started using IT [
24,
44].
 It can also be seen that the shapes of IT-starting firms and non-IT firms were different even before the use of IT, and a performance evaluation of IT utilization through a simple comparison between the two groups will lead to selection bias. The PSM methodology used in this study is known as appropriate for solving such a selection bias problem [
45,
46].
  3.3. Methodologies
In this section, we introduce propensity score matching, which was adopted to measure the effect of IT utilization. The effect of a certain treatment (such as IT application adoption in this study) on a firm can be measured as the value added created by receiving the treatment. The effect of a treatment on a firm can be described as “What would have happened to those firms who, in fact, did receive the treatment, if they had not received it?” Hence, a mere comparison between a treated firm group and untreated firm group cannot identify the exact additional effect from the treatment, since their characteristics before receiving the treatment were already different (which is generally referred to as selection bias). Modern evaluation methods are focused on estimating this counterfactual [
47]. Given the counterfactual problems, the most appropriate measure of effectiveness of a treatment on a firm might be a comparison of the performance of two firms with the same characteristics, where one received the treatment and the other did not. However, it is hard to find an appropriate comparison group to represent the untreated firms so the effect of the treatment can be evaluated. We apply the propensity score matching (PSM hereinafter) methodology in this study, which allows us to construct a comparison group by matching firms based on the propensity score of firms in the population of the untreated firm group. We expect this approach to solve the selection bias problem and to enable a comparison of the factual and counterfactual to estimate the outcome from IT adoption.
The PSM methodology was first introduced by Rosenbaum and Rubin [
48]. The concept of PSM requires fulfillment of the conditional independence assumption (CIA). This means that conditioned on the observable characteristics (
X variables) of firms, the decision to receive the treatment should be independent of the potential outcome measures. CIA in this respect can be written as follows:
        where 
 denotes independence, 
 indicates the potential outcome of the treated firm, and 
 denotes the potential outcome of the untreated firm. 
 is an indicator variable denoting receipt of the treatment. If these assumptions are satisfied and a sufficient number of observable variables related to the characteristics of the participants exist, it is theoretically possible to obtain an unbiased estimation of the treatment effect. A propensity score indicates the conditional probability that firms receive a treatment when observable characteristics (covariates) of participating firms are given. In other words,
        
Rosenbaum and Rubin [
46] proved that under CIA, with the propensity score defined as in Equation (2), all biases due to observable variables can be removed by conditioning solely on the propensity score.
        
Based on Equation (3), for a population of firms denoted by 
i, we can define the effect of a treatment, which is expressed as the difference between real and counterfactual outcomes, as the average effect of the treatment on the treated (ATT) as follows:
        where the outer expectation is taken over the distribution of 
 in the population of treated firms, 
. ATT is the difference in average outcomes of the treated and untreated firms, where the untreated firm group is formed by matching units based on the propensity score.
The propensity score (PS) is normally obtained through logistic regression, which uses various observable firm variables that affect the treatment assignment prediction as covariates. In recent PS estimations, the importance of the role of the PS in balancing covariates has been emphasized. The covariate balancing condition stipulates that there should be no significant difference in covariate values between the matched treatment firm group and untreated firm group, using the PS as a weight. In this study, we use the covariate balancing propensity score (CBPS) methodology recently developed by Imai and Ratkovic [
46]. In the CBPS methodology, the balance condition is incorporated during PS estimation and parameter values are estimated to achieve balance.
According to Imai and Ratkovic [
46], using logistic regression, the propensity score is expressed as follows.
        
Maximizing the log-likelihood function in Equation (5), parameter 
 is estimated as follows.
        
In Equation (6), if 
 is twice continuously differentiable in 
, the following first order condition can be obtained.
        
Here, 
. Equation (7) can also be interpreted as a balancing condition for 
, which is a function of covariates. Equation (7) can be rewritten as follows:
In Equation (8), the left-hand side is the weighted average (with a weight of ) of  for the treated firm group, and the right-hand side is the weighted average for the untreated firm group.
Meanwhile, the property for covariate balancing can be expressed as follows using the concept of inverse propensity score weighting.
        
Here,  is a function of covariate . Then, Equation (9) becomes a special case for Equation (7), where . 
If ATT is to be obtained, then it is recommended that the weighted covariate distribution of the untreated firm group be equal to that of the treated firm group. In this case, covariate balancing can be expressed as in Equation (10). Equation (10) can be interpreted as the average difference of a function of the covariates after standardized mortality ratio weighting.
        
The CBPS method uses a generalized method of the moments methodology with an iterative computational procedure to find estimates for  that best optimize the likelihood condition (i.e., specified likelihood function) and the balance condition (i.e., specified balance function) simultaneously. Finally, for each respective covariate, we estimate the PS that is balanced between the treatment firm group and the untreated firm group.
However, estimation of the propensity score alone is not sufficient to estimate the ATT of interest. This is because the probability of observing two firms with exactly the same PS value is in principle zero, since 
P(
X) is a continuous variable. Various matching methods have been proposed in the literature (see Caliendo [
14] for reviews), and we applied the recently developed radius matching method with bias adjustment, suggested by Huber et al. [
45]. The radius matching method matches a treated firm with all untreated firms within a certain radius, when matching based on the PS. Compared to 1:1 pair matching, in radius matching, one treated firm can be matched with multiple untreated firms, lowering the variance of the estimated ATT and increasing the efficiency of the estimator. 
In this analysis, treatment was defined as newly adopting IT applications. The analysis was conducted by pooling IT-starting firms and non-IT firms for 2010, 2011, and 2012 (since the matching procedure requires two years of data before the adoption of IT application, and we observed outcomes up to four years after the adoption, we used firm DB from 2008 to 2016). Pooling was used for two reasons: first, to increase the number of observations, thereby increasing the statistical significance of the results. The second reason is that there were no major policy or macroeconomic changes in the pooled years, leading us to believe that there would not be large differences in the treatment impact across the years. The pooled data were divided into five industry types according to the industry classifications in 
Table 3, and the CBPS methodology was applied to each respective industry to estimate the PS. 
Table 6 shows the number of IT-starting firms and non-IT firms in the pooled data by industry. There were 292 and 2169, respectively, in traditional manufacturing, 186 and 1198 in medium-tech manufacturing, 145 and 902 in hi-tech manufacturing, 310 and 3332 in material service, and 228 and 3811 in information service, respectively. 
In PS estimation, Li [
49] and Steiner et al. [
50] recommend including not only the variables related to treatment as covariates, but also the many variables related to a firm’s potential outcome. In addition, when using company data for matching, McKenzie [
51] recommends using not only data of the year prior to the treatment year but also data from years as far back as possible. Therefore, in this study, numerous variables related to a firm’s IT adoption and potential outcome were used as covariates in matching, and growth rates before IT adoption (comparing 1 year and 2 years before IT adoption) were also included. The covariates used were selected based on previous literature [
25,
32,
41,
52]. 
Table 7 presents a list of the covariates used in the analysis. Using the estimated PS, the calculated covariate balance between the IT-starting group and non-IT group was examined using the standardized mean difference method proposed in Greifer [
53]. Since the difference between the groups was less than the threshold of 0.1 for all covariates, the covariate balance was judged to be appropriate. Additional diagnostic plot and statistics on matching used in the study are summarized in 
Appendix A.
Using the obtained PS, radius matching employed in Huber et al. [
45] was used to obtain the industry-specific ATT. When matching using the PS, IT-starting firms and non-IT firms were forced to match within the same year. Three times the 90% quantile of the PS distance distribution (calculated from 1:1 pair matching of each IT-starting firm to the non-IT firm population) was used for the radius. Non-IT firms within the radius were weighted proportionally to the inverse of their distance to the respective IT-starting firm they were matched to when computing the local mean outcome. The weights obtained from the matching were used in a weighted linear regression to remove biases due to mismatches (see Huber et al. [
45] for more detail on the matching protocol).
We observed differences in growth of sales, labor productivity (value added per employee), profitability (earnings before tax per sales), number of employees, number of male and female employment, wage level (total labor cost per employee), and export between the IT-starting firm group and the non-IT firm group. The performance indicator was determined as the difference between firm performance in year t + 1 to year t + 4 and that of year t − 1 to determine the ATT of firms utilizing IT compared to firms that did not.