Sustainable Technology Analysis of Blockchain Using Generalized Additive Modeling

Blockchain is a secure distributed management technology for data. Until now, blockchain technology has been intensively developed in financial fields such as Bitcoin. As the blockchain technology develops, the application fields of blockchain are expected to further expand. We proposed a technology analysis method for sustainability of blockchain technology. We analyzed the patent documents related to blockchain for sustainable technology analysis. To carry out the technology analysis, we preprocessed the patent documents and built a structure data, document-term matrix. In general, most elements of this matrix are zeros, so it is very skewed. Due to the skewness, technology analysis by traditional methods of statistics has analytical difficulty. To overcome this problem, we proposed a technology analysis method based on generalized additive modeling. To show how our proposed method can be applied to practical fields, we collected and analyzed the patent documents of blockchain technology.


Introduction
Since technology has various effects on many areas of our society, we need to understand and cope with technology correctly. Technology changes society, and society calls for new developments that can improve the quality of human life [1]. Much research into technology analysis has been performed in various areas [2][3][4][5]. Choi et al. (2016) introduced a patent analysis method for sustainable technology management. They used the result of patent analysis for management of technology (MOT) with sustainability. Park and Jun (2017) selected the technology of three-dimensional printing for technology analysis. They also considered technological sustainability as the main goal of their technology analysis because sustainable technology is important to competitiveness of companies in the market. Kim et al. (2017) carried out a technology analysis of the Apple company according to the products of Apple such as the iPhone, iPad, and iPod. The authors showed an approach to product-based technology analysis in MOT. Kim et al. (2019) combined the statistical method and machine learning algorithm to analyze sustainable technology. In most of the studies related to previous technology analysis, patent documents were used for quantitative technology analysis. The researchers searched patent documents related to target technology, and transformed the collected documents into structured data for patent technology analysis by statistics and machine learning algorithms. This is because most of the analysis methods based on statistics and machine learning require structured data in the form of a matrix. Each row and column of this matrix are patent document and term respectively. The element of this matrix represents the frequency with which the term occurred in a patent document. In general, this matrix is very sparse because many of the element values are zero [6][7][8]. Jun et al. (2014) tried to overcome the sparsity by dimension reduction and support vector machine. They reduced the column size of the document-term matrix using principal component analysis and solved the sparseness by combining the columns of high correlation. In the dimensional reduction process, information loss occurs for the entire data set, which leads to a problem that the performance of the analysis model is degraded. Jun (2018) and Uhm et al. (2020) studied Bayesian approaches to settle the sparseness problem of patent document-term matrix. In addition, the data with the zero-sparsity problem have a skewed distribution with a long tail toward zero. We proposed a technology analysis using generalized additive modeling to solve the sparsity and skewed distribution problem in this research. We also applied this proposed method to sustainable technology analysis of blockchain. Generalized additive model (GAM) uses diverse smoothers such as regression spline to deal with the problems of sparseness and skewness [9][10][11][12]. In this paper, we studied technology analysis methods to find sustainable technologies in the blockchain field. In particular, we proposed an objective and quantitative technology analysis method rather than the existing subjective technology analysis method. To this end, after collecting and preprocessing patent documents related to the blockchain, sustainable technical analysis was performed using statistical analysis methods. In order to overcome the sparsity and skewed bias problems of structured data in this process, we studied and proposed a sustainable technology analysis method based on the generalized additive model. The remainder of our paper is organized as follows. In Section 2, we explain the concept of blockchain technology related to our study. We show the proposed method of sustainable technology analysis of blockchain using generalized additive modeling in Section 3. The next section provides the result of our case study for blockchain technology analysis. In the discussion section, we present the motivation and subject of our research, the importance and contribution to the research results, and the limitations. Lastly, we conclude our research and describe our future works related to sustainable technology analysis of blockchain.

Blockchain Technology
With the introduction of bitcoin in 2009, interest in digital currency and cryptocurrency have been focused. The central banks in each country have been discussing various changes in monetary policy, such as supplementing the payment system using bitcoin. Along with this, interest in blockchain, the underlying technology that supports the use of bitcoin in the Internet environment, has increased significantly. Blockchain technology, which started from the underlying technology of bitcoin, shows the possibility of unique application in various fields other than the bitcoin field. Figure 1 shows the web search result of Google Trends related to blockchain technology [13]. term matrix using principal component analysis and solved the sparseness by combining the columns of high correlation. In the dimensional reduction process, information loss occurs for the entire data set, which leads to a problem that the performance of the analysis model is degraded. Jun (2018) and Uhm et al. (2020) studied Bayesian approaches to settle the sparseness problem of patent documentterm matrix. In addition, the data with the zero-sparsity problem have a skewed distribution with a long tail toward zero. We proposed a technology analysis using generalized additive modeling to solve the sparsity and skewed distribution problem in this research. We also applied this proposed method to sustainable technology analysis of blockchain. Generalized additive model (GAM) uses diverse smoothers such as regression spline to deal with the problems of sparseness and skewness [9][10][11][12]. In this paper, we studied technology analysis methods to find sustainable technologies in the blockchain field. In particular, we proposed an objective and quantitative technology analysis method rather than the existing subjective technology analysis method. To this end, after collecting and preprocessing patent documents related to the blockchain, sustainable technical analysis was performed using statistical analysis methods. In order to overcome the sparsity and skewed bias problems of structured data in this process, we studied and proposed a sustainable technology analysis method based on the generalized additive model. The remainder of our paper is organized as follows. In Section 2, we explain the concept of blockchain technology related to our study. We show the proposed method of sustainable technology analysis of blockchain using generalized additive modeling in Section 3. The next section provides the result of our case study for blockchain technology analysis. In the discussion section, we present the motivation and subject of our research, the importance and contribution to the research results, and the limitations. Lastly, we conclude our research and describe our future works related to sustainable technology analysis of blockchain.

Blockchain Technology
With the introduction of bitcoin in 2009, interest in digital currency and cryptocurrency have been focused. The central banks in each country have been discussing various changes in monetary policy, such as supplementing the payment system using bitcoin. Along with this, interest in blockchain, the underlying technology that supports the use of bitcoin in the Internet environment, has increased significantly. Blockchain technology, which started from the underlying technology of bitcoin, shows the possibility of unique application in various fields other than the bitcoin field. Figure 1 shows the web search result of Google Trends related to blockchain technology [13]. The meaning of each number on the Y-axis in Figure 1 is a proportional representation of the search volume for a specific term when the total search volume is set to 100. That is, the highest point is set at 100 and the remaining amount is expressed as a relative value. As shown in Figure 1, interest in blockchain exploded between 2017 and 2018, and various companies, mainly financial companies, have been developing the blockchain technology. In addition, blockchain technology has rapidly The meaning of each number on the Y-axis in Figure 1 is a proportional representation of the search volume for a specific term when the total search volume is set to 100. That is, the highest point is set at 100 and the remaining amount is expressed as a relative value. As shown in Figure 1, interest in blockchain exploded between 2017 and 2018, and various companies, mainly financial companies, have been developing the blockchain technology. In addition, blockchain technology has rapidly emerged in the startup market, and venture capitalists are also investing a lot of capital in blockchain-related startups. Table 1 shows various definitions of blockchain from the previous research [14][15][16]. From various studies related to blockchain so far, we define blockchain as a secure distributed management technology for data, that is, a technology that shares and manages a large volume of total transaction information (ledger), and block-level information is connected and stored in a chain. [17][18][19][20][21]. Until now, blockchain technology has been used in diverse forms in various fields  [25]. We carried out technology analysis of blockchain technology and tried to find the technological sustainability of blockchain.

Additive Modeling
GAM provides a general form of extending the linear model by allowing nonlinear functions for each variable while maintaining additivity [26]. Like the linear model, GAM can be applied to both categorial and continuous response variables. GAM can be applied to regression and classification problems at the same time. First, in the regression task, we consider multiple linear regression as follows.
where β 0 and β i are intercept and regression coefficients of x i , respectively. In addition, n and k are the numbers of data points and explanatory variables, respectively. In GAM, each linear coefficient of this model is replaced by the nonlinear function g(·) as follows [26].
Using this model, we can overcome obstacles of the linear regression models. Second, the following equation represents a logistic model for binary classification tasks.
where p is P(y i = 1). y i = 1 means that an event of interest has occurred. Like Equation (2), we transform this model to the following nonlinear classification model.
This model is also an efficient approach to various classification tasks. In this paper, we apply GAM to regression tasks of patent keyword analysis for sustainable technology analysis.

Generalized Additive Modeling for Sustainable Technology Analysis of Blockchain
With the introduction of blockchain technology more than 10 years ago, it is now time for us to prepare for the blockchain era. Until now, studies on blockchain technology fell into one of two categories. The first category was studies on the social ramifications of blockchain technology. The second category was research on the underlying technology that can efficiently apply blockchain to the financial field. However, research on the methodology to objectively analyze the developed blockchain technology and use the results to confirm the sustainability of this technology in the future has not been conducted so far. Sustainable technology analysis of blockchain enables effective contribution to research and development (R&D) planning and technology management of blockchain technology in the future. In this paper, we studied generalized additive modeling for sustainable technology analysis of blockchain using text mining, statistical analysis, and visualization. The goal of this paper was to study solutions to the following two research questions (RQs): (RQ.1) How can we find the technological structure and relationships for sustainability of blockchain technology? (RQ.2) How can we solve the skewed and sparse problems that occur in the preprocessing of patent big data?
In order to answer RQ.1, we surveyed the previous research related to sustainability in technology areas and technological relations between various technologies. Sott et al. (2020) used network analysis and machine learning algorithms to find emerging technologies in the coffee sector [27]. Several studies have also been introduced on how to identify or evaluate the sustainability of a technology [28][29][30][31][32]. In addition, Schinckus (2020) showed a survey result of sustainability of blockchain from a management and financial point of view [33]. This paper provided good and bad aspects of blockchain technology in societal issues. In our research, we proposed a new approach to find the sustainability of blockchain technology. We tried to understand the sustainability of blockchain technology through an objective analysis of the technology itself. Therefore, for RQ.1, we searched patent documents related to blockchain technology and analyzed patent data. Using the results of the blockchain patent data analysis, we a constructed technology diagram of blockchain for understanding the sustainability of blockchain technology. Generally, we had to build structured data by data preprocessing for technology patent analysis. In this process, we encountered a skewed, sparse problem in structured data, that is a matrix with patent (row) and keyword (column). To solve RQ.2, we dealt with a method based on advanced statistical analysis called the generalized additive model. From the two RQs in our research, we proposed a new method for blockchain technology analysis with sustainability.
For technology analysis, we had to prepare patent or paper documents related to target technology. In this paper, we collected patent documents related to blockchain technology from worldwide patent databases. The collected documents were used for sustainable technology analysis of blockchain. First of all, we preprocessed the document data using text mining methods because statistics or machine learning algorithms require a structured data type for data analysis. From the preprocessing result, we made structured data with a matrix. This matrix consisted of patent documents and keywords for rows and columns, respectively. Each element of the matrix was the frequency at which a keyword occurred in a patent document. In general, this matrix was very sparse because most elements were zeros. Therefore, the distributions of keyword frequencies were skewed to zero. However, most of the previous research related to patent data analysis carried out the statistical methods not considering the skewed data distribution. The skewness problem of patent data degrades the model performance in statistical technology analysis. To solve this problem, we proposed a technology analysis model using generalized additive modeling. In addition, we focused on patent analysis on sustainability of blockchain technology. Given a dependent variable (keyword) y and independent vector (keyword vector) (x 1 , x 2 , . . . , x k ), the generalized additive model is expressed in Equation (5) [9,11] where f j x j is an unspecified nonparametric function such as smoothing spline and the function is mapped to y by various link functions such as Poisson and negative binomial distributions [10,34]. In addition, β 0 and ε are the bias (or intercept) and error of the statistical model, respectively. In generalized additive modeling, we compute f j x j for each x j and then sum f j x j for all x j , j = 1, 2, . . . , k. So, the generalized additive model (GAM) is an extended version of the generalized linear model (GLM) as in (6).
Using the smoother f j (·) in Equation (5), we can overcome the problem in the patent-keyword matrix. In this paper, we used regression spline as a smother, so we represent this function in Equation (7).
where φ(x) and β are basis function and model parameter, respectively. To fit the model parameters, we minimized the following objective in Equation (8).
In Equation (8), λ j represents the regularized strength for f j (·). So, we have to minimize the objective function of GAM. In our research, y was keyword blockchain and (x 1 , x 2 , . . . , x k ) were explanatory keywords that affect blockchain. So, we built GAM to explain the technological trend of blockchain based on the influences by other technological keywords as follows.
In Equation (9), We used all patent keywords except blockchain as independent variables. Each keyword represents a detailed sub-technology of blockchain. For example, the keyword blockchain represents the blockchain technology. In addition, the keyword bitcoin also illustrates the detailed technology related to bitcoin. Therefore, we present the proposed method step-by-step as follows.
Step 1. Collecting patent documents related to blockchain technology from patent databases.
Step 2. Preprocessing collected patent documents using text mining techniques.
Step 3. Selecting significant keywords that affect technological development of blockchain using GAM.
Step 4. Performing trend analysis of significant keywords for blockchain technology using regression plotting.
Step 5. Building a technology diagram for understanding sustainability of blockchain technology.
In Step 1, we searched for patent documents related to blockchain technology from patent databases around the world such as The United States Patent and Trademark Office (USPTO) [35]. The search equation for searching blockchain patents was as follows; ((bitcoin or distribution or decentralization or decentral) and (ledger or exchange or database or storage or consensus)) or (bankcard and (authentication and encash)) or cryptocurrency or (crypto and currency) or (coin and cyber) or (virtual and (currency or money)) and blockchain or (block and chain) or network and (secretkey and nonaccount). The searched patents have to be transformed into a structured data type such as patent-keyword matrix for generalized additive modeling. So, we preprocessed the collected patent documents using various text mining methods in Step 2. We used R data language and its provided package for text mining [36,37]. Figure 2 shows our text mining process for making structured data.
In Step 1, we searched for patent documents related to blockchain technology from patent databases around the world such as The United States Patent and Trademark Office (USPTO) [35]. The search equation for searching blockchain patents was as follows; ((bitcoin or distribution or decentralization or decentral) and (ledger or exchange or database or storage or consensus)) or (bankcard and (authentication and encash)) or cryptocurrency or (crypto and currency) or (coin and cyber) or (virtual and (currency or money)) and blockchain or (block and chain) or network and (secretkey and nonaccount). The searched patents have to be transformed into a structured data type such as patent-keyword matrix for generalized additive modeling. So, we preprocessed the collected patent documents using various text mining methods in Step 2. We used R data language and its provided package for text mining [36,37]. Figure 2 shows our text mining process for making structured data.  This figure illustrates all text mining phases from patent data collection to patent data analysis. Using the collected patent documents, we constructed text corpus and parsed the corpus to build a text database [37]. By extracting technology keywords from the text database, we made a patent-keyword matrix with structured data for patent analysis of GAM. From the structured data in Step 2, we made a patent-keyword matrix consisting of patent and keyword for row (observation) and column (variable), respectively. The columns of this matrix represent technology keywords extracted from the blockchain technology patents, and were used as the dependent and independent variables in GAM. Of course, the dependent variable was the keyword blockchain in our generalized additive modeling. In Step 3, we carried out our generalized additive modeling. First, we found the best GAM among various link functions in Equation (5). We considered Poisson, negative binomial, Poisson inverse Gaussian, and normal distribution as our link functions [38][39][40]. In addition, the best model among competitive GAMs was determined by Akaike information criterion (AIC) and Bayesian information criterion (BIC). AIC and BIC are computed by the following equations [11,38].
whereθ is the maximum likelihood estimator (MLE) of parameter θ. k and n are the number of parameters and data size, respectively. We had to choose the GAM with the smallest AIC and BIC values. We used the p-value from the best GAM result to select statistically significant variables for the blockchain variable. Finally, we selected the independent variables with p-values less than 0.1 (under 90% confidence level) [41]. In Step 4, we visualized the trend of the significant variable (X-axis) on blockchain keywords (Y-axis) using the result of Step 3. Using the visualization results, we classified explanatory variables for blockchain keywords in positive or negative trends. Lastly, we built a technology diagram using the results of Steps 3 and 4 to understand the sustainability of blockchain technology. Using the results, we can perform management of sustainable technology in the blockchain field. We also visualized regression plots of explanatory keywords (x) and blockchain (y) using a prediction interval. The 100(1 − α)% prediction interval at x = x 0 is defined in Equation (12) [42,43].
whereŷ 0 is prediction value of y at x = x 0 , n is data size, and (n − 1) is degree of freedom in t distribution. Mean square residual (MSR) is calculated as follows. In Equation (13), y i andŷ i are observed and predictive values at i. In this paper, we selected the keywords that affect blockchain using the trend and variance values of the prediction confidence interval. Finally, we performed hypothesis testing for feature selection in our proposed model. The null and alternative hypotheses for testing significance of x i are shown as follows [42].
In Equation (14), the null hypothesis H 0 represents the model coefficient of x i (β i ), which is zero. This means that x i cannot explain response variable y. So, when H 0 is rejected, the explanatory variable is statistically significant, and the variable is selected as a feature for the final model. In general, we used p-value as a criterion of hypothesis testing. The p-value has probability values between 0 to 1, and we can reject H 0 when the p-value is less than 0.1 or 0.05 on 90% or 95% confidence levels [34]. Next, we illustrate how our proposed model can be applied to a real domain by a case study using patent documents related to blockchain. We illustrate the overall procedure of our proposed method in the following figure.
As shown in Figure 3, using the keyword equation of blockchain, we retrieved the patent documents related to blockchain technology from patent databases. Next, we made a patent-keyword matrix using text mining techniques. This matrix was used for patent keyword analysis by generalized additive modeling. To build the technology diagram of blockchain, we used the GAM results and regression plots of keywords. Finally, we found the sustainable technology structure of blockchain from the technology diagram. This can be used for various areas of management of blockchain technology such as R&D planning and new service developments. As shown in Figure 3, using the keyword equation of blockchain, we retrieved the patent documents related to blockchain technology from patent databases. Next, we made a patent-keyword matrix using text mining techniques. This matrix was used for patent keyword analysis by generalized additive modeling. To build the technology diagram of blockchain, we used the GAM results and regression plots of keywords. Finally, we found the sustainable technology structure of blockchain from the technology diagram. This can be used for various areas of management of blockchain technology such as R&D planning and new service developments.
In this paper, we used R data language and its packages for preprocessing based on text mining [36,37]. In our patent preprocessing, we firstly extracted titles and abstracts from the collected patent documents. Next, we built a document-term matrix as structured data for generalized additive modeling. The matrix contained patent and term for row and column, respectively. In addition, an element of the matrix was frequency with which the term in the patent occurred. Using the matrix, we carried out a case study of sustainable technology analysis for blockchain. We built GAMs according to various probability distributions. The keyword blockchain was used as a dependent variable and all other keywords were used as independent variables. Table 2 shows the performance comparison results between GAMs with different distributions. In Table 2, the AIC value of GAM with negative binomial family is the smallest, so we can select the GAM based on negative binomial as the best model for blockchain technology analysis. In addition, the GAM with negative binomial distribution has the smallest BIC model, this means that the model of the negative binomial is the most like the result of the AIC. Therefore, in this paper, we used generalized additive modeling with negative binomial distribution. Table 3 represents the result of the GAM with negative binomial for blockchain technology analysis based on one dependent variable (keyword blockchain) and 95 independent variables (all remaining keywords excluding keyword blockchain). In Table 3, we show 32 keywords with p-values less than 0.1. They explain that the dependent variable (keyword blockchain) is statistically significant. Each of the 32 keywords represents a detailed technology for blockchain technology development. For example, the p-value of keyword databank is 0.0019. This means that the technology related to databank affects the technology development of blockchain significantly. Other keyword-based technologies, like databank technology, influence blockchain's technological development. In this paper, we used the 32 keywords of Table 3 for sustainable technology analysis of blockchain. The keywords of address, android, bankcard, bitcoin, configuration, cryptocurrency, currency, databank, ledger, media, metric, secretkey, trace, transform, and voucher were more significant than others because the p-values of these keywords were less than 0.01 (99% confidence level). Next, we illustrated the regression plots of each keyword and blockchain Sustainability 2020, 12, 10501 9 of 15 using the 32 keywords in Table 3. Figure 4 shows the regression plots of keyword group I with access, address, android, assort, authentication, bankcard, bitcoin, and configuration. In Figure 4, the Y-axis represents the keyword blockchain and the x-axis is one of the 32 keywords of Table 3. In Figure 4, we found the keywords of access, address, and configuration were positively correlated with the keyword blockchain. On the other hand, the slopes of android, bankcard, and bitcoin tended to decrease as blockchain increased. In addition, the predictive variances of android, bitcoin, and configuration were relatively large, because their interval bands for prediction have an increasing trend. Figure 5 illustrates the regression plots of next eight keywords, cryptocurrency, currency, databank, disconnect, distributor, encash, exclusive, and forbid. In Figure 4, the Y-axis represents the keyword blockchain and the x-axis is one of the 32 keywords of Table 3. In Figure 4, we found the keywords of access, address, and configuration were positively correlated with the keyword blockchain. On the other hand, the slopes of android, bankcard, and bitcoin tended to decrease as blockchain increased. In addition, the predictive variances of android, bitcoin, and configuration were relatively large, because their interval bands for prediction have an increasing trend. Figure 5 illustrates the regression plots of next eight keywords, cryptocurrency, currency, databank, disconnect, distributor, encash, exclusive, and forbid. Figure 5 shows the trend of each keyword and the confidence interval of the predicted value. In particular, the interval also provides the degree of dispersion (variance) for the prediction. In Figure 5, we knew the keywords of databank, disconnect, encash, and forbid were positively correlated with blockchain, because the slopes of the keywords increased. In addition, we could not find the keywords with large variance of prediction. Next, Figure 6 represents the results of regression plots for the eight keywords genetics, individual, infra, ledger, media, metric, network, and nonaccount.
The keywords of genetics, ledger, media, metric, network, and nonaccount were correlated with blockchain positively. Two keywords, individual and media had larger predictive variance than that of the other keywords in keyword group III. We show the regression plots of the remaining keywords, rebate, scan, secretkey, trace, transform, url, voucher, and wearable, in Figure 7. keywords of Table 3. In Figure 4, we found the keywords of access, address, and configuration were positively correlated with the keyword blockchain. On the other hand, the slopes of android, bankcard, and bitcoin tended to decrease as blockchain increased. In addition, the predictive variances of android, bitcoin, and configuration were relatively large, because their interval bands for prediction have an increasing trend. Figure 5 illustrates the regression plots of next eight keywords, cryptocurrency, currency, databank, disconnect, distributor, encash, exclusive, and forbid.  Figure 5 shows the trend of each keyword and the confidence interval of the predicted value. In particular, the interval also provides the degree of dispersion (variance) for the prediction. In Figure  5, we knew the keywords of databank, disconnect, encash, and forbid were positively correlated with blockchain, because the slopes of the keywords increased. In addition, we could not find the keywords with large variance of prediction. Next, Figure 6 represents the results of regression plots for the eight keywords genetics, individual, infra, ledger, media, metric, network, and nonaccount. The keywords of genetics, ledger, media, metric, network, and nonaccount were correlated with blockchain positively. Two keywords, individual and media had larger predictive variance than that of the other keywords in keyword group III. We show the regression plots of the remaining keywords, rebate, scan, secretkey, trace, transform, url, voucher, and wearable, in Figure 7. The three keywords rebate, transform, and url showed increasing trends in their regression plots. So, the keywords were positively correlated with blockchain. However, the keywords scan, secretkey, trace, voucher, and wearable were negatively correlated with blockchain, because their slopes were decreasing in the regression plots. The predictive variance was relatively large in the keywords rebate, scan, trace, and url. The results of influence of independent variables (32 keywords with statistical significance) on blockchain are shown in Table 4. Table 4. Influences of independent variables on blockchain.

Influence Independent Variables
Positive (16) access, address, configuration, databank, disconnect, encash, forbid, genetics, ledger, media, metric, network, nonaccount, rebate, transform, url Neutral (4) assort, authentication, exclusive, infra Negative (12) android, bankcard, bitcoin, cryptocurrency, currency, distributor, individual, scan, secretkey, trace, voucher, wearable In Table 4, we identified how and which keywords correlate with blockchain. Sixteen keywords were positively correlated with blockchain, and 12 keywords were negatively correlated. In addition, only 4 keywords were weakly correlated with blockchain. Table 5 shows the keywords that had a relatively large dispersion of prediction compared to other keywords.  The three keywords rebate, transform, and url showed increasing trends in their regression plots. So, the keywords were positively correlated with blockchain. However, the keywords scan, secretkey, trace, voucher, and wearable were negatively correlated with blockchain, because their slopes were decreasing in the regression plots. The predictive variance was relatively large in the keywords rebate, scan, trace, and url. The results of influence of independent variables (32 keywords with statistical significance) on blockchain are shown in Table 4. In Table 4, we identified how and which keywords correlate with blockchain. Sixteen keywords were positively correlated with blockchain, and 12 keywords were negatively correlated. In addition, only 4 keywords were weakly correlated with blockchain. Table 5 shows the keywords that had a relatively large dispersion of prediction compared to other keywords.  Table 5 shows keywords that had a relatively large dispersion of prediction compared to other keywords. Because the keywords shown in Table 5 fluctuate largely, they had more influence on blockchain than other keywords. Because the keywords configuration, media, rebate, and url in Table 5 belong to the positive-influence group in Table 4, the technologies based on these four keywords were expected to have a lot of influence on technology development of blockchain. In contrast, the keywords android, bitcoin, distributor, individual, scan, and trace in Table 5 are in the negative-influence group in Table 4. Therefore, we have to manage the technologies related to these six keywords efficiently and effectively for sustainability of blockchain technology. Using all previous experimental results, we built a technology diagram as shown in Figure 8.  Table 5 shows keywords that had a relatively large dispersion of prediction compared to other keywords. Because the keywords shown in Table 5 fluctuate largely, they had more influence on blockchain than other keywords. Because the keywords configuration, media, rebate, and url in Table 5 belong to the positive-influence group in Table 4, the technologies based on these four keywords were expected to have a lot of influence on technology development of blockchain. In contrast, the keywords android, bitcoin, distributor, individual, scan, and trace in Table 5 are in the negative-influence group in Table 4. Therefore, we have to manage the technologies related to these six keywords efficiently and effectively for sustainability of blockchain technology. Using all previous experimental results, we built a technology diagram as shown in Figure 8. In this paper, we proposed a methodology for sustainable technology analysis of blockchain using generalized additive modeling. Figure 8 is the final result of our proposed methodology for understanding blockchain technology from point of view of sustainability. The extracted 32 keywords from 96 keywords related to blockchain technology were divided into three classes according to their trend directions: positive, negative, and normal (neutral). The positive trend class had 16 keywords of access, address, configuration, databank, disconnect, encash, forbid, genetics, ledger, media, metric, network, nonaccount, rebate, transform, and url, and we use dthe technologies based on these keywords for collaborating with blockchain technology. Next, four keywords, assort, authentication, exclusive, and infra, represented normal or neutral trend on blockchain. So, we considered the technologies related to these keywords as a technological field by general management. The most important technologies to be dealt with for sustainable technology management of blockchain are the keyword class with a negative trend, including the 12 keywords android, bankcard, bitcoin, cryptocurrency, currency, distributor, individual, scan, secretkey, trace, voucher, and wearable. Therefore, we have to deal with the various technologies based on these keywords effectively and efficiently.
In development of target technology, we have to consider positive and negative technologies on the target technology at the same time. In addition, the technologies with normal trends with target technology are meaningful. In Figure 7, we represent the positive, negative, and normal technologies influencing blockchain technology for understanding sustainability of blockchain. This means that we can manage the blockchain technology more efficiently than previous technology analysis approaches allowed. Technology experts can use this diagram for their technology management of blockchain technology.

Discussion
We selected blockchain as the target technology in this paper because this technology is very important and will continue to be so in the future. We studied the sustainability of blockchain In this paper, we proposed a methodology for sustainable technology analysis of blockchain using generalized additive modeling. Figure 8 is the final result of our proposed methodology for understanding blockchain technology from point of view of sustainability. The extracted 32 keywords from 96 keywords related to blockchain technology were divided into three classes according to their trend directions: positive, negative, and normal (neutral). The positive trend class had 16 keywords of access, address, configuration, databank, disconnect, encash, forbid, genetics, ledger, media, metric, network, nonaccount, rebate, transform, and url, and we use dthe technologies based on these keywords for collaborating with blockchain technology. Next, four keywords, assort, authentication, exclusive, and infra, represented normal or neutral trend on blockchain. So, we considered the technologies related to these keywords as a technological field by general management. The most important technologies to be dealt with for sustainable technology management of blockchain are the keyword class with a negative trend, including the 12 keywords android, bankcard, bitcoin, cryptocurrency, currency, distributor, individual, scan, secretkey, trace, voucher, and wearable. Therefore, we have to deal with the various technologies based on these keywords effectively and efficiently.
In development of target technology, we have to consider positive and negative technologies on the target technology at the same time. In addition, the technologies with normal trends with target technology are meaningful. In Figure 7, we represent the positive, negative, and normal technologies influencing blockchain technology for understanding sustainability of blockchain. This means that we can manage the blockchain technology more efficiently than previous technology analysis approaches allowed. Technology experts can use this diagram for their technology management of blockchain technology.

Discussion
We selected blockchain as the target technology in this paper because this technology is very important and will continue to be so in the future. We studied the sustainability of blockchain technology by analyzing technological keywords extracted from patent documents related to blockchain technology. To analyze the patent documents, we constructed a patent-keyword matrix as the structured data for statistical analysis. Each element of this matrix was the frequency with which a keyword occurred in the patent documents. In this matrix, most elements were zero values, so the matrix was skewed to zero and sparse. However, in the previous research, we found that few attempts had been made to solve this problem. In our research, we also met this skewed problem in our patent data. In this paper, we proposed a methodology of blockchain technology analysis using GAM, and we overcame the problem and analyzed the patent data of blockchain successively. This paper contributes to the technology analysis field of patent document analysis. We did not consider objective and efficient metrics for evaluating the validity of the proposed technology analysis model such as accuracy or mean squared error (MSE). Instead, we illustrated how the proposed research can be applied to practical domains of blockchain. To develop the metric, we should consider a data science approach to technology analysis. Using various evaluation methods such as resampling of bootstrap, we plan to make an objective metric for verifying the performance of technology analysis in our future works.

Conclusions
In this paper, we proposed a technology analysis method for the sustainability of blockchain technology. We collected patent documents related to blockchain technology for technology analysis. First, we preprocessed the documents by text mining techniques and built structured data for statistical data analysis. The structured data was a matrix in which patent documents and technology keywords made up rows and columns, respectively. Each element of this matrix represented the frequency at which the keywords appeared in the patent documents. In general, this patent-keyword matrix was very sparse because most elements were zero values. So, the matrix was extremely skewed. This problem influences the performance degradation of patent analysis models. To solve this problem, we considered generalized additive modeling for blockchain technology analysis. From the experimental results, we made a technology diagram of blockchain. This shows various technological relationships between sub-technologies based on keywords for sustainability of blockchain technology. The diagram can be used for R&D planning, new service developments, etc. in the MOT of blockchain.
In our future works, we will study more advanced methods for technology analysis of blockchain. We will use research papers related to blockchain as well as patents. We will illustrate business applications of blockchain and apply new developed models to improve the competitiveness and sustainability of blockchain technology.
Author Contributions: S.J. designed this study and collected the data for the experiment. S.P. preprocessed the data and selected valid patents and analyzed the data to show the performance of the study. S.J. and S.P. wrote the paper and carried out all the research steps. All authors have read and agreed to the published version of the manuscript.