Knowledge Structure of the Application of High-Performance Computing: A Co-Word Analysis

: As high-performance computing (HPC) plays a key role in the Fourth Industrial Revolution, the application of HPC in various industries is becoming increasingly important. Several studies have reviewed the research trends of HPC but considered only the functional aspects, causing limitations when discussing the application. Thus, this study aims to identify the knowledge structure of the application of HPC, enabling practical and policy support in various industrial ﬁelds. Co-word analysis is mainly used to establish the knowledge structure. We ﬁrst collected 28,941 published papers related to HPC applications and built a co-word network that used author keywords. We performed centrality analysis and cluster analysis of the co-word network; as a result, we derived the major keywords and 18 areas of HPC applications. To validate the knowledge structure, we conducted a case study to ﬁnd opportunities for HPC research plans in the research community. As a result, we discovered 17 new research topics and presented their research priorities by conducting expert interviews and Analytic Hierarchy Process. The ﬁndings of this study contribute to an understanding of the application of HPC, to exploring promising research ﬁelds for technological and social development, and to supporting research plans for successful technology commercialization.


Introduction
At the World Economic Forum 2016, held in Davos, the Fourth Industrial Revolution was a key topic that was discussed on all industrial, academic, and governmental levels [1]. It has been defined as a blurring of the boundaries between the physical, digital, and biological spheres by the use of mobile internet, small and strong sensors, and artificial intelligence (AI) [2]. In particular, as a general-purpose technology, AI is increasingly important in today's changing industrial structure due to improvements in machine learning algorithms and increases in computing power [3,4]. High-performance computing (HPC) has been key to the success of AI, so HPC is also considered to be another general-purpose technology in the Fourth Industrial Revolution [5]. In addition, the estimated size of the global datasphere was 33 zettabytes in 2018 and is forecast to grow to 175 zettabytes by 2025 [6], so the ability to use HPC to quickly process large amounts of data has become very important.
The Fourth Industrial Revolution goes beyond industrial transformation to change our lives and lead to an "intelligent information society" [7,8]. It refers to a society in which cutting-edge technologies have become a driving force to improve human life [9]. We are already entering the intelligent information society. We check weather forecasts calculated by supercomputers, ride in self-driving cars that use AI, watch computer graphics movies rendered by high-performance computers, and use a variety of products developed using computer modeling and simulations powered by HPC. In the intelligent information society, AI collects, analyzes, and manages large amounts of information, such as sensor and transaction data, extracts insights from the data, and makes intelligent predictions or decisions [10]. HPC enables advances in the use of AI and big data technologies in our society, so the importance of the application of HPC continues to increase [11].
Technology innovation through the application of HPC is expected to provide opportunities for technology commercialization and acquisition to various organizations in the intelligent information society. Private companies can engage in research and development (R&D) and technology commercialization to create and lead the market, and public institutions can secure technologies to improve the quality of life of citizens and solve social problems. Since not all technologies developed through R&D are successful or profitable in the marketplace, successful technology commercialization requires a systematic approach and strategic actions at the point of R&D [12]. Understanding the HPC application fields from the perspective of R&D participants in the intelligent information society is required prior to R&D planning. Prior studies have reviewed the application and research trends of HPC but have limitations in that they addressed specific fields or discussed the functional aspects of HPC.
Thus, this paper presents a knowledge structure of the application of HPC that enables practical and policy support in various industrial fields, with the goal of preparing for transformation to an intelligent information society and exploring future opportunities for technological, economic, and social development. We use co-word analysis to identify major keywords and application fields in the knowledge structure, conducting a case study to validate the applicability and usefulness of the knowledge structure by discovering promising research opportunities suitable for technology commercialization in an intelligent information society. The remainder of the paper is organized as follows: Section 2 describes related work; Section 3 introduces the framework and methods of this study; Section 4 outlines the dataset and reports the results of the analysis; Section 5 describes a case study for a research institute; Section 6 discusses the research findings; Section 7 presents our conclusions.

Intelligent Information Society
The term "intelligent information society" (IISoc), as used to describe a future society that will develop with the application of artificial intelligence, is mentioned in several publications. The South Korean government published the "Mid-to Long-Term Master Plan in Preparation for the Intelligence Information Society", which contains national policies that respond to the changes and challenges of the Fourth Industrial Revolution [13]. This report explains IISoc as one in which new value is generated and progress is achieved by analysis of massive volumes of data by cutting-edge information and communications technologies (ICTs) in all aspects of the economy, society, and human life. Other publications refer to IISoc as a highly digitized, intelligence-inspired, and globally data-driven society, or to using AI and deep-learning technologies to advance our society [14,15]. In addition, the political or social implications of the response of the IISoc have been discussed in various fields, such as healthcare, crime prevention, and education [8,9,14,16,17].
As a concept similar to the IISoc, several publications describe this concept in different terms. The Japanese government proposed the "Society 5.0" initiative to create a sustainable society for human security and wellbeing in the 5th Science and Technology Basic Plan, with a vision toward creating a "Super Smart Society" that achieves economic development and solves social challenges [18]. The authors of [19,20] described a "smart city" that connects physical, social, and business infrastructure, using ICT to bring technological advances and innovative solutions to improve the sustainability and habitability of the city. The authors of [21] studied the environment of an ambient intelligence information society to promote the use of ICT by all citizens and to improve economic strength and employment opportunities. These publications described the technological, economic, and social improvements that will be achieved by advances in technology in the Fourth Industrial Revolution. In this paper, we define the IISoc as a society that leads economic growth and solves social problems by applying ICT and the technology that it improves.

HPC in Intelligent Information Society
The IISoc requires the active use of HPC and AI. Here, we focus on HPC, firstly because in a data-intensive environment, HPC is a starting point and a key requirement as AI requires an HPC infrastructure. Secondly, HPC was applied to solve complex problems in science, engineering, and business areas, even before AI had achieved its current capabilities [22,23]. HPC generally entails massively parallel processing techniques to analyze data to solve complex computational problems [24]. The discipline of using computational science to solve complex problems began by identifying the possibility that the computing capability of computers could be applied to science and engineering fields [25]. This computational science approach is recognized as a third facet of the scientific method, in addition to traditional theory and experiment, and has been established as a separate discipline to understand scientific phenomena by computer modeling and simulation that uses numerical analysis. HPC developed alongside computational science and serves as a tool to find answers to many fundamental questions in science, especially those with wide social, political, and scientific impacts [26]. Computing power is being increased by using parallel computing techniques, and the applications of HPC have expanded beyond computational science to other academic fields and industries in which vast amounts of data have been accumulated.
The development and application of HPC is essential to strengthen the competitiveness of private industry, to increase national scientific and technological prowess, and to enable the social development of the public sector. Accordingly, many countries have made significant investments and related policies to take advantage of HPC [24]. The United States enacted the High-Performance Computing Act of 1991 and has recently launched the National Strategic Computing Initiative to advance HPC leadership, in collaboration with industry and academia. Other nations and global regions, including China, Japan, the European Union, South Korea, and Russia, have also created national programs that are investing large sums of money to develop an HPC infrastructure. The obtained HPC infrastructure becomes a fundamental resource for large-scale scientific research, known as big science, at the national (or multi-national) level [27]. Big science and public R&D with the application of HPC contribute to finding answers to grand challenges related to social problems [26,28].

Knowledge Structure of the Application of HPC
Numerous studies have reviewed research trends or the literature of HPC from various perspectives. These studies have considered algorithms and techniques for the effective use of HPC [29,30], HPC environment and programming models [31,32], and future challenges for the development of HPC [33,34]. These reviews considered advances in HPC infrastructure, not the perspective of its application. Other studies have shown the application of HPC. For example, some researchers [35] used machine learning and mathematical modeling techniques to trace high-load infrastructure transactions of big-data-based projects in a distributed environment, while other subjects include the use of HPC in specific areas, such as biology [36], chemistry [37], physics [38], and medical science [39]. Any attempt to highlight the potential of HPC's numerous uses beyond specific areas will require a comprehensive understanding of the knowledge structure of the uses of HPC across application areas.
Co-word analysis can show the knowledge structure of a research field by analyzing the co-occurrence frequency of words used in various parts of a document, including the title, abstract, and keywords [40]. Keywords are provided by authors to identify important concepts in a paper. Co-word analysis presumes that a group of aggregated keywords could reveal an underlying theme, and that co-occurrences of keywords could represent associations among the themes [41]. Studies based on co-word analysis mainly built a network and analyzed it to identify the knowledge structure, and the findings suggested effectively show the knowledge links [42]. Co-word analysis helps to identify major topics in a field, identify research patterns and trends, and develop subject clusters to suggest future research pathways [41,43].
Some researchers have used co-word analysis to identify the knowledge structure of HPC. One study [44] identified seven topics of HPC research by using topic modeling to analyze abstracts from publications referenced in the Web of Science database, while another [45] discussed the topical structure of computational science and its chronological changes by analyzing papers from the International Conference on Computational Science. A third study [46] analyzed the research trends of HPC by identifying keywords and link strengths extracted from co-word analysis. These studies have suggested the knowledge structure of HPC and some future trends but generally considered abstract functional aspects, which have limitations when discussing the application areas of HPC and social development. Therefore, this study identifies the knowledge structure with a focus on the application of HPC to social development in the IISoc.

Methods
We used three steps to identify the knowledge structure of the application of HPC ( Figure 1). The first step was to build a co-word network of keywords related to HPC application research. For this purpose, research data on the application of HPC were extracted from the Web of Science (WoS), an interdisciplinary database with records from several bibliographic databases. This study focused on the use of HPC resources, so words related to HPC and its applicable fields were used for data extraction. Then, a co-word network was built using this extracted data by counting the frequency of simultaneous appearances of each keyword. The nodes of the built co-word network are the keywords of published papers related to the application of HPC, and the edges represent the cooccurrence of two keywords in the same paper. The resulting co-word network forms the "Keyword Network of HPC Application Research", showing associations between various keywords that are pertinent to the application of HPC. The second step was to identify important keywords on the network. This goal was achieved by quantifying the network centrality of each node that comprises the "Keyword Network of HPC Application Research". In accordance with prior studies, we used two types of centrality: degree centrality refers to the number of direct relations a node has with other nodes and simply counts the number of in-/out-degrees of each node [47][48][49]; betweenness centrality calculates the number of shortest paths between nodes i and j that pass through a given node v, then divides this by the number of all shortest paths between i and j [48,49]. These types of centrality have different ranges of values, depending on their characteristics, so the centrality value was normalized as a t-score before analysis. Nodes that scored above average in both types of network centrality were regarded as major keywords.
In the final step, we performed a cluster analysis to detect research areas of HPC applications located in the "Keyword Network of HPC Application Research". We constructed the maximum spanning tree by applying the Prim algorithm [50] to easily identify the structure of the keyword network. The maximum spanning tree was calculated with a general minimum spanning tree algorithm by using the reciprocal value of the simultaneous appearance frequency of a keyword as the weight of the edge. Then, we analyzed the clusters on the network using the Girvan and Newman algorithm [51]. The subject of each cluster is named using the result of centrality analysis, which is then used to specify the research area of HPC applications.

Datasets
To build the co-word network, we extracted 28,941 papers published from 2017 to 2018 (the dataset was up to date at the time of the study, but was delayed due to security issues at the case-study institution), by performing searches on the WoS Core Collection database for terms related to HPC, including big science, computational science, and national and social challenges, and their extended meanings. Among the 28,941 papers, the subject categories that accounted for the largest number were, in descending order: Engineering-Electrical and Electronic, Computer Science-Theory and Methods, Artificial Intelligence-Information Systems, Interdisciplinary Applications-Telecommunications, and Materials Science-Multidisciplinary.
The total number of cases of keywords in these articles was 75,447; they were then preprocessed, as seen in Figure 1. (1) Keywords that have the same meaning but are expressed differently, due to spaces, singular or plural forms, and abbreviations, were treated as a single term (70,043 remained). (2) By considering the WoS subject category, keywords from articles that belong to categories in which the proportion of keywords related to HPC, computational science, big science appeared to be below 0.1 (e.g., Family Studies, History, Law) were excluded from the dataset (66,255 remained). (3) To secure rigorous analysis results, only the top 20% of the most frequent keywords (≥ 19 appearances) were analyzed (457 remained). (4) To focus on the derivation of the knowledge structure for the application of HPC, keywords for the construction of the HPC infrastructure and for general analytic terms were also excluded, including "GPU", "distributed computing", "NP-hard", "computational modeling", "optimization", "scheduling", and "heuristics". The remaining 186 keywords were analyzed with the "Keyword Network of HPC Application Research".
To build and analyze the network, we used R, a programming language and software environment. With this network, centrality and cluster analysis identified major keywords and application areas of HPC.

Analysis Results
The centralities of all nodes on the network were analyzed to identify the major nodes ( Figure 2). In all, 41 nodes had above-average t-scores for both the degree and betweenness centrality. These nodes indicate keywords that have significant meanings in the application of HPC in various fields, such as physics, biology, medical science, ICT, and nanotechnology. "Machine learning" had by far the highest value in both degree and betweenness centrality, and keywords that were mainly driven by machine learning, such as "artificial intelligence", "unmanned aerial vehicles", and "image processing" also had high values. Several medical keywords (e.g., "cancer", "drug discovery", and "Alzheimer's disease") and computational methods and their applied studies (e.g., "finite element method", and "computational fluid dynamics") were major. The analysis of the "Keyword Network of HPC Application Research" identified the maximum spanning tree ( Figure 3) and detected 18 clusters (Table 1). Some keyword groups that were recognized as separate fields within the cluster were artificially divided into sub-clusters for analytical clarity. We named each cluster by referring to the content of the keywords belonging to it, and to the results of centrality analysis. These clusters refer to application areas of study that use HPC. Machine learning, which was derived as the most important node in centrality analysis, was located at the center of the network, and its application areas such as AI, autonomous driving and image processing, genetic and biological research were clustered around it. Fluid mechanics, aerodynamics, cosmology, climate, and the environment were centered on computational fluid mechanics. Molecular or nanoscale analysis, earthquake study, protein structure and interactions were centered on molecular dynamics and finite element methods. Chemical structure and thermodynamics were centered on density functional theory. Other areas such as smart communications, cranial nerve analysis, and quantum and exascale computing were also detected.

Case Study
The knowledge structure of the application of HPC provides an analytical basis for exploring technology commercialization opportunities for technological, business, and social development in the IISoc. For example, a review of the application areas of HPC helps to pioneer new business areas as a driving force for economic growth and to improve public services to increase social benefits. To validate the applicability of the knowledge structure and confirm its usefulness, this section presents a case study that explores opportunities for planning a new research direction in a research community.
The case-study institute (GRI-K) is a government-funded research institute that built a national supercomputing ecosystem for science and technology innovation in South Korea. It operated the 11 th -most-powerful supercomputer in the world in June 2018, is responsible for R&D in leading fields of national supercomputer applications and provides technical and human resources support for supercomputer users. For GRI-K, the need for the discovery and planning of research projects using HPC has emerged to meet the institution's technology commercialization goals, to increase the utilization of the supercomputer, and to respond to the South Korean Mid-to Long-Term Master Plan in Preparation for the Intelligence Information Society.
In this case study, we used the knowledge structure derived from this study to guide our exploration of new research topics that will apply HPC resources and propose a prioritization of research projects, to support institutional policymaking on the initiation of the projects for technology commercialization. This case study proceeded as follows: (1) to derive the keyword network for HPC applications, (2) to develop new research topics by expert interviews, (3) to establish evaluation criteria and a method for the developed topics, and (4) to prioritize the developed topics for research planning. We use the "Keyword Network of HPC Application Research" and cluster analysis results presented in this study as case materials.
From the results of the cluster analysis, we chose HPC application areas in which GRI-K would like to explore research topics. Clusters G07-2, G07-3, G09-1, G09-2, G12, G13, G14, G17 were selected; they are areas in which GRI-K can perform alone or in cooperation with other researchers. To develop research topics, we recruited experts in the selected fields and conducted in-depth interviews. The interviews were conducted with 13 experts, consisting of researchers inside GRI-K and outside experts. Taking into account the keywords that belong to the clusters, and the results of the centrality analysis, the experts proposed 17 research topics that will use HPC (Table A1 in Appendix A). When developing the research topics, the purpose of the application of HPC was considered in terms of technology innovation, technology commercialization, and social development [5].
To determine the suitability of the research topics and to suggest priorities for the assignment of the topics in GRI-K, we developed evaluation criteria for the topics (Table 2) and a scoring method for evaluation. Due to the characteristics defined in the IISoc [18,26], evaluation criteria were constructed in consideration of technological, economic, social, and public aspects. A strategic aspect was added to the evaluation criteria to consider the strategic conformity of the research topics with GRI-K. Each topic was evaluated by the sum of the product of the relative weights of the criteria and the surveyed score per criterion. The weights of the criteria were derived using the analytic hierarchy process (AHP), which is a structured technique for organizing and analyzing complex decisions [52]. The AHP-based approach is a very effective method to assess technologies and to analyze the policy implications [53,54]; the authors of [53] propose an AHP-based decision-making model to investigate the social sustainability of the technology management process in the banking industry, and [54] used an integrated SWOT-AHP analysis, with the priorities of SWOT factors of biomethane, to support a green revolution in European transport. To evaluate the research topics in this study, the weights were analyzed with 10 experts in the selected fields ( Figure 4). The surveyed score per criterion was derived from 7-point Likert-scale expert surveys asking whether each topic meets the criterion. We surveyed 34 experts in the selected fields to obtain the scores.  The priorities for the assignment of research topics were presented by plotting on a two-by-two matrix consisting of two axes: the technological, economic, social and public aspects (x-axis) from the GRI-K's external perspective, and the strategic aspect (y-axis) from the GRI-K's internal perspective ( Figure 5). Each research topic was placed in the matrix by the weighted sum of the surveyed score and the weights of the criteria corresponding to the x and y axes, where the weighted sum value was normalized as a t-score. The final evaluation scores and priority results for each topic were obtained (Table A2 in Appendix A; Figure 5). Finally, "Research on next-generation video coding standards and efficiency improvement" (TI01), "Use of deep learning to improve object-recognition accuracy" (TI02), "Development of quantum computer error correction code" (TI04), "Development of new drugs by considering protein structure and models of interaction" (TS01), and "Genomic sequence analysis and new drug development by machine learning on experimental data" (TS02) were identified as research topics that use HPC, and that should be performed preferentially in GRI-K.

Discussion
IISoc is a society that brings improvements to our lives by exploiting the analytic capabilities of AI and the computing power to process large-scale data. In a situation in which major countries are exposed to the threat of low economic growth, an aging workforce, and national problems such as diseases, disasters, climate change, and natural resource crises [67], this move toward the IISoc is expected to be able to solve these difficulties facing our society. HPC is a requirement for running AI and processing big data, so the applications of HPC in various areas provide a beginning to meeting this expectation. On-demand cloud computing platforms (e.g., Amazon Web Service, https: //aws.amazon.com/ (accessed on 12 October 2021)) and cyberinfrastructure for scientific activities (e.g., HUBzero, https://hubzero.org/ (accessed on 12 October 2021)) that support HPC facilities are further accelerating the use of HPC. Thus, many organizations are looking for opportunities to address these challenges in the public sector and to gain a competitive advantage in the private sector, where the knowledge structure of applications of HPC can help.
Opportunities to be found in the public sector include improving the quality of life for citizens and seeking solutions to social problems. As a real-world example, the Human Brain Project uses HPC capabilities to understand the organization and functioning of the brain and its diseases through the use of brain molecular and cellular simulations [68]. This project can be seen as cluster G11 among the HPC application areas of the research findings and is expected to be developed into research to improve public healthcare for illnesses such as Alzheimer's and Parkinson's disease. Other real-world examples in the public sector include: climate simulations for early warnings of storms, and strategies to mitigate or adapt to climate change (G17); transportation systems to manage traffic flows, to manage potential safety problems, and to automate freight delivery (G09-1, G12); real-time detection of suspected criminals and for counter-terrorism (G09-2); and infectious disease detection and virus genomic sequence analysis (G04-2, G07-2) [68][69][70]. These examples suggest the utility of HPC in the application areas of climate, transportation, security, healthcare, and other fields, within the knowledge structure.
In the private sector, HPC is applied to capturing opportunities for product development innovation, such as speeding up product development and reducing product design risks and costs. For example, General Electric, a major producer of turbines, used supercomputers to model the unsteady flow of industrial turbines that are deployed in jet engines and power stations. Using the calculation model, designers made advanced and fine-grained adjustments to the turbomachinery while increasing operational and fuel efficiency [71]. Similarly, the BMI Corporation pursued fuel savings for long-haul trucks by developing computational models of aerodynamic wheel fairing and of a special airflow-directing mechanism [71]. The previous examples match the application areas in G16 and G18. The development of personalized medicine (G01, G07) is another possible new business opportunity.
The knowledge structure found in this study can describe prior HPC application studies, and conversely, can be used to explore technology commercialization opportunities for the application of HPC in the IISoc. The structure enables R&D planning when establishing a science and technology policy, and can help to explore the applied research of HPC in each research field. To illustrate this utility, we conducted a case study to develop research topics that use HPC in a research institute, then proposed 17 topics. Derivation of the knowledge structure from the viewpoint of the application of HPC may facilitate the discovery of practical research topics using HPC. Prior studies [44][45][46] were limited to discussing the results of quantitative analysis and suggesting conceptual future research directions within the results. Meanwhile, this study proposes practical research topics via a balanced approach that uses co-word network analysis, AHP, and expert interviews, considering both quantitative and qualitative analyses. In the process of evaluating research topics for the IISoc, a multi-criteria evaluation was applied that considers technological, economic, social and public, and strategic aspects.
In the research findings of the knowledge structure of the application of HPC, machine learning had the highest centrality and appeared to be the core node, located at the center of the "Keyword Network of HPC Application Research". HPC resources are used with machine learning techniques to extract valuable information from complex datasets, rather than simply as a calculator for scientists. Machine learning may have an important function in the execution of the scientific workflow, both now and in the future [72]. Methods or theories that are mainly used in academic disciplines, such as the finite element method, density functional theory, molecular dynamics, and computational fluid dynamics, also had high centrality and were in key positions on the network. The cluster analysis results showed the areas of research in which HPC is applied. Some research areas used computer models to study molecular systems, ranging from nanoscale chemical systems to large biological molecules and material assemblies. Another was the study that sought to solve complex problems in thermodynamics or fluid dynamics by computer simulation. These studies used HPC resources to simulate experiments that are not feasible in the laboratory or to make challenging problems tractable [25]. HPC is also actively used in certain biological fields including genome research, medicine, and bioinformatics. In information and communication technologies, in addition to machine learning research on autonomous driving and image processing, which are already widely studied, HPC is being used to develop technologies related to the construction of next-generation communication networks as a new trend [73]. Research on quantum and exascale computing for the development of the next-generation HPC platform, and research on climate, environment, or disaster prevention as a global problem, are also noteworthy for new research areas regarding the application of HPC.
Prior studies on building the knowledge structure of HPC have explained the structure with a focus on the intrinsic functions of HPC and the implementation of HPC infrastructure. The authors of [44] presented seven HPC research topics: energy efficiency, heterogeneous systems, large-scale applications, networks, parallel algorithms, interconnected systems, and parallel software. Similarly, the study by [46] clustered the research areas of HPC into computational modeling and theory, computational mathematical modeling, computer simulations, parallel computing, and improved visualization, and large-scale application software. The paper by [45] analyzed trends and correlations between topics in a hierarchical topic structure, composed mainly of the functional and conceptual terms of HPC (i.e., modeling, visualization, numerical, parallel, data-driven). Meanwhile, this study analyzed the knowledge structure in terms of the application of HPC. As mentioned in the previous paragraph, the results refer to those keywords applicable to various industries and academic fields in the intelligent information society. Thus, the findings of this study contribute to practical and policy support for R&D managers regarding technology commercialization plans. While the prior studies only researched the knowledge structures, this study has practical and policy significance in that it further suggested future promising technologies and research plans in the case study. In addition, we present a more detailed sub-disciplinary level of HPC work than in previous research by using cluster analysis. This facilitates the identification of detailed topics and related theories, as well as research keywords in the cluster. It may help researchers identify the research methods and theories used in their study approach.
This study also offers academic contributions in revealing the knowledge structure of existing research on the use of HPC and proposes a quantitative and qualitative research framework to explore research opportunities. It provides a new research direction by exploring research topics, using a quantitative method that uses the keywords of papers to build co-word networks. Prior studies mainly relied on expert-dependent, qualitative methodologies to find research topics, so this balanced approach will usefully complement those earlier qualitative works.

Conclusions
With the impending development of the IISoc, HPC contributes in various fields to achieve technological, economic, and social developments. As the importance of the application of HPC gradually increases, R&D, technology commercialization, or technology acquisition plans that use HPC for the IISoc are becoming increasingly important. The knowledge structure must be understood first. Hence, we identified the knowledge structure of the application of HPC. We organized the concept of the IISoc and identified major keywords and areas of HPC application. To validate the applicability of the knowledge structure, we conducted a case study and proposed practical research topics that use HPC. The case study identified 18 HPC application areas and 17 research topics as new technologies and prioritized their evaluation. According to expert interviews in the case study, technologies such as object recognition, video coding, drug development, and genomic sequence analysis, which typically require high computational complexity in information and communication technology and biotechnology, are expected to be promising HPC application fields (Table A1 in Appendix A). The findings of this study have practical and policy implications as basic data for understanding HPC applications, exploring HPC application opportunities in IISoc, suggesting the possible direction of R&D planning, and the establishing of technology commercialization strategies. This study contributes to understanding the application field of HPC quantitatively and comprehensively, from previous sporadically existing HPC works. By analyzing the topic in terms of HPC applications, the results contribute to leading the industry by market participants responding to changes in the IISoc, and may help to spread the active use of HPC to general researchers and R&D administrators in other fields, rather than merely to researchers who have been involved in the development of HPC. It can be also used as a way to move from reviewing institutional policies to establishing national science and technology R&D programs.
This study presents the knowledge structure of the application of HPC in a specific period. As more data accumulates in the future, further investigation of the trends of HPC application research could identify changes in the knowledge structure by using chronological analysis. This study has another limitation in that the analysis was confined to academic papers. Such papers represent research activities and contain keyword items that are reliable for explaining the research content and are, therefore, suitable data for the analysis of knowledge structures. If the scope of data collection is extended to other sources that have different characteristics than those considered here, such as patents and web scrap data, other application areas or topics that do not appear in this knowledge structure may be discovered. Although this case study validated the applicability of the knowledge structure in the research community, further validations in other fields, such as the identification of business opportunities in the private sector, and policymaking decisions in the public sector, would be beneficial for strengthening its applicability.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1. Research topics that use HPC, derived in the case study.

TI01
Research on next-generation video coding standards and efficiency improvement -Securing technology that shows higher coding efficiency than the existing latest video compression standards -Standard preoccupation representing the recent trend of approach that uses HPC on large amounts of computational results, rather than efficient processing of transform and information compression used in an existing low-spec computing environment Quantum Silicon Chip and its production process design -Design of circuits and production processes to manufacture quantum computer processors (Quantum Silicon Chip) -Utilization of HPC resources in driving computer-aided design (CAD) programs for quantum silicon chip design -Development of technology that can be used to build a smart city by processing data collected using Internet of Things (IoT) technology -Building a data-based smart city framework to improve the quality of life of citizens, such as transportation, energy, and security, from sensors and information obtained from citizens TS01 Development of new drugs by considering protein structure and models of interaction -Formulating protein structures by using HPC and identifying the optimal protein structure to maximize the required properties (physical-chemical approach) -Predicting the interaction between proteins and designing protein structures to develop new drugs that can interact with a target TS02 Genomic sequence analysis and new drug development by machine learning on experimental data -Predicting interactions and physicochemical phenomena by applying deep learning technology to experimental data on molecular structure (data-driven approach) -Research on gene expression, customized medicine, or new drug development by high-speed analysis of genome sequence (next-generation sequencing)

TS03
Medical image analysis through search for optimal neural network structure -Supporting fast and sophisticated medical diagnosis by learning of millions of high-resolution medical images Table A1. Cont.

TS04
Analysis of extreme values in extreme event by analysis of the probability distribution function of extreme weather and climate -Quantitative analysis of probability distribution of global warming caused by carbon dioxide generated by human activities -Quantitative estimation of the increase in frequency or intensity of extreme events due to global warming TS05 Weather phenomena analysis by learning of weather maps and anomaly patterns -Analysis on the effect of distant weather changes on current location weather changes by using artificial intelligence-based machine learning -Mid-to long-term forecasting by learning the difference between the average weather condition and the current condition (anomaly pattern)

TS06
Typhoon course prediction using high-performance computing -Identification of the path of typhoons by short-cycle calculations using real-time observation data Research topics using HPC were indexed by technology innovation (TI), technology for commercialization in business (TB), and technology for social or public issues (TS).