Research on Knowledge Gap Identification Method in Innovative Organizations under the “Internet+” Environment

Under the “Internet+” environment, the R&D intensity of products and services has increased; hence, organizations need to improve their ability to integrate knowledge and technology resources. Knowledge gaps will arise when an organization’s knowledge reserves fail to meet the needs of innovation activities. This research established a network of complete knowledge topics under the “Internet+” environment based on the Word2Vec model. The word vectors and word frequencies of organizational reserve knowledge texts were analyzed to establish an organizational reserve knowledge topic network. The Term Frequency-Inverse Document Frequency algorithm was used to identify the demanded knowledge topic. The satisfaction capability of demanded knowledge in the reserve knowledge topic network was calculated via the eigenvector centrality and the fuzzy evaluation method. The corresponding strategies were then put forward to make up the knowledge gap. Finally, a case study was conducted and compared with SWOT (Strengths, Weaknesses, Opportunities, and Threats) and Venn diagram analysis on the economic and management college of a university in Beijing to verify the effectiveness of this method.


Introduction
The rapid development of the Internet presented by information technology has gradually integrated into the real economy and has become widely used in all walks of life. Driven by a new round of scientific and technological revolution, more and more countries and regions have launched Internet development strategies, which take the network as an important means to improve competitiveness and constantly promote the integration of the Internet with various fields of the economy and society [1]. Under the "Internet+" environment, fierce market competition has led to an accelerated pace of product and service research and development, forcing organizations to improve their ability to integrate large amounts of knowledge and technical resources.
When these innovations require more and more knowledge in the form of systems, it becomes increasingly impossible for a single organization to complete the rapid iteration of products and services only by relying on its capabilities due to facing rapid and fierce market competition. Knowledge gaps may arise when an organization's knowledge reserves and existing technologies fail to meet the needs of innovation activities [2,3]. From a strategic perspective, it is quick and reasonable to choose the path Information 2020, 11, 572 3 of 14

The Definition and Identification of Knowledge Gaps
Knowledge gaps play an important role in organizational competition and innovation. There are two typical definitions of the knowledge gap. One definition is that the knowledge gap originates from the strategic gap, which is the gap between the knowledge needed by the organization to implement the strategy and the knowledge actually possessed [8]. Another definition for the knowledge gap is the knowledge that the organization lacks at any moment, but this knowledge is crucial for the survival and growth of the organization and must be filled [15]. In the theoretical research of knowledge gap analysis, significant research works on its basic concepts and recognition methods have been conducted by academic and industry scholars. By analyzing the application scenarios, Vos divided the classification of the knowledge gap into product R&D, manufacturing, marketing, and management and proposed a process paradigm for identifying the needed knowledge by SMEs to explore market opportunities [16]. Chen introduced a Venn diagram to analyze the knowledge gap of the enterprises and provided compensation strategies for different knowledge gaps according to the status of the enterprises' knowledge reserve [10]. Dang proposed a strategy for finding and replenishing technological innovation gaps under the network environment and conducted a strategic analysis of knowledge security [17].

The Application of Knowledge Gap Identification
In the application research of a knowledge gap, Lafuente-Ruiz-de-Sabando proposed that knowledge gaps for college image and reputation should be identified and compensated by stakeholders in the process of effective resource input in colleges and universities [18]. Based on the analysis of the causes of the knowledge gap in the manufacturing industry, Li built an evolutionary game model revealing the game behavior between manufacturing enterprises and customers and analyzed the model equilibrium points and their stability under different situations [19]. Malhotra et al. analyzed the knowledge gap caused by group diversity in an online platform strategy selection process and proposed four methods to reduce the risks [20]. Qiu et al. studied the construction method of organizations' knowledge structures based on text mining, designed a tree structure knowledge expression method, used a tree-matching algorithm to identify knowledge gaps, and empirically studied the method based on the patent literature of an organization [9,21]. Li et al. established an element matrix with members, knowledge, and goals and proposed a knowledge gap identification method for scientific research teams based on the decomposition of goals and knowledge [22]. In addition, there is research on the analysis of innovative knowledge, construction projects, and gap countermeasures to analyze the impact mechanism of different knowledge gaps [6,[23][24][25].

The Construction of the Network of Complete Knowledge Topics under the "Internet+" Environment
The network of complete knowledge topics under the "Internet+" environment is an undirected weighted connected graph G = (V, E). The nodes V = {v 1 , v 2 , · · · , v m } represent knowledge themes, and the edges E = e ij v i , v j ∈ V represent associations between topics. The weight d ij of the e ij is the strength of the association between the topics, and the greater the value, the stronger the association. The process of constructing a network of complete knowledge topics under the "Internet+" environment is the process of determining the strength of association (d ij ) between node sets, associatio sets, and arbitrary nodes. Under the "Internet+" environment, the network of complete knowledge topics is the sum of all kinds of knowledge carriers in the form of texts on the Internet, and the content of quantity of knowledge carriers emerge as the topics of knowledge with semantics and co-occurrence similarity. Identifying knowledge topics from the knowledge carriers under the Information 2020, 11, 572 4 of 14 "Internet+" environment and obtaining the associations between different topics is key to building a network of complete knowledge topics.
Based on the Word2Vec model, this study represents the networked knowledge carrier text as the semantic information of the word vector so that the knowledge topics are semantically vectorized and the semantic similarity is obtained. The Word2Vec model is a shallow neural network. By inputting words and context information, words can be mapped to embedded space vectors without supervised learning, and the curse of dimensionality could be avoided with the dense mapping of continuous dimensions to realize the semantic vectorization of words [26,27]. The Word2Vec model includes the continuous bag of words (CBOW) and the skip-gram model. The CBOW model is suitable for predicting input words for a given contextual semantics, and the skip-gram model is suitable for a given input word and predicting contextual semantics. In this study, the stock knowledge analysis of the innovative organization belongs to the process of semantic prediction of the given input words, therefore the skip-gram model is selected. In the skip-gram model, ω t represents the current word, the c value represents the length of the context, and p(ω i |ω t )(t − c ≤ i ≤ t + c) is the probability that the current word will appear together with a word in the window. The training goal of the model is to maximize the H value, where T is the length of the text.
After each round of training, the Softmax classification function is used to calculate the loss and perform backpropagation. After the training, the vector representation v i = (S i1 , S i2 , . . . , S in ) of the subject v i can be extracted from the hidden layer of the neural network, where n is the vector dimension, and S in is the value of the dimension of the vector. The Pearson correlation coefficient (PCC) is used to express the semantic relevance d ij of the topic v i and v j .
where S i , S j is the mean of all dimensions in the vector representation of the topics, and the larger the value of P ij , the stronger the semantic association between the topic v i and v j .

The Construction of Reserved Knowledge Topic Network
Under the "Internet+" environment, the establishment of the network of complete knowledge topics reflects the distribution and association of social knowledge topics. For an innovative organization, it focuses on the cluster of topics formed by one or more knowledge topics, and the relation of the topics is also different from the social average. Therefore, it is necessary to establish an organization's reserve knowledge topic network and identify the reserve knowledge topics and topic associations.
The innovative organizations' reserve knowledge topic network G = (V , E ) is also an undirected weighted connected graph, where V ⊆ V, E ⊆ E. The topic relation of the reserve knowledge topic network takes co-occurrence associations and semantic associations into consideration at the same time. The semantic relevance is obtained by the Word2Vec model in Section 2.1. The co-occurrence correlation degree is obtained by the co-occurrence frequency analysis of the topic words and quantified by the Ochiia coefficient. The relationship is shown in Equation (3).
Information 2020, 11, 572 5 of 14 t f ij represents the total number of keywords v i and v j in the same document, t f i is the total number of occurrences of the topic words v i , t f j is the total number of occurrences of the topic words v j , and O ij is the Ochiia coefficient between the topic words v i and t j . The larger the value of O ij , the greater the co-occurrence relationship between the topics v i and v j [28,29]. Finally, the degree of association between the topics v i and v j in the knowledge reserve semantic network can be expressed as: where β ∈ (0, 1) is the weight coefficient [30]. The correlation between the knowledge topic in the reserve knowledge topic network can be used to amend the topic relevance of the network of complete knowledge topics. The specific formula is as follows:

The Required Knowledge Topic Identification
The TF-IDF algorithm is used to extract the topic words in the text of the demanded knowledge carrier [31,32] and realize the identification of the demanded knowledge topic. The TF-IDF algorithm considers both the word frequency and the reverse document frequency. From the perspective of the word frequency, the higher the frequency of a word in a single document, the more prominent the topic is represented by the word. From the perspective of the reverse document frequency, it is considered that a word appears in all documents. The frequency of occurrence is high, the general importance of the word is high, and the topic represented is less significant. The word frequency t f ij is denoted as follows: where n ij is the frequency of word v i in document d j , and k n k,j is the total frequency of all words in document d j . The reverse document frequency id f i is expressed as follows: where |D| is the total number of documents describing demanded knowledge, and j : t i ∈ d j is the number of documents including v i . At the same time, word frequency t f ij and reverse document frequency id f i are considered. The importance of a topic word is shown in Formula (8).
The importance threshold is set to α, and the demanded knowledge topic set is as the weight of v i in demanded knowledge topic set. Thus, the weight coefficient matrix of the demanded knowledge topic set V can be written as follows: where n is the number of elements in V .

Knowledge Gap Identification and Filling
On the basis of constructing the knowledge collection network, the reserve knowledge topic network, and identifying the demanded knowledge topic in the previous sections, this section identifies the knowledge gaps in the demanded knowledge topic and proposes corresponding compensation methods. This study suggests that whether the demanded knowledge topics v i can be met within the organization depends on whether the topic exists in the reserve knowledge topic network and whether the topic is important in the reserve knowledge topic network. This standard for measuring importance considers both the number of related neighbor topics and the importance of the neighbor topics. Therefore, the centrality of the eigenvector in the complex network model can be used to describe the importance of a knowledge topic in the organization's reserve knowledge topic network. The formula as follows: where γ is a proportional constant, R ij is the degree of association of topics v i and v j in the reserve knowledge topic network, E i is the centrality of the eigenvector of topic v i in the reserve knowledge topic network, and E j is the centrality of eigenvector of the neighbor topic v j of topic v i . Among them, for v i ∈ V , E i reflects the ability of the reserve knowledge topic network to meet the demanded knowledge topic v i . However, due to the ambiguity of the organization's reserve knowledge topic, the demanded knowledge topic, and the knowledge gap, the ability to satisfy accurate calculations may not be optimal. Therefore, the ability to satisfy the knowledge topic needs to be blurred. In order to map the exact eigenvector centrality to the fuzzy domain of knowledge satisfaction, a fuzzy evaluation set needs to established as follows: where u 1 , u 2 , u 3 indicate that the knowledge topic satisfies ability a poor, general, and good degree of membership, respectively, and its linear degree of membership function is defined as follows: The fuzzy membership of the subject v i in the demanded knowledge topic set V for each capability level can be calculated according to Equations (13)- (15). The fuzzy relation matrix can then be obtained by: Through the compatibility of fuzzy relations, the fuzzy evaluation vector for knowledge satisfaction ability is obtained when the organization is oriented to the demanded knowledge topic set.
The organizational knowledge satisfaction ability is identified according to the membership degree distribution of each component in the vector B. If the evaluation of satisfaction ability is good, it means the demanded knowledge topic set V has corresponding themes in the reserve knowledge network. The topic also occupies the network center position with the neighbor topics, indicating that the organization's demanded knowledge can be fully met within the organization when the organizations are engaged in knowledge innovation activities on such topics for a long time. If the evaluation is poor, it means the knowledge topic involved in the demanded knowledge topic set V does not appear in the reserve knowledge network or the corresponding topic and neighbor topic are at the edge of the network and need to seek out-of-organization support. If the evaluation is general, it means the demanded knowledge set V has a corresponding topic in the reserve knowledge network; however, if the neighboring topics do not occupy the ideal network center position, the organization can gradually meet the demanded knowledge topic through special training and other means.

Case Study
In order to analyze the effectiveness of this method, an empirical study was conducted in the college of economics and management at a university in Beijing. The college has eight undergraduate majors, including economics, international trade, accounting, financial management, marketing, business administration, quality management, and human resource management, with three authorized disciplines of first-level Masters, namely management science and engineering, business administration, and applied economics. It has a research foundation in econometrics, knowledge management, science and technology management, quality management, human resources, asset evaluation, securities investment, corporate growth and mergers and acquisitions, financial accounting, business management teaching, and circular economy.
The Chinese Wikipedia corpus is the complete knowledge set under the "Internet+" environment, using the PyNLPIR (Chinese word segmentation system) provided by the Chinese Academy of Sciences for word segmentation and the Word2Vec model for semantic modeling [33]. In this study, the skip-gram model in Word2Vec is used for training. The dimension value of the word vector was set to 200. For the power-law distribution of the word frequency in the corpus, the low-frequency words with a word frequency of fewer than five times were filtered to reduce the size of the knowledge collection network.
In order to establish an organization's reserve knowledge network, the reserve knowledge text carrier was obtained through the China Knowledge Network full-text database. The college was used as the author unit to search for journal articles published in the range of 2014-2018, and 336 articles were obtained. The topics and abstract information of the above articles were exported, and the PyNLPIR thesaurus was used to segment the abstract information: (1) Semantic modeling was performed using the Word2Vec model. The skip-gram model was selected, and the dimension value of the word vector was set to 150. The frequency that to filter words was set to 5 times, and the semantic relevance d ij between high-frequency topic words was obtained. (2) The TF-IDF algorithm was used to extract keywords and obtain the co-occurrence degree O ij between high-frequency topic words. The topic words were extracted from the complete set network and obtained by the semantic modeling of the organization's reserve knowledge. The intersection of the keywords was obtained from the co-occurrence analysis of the reserve knowledge. As the topic set of the organization reserve knowledge network, β = 0.50 an organizational reserve knowledge network was built with 517 points, 5228 edges, an average degree of 20.22, an average aggregation coefficient of 0.43, an average path length of 2.87, and a degree distribution that obeyed the organizational reserve knowledge network of the power-law distribution feature. The topology of the network is shown in Figure 1.
of the inverse document frequency, a wide range of materials were used to extract the corpus material of research content and the research method section, including 26 applications for the National 863 Program project since 2002, 14 applications for the 973 Program, 10 applications for the National Science and Technology Research Project, 121 applications for the National Natural Science Foundation of China and general programs, 17 applications for youth and general programs of Beijing Natural Science Foundation, 18 applications for national social science fund projects, and 27 applications for social science fund projects in Beijing.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. At α = 0.1 and after applying Formulas (6)-(8) to the above corpus materials to extract highfrequency topic words, 18 topic words were obtained. The weights ′ of each topic word in the It can be seen from Figure 1 that the reserve knowledge of the economic and management college has formed three clusters of topics with close associations located at the center of the network. The first is the financial management topic cluster, with "economic benefits" as the core and "banking," "Capital," "tax," "investor," "inventory," "reward rate," "return rate," and so on as important topic terms. The second is the system evaluation topic cluster, with "features" as the core and "system," "modeling," "structure," "function," "efficiency," and so on as important topic terms. It is worth noting that there is also a circular economy topic cluster with the main keywords of "population," "region," "cluster," "area," and "ecology", shown in Figure 1. Compared with the above two topic clusters, the topic of the circular economy topic group is still smaller, the association between the topic words is still weak, and the network location is also far from the center, which is consistent with the status quo and trend of the discipline development of the college.
The topic of demanded knowledge is extracted from the demanded knowledge text carrier by applying the TF-IDF model. The demanded knowledge text carrier is the research content and research method of the project application of the Beijing Philosophy and Social Science Planning Office. Through this empirical analysis, we can understand whether this project can be completed independently in this college. In order to obtain the background corpus required for the calculation of the inverse document frequency, a wide range of materials were used to extract the corpus material of research content and the research method section, including 26 applications for the National 863 Program project since 2002, 14 applications for the 973 Program, 10 applications for the National Science and Technology Research Project, 121 applications for the National Natural Science Foundation of China and general programs, 17 applications for youth and general programs of Beijing Natural Science Foundation, 18 applications for national social science fund projects, and 27 applications for social science fund projects in Beijing.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.
At α = 0.1 and after applying Formulas (6)-(8) to the above corpus materials to extract high-frequency topic words, 18 topic words were obtained. The weights t f id f i of each topic word in the demanded knowledge were calculated according to Formula (9). At γ = 1, the eigenvector centrality E i of each subject word could be calculated in the reserve knowledge network according to Formula (11). At a = 0.10, b = 0.20, c = 0.30, and d = 0.40, Formulas (13)-(15) could be used to calculate the degree of membership of the knowledge satisfaction content of each subject word, including poor, general, and good. The corresponding results are shown in Table 1. the degree of satisfaction of the organization on the reserve knowledge topic set was calculated according to Formula (17), and the fuzzy evaluation vector was B = (0.51, 0.34, 0.15). It can be seen from the fuzzy evaluation vector that the membership degree belonging to the "poor" level is the largest, which is 0.51. According to the principle of maximum membership degree, the ability of the organization to meet the current knowledge needs is "poor." It is recommended to seek external support from the organization, namely to complete the project through cooperation with other research units.

Results and Discussion
In this section, the method of identifying and filling the knowledge gap of natural language processing proposed in this study is compared with SWOT analysis [8] and Venn diagram [10] to verify the effectiveness of the method.
(1) Analysis based on SWOT analysis. With a focus on industry trends, research directions of the organization, existing research capabilities, and current research needs, 12 professors and young teachers were invited to brainstorm and discuss in a conference room of the college on November 17, 2020. The analysis shows that the opportunities faced by the organization are "the contradiction of resources and environment facing economic development is prominent." The threat is that "similar colleges in the local region have relatively distinctive industry characteristics." The strength is that the disciplines are relatively complete. The weakness is that the discipline of economics and financial management is relatively deep, and the advantage of management is not prominent. The strategy of organizational development should be "condensing the industry characteristics of circular economy and management," and the knowledge gap is "management decision-making in the field of the circular economy, including system analysis, evaluation, and decision-making." The knowledge gap obtained by the SWOT analysis method is shown in Figure 2. (2) Analysis based on the Venn diagram. On November 19, 2020, a total of 15 representatives (project team members, professors, and young teachers) were invited to a conference room of the college to conduct expert interviews based on the project research content to be completed in Section 4. According to the content of the interview meeting and combined with the Venn diagram method, the knowledge demand and knowledge reserve set needed to complete the project were sorted to obtain the knowledge gap. It can be seen that the set of the organization's reserve knowledge includes "Finance," "Accounting," "Investment," "Policy," "Tax Revenue," "Environment," "Industry," "Park," and so on, whereas the set of demanded knowledge includes "Network," "Environment," "Industry," "Park," and so on. Among them, "Environment," "Industry," "Park," and so on are the intersection of the reserve knowledge set and the demanded knowledge set, which are the knowledge needs that can be satisfied, and "Network" is the knowledge gap. In contrast with the methods proposed in this study, knowledge gaps such as "Network structures," "Complexity," "Network Topology," and "Measurement" under the concept of "Network" were not identified. This is because, in expert interviews, the overall structure of the knowledge is a blind spot once the knowledge other than expert experience appears. As a result, the intensity of the gap is difficult to be quantified. Therefore, the analysis based on Venn diagram considers "Network" as "a knowledge gap with certain knowledge accumulation." The knowledge gap obtained by the Venn diagram analysis method is shown in Figure 3. The comparison of SWOT, Venn diagram and the Method in this research on knowledge gap identification and fill is listed in Table 2.  (2) Analysis based on the Venn diagram. On November 19, 2020, a total of 15 representatives (project team members, professors, and young teachers) were invited to a conference room of the college to conduct expert interviews based on the project research content to be completed in Section 4. According to the content of the interview meeting and combined with the Venn diagram method, the knowledge demand and knowledge reserve set needed to complete the project were sorted to obtain the knowledge gap. It can be seen that the set of the organization's reserve knowledge includes "Finance," "Accounting," "Investment," "Policy," "Tax Revenue," "Environment," "Industry," "Park," and so on, whereas the set of demanded knowledge includes "Network," "Environment," "Industry," "Park," and so on. Among them, "Environment," "Industry," "Park," and so on are the intersection of the reserve knowledge set and the demanded knowledge set, which are the knowledge needs that can be satisfied, and "Network" is the knowledge gap. In contrast with the methods proposed in this study, knowledge gaps such as "Network structures," "Complexity," "Network Topology," and "Measurement" under the concept of "Network" were not identified. This is because, in expert interviews, the overall structure of the knowledge is a blind spot once the knowledge other than expert experience appears. As a result, the intensity of the gap is difficult to be quantified. Therefore, the analysis based on Venn diagram considers "Network" as "a knowledge gap with certain knowledge accumulation." The knowledge gap obtained by the Venn diagram analysis method is shown in Figure 3. The comparison of SWOT, Venn diagram and the Method in this research on knowledge gap identification and fill is listed in Table 2. (2) Analysis based on the Venn diagram. On November 19, 2020, a total of 15 representatives (project team members, professors, and young teachers) were invited to a conference room of the college to conduct expert interviews based on the project research content to be completed in Section 4. According to the content of the interview meeting and combined with the Venn diagram method, the knowledge demand and knowledge reserve set needed to complete the project were sorted to obtain the knowledge gap. It can be seen that the set of the organization's reserve knowledge includes "Finance," "Accounting," "Investment," "Policy," "Tax Revenue," "Environment," "Industry," "Park," and so on, whereas the set of demanded knowledge includes "Network," "Environment," "Industry," "Park," and so on. Among them, "Environment," "Industry," "Park," and so on are the intersection of the reserve knowledge set and the demanded knowledge set, which are the knowledge needs that can be satisfied, and "Network" is the knowledge gap. In contrast with the methods proposed in this study, knowledge gaps such as "Network structures," "Complexity," "Network Topology," and "Measurement" under the concept of "Network" were not identified. This is because, in expert interviews, the overall structure of the knowledge is a blind spot once the knowledge other than expert experience appears. As a result, the intensity of the gap is difficult to be quantified. Therefore, the analysis based on Venn diagram considers "Network" as "a knowledge gap with certain knowledge accumulation." The knowledge gap obtained by the Venn diagram analysis method is shown in Figure 3. The comparison of SWOT, Venn diagram and the Method in this research on knowledge gap identification and fill is listed in Table 2.

SWOT [8] Venn Diagram [10] Method in This Research
Set up the knowledge requirements set Organization members adopt brainstorming and other methods to discuss and clarify the strategic intention of the organization and determine the knowledge needed to carry out its expected strategy.
Set up a knowledge demand set and draw a knowledge structure chart by means of brainstorming, interview, and investigation.
TF-IDF algorithm is used to extract the subject words in the text of the required knowledge carrier and construct the requirement knowledge network.
Create a knowledge store set Perform a knowledge-based SWOT analysis to create a map of existing knowledge resources.
Establish a knowledge storage set, describe organizational status, and draw a knowledge distribution map.
Semantic vectorization is carried out based on the Word2Vec model, and the knowledge co-occurrence relationship and semantic association are considered to establish the subject network of reserve knowledge.

Identification of knowledge gaps
Identify knowledge gaps by matching organizational knowledge resources and capabilities to strategic opportunities and threats.
Manually compare knowledge structure diagrams and knowledge distribution diagrams to identify the knowledge gap.
Feature vector centrality is used to describe the importance of the required knowledge topic in the reserve knowledge topic network and identify organizational knowledge gaps.

Knowledge gap compensation method
Transform an organization's knowledge strategy into an organizational and technical architecture to support knowledge creation, management, and utilization processes to bridge these gaps Proposed three kinds of knowledge gaps, knowledge gaps with knowledge accumulation, and knowledge gaps without knowledge accumulation.
Establish a fuzzy evaluation set to evaluate organizational knowledge satisfaction ability. If the ability evaluation is better, the knowledge required by the organization can be fully satisfied within the organization. Instead, seek support outside the organization or gradually meet the requirements of knowledge topics through special training and other means.

Conclusions and Discussion
Under the "Internet+" environment, on the one hand, the rate of organizational knowledge innovation has accelerated significantly. On the other hand, the amount of Internet distribution of knowledge text carriers has increased dramatically, presenting an information explosion. Therefore, the use of knowledge of the network text carrier, efficient and accurate identification of innovative organizational knowledge gaps, and providing corresponding gap compensation methods are the key to winning the knowledge innovation competition under the "Internet+" environment.
In view of the above problems, this study builds a network of complete knowledge topics under the "Internet+" environment based on the Word2Vec model. In the context of the complete network of knowledge topics, the word vector and frequency of an organization's reserve knowledge texts are analyzed, and the organization's reserve knowledge topic network was established based on characteristics the vector centrality analysis organizes the satisfaction degree of the reserve knowledge topics. The demanded knowledge topics are identified based on the TF-IDF model. The fuzzy evaluation method is used to identify the satisfaction ability of the demanded set for knowledge topics in the reserve knowledge topic network and propose corresponding compensation methods.
Through these methods, taking the college of economics and management of a university in Beijing as the research object, the reserve knowledge topic network has 517 points, 5228 edges, an average degree of 20.22, an average clustering coefficient of 0.43, an average path length of 2.87, and the degree distribution. They all obey the power ratio characteristic of a scale-free network. The network has already formed two thematic clusters of financial management and economic system evaluation, and the circular economy topic cluster is being formed. The demanded knowledge was obtained from the research content of the project application of a Beijing Philosophy and Social Science Planning Office, and 18 topics were extracted. The fuzzy evaluation vector is obtained. The value is B = ("0.51,0.34,0.15"), which indicates that the college's current ability to meet the above knowledge needs is "poor," and it is recommended to seek external support.
In the research of existing knowledge gap identification methods, SWOT analysis is a tool for strategic analysis, and the identified knowledge gap is difficult to be quantified. It is not suitable for rapid response identification. The Venn diagram is a quantitative method to study knowledge gaps, but it ignores the hierarchy and correlation between knowledge, and the strength of knowledge gaps obtained is affected by expert experience. This study adopts the quantitative analysis method based on the Word2Vec model to obtain the co-occurrence relationship and semantic relationship among knowledge, establishes the fuzzy evaluation set according to the method of the fuzzy comprehensive evaluation, and obtains different compensation methods according to different satisfaction degrees. This paper presents a fast, objective, and quantitative method for knowledge gap identification and filling for the demand for organizing rapid collaborative innovation under the Internet+ environment. Digital document resources of distributed network storage are mined, and the topic of knowledge reserve and its correlation are expressed in the form of a network graph, reflecting the background knowledge structure inside the organization. The knowledge gap of the organization was determined by matching the knowledge needs and background knowledge of the organization. The corresponding remedy strategies are given according to the degree of knowledge gap satisfaction.
The following further improvements about this study can be made: (1) The fuzzy evaluation results are affected by the values of the parameters a, b, c, and d of the linear membership function. In the future, the parameters can be fed back according to the ability of the organization to complete the knowledge innovation activities. (2) There is no corresponding topic in the reserve knowledge network due to the lack of professionalism of the Chinese Wikipedia corpus and the word segment library used in the establishment of the knowledge collection network. Corpus and word segmentation library for knowledge topic mining in innovation activities should be built in the future. (3) This research corresponds to the innovative organization level. Under the "Internet+" environment, this research method can be applied to the level of innovative talents to achieve more precise knowledge topic analysis and gap identification. (4) The method proposed in this study can be applied to the topic analysis and matching of technology, data, service, and content resources in addition to knowledge gap identification.