Patent Analysis on the Development of the Shale Petroleum Industry Based on a Network of Technological Indices

: This study investigated the technological developments in the shale petroleum industry by analyzing patent data using a network of technological indices. The technological developments were promoted by the beginning of the shale industry, and after the first five years, it showed a more complex development pattern with the convergence of critical technologies. This paper described progress in the shale petroleum technologies as changes in relatedness networks of technological components. The relatedness represents degree of convergence between technological components, and betweenness centrality of network represents priority of technological components. In the results, the progress of the critical technologies such as directional drilling, increasing permeability, and smart systems, were actively carried out from 2012 to 2016. Especially, unconverged technology of increasing permeability and the converged technology of directional drilling and smart system has been intensively developed. Some technological components of the critical technologies are more significant in the form of converged technology.


Introduction
Unconventional petroleum has gathered great interest in recent times, especially with growing concerns over the depletion of conventional petroleum sources. In a few countries, unconventional petroleum is already being produced economically along with a steady growth in the industry.
In the US, unconventional petroleum, such as shale oil and tight gas, has seen an increase in production amounts as energy resources. In 2008, shale gas and tight oil constituted as much as 16 and 12% of production of natural gas and crude oil in the US, respectively. In 2018, these production amounts reached 70 and 60%, respectively [1]. This economical production of shale petroleum (shale gas and tight oil) can be considered as an achievement of technological advancement in shale petroleum. Despite the economic benefit of shale petroleum, the environmental problem is the most important factor that could pose a threat to shale petroleum production. According to Cooper et al. [2], shale petroleum causes greenhouse gas (GHG) emissions, water overuse, and local issues around the production site.
According to Holditch [3], the mechanisms for producing unconventional petroleum versus conventional petroleum are distinguishable in two ways. First, increasing the permeability of underground formation is a critical mechanism to produce unconventional petroleum such as shale gas and tight oil, which typically are un-permeable formations that contain petroleum.Second, reducing the viscosity of hydrocarbons is a critical mechanism to produce unconventional petroleum such as heavy oil and oil shale. Permeable underground formations contain excessively viscous petroleum, making it difficult the technologies, such as equipment and device for drilling, extraction exploration, feed purification, and technology for digital simulation. Kim and Lee [25] argued that the critical technological aspects of the shale petroleum industry, such as obtaining resources, investigation, and data processing, have been actively developed from 2010 to 2016.
This study describes the development of technologies of directional drilling (DD) and increasing permeability (IP), which are core technologies of the shale industry, in terms of convergence with smart systems (SS). This is considering that the shale industry's technological developments were focused on productivity improvement. This study found the following: First, the intensity of technological development, measured by the incidence of patents, increased significantly from 2012 to 2016. This point is concordant with the results of Sandrea [9] who observed that there was progress in the productivity of shale petroleum technology. Second, since 2012, a high proportion of technological developments of DD have been developed in the form of convergence with SS. Third, IP technology had a low tendency to converge with DD and SS, but both the number of patents developed and the complexity of the technology were the highest. Furthermore, "reinforcing fractures by using prop" appeared to be the most critical field of IP technologies since 2012. This result is consistent with the results of Shah et al. [14] who also found that productivity improvement was achieved through reinforcing in the shale industry.

Data and Methodology
This study investigated the development of technologies of shale petroleum by analyzing patent data. This study utilized patent data related to production technologies of shale petroleum, such as IP, DD, and SS, from 1997 to 2016. For its analysis, this study calculated the association strength and betweenness centrality by utilizing the most finely distinguished scope (full digit) of technological index (TI) of patent, such as international patent classification (IPC) and cooperative patent classification (CPC).
The analysis of this study has some features. This study focuses only on a portion of technologies of the shale petroleum industry from the data collection stage, focusing on only 26 of the approximately 3900 technology indices. Thus, there are limitations to presenting comparative analysis of various technologies, and to presenting new technologies in an exploratory manner. Still, this study has some advantages. It focuses on the critical technologies of the shale petroleum industry and has the advantage of using association strength, instead of cosine similarity, as the similarity measure. According to Eck and Waltman [26], association strength is an unbiased measure compared to other similarity measures, such as cosine similarity. This is because association strength is not substantially correlated to the occurrence of input data. However, cosine similarity is positively correlated to the occurrence of input data. That is, a frequently occurring TI tends to have higher similarity than a less frequently occurring TI. Lastly, this study distinguishes and presents the results by technological domains, and suggests the results in a numerical and visualized form.

Data
We retrieved patent data from the Korea intellectual property right information service (KIPRIS), an online patent database system of the Korea intellectual property organization (KIPO) [27]. The retrieved data set includes patents for the technologies of American shale petroleum. The retrieval process was as follows: First, we built a searching query by focusing on three critical technologies of unconventional petroleum, namely "directional drilling," "stimulating production by increasing permeability," and "smart system for control, surveying, or testing." This is because, as described in Section 1, DD and the stimulation technologies of unconventional petroleum have taken the critical role of production and initiated the industrial growth of unconventional petroleum in the US. Furthermore, SS are required to facilitate productive operation, advance the apparatus or method of DD, and increase permeability [28]. Second, we focused on the patents applied to the US patent office. This is because only the US has advanced in the growth of an unconventional petroleum (shale gas and tight oil) industry since 2007.
To collect patent data, we built a searching query that comprised the TIs of three technological domains: directional drilling (DD), stimulating production by increasing permeability (IP), and smart systems for control, surveying, or testing (SS). The TIs pertaining to the three technological domains are shown in Table 1. In Table 1, the abbreviations "DD," "IP," and "SS" indicate TIs "E21B 43/," "E21B 7/," and "E21B 44/." For example, TI IP26 indicates E21B 43/26. The descriptions of TIs are provided in Table 1. As shown in Table 1, this study classifies the retrieved patents into six kinds of technologies such as DD, technologies that stimulate production by IP, SS technologies, and three kinds of converged technologies, such as convergence of DD and IP (CDI), convergence of DD and SS (CDS), and convergence of IP and SS (CIS) that involve the TIs within DD, IP, and SS, respectively.
Through the searching query, we found 12,964 applied patents from 1960 to 2019. However, this study focused only on the 6421 granted patents, which were applied from 1997 to 2016, as shown in Figure 1. In Figure 1, the blue line represents the number of applied patents, the orange line represents the number of granted patents, and the gray dash represents the granted ratio. The granted patent count is ordered by the application date. In Figure 1, the trend of the blue line increases twice around 1997 and 2007. The supposed reasons that affect the trend are that the spot prices of natural gas were listed in Henry hub in January 1997, and the shale petroleum industry started commercial production in early 2007. In addition, the applied patent counts, the granted patent counts, and the granted ratio have rapidly decreased since 2017. The reason for the sharp decline in observations (blue and orange lines) is considered as an incomplete aggregation of the database. Thus, this study excludes the observations applied since 2017. Moreover, the reason for this study to use observation as an input of analysis is that the occurrence of technology is recognized only when the patent data occurred consecutively over at least 5 years. Thus, the discontinued observations are excluded.

Methodology
This study utilizes association strength as the technological relatedness measure between TIs of patent [30], and betweenness centrality as a priority measure of TI [31]. The calculation process of betweenness centrality was performed by using the software package "networkX" [32].

Technological Relatedness: Association Strength
In this study, we calculated the association strength similarity. The calculation processes of the two measures were undertaken by following the formulas provided below [30].
In Equation (1), is the number of co-occurrence of TIs and . and are terms for counting the number of TIs in the patent.
• is one ( • = 1) when the TIs or j exist in patent , and • is zero • = 0 when either TIs do not exist in patent . Thus, becomes one when both and are one. is number of total patents for a five-year research period. In Figure 1, the trend of the blue line increases twice around 1997 and 2007. The supposed reasons that affect the trend are that the spot prices of natural gas were listed in Henry hub in January 1997, and the shale petroleum industry started commercial production in early 2007. In addition, the applied patent counts, the granted patent counts, and the granted ratio have rapidly decreased since 2017. The reason for the sharp decline in observations (blue and orange lines) is considered as an incomplete aggregation of the database. Thus, this study excludes the observations applied since 2017. Moreover, the reason for this study to use observation as an input of analysis is that the occurrence of technology is recognized only when the patent data occurred consecutively over at least 5 years. Thus, the discontinued observations are excluded.

Methodology
This study utilizes association strength as the technological relatedness measure between TIs of patent [30], and betweenness centrality as a priority measure of TI [31]. The calculation process of betweenness centrality was performed by using the software package "networkX" [32].

Technological Relatedness: Association Strength
In this study, we calculated the association strength similarity. The calculation processes of the two measures were undertaken by following the formulas provided below [30].
In Equation (1), cti ij is the number of co-occurrence of TIs i and j. ti pi and ti pj are terms for counting the number of TIs in the patent. ti p· is one (ti p· = 1) when the TIs i or j exist in patent p, and ti p· is zero ti p· = 0 when either TIs do not exist in patent p. Thus, cti ij becomes one when both ti pi and ti pj are one. m is number of total patents for a five-year research period.
In Equations (2) and (3), sti i or j is the total co-occurred number of TIs i or j for a five-year research period. T is sum of total co-occurred number of whole TIs.
In Equation (5), S C ij is the similarity measure between TIs i and j, which occur over a 5-year research period.

Betweenness Centrality
This study uses betweenness centrality to determine the comparative importance of TIs in the graph of shale petroleum technologies [31].
In Equation (6), C B (k) is the betweenness centrality of TI k. k is an element of K, a set of whole TIs. Each TI is a node in the graph, which comprises TIs and their similarities. σ vw denotes the number of shortest paths from nodes v and w. σ vw (k) denotes number of shortest paths through node k.

Development of Unconventional Petroleum
Section 3 focuses on describing the technological development and convergence of the unconventional petroleum technologies. First, we can easily identify differences in the extent of technological development of unconventional petroleum by validating the annual patent counts of each technological domain. Figure 2 presents information concerning the granted patent counts of three technological domains and their converged technologies, and the weight of converged technology of DD, IP, and SS.
In Figure 2A, the orange, gray, and yellow lines represent the annual counts of granted patent, including the TIs of DD, IP, and SS with their converged technologies. Broadly, the granted patent counts of the three technological domains increased from 1997 to 2014. In particular, from 2009 to 2014, patents related with IP (gray line) rapidly expanded from 170 to 464 patents per annum. From 2011 to 2014, patents related to DD (orange line) rapidly expanded from 99 to 186 patents per annum, and patents related with SS (yellow line) rapidly expanded from 62 to 186 patents per annum. Figure 2B shows the weight of converged technology of the three technological domains (DD, IP, and SS). Interestingly, the weight of converged technology of IP (gray line) shows a very low level of weight in converged technology. While the weight of converged technologies of DD and SS (orange and gray lines, respectively) have fluctuated around 0.2 from 1990s to 2000s, they have increased from 0.2 to 0.4 since 2011. Figure 2C shows the annual counts of granted patents in converged technologies. The convergence of DD and SS (orange line) always shows higher annual counts than others and have expanded from 2011 to 2014. The convergence of IP and SS (gray line) showed a zero count before 2009, and then showed 11 patents per annum at its peak point in 2014. The convergence of DD and IP (yellow line) also showed a zero count before 2004; it expanded from 2 patents per annum in 2011 to 14 patents per annum in 2016. Figure 2D shows the trend in the annual count of granted patents for DD, IP, and SS, which is very similar to the orange, gray, and yellow lines of Figure 2A. This is because the weights of the converged technologies are quite stable for the research period, as shown in Figure 2B. In Figure 2A, the orange, gray, and yellow lines represent the annual counts of granted patent, including the TIs of DD, IP, and SS with their converged technologies. Broadly, the granted patent In summary, the intensity of technological development has increased in the last 20 years. Moreover, in the past 10 years, converged technologies such as CDS, CIS, and CDI have been developed. Technologies related to DD and SS show a lower extent of technological development with a relatively higher weight of converged technology than IP. Technology related to IP shows a higher intensity of technological development with a lower weight of converged technology. Only two patent technological domains, those of CDS and CDI, rebounded in their annual count of granted patents in 2016. In the next subsections, this study presents technological development from the network aspect of technological relatedness.

Network of Shale Petroleum Technologies
This subsection attempts to describe the technological development of shale petroleum by presenting network properties and visualizing the networks of technological relatedness. The network properties show the development of the patent set from the aspect of a network of technological relatedness. Table 2 presents the network properties of patent technological relatedness in 5-year periods.  Table 2 describes the network development for different research periods. The patent counts grew by 94% on average in each period, while the number of nodes grew by 83%, the number of edges by 123%, and the ratio between edges and nodes by 21%. As described in the previous subsection, the values of Table 2 also show that the intensity of development in shale gas technology has increased. In addition, the number of nodes, which means the number of TIs, has also increased. Meanwhile, the connection between nodes has increased more rapidly. The increased edges and nodes in these networks could mean that these patents contain more combinations of TIs than before. The increased combination of TIs means an increased combination of new technological components, and thus, the emergence of a new technology is expected. Figure 3 shows networks of technological relatedness to help understand the growth of the networks. The visualized networks show only the network of technological relatedness in the first and last periods. The intermediate process is omitted because the networks show a steadily increasing trend during the research period. Figure 3A,B show the visualized networks of technological relatedness from 1997 to 2001 and from 2012 to 2016, respectively. Figure 3 was drawn using Gephi, a visualization network software [33]. Figure 3 shows networks of technological relatedness of shale petroleum technologies from 1997 to 2001 ( Figure 3A) and from 2012 to 2016 ( Figure 3B). In Figure 3, the letters in orange, red, and blue represent the TIs of IP, DD, and SS.
When Figure 3A is compared to Figure 3B, the latter has a more compact shape than the former. This difference in the visualized results is due to the quantitative difference in patent count, the number of nodes and edges between the two patent groups as shown in Table 2, and the difference in ratio between the number of edges and nodes. In addition, the distance between the TIs of DD and SS have become closer. The TIs of IP are still separated from other TIs. This point seems to be influenced by the high number of patents of CDS, as shown in Figure 2C. These differences in the distance between TIs can be described by the association strength similarity of TI, which represents technological relatedness. The technological relatedness between technological domains is presented in Table 3. Energies 2020, 13, x FOR PEER REVIEW 9 of 16     Table 3 presents the association strength between the center nodes of the technological domains, the number of edges between technological domains, and the sum of association strength similarity of edges between technological domains. Lines 1-3 in Table 3 show association strength, which represents the weight of the edge between the center nodes of technological domains such as DD, IP, and SS. These association strength similarities show the relatedness between the technological domains. In addition, the higher the association strength, the closer the distance between the converged technologies in the network of TIs. In lines 4-6 of Table 3, the number of combinations of TIs between technological domains represents the number of ways technologies converge. Thus, the convergence of DD and SS represents CDS, the convergence of IP and SS represents CIS, and the convergence of DD and IP represents CDI. In the lines 7-9 of Table 3, the sum of association strength in each combination between the TIs shows the changes in the aggregate quantitative relatedness between technologies.
In the case of CDS, the association strength similarity between DD (DD04) and SS (SS00) has increased by 2.80% for approximately 20 years. The number of combinations also increased by approximately two times over the same period. Moreover, the sum of the association strength increased by 33.82% over the same period. These changes in values are in concordance with the results from Figure 3. These changes show that the convergence in technology is conducted in a more detailed way. Thus, it implies an improvement in the level of technological development.
In the case of CIS, the association strength similarity between IP (IP26) and SS (SS00) occurred for the last two periods. Compared to the initial state, the association strength from 2012 to 2016 increased by 35.61%. The sum of association strength increased by 175.33%. The number of combinations has more than doubled. However, the number of edges was the lowest compared to the other converged technologies. Furthermore, compared to other technologies, CIS occurred recently and has not grown yet. In Table 4, while the granted patent counts of CIS increased, the ratio between the edges and nodes of CIS decreased compared to the initial state in the period from 2012 to 2016. Information pertaining to granted patent counts and other network properties by technological domains are summarized in Table 4.
In the case of CDI, the association strength between DD (DD04) and IP (IP26) occurred for the last three periods. Compared to the initial state, the association strength increased by 121.53%, while the sum of association strength increased by 305.76%. Interestingly, the number of edges increased two times for each of these periods. These results show that CDI has developed in a manner that has increased the various ways in which technology converges. In particular, the newly emerged ways of technological convergence seem to increase the technological relatedness between DD and IP. In Table 4, the granted patent count for CDI is six to ten times bigger in the period 2012 to 2016 compared to the previous periods. Furthermore, the ratio between edges and nodes increased two times in the period 2012 to 2016 compared to the previous periods. Furthermore, Table 4 shows that IP (15.93) is the highest for the ratio between edges and nodes by technology area, followed by CDS (12.96), DD (12.03), SS (10.84), CDI (9.95), and CIS (6.48). Interestingly, although the IP has converged less with DD and SS, the technological development of IP shows that the combination of technology elements has progressed in a more complex pattern (high connectivity).
As described before, not only have shale petroleum technologies developed, the relationships between technologies have also changed. Next, this study attempts to determine the priorities of technological components, which are assumed to have changed with the development of technology, by presenting the betweenness centrality of TI.

Priority of Technological Components
In this section, the betweenness centrality is presented to validate the priority of technological components by technological domains. Table 5 presents the betweenness centrality of 10 TIs from the top for the four periods between 1997 and 2016. In Table 5, SS00 (automatic control system for drilling) is always ranked at the top. This result implicates that the technological development of the shale petroleum industry has focused on the smart system, which also has a high weight of convergence with DD for Figure 1. IP267 (reinforcing fractures by using prop) has been ranked the second highest from 2007 to 2016. This result is in concordance with the results of Shah et al. [14] Furthermore, DD04 (directional drilling), DD046 (horizontal drilling), DD06 (deflecting direction of borehole), and DD061 (tools such as shaft rotating inside a non-rotating guiding traveling) have been ranked between second and fifth. Interestingly, DD046 was ranked eighth in the last period, while DD06 was still ranked fourth. This result may implicate that the optimization of the production process seems to be more dependent on the technologies related to deflecting the direction of borehole than horizontal drilling, which is a recent technological development. SS005 (underground automatic control system), SS06 (tool feeds' automatic control, which responds to the flow or pressure of the motive fluid of drive), SS02 (automatic control of the tool feed), and DD068 (drilling by using down-hole drilling motor) have been frequently ranked between sixth and tenth. These TIs have a common point, that is, they can be applied to underground drilling systems. Lastly, IP263 (fracturing by using explosive) has seen an increase in its betweenness centrality rank since 2007. These increases in rank may indicate that the diversification of fracturing method has gathered interest since the year the industry came into existence.
To describe the primary focus of technological domains in the recent period, this study presents the betweenness centrality of TI by six technological domains from 2012 to 2016 in Table 6. Results bearing values less than 0.001 are omitted. As shown, there is a big difference in the betweenness centralities of DD06 and DD046. The betweenness centrality of DD06 is approximately seven times bigger than that of DD046. Thus, the unconverged DD technologies have focused recently on the deflecting direction technologies. The unconverged technology of IP mainly focuses on IP267, reinforcing fractures (refracturing), in the recent period. In the case of the unconverged technologies of SS, SS00 is approximately 11 times bigger than SS02, which is ranked the second highest. The detailed component of the technology is not important for the development of the unconverged technology of SS. In the case of CDS, the betweenness centralities of the top two TIs are relatively bigger than that of the others. The top two Tis do not relate to the detailed function of the technologies. This shows that although the technological development of CDS is undertaken with a higher intensity than that of other converged technologies, the form of technological development is less related to the details of the technological component. CIS has only one TI, which shows that CIS has not grown yet. In the case of CDI, the betweenness centrality of Tis is quite similar. Interestingly, a more general level of TI has the lowest ranking value. Thus, it can be presumed that although CDI has a low intensity of technological development, these developments have shown relatively specific functions. In addition, converged technologies, such as CIS and CDI, show lower levels in the variety of priory technological components that pose high betweenness centrality. Interestingly, some technology components such as DD046, DD067, and SS005 were relatively high within the converged technological domain compared to the order of priority within the unconverged technological domain. DD046 shows a lower priority than DD04 and DD06 within DD in Table 6. Meanwhile, DD046 shows the highest priority within CDI in Table 6. DD067 is not shown within DD, but DD067 shows the fourth priority within CDS in Table 6. SS005 shows lower priority than SS00 and SS02 within SS, but SS005 shows a higher priority than SS02 within CDS in Table 6. Thus, DD046 is an important component in the form of CDI technology. Furthermore, when DD046 is developed in the form of CDI, it is considered more important as a technological component. Moreover, DD067 and SS005 are considered as important technological components in the form of CDS technology.

Conclusions
This study attempted to shed light on technological development of the US shale petroleum industry. Here, this study focused on developments and convergences of critical technologies such as directional drilling, increasing permeability, and smart system. To analyze the technological progress, this study measured association strength as relatedness of technological components, described technologies as network of technological components, and measured betweenness centrality as priority of technological components. The results can be stylized as follows: First, the technological developments have been intensively conducted since 2012. Second, the development of DD technologies has been closely related to SS from 2012 to 2016. Except for CDS, the development of converged technologies is lower, considering their intensity and variety, compared to unconverged technologies. Third, the IP technologies are less converged with the other technological domains of direction drilling and smart system. However, IP technologies have intensively developed with higher complexity, more components, and number of inventions than other technological domains. Fourth, some technologies are more significant in the form of converged technology. Horizontal drilling (DD046) is significant in the form of CDI. Tools locking sections of underground apparatus (DD067) and underground automatic control systems (SS005) are significant in the form of CDS.
This study has limitations as its focus is restricted to only some technologies of the shale petroleum industry. However, this study suggests specific information for the respective technologies by using the most specific level of data (full digit of TI). Thus, further investigation is required that analyzes other key technologies of the shale petroleum industry by using the most specific level of data.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations and symbols used in this manuscript: Number of co-occurrence of technological index i and j i, j Technological index in a patent p m, n Number of patents for some five-year research period ti pi , ti pj Term for counting the number of technological indices in a patent p One out of many patents for some research period sti i , sti j Total co-occurred number of technological index i or j for some research period T Sum of total co-occurred number of whole technological index S C i j Similarity measure between technical indices i and j C B (k) Betweenness centrality of technological index σ vw (k) Number of shortest paths through node k σ vw Number of shortest paths from nodes v and w K Set of whole technological indices k, v, m Technological index consisting of a node of association similarity (technological relatedness) network