Patent Technology Network Analysis of Machine-Learning Technologies and Applications in Optical Communications

: AstheInternetofThings(IoT)develops, applyingmachinelearningonopticalcommunications has become a prospective ﬁeld of research. Scholars have mostly concentrated on algorithmic techniques or speciﬁc applications but have been unable to address the distribution of machine-learning technologies and the development of its applications in optical communications from a macro perspective. Therefore, in this paper, machine-learning patents in optical communications are taken as the analytical basis for constructing a patent technology network. The study results revealed that key technologies were primarily in data input and output devices, data-processing methods, wireless communication networks, and the transmission of digital information in optical communications. Such technologies were also applied to perform measurement for diagnostic purposes and medical diagnoses. The technology network model proposed in this paper explores the technological development trends of machine learning in optical communications and serves as a reference for allocating research and development resources.


Introduction
Optical communications have greatly advanced in signal serial communication speed, agile channel spacing, modulation formats, and coding schemes. However, relevant technologies have yet to fully meet the complexity and performance requirements of future optical communication system networks. The distinguishing features of machine learning are its autonomous learning and evolution ability. With new information, a system can modulate its structure and parameters to build a new mapping network and enable new skills [1]. At present, machine-learning technologies play a significant role in network planning, failure prediction, and optical performance monitoring in optical communication systems [2][3][4]. In the future, intelligent optical communication system networks will be automated and adaptive and become capable of predicting traffic demands to maximize performance. To achieve this goal, the integration of machine-learning mathematics, programming, and algorithms is necessary in optical communications, and these are the key directions of future optical communication development.
Integrating machine learning and optical communication technologies-in essence, combining computer science with communication-is a forward-looking research field. Machine-learning technologies have been highly effective in classification tasks, particularly when signals are non-linear and distorted [5]. Machine learning predicts and eliminates defects in a system by learning its properties. Current signal analysis systems are ineffective in classifying signals. However, machine learning can be applied to such systems to identify patterns in collected data and boost the systems' signal analysis performance. In truth, more scholars have begun to focus on developing this technology [2,4,6].
To satisfy the growing traffic demands of mobile communications and transmitting diverse and high-quality data through methods such as IoT, the goal of developing optical communications is to integrate machine learning and create deeply intelligent control management systems. Topics on involving machine learning in optical communications are diverse and include optical performance monitoring, fiber non-linearity compensation, cognitive network failure prediction, dynamic planning, the cross-layer optimization of software-defined networks, the quality of transmission estimation, and the physical layer design of optical communication systems [23]. These techniques involve optics, mathematics, computer science, communications, and semiconductors and belong to cross-disciplinary technical fields. Developments in IoT technologies, mobile communications, and optical communications have led multiple governments to focus on machine-learning applications in optical communications. Optical communications will gradually become widespread among most users. The sending of large volumes of sound, video, and image data is reliant on AI technologies such as machine learning for computing and transmission, and the range of applicable fields is wide. Therefore, this study analyzed the technological development and applicable fields of machine learning in optical communications, and patents were examined to identify key technologies. The exploration into these key technologies was conducted through network analysis, which is further explained in the following section.

Technology Network Analysis Model
In recent years, studies have explored the development and trends of technological innovation through network analysis methods [24,25] or sought to understand the technical partnership between institutions or inventors through network analysis [26,27] to explore the flow of knowledge [28,29]. Network analysis can be used to accurately illustrate the transmission paths and evolution of technology and knowledge. Analysis of patent data in particular can provide objective and feasible data, such as the year the patent was approved, the quantity of the patent, and the type of technology used [13]. Therefore, this study used patent data to analyze the development of specific technologies, and based on the features of network analysis, the relationship between technical fields were analyzed using co-classification. Because each patent may be involved in multiple technical fields, co-classification can be used to define the relationship between technical fields [30,31] and pinpoint key technologies on the basis of technology networks. The classification framework of technical fields was based on the current the IPC system, and this study used the technology network analysis model to explore key technical fields in machine learning in optical communications.

Retrieval Strategy and Data Source
In this study, the patent analysis was based on data from the United States Patent and Trademark Office (USPTO), a historic organization whose development and data can be traced back to 1975. Because the United States is the largest commercial trade market in the world, most inventors who apply for patents in other countries also submit patent applications in the United States. Therefore, researchers generally use the USPTO database to examine global innovation activities [13,14]. Patents explored in this study were limited to US patents that were announced between January 2015 and December 2019. In this study, the patents were retrieved using SSTO, and the search criteria were (SSTO/Machine Learning) and (SSTO/Optical Communications), which returned 824 patents in total.

Network Centrality Analysis
In this study, key technologies in patent technology networks were identified through technology network centrality. The methods of measuring network centrality are explained as follows.

Degree Centrality
Degree centrality is the number of nodes that are adjacent a specific node and can be used to evaluate the core of a patent technology network. High degree centrality represents a greater number of connected nodes in a network, and degree centrality in specific networks of links represents critical transitions that will become a hot spot in the network [32].
If nodes i and j are connected, then m ji = 1.

Eigenvector Centrality
Eigenvector centrality measures the influence of a node in a network. In addition to whether a node is connected with other nodes, relevant analysis focuses on whether the nodes connected to this nodes are linked with other nodes. The centrality of a node is determined by the centrality of its adjacent nodes. If a node is connected to nodes with high centrality, the node in question has higher degree centrality. This indicates that degree centrality differs between adjacent nodes. Analysis of eigenvector centrality can determine the relative importance of a node and constitutes a crucial research field in technology networks.
where C e (i) and C e ( j) are the eigenvector centrality of nodes i and j, respectively; a ij represents the node entering the adjacency matrix A; and λ is a constant and the largest eigenvalue in the adjacency matrix A. In this equation, eigenvector centrality views the centrality of a single node as the linear combination of the centrality of all other nodes to derive a linear function [33].

Structural Hole
Structural holes can be used to assess the ability of a node as a mediator in the overall network. This concept describes the characteristics of a node that occupies the main communication and information channels in the network, which is associated with the hole effect [34]. This is the degree to which the connections between clusters depend on the node in question. The gaps between clusters provide opportunities to build a network of bridges. If any technology becomes a bridge to connect two non-overlapping technology clusters, that bridging technology gains a spatial advantage in the overall technology network. Burt [34] argued that structural hole effects can be measured using a network constraint index, with a value between 0 and 1. A high network constraint index indicates low autonomy and a low structural hole effect in the entity.
where C ij is the score of the constraint on node i by node j; P ij is the proportion of connections with node j among the connections of node i; P jk is the relational ratio of node j with the other connections of all nodes; and P k j is the ratio of all other nodes with node j connections.
This equation sums all node j totals, and this sum is the total constraint on node i in the network [34]; ergo, C i = j C ij .

Patent Retrieval Results
Before performing technology network analysis, the patent retrieval results were analyzed to gain a preliminary overview of technological development. Machine-learning technologies in optical communications involved 407 four-digit IPCs, which indicated a wide scope of involvement. Table 1 presents the quantity of the top 10 four-digit IPCs. In Table 1, the frequency denotes the number of patents that have appeared in the IPC classification; for example, 170 patents under G06F17 belonged to the IPC classification. The percentage represents the proportion accounted for by an IPC classification out of the total number of IPC classifications; for example, the 824 patents contained 3291 IPC classifications (one patent might have more than two IPC classifications), and G06F17 appeared 170 times, accounting for 5.17%. The results indicated that that the technologies were mostly concentrated under the classifications G06F17, G06F3, G06K9, H04L29, and A61B5 (Table 1); Appendix A displays the definition of each IPC code. According to the IPC definitions, G06F17 refers to data-processing methods, and G06F3 is the classification for data input and output devices (e.g., interface arrangements). G06K9 covers methods and devices for recognizing patterns, and H04L29 is communication control. A61B5 is the classification for measuring diagnostic purposes, among which G06F17 was more related to AI [35] because it appeared the most frequently.
The analysis results of the top ten patentees revealed that Apple Inc., which focuses on AI and communication technology development, holds the greatest number of patents, followed by SAS Institute Inc., Intel Corporation, and International Business Machines Corporation, which are global leaders in intelligent software and services ( Table 2). The next patentee, Volcano Corporation, develops biological diagnostic systems with ultrahigh resolution and holds numerous patents to technologies that can be applied for medical diagnostics. Furthermore, G06N in three-digit IPCs describes computer systems based on specific computational models, and G06N will be subdivided into four-digit IPCs, which is further explained different computational models. Therefore, the patents collected in this study reveal computational models (i.e., G06N). In this study, four-digit IPCs related to G06N appeared 131 times, and the distribution of the four-digit IPCs, including computer systems based on biological models (G06N3), computer systems using knowledge-based models (G06N5), subject matter not provided for in other groups of this subclass (G06N99), as shown in Table 3.

Technology Network Analysis
The results of previous studies have suggested that technology co-classification analysis can be used to analyze the relationship between fields of technology [30,31]. Because patents may be subject to multiple patent classification codes, co-classification information can be used to define the relationship between technical fields, as shown in Figure 1.
Photonics 2020, 7, x FOR PEER REVIEW 6 of 14 distribution of the four-digit IPCs, including computer systems based on biological models (G06N3), computer systems using knowledge-based models (G06N5), subject matter not provided for in other groups of this subclass (G06N99), as shown in Table 3.

Technology Network Analysis
The results of previous studies have suggested that technology co-classification analysis can be used to analyze the relationship between fields of technology [30,31]. Because patents may be subject to multiple patent classification codes, co-classification information can be used to define the relationship between technical fields, as shown in Figure 1. The numbers in the matrix in Figure 1 represent the frequencies of different IPCs appearing in the same patent. A greater number represents a stronger technical connection between IPCs. For example, the IPCs of P1 belong to IPC1, IPC2, IPC3, and IPC4. Because IPC1, IPC2, IPC3, and IPC4 simultaneously appear in P1, a technical connection exists among IPC1, IPC2, IPC3, and IPC4. Moreover, regarding the relationship between IPC1 and other technical fields, because IPC1, IPC2, IPC3, and IPC4 only appear simultaneously in P1, the technical connections between IPC1 and IPC2, IPC3, and IPC4 were consistently "1", as shown in the first column of the matrix. Because IPC1 does not concurrently appear with IPC5 and IPC6 in patents, the technical connection of IPC1 to IPC5 and IPC6 is "0". This approach was adopted to gradually develop co-classification matrices of all technical fields. Therefore, the present study can facilitate plotting of the technology network map through the matrix. Table 4 presents all parameters used in the technology network analysis. Figure 2 presents the network model of key technologies, and the key IPCs are listed in Table 5. The centrality performance index of the top 10 IPC codes in the frequency analysis (Table 1) has been added to Appendix B. The numbers in the matrix in Figure 1 represent the frequencies of different IPCs appearing in the same patent. A greater number represents a stronger technical connection between IPCs. For example, the IPCs of P 1 belong to IPC 1 , IPC 2 , IPC 3 , and IPC 4 . Because IPC 1 , IPC 2 , IPC 3 , and IPC 4 simultaneously appear in P 1 , a technical connection exists among IPC 1 , IPC 2 , IPC 3 , and IPC 4 . Moreover, regarding the relationship between IPC 1 and other technical fields, because IPC 1 , IPC 2 , IPC 3 , and IPC 4 only appear simultaneously in P 1 , the technical connections between IPC 1 and IPC 2 , IPC 3 , and IPC 4 were consistently "1", as shown in the first column of the matrix. Because IPC 1 does not concurrently appear with IPC 5 and IPC 6 in patents, the technical connection of IPC 1 to IPC 5 and IPC 6 is "0". This approach was adopted to gradually develop co-classification matrices of all technical fields. Therefore, the present study can facilitate plotting of the technology network map through the matrix. Table 4 presents all parameters used in the technology network analysis. Figure 2 presents the network model of key technologies, and the key IPCs are listed in Table 5. The centrality performance index of the top 10 IPC codes in the frequency analysis (Table 1) has been added to Appendix B.    The overall network, as shown in Table 4, was composed of 407 nodes and 206,194 links; a total of 407 IPCs were related to machine learning technologies in optical communications. The network density and compactness were 0.159 and 0.377, respectively, indicating that the network was sparsely distributed and that the interaction frequency between nodes was low. The average path length was  The overall network, as shown in Table 4, was composed of 407 nodes and 206,194 links; a total of 407 IPCs were related to machine learning technologies in optical communications. The network density and compactness were 0.159 and 0.377, respectively, indicating that the network was sparsely distributed and that the interaction frequency between nodes was low. The average path length was 2.688, meaning that connecting a node to other nodes required nearly 3 steps, on average. The technology nodes depicted in Figure 1 are crucial technology nodes linking more than 60 different technology nodes; that is, they represent the more critical technology fields in patents for machine-learning technologies in optical communications. Regarding degree centrality, eigenvector centrality, and structural hole, the IPCs G06F3, G06F17, H04W4, and H04L12 had more than two indices in the top five technology fields (Table 5). In the overall IPC codes, the mean eigenvector centrality was 0.028, and a larger value indicated greater relative importance of a node. The structural hole effect was measured using the network constraint index. The mean of said index was 0.519, and a higher value indicated a lower structural hole effect. This suggested that key machine-learning technologies in optical communications are mainly concentrated in data input and output devices (e.g., interface arrangements; G06F3), data-processing methods (G06F17), wireless communication networks (H04W4), and the transmission of digital information (H04L12). The IPCs A61B5 and G06F19 only appeared in the structural hole column, which signified that these two technology fields belonged to different technology clusters and played a cross-disciplinary role. Thus, key technologies in the cross-disciplinary uses of machine learning in optical communications were for measurements for diagnostic purposes (A61B5) and information and communication technology (ICT) specially adapted for specific application fields (G06F19).
Additional insights available through network centrality metrics were as follows. Despite the slightly lower frequency of certain technology fields appearing in patents, in terms of the overall technology network, the connected technology nodes were more diverse and had an interdisciplinary nature in terms of applications. For example, although H04W4 and G06F19 are not listed in the top 10 technology fields with the highest frequency in Table 1, H04W4 was observed to connect to more different nodes in terms of degree centrality. Eigenvector centrality considered whether H04W4 was connected to a node with relatively high centrality, whereas structural holes revealed that G06F19 and H04W4 occupied the main channel of network communication; that is, the degree to which the connection between technology clusters depended on G06F1 and H04W4 was revealed.

Country-Technology Two-Mode Network Analysis
To include more interesting findings, country-technology two-mode network analysis and factionalization analysis were used to understand the strategic cluster of patent technology deployment of each country. Factions analysis was employed to conduct a complete survey of small-world structures. Factions analysis is an explorative tool used to identify subclusters in a social network [36]. In all, four factions were present. The final proposition correct was 0.703, suggesting a favorable fit value. The faction analysis results are presented in Table 6 and Figure 3. Table 6. Faction analysis results.  Table 6 and Figure 3 can be used to identify the technology clusters of invention in the most prominent countries and the proximity of technical fields. For example, China and Germany belong to the same technology cluster, with their patents presenting high technical closeness in the A61B5 field. Australia and Belgium have high connectivity in the G06Q50 and H04L9 fields. Switzerland, the United Kingdom, and Israel have high connectivity in the G06F17, G06K9, and H04B10 fields. The United States and Japan have high technical closeness in the G06F3 and H04W4 fields.   Figure 3 can be used to identify the technology clusters of invention in the most prominent countries and the proximity of technical fields. For example, China and Germany belong to the same technology cluster, with their patents presenting high technical closeness in the A61B5 field. Australia and Belgium have high connectivity in the G06Q50 and H04L9 fields. Switzerland, the United Kingdom, and Israel have high connectivity in the G06F17, G06K9, and H04B10 fields. The United States and Japan have high technical closeness in the G06F3 and H04W4 fields.

Key Machine-Learning Technologies in Optical Communications over the Years
The changes in G06F3, G06F17, H04W4, H04L12 patents over the years were analyzed to understand the development trajectory of machine learning in optical communications (Figure 4). The results indicated that the application and development of G06F3 technologies-optical signals or data input and output devices-have gradually received more attention in recent years ( Figure 2).

Key Machine-Learning Technologies in Optical Communications over the Years
The changes in G06F3, G06F17, H04W4, H04L12 patents over the years were analyzed to understand the development trajectory of machine learning in optical communications (Figure 4).  Table 6 and Figure 3 can be used to identify the technology clusters of invention in the most prominent countries and the proximity of technical fields. For example, China and Germany belong to the same technology cluster, with their patents presenting high technical closeness in the A61B5 field. Australia and Belgium have high connectivity in the G06Q50 and H04L9 fields. Switzerland, the United Kingdom, and Israel have high connectivity in the G06F17, G06K9, and H04B10 fields. The United States and Japan have high technical closeness in the G06F3 and H04W4 fields.

Key Machine-Learning Technologies in Optical Communications over the Years
The changes in G06F3, G06F17, H04W4, H04L12 patents over the years were analyzed to understand the development trajectory of machine learning in optical communications (Figure 4). The results indicated that the application and development of G06F3 technologies-optical signals or data input and output devices-have gradually received more attention in recent years ( Figure 2). The results indicated that the application and development of G06F3 technologies-optical signals or data input and output devices-have gradually received more attention in recent years (Figure 2).

Results Discussion
In this study, network analysis was performed to explore key machine learning technologies in optical communications. The findings revealed that among data input and output devices, data-processing methods, wireless communication networks, and digital information transmission were key technologies that were not clustered in specific fields. The findings also indicated that machine-learning technologies in optical communications were applied to measurement for diagnostic purposes. Therefore, medical diagnostic applications is a direction that merits future study. In addition, comparing the differences between the network analysis and the most frequent Top 10 IPCs to the fourth digit revealed that wireless communication networks (H04W4) were among the top five in the network analysis; however, in the frequency analysis, their frequency of occurrence was not in the top 10. This indicated that although few patents for machine-learning technologies in optical communications were directly related to wireless communication networks, the connected technology fields were quite diverse. This highlights the importance of wireless communication networks and the advent of the era of wireless optical communications.
Furthermore, analysis of major patentees revealed that major developers are leaders in ICT and intelligent software and services-Apple Inc., SAS Institute Inc., Intel Corporation, and International Business Machines Corporation. Apple Inc. focuses on AI applications in communications, whereas SAS Institute Inc. conducts data exploration to advance algorithmic applications to deploy AI to more industries. Intel Corporation and International Business Machines Corporation are major vendors in chips and information. Although most of the top 10 patentees are leading ICT developers and the number of academic patents is lower than that of patents under general enterprises [37], academic and scientific research has a substantial influence on technology patents. The methods and specific techniques proposed in academic and scientific studies affect industrial development. For example, in the sample analyzed in this study, Carnegie Mellon University's patent (US10436615B2), authorized in 2019, uses machine learning. The computer system then trains a classifier to serve as a virtual sensor for an event that is correlated to the data from one or more sensor streams within the featured sensor data. The technology is related to the recording of measured values and can be applied in many industrial fields. Another example is a University of Central Oklahoma patent authorized in 2018 (US9922291B2) that proposes a method and apparatus for providing personalized configuration of physical supports for the human body, comprising accepting input including an individual's demographic information. The patents provide new methods and modes of thinking for computer systems based on specific computational models.
Factions analysis was adopted to determine the technical identification of inventions in prominent countries and to provide references for governments with regard to patent deployment. The results of the factions analysis revealed competition between countries in inventions, as well as the focal fields of each country. China and Germany belong to the same technical cluster, whereas the United States and Japan are in different clusters from those of the United Kingdom and Israel. Moreover, the proximity of technical fields can be observed through the results. For example, methods and devices for recognizing patterns (G06K9) and transmission systems employing electromagnetic waves other than radio-waves (H04B10) belong to the same faction; the co-occurrence of these technological fields in the same patent is highly likely. In technical applications of machine learning in optical communications, high co-occurrence and closeness are present.
The development trend of key machine-learning technologies in optical communications was concentrated in G06F3 or data input and output devices. Recent studies have argued that using machine learning and neural networks can precisely reconstruct digital images, convert blurry and unrecognizable speckle patterns into recognizable digital images, and process distortions caused by environmental disturbances to optic fibers [38]. This technological development is anticipated to advance endoscopic imaging and medical diagnoses [38,39]. In addition, in terms of causality of technology time series, patents for machine-learning technologies in optical communications were mainly focused on the technology field of data processing methods (G06F17), which emphasizes complex mathematical computation. Recently, optical signals as well as data input and output devices (G06F3) began to increase substantially. This indicates that with the development of communication technology and big data, in addition to the improvement of early mathematical computation through the development and application of technology, input devices that transform signals into digital data formats that can be processed by computers have gradually attracted attention, leading to further technological developments and relevant applications.

Implications Discussion
For theoretical contributions, studies on machine learning in optical communications have mostly explored algorithmic techniques [4,7] or researched specific applications [2,8]. However, these studies have failed to identify focal technical fields, development trends, and network distribution channels among technical fields from a macro perspective, particularly regarding indispensable technologies for the future development of machine learning for optical communications. This observation of technological distribution is particularly critical. This study filled this research gap and adopted a new perspective that centers on technical fields.
In technology development, as was described earlier, the focus of key machine-learning technologies in the optical communications field has been mainly on G06F3. In other words, machine learning is applied to data input and output devices in optical communications. Whether optical communications can achieve the prediction of transmission quality through machine learning is crucial in the development of optical communications technology. Traditional optical communications performance relies on the calculation of network layer parameters, which are based solely on the available flow and flow load of the network. However, whether optical communications are blocked is determined by not only the network layer but also the physical layer. The optical communications network must effectively predict the quality of transmission before new channel deployment of optical communications. The quality of transmission involves physical layer parameters such as signal-to-noise ratio and symbol error rate. How machine learning can be used to effectively predict the quality of transmission is the key to future technology development. The use of analysis models to estimate the damage of physical layers for the provision of accurate results is a fundamental challenge in the implementation of optical communications.
To assist in policy suggestions, this paper provides industries and governments with valuable information in a technical map of machine learning in optical communications. The analysis and understanding of technological development foci in technology networks inform industries on allocating research and development resources and informs governments on promoting emerging technologies. Given the development of IoT applications for driving the transmission volume of digital data, the optical communication field will require the integration of machine learning to construct adaptive smart optical communication system networks. This study found that major key technologies were concentrated in electric digital data processing, wireless communication networks, and the transmission of digital information. Therefore, governments should implement long-term financial support and training programs for talent in these technologies to increase the overall research and development capacity of the optical communication industry.

Limitations and Future Research Directions
First, this study used the patent keywords classifications that were organized by Derwent as the basis for patent screening. Although the Derwent database has several hundred experts who examine publicly available patent information to manually sort technical keywords, the development of machine-learning technologies in optical communications spans several fields and technical applications; therefore, some patents may not have been included in this analysis despite falling within its scope. Furthermore, this study analyzed large-scale holistic technology networks. Therefore, the empirical basis was limited to the number of approved patents, and the study did not make value judgements for individual patents. For example, this study lacked a discussion on whether the income of the patentees was correlated to the patents they owned, as well as the cost structure that patents imposed on the patentees. In the future, individual case studies into specific high-value patents can include expert interviews or other research methods. Finally, due to personnel and financial reasons, this study only used USPTO-the largest global commercial trading market-as its source of information on patents. Although this database is widely used to measure global innovations [13,14], future researchers with sufficient time and money should include other data sources on patents for observation and verification, such as by including approved standard essential patents in communications or their citation documents to expand their research breadth.