An Approach for Chart Description Generation in Cyber–Physical–Social System

: There is an increasing use of charts generated by the social interaction environment in manufacturing enterprise applications. To transform these massive amounts of unstructured chart data into decision support knowledge for demand-capability matching in manufacturing enterprises, we propose a manufacturing enterprise chart description generation (MECDG) method, which is a two-phase automated solution: (1) extracting chart data based on optical character recognition and deep learning method; (2) generating chart description according to user input based on natural language generation method and matching the description with extracted chart data. We veriﬁed and compared the processing at each phase of the method, and at the same time applied the method to the interactive platform of the manufacturing enterprise. The ultimate goal of this paper is to promote the knowledge extraction and scientiﬁc analysis of chart data in the context of manufacturing enterprises, so as to improve the analysis and decision-making capabilities of enterprises.


Introduction
The cyber-physical-social system (CPSS) [1][2][3] has recently attracted great attention as a new computing paradigm [4] that integrates physical systems, social systems, and information systems with development of intelligent computing, control, and communication technologies. CPSS expands social characteristics and interactions on the basis of the cyber-physical system (CPS) [5,6], aiming to understand the relationship and influence between CPS and human society, so as to better guide the network, physical, and social elements in the system. CPSS connects physical systems, information systems, and social systems through sensor networks, and intelligently manages various complex systems through intelligent human-computer interaction. Many manufacturing enterprises have adopted CPSS to integrate multi-source heterogeneous manufacturing resources and network services to coordinate and control the manufacturing process accurately [7], to realize the distribution of manufacturing resources between enterprises or cross-enterprises, to achieve more intelligent production control and decision making [8]. However, the increasing use of system in CPSS environment has resulted in explosion of data. Among these data, unstructured chart data occupies a large proportion, but it is difficult to use traditional data analysis methods to process, due to its irregular data structure. This situation consequently leaves two challenges for analysis and application of chart data.
The first challenge is the recognition and extraction of chart data in manufacturing enterprises. Manufacturing enterprises generate large number of statistical data on product research, production quality, sales, after-sales, etc., during the production process. Most of them are directly saved as chart in that the vivid graphic structure can intuitively and effectively reflect the relationship and characteristics of the data [9]. The information of chart can be used by decision makers to understand the full life cycle of the product and identify the potential problems timely so as to make better production schedule. However, a chart is generally saved in a jpg or svg format [10,11], which makes it difficult to obtain

Related Work
This section is to survey the advancements in two research directions related to social context extraction and text description generation, and identify the deficiencies that need to overcome.

Manufacturing Chart Data Extraction
Manufacturing chart data extraction in CPSS is to extract more useful information and knowledge from unstructured manufacturing chart. Many scholars have focused on specific chart types, such as bar charts [15], line charts [16], and scatter charts [17], due to the diversity of the charts. In addition, most of the existing research integrates OCR, image recognition, and deep learning technologies to extract knowledge from the charts. For example, Savva and Kong [18] designed ReVision system to extract data from pie charts and bar charts by locating marked points in the charts, and they used SVM to classify charts. Choudhury and Wang [19] used the clustering method to extract data from the line chart. Siegel and Horvitz [20] used CNN network to classify the chart types and extracted the chart data with legend information. Choi and Jung [21] extracted chart data from graphs based on deep neural network (DNN). Jung and Kim [22] designed a semi-automatic, interactive extraction algorithm to extract data from different chart types. Poco and Heer [23] proposed an automatic text extraction pipeline to recover the visual information coding from the chart. Luo and Li [24] extracted chart data based on OCR and key point detection network.
The above data extraction application reported two strategies. The first is a two-phase process. It classifies chart types using machine learning or deep learning methods, then extracts chart data using different methods according to the chart type. The approaches based on this strategy have great accuracy, but these methods cannot generalize well on other types of charts. For example, the extraction framework of bar chart cannot be applied to the line chart. The second strategy uses key point detection technology to extract the underlying data for all types of charts, and then transforms the underlying data into structured chart information. The second strategy can get chart in-

Related Work
This section is to survey the advancements in two research directions related to social context extraction and text description generation, and identify the deficiencies that need to overcome.

Manufacturing Chart Data Extraction
Manufacturing chart data extraction in CPSS is to extract more useful information and knowledge from unstructured manufacturing chart. Many scholars have focused on specific chart types, such as bar charts [15], line charts [16], and scatter charts [17], due to the diversity of the charts. In addition, most of the existing research integrates OCR, image recognition, and deep learning technologies to extract knowledge from the charts. For example, Savva and Kong [18] designed ReVision system to extract data from pie charts and bar charts by locating marked points in the charts, and they used SVM to classify charts. Choudhury and Wang [19] used the clustering method to extract data from the line chart. Siegel and Horvitz [20] used CNN network to classify the chart types and extracted the chart data with legend information. Choi and Jung [21] extracted chart data from graphs based on deep neural network (DNN). Jung and Kim [22] designed a semi-automatic, interactive extraction algorithm to extract data from different chart types. Poco and Heer [23] proposed an automatic text extraction pipeline to recover the visual information coding from the chart. Luo and Li [24] extracted chart data based on OCR and key point detection network.
The above data extraction application reported two strategies. The first is a two-phase process. It classifies chart types using machine learning or deep learning methods, then extracts chart data using different methods according to the chart type. The approaches based on this strategy have great accuracy, but these methods cannot generalize well on other types of charts. For example, the extraction framework of bar chart cannot be applied to the line chart. The second strategy uses key point detection technology to extract the underlying data for all types of charts, and then transforms the underlying data into structured chart information. The second strategy can get chart information more directly in comparison with the first strategy regardless of the type of chart. This study focuses on Symmetry 2021, 13, 1552 4 of 17 chart data extraction from manufacturing chart, adopts the second strategy, and adds chart text extraction process to obtain legend and coordinate information.

Chart Description Generation
Significant progress of image description approaches based on NLG has been made in the field of visualization. Chart description pays more attention to precision and relevance in comparison with general image description approaches, which leads to high-level chart information understand for the statistical characteristics of chart. The current NLG methods mainly include template-based and deep learning. Most existing works use predefined templates to generate text description for a certain type of chart, and then extends the approach to other types of charts. Several approaches have been reported in the literature. Al-Zaidy and Giles [25] used image processing and text recognition technology to extract bar chart data and generate chart description by the "protoform" method [26]. Oliveira and Silva [15] defined textual description templates with relevant characteristics of chart elements to verbalize the extracted data of a bar chart. Bryan and Ma [27] presented temporal summary images as an approach for both exploring chart data and generating description from it. Hullman and Diakopoulos [28] designed Calliope system to generate visual data description that incorporates a new logic-oriented Monte Carlo tree search algorithm. Mahmood and Bajwa [29] used a template-based NLG approach to describe the data information extracted from the pie chart in the form of natural language summaries. Kallimani and Srinivasa [30] designed a system to identify and interpret bar graphs and generate bar chart description from the extracted semantic information. Liu and Xie [31] proposed an approach to automatically generate chart description, and designed a summary template that can be extended to different chart types to generate a text description of chart. These methods are simple to construct, easy to operate and implement, but the generated descriptions are relatively rigid and cannot fully meet the interactive and diverse needs of manufacturing companies in the CPSS environment.

Data Information Extraction from Chart
Our first work is a pipeline model for chart information extraction based on OCR and deep learning technology, which can be divided into two major parts: chart text extraction and key point detection. The chart text extraction includes legend and coordinate information extraction and classification. The key point detection adopts CornerNet [32] to extract the most valuable points in the chart. Finally we can get the numerical chart information by integrating the text information and key point data.

Chart Text Extraction
Chart text extraction can help obtain the common information in the chart, including chart title, legend, and coordinate range. The framework of this section is based on OCR and CNN technology, as shown in Figure 2.
Beginning with the preprocessing of the chart, we binarize the chart image with a threshold approach to facilitate subsequent processing. Then, a text pixel classifier based on CNN is used to remove non-text pixels in the chart image to obtain a pure image with only text pixels remaining, which facilitates the text recognition work. In the text recognition phase, we perform OCR using the open source engine [33] to get a set of text contents. To classify the text content more appropriately, we consider four text types, including chart title, legends, x-axis, and y axis, and we train an SVM using a radial basis function kernel to classify text elements. In the classification task of text roles, there are many alternative machine learning classification methods, such as k-nearest neighbors (KNN) [34] and SVM. The advantages of SVM are that it can easily handle linearly separable and low-dimensions problems, and it can also solve high-dimensions problems through appropriate kernel function, while the disadvantages are that it is inefficient when dealing with large sample datasets and is sensitive to noisy feature vectors. In contrast, the advantages of the KNN classifier are that it does not require training process and is easy to understand, but it does not work well with high-dimensional problems. Considering that text role classification is a high-dimensional classification problem, when KNN calculates the similarity between the features of two texts, the cost is quite huge, and the classification speed is not particularly ideal. Meanwhile, the scale of the chart text dataset of the manufacturing enterprises is relatively small. It is prone to overfitting when using deep learning-based models, so we adopt SVM as the text role classification method.

Key Point Detection
Key point extraction allows to extract valuable chart points regardless of chart types. We adopt CornerNet with HourglassNet [35] backbone for key point proposal. After a series of down-sampling and up-sampling processing, we get a probability map that highlights the pixels in key point locations, then use it as input through the predict module to obtain the thermal feature map, embedded feature map, and offset feature map of the chart. The framework of key point detection is shown in Figure 3.
The thermal feature map is used to predict the position information of upper left corner and lower right corner point. The number of channels is the categories in training set, indicating the category probability of each corner point. The loss function of the thermal feature map is as follows: where N is the number of key points in the chart, and are the hyper parameters that determine the contribution of each point, and the values are set to 2 and 3, respectively.
is the score of category C at the position ( , ). The higher the score, the higher the probability of the corner point.
is the ground-truth thermal characteristic map calculated by Gaussian formula. (1 − ) can be understood as the distance be- In the classification task of text roles, there are many alternative machine learning classification methods, such as k-nearest neighbors (KNN) [34] and SVM. The advantages of SVM are that it can easily handle linearly separable and low-dimensions problems, and it can also solve high-dimensions problems through appropriate kernel function, while the disadvantages are that it is inefficient when dealing with large sample datasets and is sensitive to noisy feature vectors. In contrast, the advantages of the KNN classifier are that it does not require training process and is easy to understand, but it does not work well with high-dimensional problems. Considering that text role classification is a high-dimensional classification problem, when KNN calculates the similarity between the features of two texts, the cost is quite huge, and the classification speed is not particularly ideal. Meanwhile, the scale of the chart text dataset of the manufacturing enterprises is relatively small. It is prone to overfitting when using deep learning-based models, so we adopt SVM as the text role classification method.

Key Point Detection
Key point extraction allows to extract valuable chart points regardless of chart types. We adopt CornerNet with HourglassNet [35] backbone for key point proposal. After a series of down-sampling and up-sampling processing, we get a probability map that highlights the pixels in key point locations, then use it as input through the predict module to obtain the thermal feature map, embedded feature map, and offset feature map of the chart. The framework of key point detection is shown in Figure 3.
The thermal feature map is used to predict the position information of upper left corner and lower right corner point. The number of channels is the categories in training set, indicating the category probability of each corner point. The loss function of the thermal feature map is as follows: where N is the number of key points in the chart, α and β are the hyper parameters that determine the contribution of each point, and the values are set to 2 and 3, respectively. p cij is the score of category C at the position (i, j). The higher the score, the higher the probability of the corner point. y cij is the ground-truth thermal characteristic map calculated by Gaussian formula. 1 − y cij can be understood as the distance between the predicted corner point and the real corner point after Gaussian non-linearization.  Embedding feature map is used to match the upper left and lower right key points of the same target. The main idea is to minimize the distance of embedding feature maps belonging to the same group of targets and increase the distance that does not belong to the same target to match the same group of targets. The loss function is as follows: where is the loss of function that minimizes the distance between the corner points of the same group and is the loss of function that distinguishes the corner points of different groups.
is the embedding feature of the upper left corner point of category k, is the embedding feature of the upper right corner point, and is their average value.
The offset feature map is used to correct the position of the key points. The offset is added to the predicted corner point position to reduce the error caused by a series of up-sampling and down-sampling operations through the hourglass network.
The key point positions can be determined through the three feature maps. Finally, we combine the key point feature data with the text information to get the chart data.

Problem Description and Assumption
Chart descriptions with natural language generation technology can greatly improve the comprehensibility and interactivity of unstructured data for promoting the information integration and application cross-enterprise manufacturing data in the CPSS environment.
The framework of traditional NLG [36,37] methods is based on "white box" design patterns, such as template-based and grammatical rule-based methods, in which the functions of each module are relatively clear, and have strong interpretability and con- Embedding feature map is used to match the upper left and lower right key points of the same target. The main idea is to minimize the distance of embedding feature maps belonging to the same group of targets and increase the distance that does not belong to the same target to match the same group of targets. The loss function is as follows: where L pull is the loss of function that minimizes the distance between the corner points of the same group and L push is the loss of function that distinguishes the corner points of different groups. e tk is the embedding feature of the upper left corner point of category k, e bk is the embedding feature of the upper right corner point, and e k is their average value. The offset feature map is used to correct the position of the key points. The offset is added to the predicted corner point position to reduce the error caused by a series of up-sampling and down-sampling operations through the hourglass network.
The key point positions can be determined through the three feature maps. Finally, we combine the key point feature data with the text information to get the chart data.

Problem Description and Assumption
Chart descriptions with natural language generation technology can greatly improve the comprehensibility and interactivity of unstructured data for promoting the information integration and application cross-enterprise manufacturing data in the CPSS environment.
The framework of traditional NLG [36,37] methods is based on "white box" design patterns, such as template-based and grammatical rule-based methods, in which the functions of each module are relatively clear, and have strong interpretability and controllability. Deep learning-based NLG methods adopt end-to-end "black box" design patterns, which are language probability models that predict the character or word with the highest probability of appearing based on the existing text sequence.
The NLG method based on templates and grammatical rules can generate text descriptions in predefined formats. However, these descriptions are not enough for manufacturing companies in the CPSS environment. In most cases, they have higher requirements for the flexibility and interactivity of data analysis and hope to obtain the relevant descriptions that can meet user's demands instead of the results that contain a lot of useless information. For this purpose, we decided to adopt deep learning-based NLG method to generate chart descriptions.
Compared with general deep learning-based NLG tasks, the difference in manufacturing chart description is the large amount of numerical data. Considering that numerical data cannot help the semantic representation, in fact, these data may even mislead the semantic generation process of the model. Therefore, we design a deep learning-based NLG model that ignores numerical data. We mask all the numerical data and other data that related to the chart properties during model training process to help the model focus on semantic representation generation. At the same time, an output branch is added to distinguish user's intention. By adding corresponding numerical data according to the user's intention to the generated semantic representation, we can get more reliable chart descriptions.

The Model of Natural Language Generation for Chart Description
We design an NLG model based on the long-and short-term memory network (LSTM) [38] to generate the text description for chart data of the manufacturing enterprise.
As shown in Figure 4, the initial sequence of the model is the input of manufacturing enterprise user, and it is first transformed into word vector through the embedding layer for the reason that neural network cannot directly process the text sequence. We chose the pre-trained Bert [39] as the embedding layer, a popular language representation model in recent years, which has achieved excellent results in many natural language processing tasks through a huge amount of corpus, complex structure, and powerful computing capabilities.  In the output layer, we set up two fully connected structures, y1 and y2, for semantic representation generation and user intention recognition, respectively. A softmax classification layer is added after y1 layer to select the word with the probability according to current sequence, then add it to the sequence. In addition, this process will be The word vector obtained through encoding layer is passed to LSTM network. Compared with the traditional recurrent neural network (RNN) [40], LSTM introduces memory module and cell state to control and store information. As shown in Figure 4, the memory module has three gates, including forget gate, input gate, and output gate. The forget gate determines the previous cell state c t−1 to be stored in current cell state c t , which is defined as where w f refers to the weight matrix of forget gate, [h t , x t ] refers to the two vectors of splicing, σ is the sigmoid function, b f is the bias matrix. The input gate determines the input of current network x to be stored in state c t , which is defined as where w i and b i are the weight matrix and bias matrix of input gate. The cell state can be updated through the results of the forget gate and input gate as follows: The output o t of LSTM is determined by the output gate according to current cell state c t as In the output layer, we set up two fully connected structures, y 1 and y 2 , for semantic representation generation and user intention recognition, respectively. A softmax classification layer is added after y 1 layer to select the word with the probability according to current sequence, then add it to the sequence. In addition, this process will be continuously looped until the expected description that masks the numerical data is generated. From the perspective of probabilistic modeling, it can be expressed as: where X is the input sequence, Y = (y 1 , y 2 , · · · , y n ) represents the output of model, and y i (i = 1, 2, · · · , n) represents a single character or word. The purpose of model training is to obtain conditional probability P θ (Y|X) . The model generates a character or word y i every time by sampling through the probability distribution P θ (y i |Y <i , X) . To improve the interactivity of the description, we choose a smoother sampling strategy as follows: where t is a parameter that controls the randomness of sampling. The larger the value of t, the stronger the diversity of sampling and the more changes in the generated description. For example, if the input of user is "max," we can get a chart description through the sampling process as: "In the {chart title} chart, {y axis} get maximum value in {x value}, with a value of {y value}." Obviously, there are four masks needed to be replaced in the next steps. We also add a softmax layer after y 2 layer to distinguish user's intent, but we only retain the result of the initial sequence as the intent information. For the previous example, the model can distinguish that what the user wants is the maximum value in the chart, and then we can select the numerical value and other chart properties from the extracted chart data to prepare for the description as: " The last step of the model is to replace the masks in the generated chart description with chart data according to user's intent. The final chart description is as: "In the Workshop order count in the first half of 2020 chart, workshop 1 has least orders in Jan, with a value of 3.4."

Application Cases and Experiments
In this section, two sets of experiments are carried out to test the efficiency and practicality of the proposed MECDG model. The first set of experiments tests the effectiveness of chart data extraction through comparative experiments on the Vega [41] dataset and the self-collected manufacturing enterprise chart dataset (MECD), which is collected from the workshop and production system of the manufacturing enterprise. The second set of experiments tests the performance of chart description generation through questionnaires and case comparisons. Note that the dataset of the second set of experiments is built according to the analysis needs of the manufacturing enterprise users.

Dataset and Settings
We chose the charts automatically generated by the Vega system and MECD as the dataset of chart data extraction. The dataset contains bar graphs, line graphs, and line graphs. The details of the dataset are shown in Table 1. During the experiment, we divided the dataset 3:1 into training set and test set. For chart description generation, we construct a manufacturing enterprise intention and semantic representation dataset (MEISRD). The dataset collects 4971 samples of chart descriptions and 3918 samples of users' intent under various manufacturing enterprises' demand.
The hardware experiment environment of this article is configured as CPU Intel Xeon E5-2630V4 (10 cores, clocked at 2.2 GHz), with a storage capacity of 128 GB. The MECDG model is deployed on a symmetric multi-processing node, which is equipped with 2 NVIDIA 2080Ti graphics cards, and the data is stored on a local disk with a storage capacity of 1TB. The experiment is based on the Keras deep learning framework, adopts Adam optimizer with learning rate 2.5 × 10 −4 , and decreases the learning rate to 2.5 × 10 −5 for the last 100 batches to train the natural language generation model. Batch size is set as 32, whereas α and β are set to be 2 and 4, respectively. For all the types of chart images, we use the same HourglassNet -based network with 104 layers.

Chart Data Extraction Evaluation
To evaluate the performance of chart data extraction in MECDG, we introduce two benchmark models in Table 2. The first competitor employs ReVision [18] system to extract chart data. The second adopts ChartSense [22] method. The two methods are appropriate choices for us to evaluate the performance of MECDG on chart data. We used three basic evaluation methods to conduct comparative experiments including precision, recall, and f1 score as follows: precision = true positives true positives + f alse positives (10) recall = true positives true positives + f alse negatives (11) f 1score = 2· precision·recall precision + recall (12) where true positives is the positive sample predicted to be positive by the model, f alse positives is the negative sample predicted to be positive by the model, and f alse negatives is the positive sample predicted to be negative by the model. The detailed comparison results are shown in Table 2. We have given the values of three indicators for each method in three different chart types, including bar chart, scatter chart, and line chart. To reflect on the overall performance more intuitively, the average value is also displayed at the bottom of Table 2. Limited by the size of the table, we use "Prec" for precision, "Rec" for recall, and "F1" for f 1 score here.
It can be seen from the table that the average precision of this method in various charts is 90.1%, which is significantly higher than ReVision and ChartSense. Especially, the data extraction precision of this method can reach 91.2% on the bar graph. The intuitive reason for this result is that the key points in the bar chart are more prominent, and therefore it is more conducive to use CornerNet to extract chart data.
Since chart text extraction and key point extraction are performed independently, the evaluation value under these two steps is given respectively. In the first step, we extract the title, coordinates, and legend data from the dataset as a comparison. In the second step, we manually mark the coordinates of the key points as a comparison standard. The results are as Table 3: It can be observed that both parts have good performance, and the result of the text extraction part is a little worse than key point extraction, but it is actually sufficient for the extraction of chart data.

Chart Description Generation Evaluation
The language description is flexible and changeable, as is the chart description generation. Unlike general deep learning-based tasks, the effectiveness of the chart description generated by the NLG model is difficult to measure. In many cases, the expressions are different but the description may be appropriate. We adopt bilingual evaluation understudy (BLEU) [42] as the evaluation method, which is used to evaluate the quality of translated text in machine translation. This method mainly evaluates the text by comparing the similarity between the machine-translated text and the human-translated text. Generally, the higher the similarity, the higher the quality. Based on this concept, we can also use BLEU to evaluate the chart description by comparing the generated description with the high-quality expected description. First calculate the score of each generated description, and then average the scores of all descriptions to get the overall quality score. The relevant equation is defined as: where P n represents n-gram precision of generated description c i compared with standard description s i,j , BP represents the brevity penalty, and BLEU represents the final score. h k (c i ) represents the number of the k-th phrase in the generated description. h k s i,j represents the number of the k-th phrase in the standard description. Generally, P n is sufficient for effective evaluation, but the precision of n-gram may get better as the sentence length becomes shorter. BP is introduced to punish too short descriptions. l c represents the length of generated description and l s represents the length of standard description. To balance the effect of the n-order statistics, calculate the geometric weighted average, and then multiply it by the length penalty factor to get the BLEU score. The value of BLEU is always a number between 0 and 1, and the closer to 1, the higher the quality of the generated description.
In this section of the comparative experiment, we introduce two benchmark models, including origin RNN and LSTM models. The BLEU value results are as follows in Table 4: Obviously, MECDG greatly improves the quality of description by splitting the chart description into two parts, including intention recognition and description generation. In this way, the adverse effects of the chart attribute values and numerical data on the semantic understanding of the model are avoided.

Application Case in Manufacturing Enterprise
We apply MECDG proposed in this paper to an enterprise's quality data integration and visual analysis platform [43], which integrates, manages, and analyzes the data generated by the company in a unified manner. There are four main parts in this platform: data management, algorithm management, data analysis, and data visualization, so we add MECDG to the algorithm management module according to the access rules and specifications of the platform, and apply it as one of the analysis algorithms for manufacturing chart data. In the analysis platform, we upload the chart data in data management module first, then use MECDG to analyze the chart in data analysis module to get chart description, and finally we display the description in the data visualization module. Figure 5 shows the implementation of the MECDG method for chart analysis on the enterprise interactive platform. The figure on the left lists the interaction statistics in the platform, including interaction time, content, and user information, while other two figures show the operation of chart information analysis and extraction.
to get chart description, and finally we display the description in the data visualization module. Figure 5 shows the implementation of the MECDG method for chart analysis on the enterprise interactive platform. The figure on the left lists the interaction statistics in the platform, including interaction time, content, and user information, while other two figures show the operation of chart information analysis and extraction. Figure 5. Application case results in manufacturing enterprise platform. Firstly, manufacturing users manually upload a chart image that needs to be analyzed. Below the image are the requirements for the uploaded image, including format and size. Secondly, click the "Start Extraction" button to extract the data information of the chart image. In the result interface, manufacturing users can directly click "Extraction Result" to obtain the extracted chart data, which is displayed in a table on the web page, including data serial number, data type, and content, or click "Analysis Result" to enter the chart analysis page. In the chart analysis page, manufacturing users enter analysis requirements and click the "Search" button to obtain the desired chart analysis description on the web page.

Evaluation and Discussion
In this section, we mainly evaluate the MECDG method applied in manufacturing enterprises of the CPSS environment from two aspects of practicality and effectiveness.

Evaluating the Practicality of MECDG Model
The goal of the first phase is to collect as much feature information as possible from the chart of the manufacturing enterprises for further text description generation. We first create the chart dataset MECD of the manufacturing enterprises and then run the extraction algorithm on these charts to capture text and data information. We use two methods to extract data and text information, respectively, to ensure the completeness and accuracy of information extraction. The dataset includes two parts: the charts generated by the Vega system and the charts of the manufacturing companies for the reason of Firstly, manufacturing users manually upload a chart image that needs to be analyzed. Below the image are the requirements for the uploaded image, including format and size. Secondly, click the "Start Extraction" button to extract the data information of the chart image. In the result interface, manufacturing users can directly click "Extraction Result" to obtain the extracted chart data, which is displayed in a table on the web page, including data serial number, data type, and content, or click "Analysis Result" to enter the chart analysis page. In the chart analysis page, manufacturing users enter analysis requirements and click the "Search" button to obtain the desired chart analysis description on the web page.

Evaluation and Discussion
In this section, we mainly evaluate the MECDG method applied in manufacturing enterprises of the CPSS environment from two aspects of practicality and effectiveness.

Evaluating the Practicality of MECDG Model
The goal of the first phase is to collect as much feature information as possible from the chart of the manufacturing enterprises for further text description generation. We first create the chart dataset MECD of the manufacturing enterprises and then run the extraction algorithm on these charts to capture text and data information. We use two methods to extract data and text information, respectively, to ensure the completeness and accuracy of information extraction. The dataset includes two parts: the charts generated by the Vega system and the charts of the manufacturing companies for the reason of that the number of chart collected by the manufacturing companies is limited. We use the Vega system, which the dataset generates and is also used in many chart data extraction researches to improve the generalization ability of the model. Our dataset contains three chart types, which can roughly represent the chart types of manufacturing companies.
Based on these extracted chart information, the second phase is illustrated by a practical example of the chart description in the manufacturing enterprise's workshop, which is described as: "In the 'Workshop order count in the first half of 2020' chart, workshop 1-3 have an increase trend." Figure 5 presents the graphical user interface for the chart text description in the manufacturing enterprise data analysis platform. In the search box below the uploaded chart, the users enter the key text of the chart analysis that they want to get and press "Enter." Then, the platform will call the model to obtain the results that need to be displayed through two parts: chart data extraction and description generation. The final chart description will be displayed in the result part below the figure. This method greatly improves the ability of manufacturing enterprises to analyze and understand chart data.

Evaluating the Effectiveness of MECDG Model
A difficulty in generating chart text description is, how to evaluate the quality of the generated results. There is no uniform standard. Therefore, a model of generating chart text description is usually evaluated by the performance based on "ground truth," which relies on human objective evaluation. We assembled a questionnaire with the ground truth generated by the model conduct computational experiments to assess the statistical accuracy of the model.
To evaluate the description quality generated by the model, we randomly sent online questionnaires to data analysts, information graphics designers, UI designers, data visualization technicians, and corporate management users of a manufacturing company, and obtained a total of 60 valid responses. In the questionnaire, the respondents mainly evaluated the descriptions generated by using the model. The questionnaire contained the processing results of 10 different types of chart data. Design scores were based on the recognition of the description text. After collecting and analyzing the questionnaires, the results were statistically analyzed. The specific results are shown in Figure 6. that the number of chart collected by the manufacturing companies is limited. We use the Vega system, which the dataset generates and is also used in many chart data extraction researches to improve the generalization ability of the model. Our dataset contains three chart types, which can roughly represent the chart types of manufacturing companies.
Based on these extracted chart information, the second phase is illustrated by a practical example of the chart description in the manufacturing enterprise's workshop, which is described as: "In the 'Workshop order count in the first half of 2020' chart, workshop 1-3 have an increase trend." Figure 5 presents the graphical user interface for the chart text description in the manufacturing enterprise data analysis platform. In the search box below the uploaded chart, the users enter the key text of the chart analysis that they want to get and press "Enter." Then, the platform will call the model to obtain the results that need to be displayed through two parts: chart data extraction and description generation. The final chart description will be displayed in the result part below the figure. This method greatly improves the ability of manufacturing enterprises to analyze and understand chart data.

Evaluating the Effectiveness of MECDG Model
A difficulty in generating chart text description is, how to evaluate the quality of the generated results. There is no uniform standard. Therefore, a model of generating chart text description is usually evaluated by the performance based on "ground truth," which relies on human objective evaluation. We assembled a questionnaire with the ground truth generated by the model conduct computational experiments to assess the statistical accuracy of the model.
To evaluate the description quality generated by the model, we randomly sent online questionnaires to data analysts, information graphics designers, UI designers, data visualization technicians, and corporate management users of a manufacturing company, and obtained a total of 60 valid responses. In the questionnaire, the respondents mainly evaluated the descriptions generated by using the model. The questionnaire contained the processing results of 10 different types of chart data. Design scores were based on the recognition of the description text. After collecting and analyzing the questionnaires, the results were statistically analyzed. The specific results are shown in Figure 6. The results in the figure show that among the four evaluation criteria for the description text, most of the respondents gave quite satisfactory evaluations, which means that our model can meet the needs of users and generate correct and understandable descriptions. We also received many suggestions that look forward to more diversified text descriptions, and these problems can be further improved by expanding the corpus data in the future. The results in the figure show that among the four evaluation criteria for the description text, most of the respondents gave quite satisfactory evaluations, which means that our model can meet the needs of users and generate correct and understandable descriptions. We also received many suggestions that look forward to more diversified text descriptions, and these problems can be further improved by expanding the corpus data in the future.
We also compared the results with the traditional natural language template generation method. Figure 7 shows the semantic expression using the template method in manufacturing enterprise platform and pre-setting the maximum and minimum expression methods. As shown in Figure 7, compared with the chart description generated by the MECDG method in Figure 5, the traditional template-based method requires rigid expressions to be set in advance, and when the needs increase, the semantic description will be more cumbersome and difficult to read. However, the corresponding analysis results displayed according to the input requirements can make the chart analysis results more targeted and readable.
We also compared the results with the traditional natural language template generation method. Figure 7 shows the semantic expression using the template method in manufacturing enterprise platform and pre-setting the maximum and minimum expression methods. As shown in Figure 7, compared with the chart description generated by the MECDG method in Figure 5, the traditional template-based method requires rigid expressions to be set in advance, and when the needs increase, the semantic description will be more cumbersome and difficult to read. However, the corresponding analysis results displayed according to the input requirements can make the chart analysis results more targeted and readable.

Discussion
The MECDG approach is more user-friendly and interactive than the traditional pipeline model of chart analysis, because in the process of enterprise application, chart images and analysis requirements are given by the user in the interactive system, which can achieve "what you want is what you get." Especially in manufacturing enterprises under the CPSS scenario, what is valuable is not only the results of data analysis but also the interaction information between the enterprise and the platform, such as interaction requirements. Through the analysis of the interactive information in the application process, enterprises and society can explore the deeper needs of users, so as to make more intelligent decisions on the production and future planning of manufacturing enterprises.
As mentioned above, each functional module in the MECDG method is independent. The advantage is that the results produced at each phase of the model are very intuitive and interpretable, but the model will appear more redundant. At the same time, in the analysis and generation phase, the model uses self-collected manufacturing enterprise dataset for training, and the dataset is relatively small. In future research, we consider organizing the chart data extraction and analysis process into an integrated model using more appropriate strategies, and at the same time use larger and more credible datasets in training process to improve the quality of chart analysis results.

Discussion
The MECDG approach is more user-friendly and interactive than the traditional pipeline model of chart analysis, because in the process of enterprise application, chart images and analysis requirements are given by the user in the interactive system, which can achieve "what you want is what you get." Especially in manufacturing enterprises under the CPSS scenario, what is valuable is not only the results of data analysis but also the interaction information between the enterprise and the platform, such as interaction requirements. Through the analysis of the interactive information in the application process, enterprises and society can explore the deeper needs of users, so as to make more intelligent decisions on the production and future planning of manufacturing enterprises.
As mentioned above, each functional module in the MECDG method is independent. The advantage is that the results produced at each phase of the model are very intuitive and interpretable, but the model will appear more redundant. At the same time, in the analysis and generation phase, the model uses self-collected manufacturing enterprise dataset for training, and the dataset is relatively small. In future research, we consider organizing the chart data extraction and analysis process into an integrated model using more appropriate strategies, and at the same time use larger and more credible datasets in training process to improve the quality of chart analysis results.
The MECDG method is actually an image description model. It can extract data from different types of chart image and use the extraction results to generate chart descriptions. This method can be easily extended to other fields, such as using the extracted chart data to redesign and improve chart visualization to improve the comprehensibility of chart, reducing chart design bias or improving their aesthetics, and designing accessibility systems with generated chart description, which can express the chart information in a sound or tactile way through wearable devices to help visually impaired users understand charts.

Conclusions
This paper proposes a two-phase unified model to analyze the unstructured chart data of manufacturing enterprises in the CPSS environment. The first phase presents an approach to recognize and extract data and text information from different types of charts. The model uses convolutional neural network and OCR technology to obtain chart data characteristics as basic data for further generating description. The second phase presents an approach to generate corresponding chart feature information descriptions according to user needs. The contributions of this study include the following aspects: (1) Instead of analyzing specific type of chart, the proposed MECDG method can analyze three different types of charts. (2) The MECDG method allows manufacturing users to obtain visual analysis from the chart in a more interactive way. The chart description generated according to user needs makes the application of the method more flexible. (3) The experiments proved that the proposed MECDG method is efficient for manufacturing a chart description generation than other latest methods. However, our model applied to the manufacturing enterprise data analysis platform is in an early phase of and has some limitations.
(1) To analyze more types and more complex charts, the proposed model should optimize the model in the data extraction phase.
(2) To generate more accurate and flexible language descriptions, we need to continuously enrich the corpus to improve the language quality of the generated text descriptions and descriptive ability.