Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support

Yoon, Byungun; Roh, Taeyeoun; Jang, Hyejin; Yun, Dooseob

doi:10.3390/su11164258

Open AccessArticle

Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support

by

Byungun Yoon

^*

,

Taeyeoun Roh

,

Hyejin Jang

and

Dooseob Yun

Department of Industrial and Systems Engineering, Dongguk University, Seoul 100715, Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2019, 11(16), 4258; https://doi.org/10.3390/su11164258

Submission received: 20 June 2019 / Revised: 22 July 2019 / Accepted: 31 July 2019 / Published: 7 August 2019

Download

Browse Figures

Versions Notes

Abstract

:

Companies have long sought to detect financial risks and prevent crises in their business activities. Investors also have a great need to identify risks and utilize them for investment. Thus, several studies have attempted to detect financial risk. However, these studies had limitations in that various data were not exploited and diverse perspectives of the firm were not reflected. This can lead to wrong choices for investment. Thus, the purpose of this study was to propose risk signal prediction models based on firm data and opinion mining, reflecting both the perspectives of firms and investors. Furthermore, we developed a process to obtain real time firm related data and convenience visualization. To develop this process, a credit event was defined as an event that led to a critical risk of the firm. In the next step, the firm risk score was calculated for a firm having a possible credit event. This score was calculated by combining the firm activity score and opinion mining score. The firm activity score was calculated based on a financial statement and disclosure data indicator, while the opinion mining score was calculated based on a sentiment analysis of news and social data. As a result, the total firm risk grade was derived, and the risk level was proposed. These processes were developed into a system and illustrated by real firm data. The results of this study demonstrate that it is possible to derive risk signals through integrated monitoring indicators and provide useful information to users. This study can help users make decisions. It also provides users an opportunity to identify new investment momentums.

Keywords:

risk signal detection; opinion mining; financial decision support

1. Introduction

Numerous recessions in the past have wreaked havoc on the roots of state-based industries with many negative effects, including economic slowdown. Global threats, such as the 2008 subprime mortgage crisis in the US and the 2016 financial crisis in China, may affect other countries, as well. Therefore, there is an increasing need to prepare for these threats by detecting their risk. Several studies have been conducted to predict the possibility of financial accidents through monitoring. However, if such monitoring is not performed correctly, it may lead to more adverse effects. Furthermore, if misleading policies are established for a company, the company may face crucial credit consequences, such as bankruptcy. In addition, investors are particularly sensitive to monitoring. If they make a wrong choice, they may pay for it dearly. Therefore, investors are trying to obtain meaningful results for their investment by acquiring and analyzing a variety of information using their own methods. It is important for investors to know what kind of data to acquire and how to acquire new data. If investors can acquire diverse data, they are more likely to make unbiased decisions. Furthermore, if they acquire newer data, they may obtain more favorable results. Besides accuracy, acquiring unbiased and fresh data is also important to assist users to invest.

Many studies have exploited firm data, such as financial statements and disclosure data. These data can be used as the basic data for analyzing companies because they objectively represent the company. Thus, many studies have analyzed firm data as fundamental data that objectively express the firm [1,2]. Firm data could be used to exploit the risk level of the firm. They also provide information about the superiority of the enterprise [3,4]. Other studies have proposed that companies could be evaluated with not only firm data but also in other ways, such as with social data. Several studies have shown that risk detection can be performed through social data [5,6,7,8,9,10]. These studies have suggested that economic news and social data may trigger financial market movements and organizational behavior, and that patterns of SNS (Social Network Services) investors can have a significant impact on other investors’ trading behaviors. Since the crime rates of stock price manipulation through organizational behavior and arbitrage transactions using non-transaction information are increasing recently, these movements can be detected by using news and social data. However, the data used to analyze firms in previous studies have limitations. First, in the case of data disclosure or financial statements, the analysis of data may have results that are different from actual investment results. Even companies that are sound in their financial statements may manipulate their accountings or have incidents that are not recorded in their accountings. If investors make investment decisions based on these data, it is too late. In other words, when data are confirmed, and an investment is linked to those data, it is too late for investors to respond, because the information is likely to be reflected in the market already. In previous studies exploiting opinion mining, real-time updates were not made because their data were acquired and analyzed in advance. As a result, the data used in previous studies may be biased. In addition, data analysis may not be performed in real time. Therefore, it is possible to make wrong choices when investing through acquired information.

To overcome the limitations of previous studies, it is important to develop an intelligent investment decision support algorithm. This algorithm combines disclosure data, artificial intelligence, and opinion mining with vast amounts of social data and news and can be used to not only analyze firm data, but also to reflect investors’ opinions. To do this, a database is constructed in this study by collecting data related to investment based on web crawling. In addition, a sentiment lexicon associated with investment is created for machine learning. With the collected data, we conducted semi-supervised learning and detected a promising/risk signal for each firm. Finally, by selecting the investment risk group according to the detected signal, we suggest the at-risk investment firm, visualize the result, and present the result as a system. Through this study, users can minimize diversion from data by acquiring and analyzing various data sources. In addition, these data can be periodically collected to update the information and reflect the update in the results so users can quickly identify meaningful information. Our ultimate goal is to present a tool that can be easily grasped by users by suggesting and leveling risk based on the obtained information.

Chapter 2 describes background theories related to financial decision-making support systems, opinion mining, and risk detection. Chapter 3 designs a system for developing a financial risk prediction algorithm. Chapter 4 illustrates analysis results based on actual cases and visualizes them as a system UI. Chapter 5 discusses issues of the proposed system’s output. Finally, Chapter 6 presents the conclusions and limitations of this study and suggests future research.

2. Background

2.1. Financial Decision-Making Support System

One of the most important activities in business management is decision-making. The business environment is becoming more complex. The connectivity of various elements is also increasing. Thus, decision-making is becoming difficult and unpredictable. A decision support system is currently being used to assist decision makers not only in firms, but also in individual units. Its use is increasing day by day. Different systems are being used in various fields, as there are various methodologies for providing solutions. Yazdani et al. [11] conducted research to provide an ideal solution based on the Quality Function Deployment (QFD) in the agricultural supply chain. Scalia et al. [12] proposed a multi-variate decision-making system for pancreatic islet transplant. A decision-making system that assists in uncertainty during medical diagnosis using a fuzzy system was also developed [13].

In the stock market, decision-making systems are used in various ways. Paniagua et al. [14] suggested an autonomous emotional decision-making system using artificial emotions to improve the decision-making autonomy of the system and achieve better investment results. An adaptive stock index trading decision system has been proposed to predict the movement of stock index prices, capture trading opportunities, and minimize errors and costs [15]. Weng et al. [6] developed a financial expert system for predicting short term stock prices based on counts and sentiment scores of news articles. Another system to assist users in financial trading has also been proposed, to predict stock trading patterns by integrating the support vector machine and portfolio selection theory [16].

However, these previous studies mainly reflected the perspective of the company, while perspective of investor was not simultaneously reflected. Since the investor is one of the most important roles in the financial field, the investor needs to be reflected to derive an accurate result. Thus, a business-oriented indicator considering the objectivity of a firm is proposed in this study. Another type of indicator that reflects the perspective of investors is also proposed. Based on both types of indicators, the results reflect both perspectives and show the proper overall risk level of the enterprise, to assist investors for future investment.

2.2. Opinion Mining

Opinion mining, also known as sentiment analysis, refers to the extraction of subjective emotional information. Human language is divided into positive and negative elements, which are freely described according to natural language processing and text mining [17]. Opinion mining research can be categorized and has been used to derive quantitative and qualitative information. Research that derives quantitative information from opinion mining is focused on the perspective of emotional analysis, while the other is focused on the perspective of content analysis. Quantitative information opinion mining proposes sentiment value for information based on the degree of positive and negative opinions. Sentiment value suggests the relative value of negative and positive numbers. However, it is not a binary classification of affirmative or negative. Singh et al. [18] analyzed movie review data and derived an evaluation from users. Min and Park [19] proposed a methodology for screening and using review data based on sentiment value. Qualitative information opinion mining aims to extract specific contents from opinion. Lee [20] extracted the needs of users from online reviews. Thorleuchter [21] explored potential customers based on the characteristics of the reviewers. Ghazizadeh et al. [22] tried to determine the complaints of specific targets from various reviews and proposed ways to solve these complains. Alkubaisi et al. [5] suggested a classification model in the stock market exchange based on a sentiment analysis of Twitter. Zhang et al. [7] developed a model for the reliability of stock comments by considering analysts’ opinions and their shifting patterns.

In the current study, factors that could influence the investment choice of each firm were proposed from the perspective of a combination of quantitative and qualitative analyses. The influence of each factor was then presented as a quantitative value. News and subjective investor opinions were analyzed to determine the objective and subjective opinions of each company. Since news conveys objective facts for each company, it can be interpreted as a factor through which investors can decide whether to invest or not. Furthermore, subjective investor opinions of objective facts about companies, expressed in SNS and the security community, can be used as an effective basis for decision-making by other investors who do not express such opinions. The derived data were exploited to extract positive and negative core keywords that could influence investment choice. Each core keyword is derived by a Naïve Bayes Classifier. By using graph-based semi-supervised learning, the emotional values of the SNS and news documents were derived by calculating the emotional values of the factors related to the company, excluding core keywords. Finally, the risk signal is predicted from four perspectives to derive a unified indicator of the investor’s perspective. This value is then used to monitor bad credit events for the company with a credit risk.

2.3. Risk Detection in Financial

Firms need to consider many aspects of their business. They especially need to monitor risks that may arise. To avoid or prepare for risks, firms have been constantly exploring the development of systems that can anticipate and alert firms about risks in advance. An early warning system (EWS) is a representative system that can reduce the risk of an enterprise or an individual by predicting an unusual situation. It can also assist in the identification of risks quantitatively based on current conditions and possible risks [23]. EWS can be used in various fields, including financial monitoring and reporting of potential problems, risks, and opportunities before they may affect financial statements. Therefore, EWSs provide opportunities to prevent or mitigate potential problems [23].

The EWS used in the financial field has been developed based on several models. A traditional EWS exploits regression models, such as logistic and probit models [24,25]. A model that reflects country-specific characteristics has been proposed using various indicators [26]. A multiple-criteria decision analysis (MCDM) based methodology has also been used [27]. An EWS using an artificial neural network has also been proposed. For example, Brockett and Cooper [28] proposed an EWS using a neural network based on various variables. Kim et al. [29] used EWS models based on an analysis of the Korean economy using an Artificial Neural Network (ANN) as an indicator. Yang et al. [30] exploited the EWS with ANN to detect the financial risk of a bank. However, these studies were focused on indicators that could represent the status of companies, while the opinions of investors were not reflected sufficiently. This problem needs to be overcome because it can lead to biased results due to enterprise-centered evaluation.

Therefore, the present study suggests a new concept in risk detection by adding a process that can reflect the opinions of investors. This concept exploits indicators that can represent the company. The risk level from the firm’s perspective and the opinion level from the investor’s perspective are linked with each other to derive a comprehensive risk level. The degree of risk is then determined by judging the trend or situation of the firm. Based on the degree of risk, investment strategies are provided. Finally, a system is proposed and results derived from the system are illustrated.

3. System Design

3.1. Basic Concept

The main purpose of crisis detection research is to find signals by collectively using data related to companies. Disclosure data and financial statements are usually exploited to identify risk signals because these data describe the firm objectively. These data can be used to identify firm activity and credit events of the past. Thus, they are useful to determine corporate soundness and pinpoint companies that need monitoring. Although some companies can be evaluated as sound companies by data such as financial statements, they may possess hidden risks, such as accounting manipulations or serious events that are not presented in the disclosure data. For such companies, a risk signal can be derived by analyzing data other than financial data, such as the opinions of investors. Therefore, opinion mining is exploited to derive hidden information or reflect investors’ opinions about the firm. These two datatypes are then exploited together to calculate a risk signal indicator for each company. The risk signal indicator is defined based on the probability of the occurrence of a credit event. The risk level of the company is also derived by the calculated indicator. Finally, a series of processes are systemized to enable continuous risk signal detection of the firm. Thus, this study evaluates a crisis in two aspects: (1) the opinion mining aspect of corporate data that can objectively express the company; and (2) investors’ opinions of the company as shown in Figure 1. Such a comprehensive risk assessment could lead to more accurate risk level monitoring for the firm.

3.2. System Process

3.2.1. Credit Risk Evaluation

For credit risk evaluation, financial statements and disclosure data are used so that the risk degree of a company can be assessed quantitatively. Financial statements and disclosure data are quantitative documents that show the financial status and business performance of a company. As described in Section 2, these data have been analyzed in various studies. Financial indicators based on disclosure data are selected to identify the possibility of the following credit events that could occur in the company: (1) abolishment of a listing, (2) insolvency, (3) rehabilitation proceedings, and (4) bankruptcy. The related indicators for each credit event, and their explanations, are listed in Table 1.

The monitoring indicator is proposed to evaluate the risk level of a firm based on a financial statement and prior studies of financial indicators [31,32,33,34,35,36,37]. Indicators already used by banks are examined first, due to their availability, to choose proper indicators. The final set of assessment indicators was reviewed by the researchers and financial experts at a leading financial investment company in Korea. These indicators used by banks are shown in Table 2. Table 3 shows monitoring indicators associated with each credit event based on financial statements. Monitoring indicators based on disclosure data are counted when each type of incident related indicator is registered. The validity of the disclosure data is measured within three months, six months, and one year, because the Korean financial services commission has suggested that the situation of a debtor should be monitored for about three months for its ability to repay future obligations. The world’s top three rating agencies—Moody’s corporation, S&P, and Fitch group—have a grace period of six months for credit rating adjustments. KOSDAQ and KOSPI in the Korean stock market are currently evaluating delisting criteria for a period of one year. Referring to these cases mentioned above, the disclosure data of a firm are evaluated by dividing the period into three sections (3 months, 6 months, and 1 year).

Some proposed disclosures may have positive and negative effects on the market. For instance, the disposal of treasury stock can be interpreted as positive if it is used as a means to improve the firm’s environment or to raise investment resources for the long term. On the other hand, there is room for negative interpretations. For example, it can be interpreted that the company lacks money for financing. Table 4 shows the list of the possibilities through which the monitoring index can be evaluated as positive or negative and the proposed criteria for evaluating the decision. Candidates for each indicator were selected by referring to previous studies [34,38,39] and, finally, after a review by experts, as shown in Table 2 and Table 3. If an indicator is evaluated as positive, it is counted as −1. If the indicator is evaluated as negative, it is counted as +1.

3.2.2. Opinion Mining

The opinion mining process conducts opinion mining evaluations of companies based on corporate news and community opinions. First, firm related news and opinions are gathered. Unnecessary information and advertisements are then removed from these collected data. These refined data are then parsed for sentiment analysis and word2vec analysis. After data collection and refinement, stock prices are exploited to assign sentiment values. It is assumed that the period that news and social data can affect stock price volatility is three days. For instance, if the stock price of company A rises on September 1, opinions registered between August 29 and August 31 are considered as positive. These three days were used as a standard period in which opinions were well reflected in the stock price through qualitative evaluation.

Keywords with definite polarity values are selected. Sentiment values are then assigned to opinions to derive the sentiment value for each word. Four keywords with the largest sentiment values among the calculated words are selected. Based on the co-occurrence of the word using previously parsed data, word embedding is performed with word2vec. The coordinate value of the word is then derived based on embedding. Finally, sentiment values and coordinates for the core keyword derived from the above are associated with the word embedding result. The sentimental value for each word is then derived using the distance between the core keyword and other words. When the core keyword is positive, similar characteristic words will be embedded near the core keyword. Thus, they can be evaluated as positive words. Likewise, when the core keyword is negative, embedded words near the core keyword can be evaluated as negative words. Ultimately, the sentiment values of all words can be calculated based on their distance from the core keyword. After calculating the sentiment values for all words, the sentiment value per document is calculated.

The distance between the word and the core keyword uses Euclidean distance. The sentiment value is allocated in inverse proportion to the distance. The sentimental value of the keyword, the proximity index between the keyword and core keyword, and the distance between the keyword and core keyword are as follows:

s e n t i m e n t a l v a l u e o f k e y w o r d i = \sum_{j = 1}^{m} (P r o x i m i t y i n d e x b e t w e e n k e y w o r d i a n d c o r e k e y w o r d j)

P r o x i m i t y i n d e x b e t w e e n k e y w o r d i a n d c o r e k e y w o r d j = \frac{1}{r^{2}} * (S e n t i m e n t a l v a l u e o f c o r e k e y w o r d j)

r_{j} = (E u c l i d e a n d i s t a n c e b e t w e e n k e y w o r d i a n d c o r e k e y w o r d j = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}} (i = 1, \dots, m) .

For example, if there are two core keywords, ‘shipping’ and ‘buying’, coordinates of ‘shipping’ are derived as (1.2522, 0.0515), and the sentiment value is −17.1. Coordinates of ‘buying’ are derived as (−5.1012, 3.1122), and the sentiment value is 20. If the coordinate value of the word ‘high altitude’ is (1.8050, 0.7221), the distance difference between ‘high altitude’ and ‘shipping’ is 1.3233, while the difference between the coordinates of ‘high altitude’ and ‘buying’ is 0.0187. If the sentiment value is calculated according to the above equation, it is −22.2658.

The monitoring index for opinion mining is then calculated based on the sentiment value of the document. The above-mentioned monitoring indicators are used to indicate the negative level of a firm by analyzing the ratio of negative documents or the period during which the negative opinion is maintained in order to determine the risk signals of the firm. Since units for the various indicators are different, standardization is exploited to integrate the indicators. Definitions and explanations of each indicator and unit are shown in Table 5

3.2.3. Signal Detection

The risk signal of a company is explored by utilizing the monitoring index, including the index based on disclosure data and financial statements and the index based on opinion mining. Financial statements and disclosure data can be treated as external evaluations of corporate activities and the status of the company. Opinion mining is not only related to corporate activities but also to investors’ opinions or external environmental assessments. Financial statements and disclosure data are updated regularly or irregularly according to the time of submission. On the other hand, the sentiment value of opinion mining always changes. Therefore, the signal should be divided into two perspectives. The value is expressed as follows:

\begin{array}{l} F i r m a c t i v i t y i n d e x & = a \times (w e i g h t s u m o f i n d i c a t o r b a s e d o n f i n a n c i a l s t a t e m e n t) \\ + b \times (w e i g h t s u m o f i n d i c a t o r b a s e d o n d i s c l o s u r e d a t a) \end{array} O p i n i o n M i n i n g i n d e x = W e i g h t s u m o f o p i n i o n m i n i n g i n d i c a t o r .

To suggest a risk signal, each type of indicator is classified into five grades. To classify these grades, cut-off values are defined. A total of 528 cases are analyzed to determine cut-off values. Cut-off values are determined according to the top 20%, 40%, 60%, and 80% of each index, as shown in Table 6. Each indicator was standardized from 0 to 1. An integrated index was calculated by the weighted sum between them. In this case, the weight of firm activity was defined as 40% in the financial statement weight and 60% in the disclosure data weight. There is no significant change within one year in the case of financial statements. However, since disclosure data can always be obtained, these disclosure data are assigned a higher weight because they are highly likely to affect the investment of the company. In other words, it is possible to make a more accurate prediction by using an indicator that constantly fluctuates from a fixed perspective. Finally, based on the measured business activity index and opinion mining index, the interval average grade is measured for 1 year, 6 months, and 3 months. Furthermore, the total risk grade is derived by following the total risk grade matrix based on the calculated firm activity grade and the opinion mining grade. This value is shown in Figure 2. Grades are then classified into danger, warning, or attention signals. The total risk grade is evaluated from two perspectives: the firm activity grade and the opinion mining grade, derived in total from grades 1 to 5. For example, if the firm activity grade is more than 0.45, and the opinion mining grade is less than 0.20, then the total risk grade is ranked #1, with the lowest risk value. In contrast, if the firm activity grade is less than 0.20, and the opinion mining grade is more than 0.44, the total risk grade is selected as the highest risk grade of 5. The criteria used for distinguishing these risk signals are shown in Figure 3.

In this study, the risk was conservatively recognized. The average of two indicator classes was rounded up for strict management. However, if the difference between the two grades is 4, the integrated risk grade itself is defined as 4 because grades deviating from each other may produce distorted results when using averages.

3.3. System Architecture

In this section, the types and modules of DB (Database) are defined. These types and modules are required for the system to systematize the research result. The system architecture shows what data are required for each module and where these data come from. A real system user interface will be illustrated in Chapter 4.

3.3.1. Database

To derive results from the risk assessment process, data are required. Since the data required for each process module are different, databases associated with these data are also different. The DB required for this process is divided into three categories: a corporate information DB, which contains disclosure data and financial statement of the firm, a news DB, where corporate news is collected, and a social DB that collects social data. First, the corporate information DB collects data related to company activities that are disclosed by each company. It contains financial statements and disclosure data. The news DB collects news data by searching articles linked to Korean search portal sites. These are collected by web crawling. Finally, the social DB exploits company-specific keywords on several community sites. Twitter is a representative example. The social DB also exploits web crawling for data collection. Unnecessary information and advertisements are then removed from news and social data. Preliminary processing is then performed by parsing for sentiment analysis and word2vec analysis.

3.3.2. Modules

Modules are needed to construct the risk assessment process. There are three modules: the credit risk assessment module, the opinion mining module, and the signal detection module. First, the credit risk assessment module evaluates the financial soundness of a company based on its business activities. The module then derives companies that are predicted to be at risk of bankruptcy as the risk group and uses them for analysis of the target companies. Based on the collected companies’ information, a Support Vector Machine (SVM) analysis is conducted to predict companies that have a possibility of default. SVM exploits past corporate data to distinguish between the credit event occurrence group and the normal repayment group. Financial indicators (business performance, profitability, stability evaluation) and non-financial indicators (number of disclosures) are exploited as independent variables. Credit events are used as dependent variables. The opinion mining module exploits news / social data and assigns sentiment values to opinions based on the fluctuation of stock prices. If the stock price rises, news and social comments created on that date are evaluated as positive. On the other hand, when the stock price declines, news and social comments are considered to be negative.

Finally, the signal detection module predicts the credit risk of companies that are selected as risk companies. Based on the characteristics of each signal, the possibility of future credit events and the risk level are then derived. To this end, the firm activity index and opinion mining index are used to generate ratings for each indicator of the company and identify the integrated risk level for each period. As a result of the module, changes in the risk level over time are plotted, and an investment risk signal is presented. Consequently, this signal can be divided into three stages (risk, warning, and attention), thereby presenting a risk level for each company.

3.3.3. Functions

This system can automatically check news articles, personal postings, and public information about the pre-selected company and provide the results of a numerical index-based risk assessment based on a sentiment analysis. In addition, system users can directly participate in the evaluation by marking data sources, such as news articles, as positive or negative. The results are reflected in the learning process of sentiment analysis and are dynamically updated to enhance analysis performance. For systemization, each module loads different data from the DB. First, the credit risk assessment module retrieves data from the corporate information database because the module is based on the company’s information. Its data include financial statements and disclosure data that contain a history of business activities. This module invokes a positive integer, such as how each major credit event occurs, and a normalized number that describes variables in financial statements. Next, the opinion mining module retrieves data parsed from the news DB and social DB. Finally, the signal detection module exploits the sentiment value of the opinion mining module and company information from the credit risk assessment module.

4. Illustration

4.1. Data Collection

In this section, we will illustrate a real firm in stock market. Company A was chosen as the target for analysis. Company A produces titanium dioxide and cobalt sulfate in Korea. It is currently an important company. Its stock has risen more than three times over the entire stock market and the KOSDAQ market, as of September. In addition, opinions of its users are diversified, and it has various disclosures. Thus, it is appropriate to select Company A as a subject to be analyzed in this study. A period of one year (from September 2016 to August 2017) was selected as the analysis period. First, the data of company A were collected. The collected data of company A are shown in Table 7. To collect the firm data, we developed R code for web crawling.

4.2. Credit Risk Evaluation

To evaluate the business activity index for company A, data were taken from the company information DB. The number of financial statements and credit events in the disclosure data are the required data. For company A, the number of disclosures in 2017 increased significantly. In addition, its corporate activity index changed drastically due to the issuance of corporate bonds (CB, EB). As a result, company A had 8 credit events. Company A is evaluated as a company that has a good asset size and quality. However, its business performance derived is not particularly good. In addition, its derived repayment ability is weak. Specific details are shown in Table 8.

4.3. Opinion Mining

To evaluate the opinion mining index for company A, collected news and social data were fetched. Based on these data, a word2vec analysis was performed and the sentiment value was propagated to neighboring words based on the sentiment value of the core keyword. Sentiment scores of the entire document were calculated using the sentiment value of each word. Ultimately, it was possible to calculate the opinion mining index for company A. Through this process, a figure-like result was derived. In May 2017, the negative word ‘split’ appeared. In August 2017, negative words such as ‘convertible bonds’ were mentioned frequently. Very negative keywords appeared in relation to the issuance of corporate bonds in March 2017. In August, there were sudden market fluctuations. At the same time, negative terms such as ‘convertible bonds’ overlapped, indicating very negative values. On the other hand, in accordance with the government’s electric car policy, there was a period in which company A emerged as a beneficiary and was evaluated positively. In that period, a positive score was derived.

4.4. Signal Detection

The business activity index and opinion mining index for company A are shown in Table 9. Company A showed large fluctuations in both ratings. In particular, the business activity level fell from 2 to 5 within one year. This was because the number of disclosures increased significantly in 2017. In 2017, the Korea Financial Authority requested disclosure from company A four times, in order to evaluate company A’s rumors. In addition, company A issued two convertible bonds (CB) to secure operating funds. The risk level for company A continued to fluctuate while proceeding to spin-off. As a result, company A’s integrated risk rating was evaluated as very dangerous (4.4 for one-year, 4.31 for the past six months, and 4.50 for the recent three months). A high score was derived for its business activity. The opinion mining index, which scored over 3.5, was simultaneously calculated to be high. Thus, the integrated risk rating was evaluated as ‘caution’. This indicates that there is a need to pay attention to investment in company A. It also indicates that continuous observation is necessary for company A.

4.5. Interface Implementation of Risk Signal Detection System

After evaluating the integrated risk rating of a company, positive / negative keywords and relevant contents are visualized, key perspective points are then presented after assessing the integrated risk level for each company. Positive / negative keywords for each key point in the corporate valuation are presented. Keyword-related company data are then derived in the form of a keyword map. Customized monitoring and investment strategies for each risk assessment are then derived. In addition to the above-mentioned integrated risk level, this systemization will provide users with necessary information. This system aims to visualize the results after taking the required data from the DB and performing the necessary processes for each module, as shown in Chapter 3 above.

The process of collecting data begins by collecting new news, community comments, and dart announcements on a daily basis. When the program is executed, data collection starts automatically for the analysis of the target companies entered in advance. These collected data are presented in the UI (User Interface) as shown in Figure 4. The period of analysis can also be set. In addition, positive and negative words registered in advance can be checked to improve the accuracy of the sensitivity analysis for each company as shown in Figure 5. Word importance cut-off values and an analysis target period for learning the data are then set. When the setting is completed, the ‘start learning’ button can be pressed to move to the next step.

The main interface searches for companies and provides information about the opinion mining of the searched companies, as shown in Figure 6. It provides opinion mining scores along with the daily news increase and decrease for all observed companies that the user has registered in advance. In addition, the main interface provides weekly positive and negative changes in the opinion index and news counts, which can be visualized in a word cloud, to visualize which core keywords appear in positive news and negative news.

If users want to acquire detailed information about an individual company, they can click on the name of the company on the left to go to the UI, which provides detailed information about the company. As shown in Figure 7, users can see company-specific news, community opinions, and disclosure data in the UI.

In order to acquire information about companies that are not registered, the system needs to learn about companies, as mentioned in the framework. The above UI screen is a UI for learning a company that has not yet been registered. The system then adds the company to the DB. First, the name of the company can be to be added in the lower left corner as shown in Figure 8. The period of data to be learned is then set. The cut-off value that distinguishes between positive and negative news is set, and learning is conducted for company data. It is possible to increase the accuracy of the data by determining the positive or negative values of words by using the dictionary function or by adding the company’s core keyword as shown in Figure 9.

Based on the learned data, news can be checked based on what types of news have been reported for companies registered in the DB. These date are presented in the UI, as shown in Figure 10. Using the proposed system UI, we can see that both the credit risk assessment module and the opinion mining module explained in the framework are utilized.

Next, signal detection is performed based on the derived firm activity and opinion mining grade. As shown in Figure 11, the time period in which the rating changes sharply in the risk level graph is highlighted. For the highlighted interval, the core keyword of the period is presented. Each core keyword is then divided into detailed keywords that describe the core keyword. A news article or a social opinion related to the keyword is then visualized. The system then suggests what type of risk signal the company is. Ultimately, visualization of the system allows users to acquire information more easily. By obtaining these refined data, users will be able to make more informed decisions.

5. Discussion

The risk signal detection system proposed in this study can improve usability by providing immediate convenience to users and real time analysis. Indeed, many security firms are collecting real-time data. However, most of the data that play a key role in the company, such as news and disclosure data, are hand-drawn. This may produce a slow response from the brokerage firm when an incident occurs. In the worst case scenario, even a main event is missed. If our system is utilized, it will be possible to improve the accuracy and automate the work, which is currently dependent on manual hand-drawn work. In other words, it is possible to minimize the above-mentioned problems.

In addition, all events occurring in the security system are not determined in advance. In some cases, events that are normally evaluated as negative can be evaluated positively and can be reflected positively in stock prices. On the other hand, events that were evaluated as positive might have an adverse effect on stock price. Therefore, the most important part of making a decision on security investments should be to quickly understand how investors’ opinions are reflected in the stock price. In this way, the system exploited disclosure data and financial statements, which can be considered objectively. This system also quantifies opinions of individuals, that are subjective. Thus, the system provides a basis for supporting decision-making.

Finally, the system has the advantage of utilizing qualitative and quantitative methodologies appropriately in the risk assessment process. If the system is biased toward a qualitative method, there may be a question about the reliability of the system. If it lacks numerical visualization, it can confuse users. Conversely, if the system is biased in a quantitative way, it may exclude unseen information, resulting in a result that does not reflect reality. For this reason, it is important to properly combine both methodologies. This system demonstrates a risk management technique that properly utilizes both quantitative and qualitative information.

However, this system has a drawback in that it is too dependent on keywords. Depending on the positive or negative aspect of the keyword, the sentiment value of neighboring words also changes. It is possible that positive and negative keywords are mixed due to the natural characteristics of word2vec. Thus, it is necessary to suggest complements or alternatives for the visualization data of word2vec in the system because it can be difficult to grasp the whole flow. Even visualized data can provide inaccurate views when users interpret the data. In addition, since the system exploits NLTK, a Korean language processing package, it has a problem in that the name of the firm is not recognized as a proper noun. This can degrade the accuracy of the analysis, requiring the cumbersome process of adding the word to the dictionary in advance.

In order to improve the accuracy of the analysis, the selected words, which were reflected in the dictionary and limited to noun forms, should be sophisticated. Although different parts of speech, such as verbs and adjectives, could affect the sentiment value, these parts of speech were excluded because they created a large amount of noise. Therefore, the sentiment analysis needs to be undertaken more precisely.

Furthermore, this study can extend the framework by applying a deep learning-based model that automatically optimizes parameters according to learning data [40,41]. In addition, corporate opinion data can be applied to a variety of areas in managerial works [42]. In other words, the purpose of analysis can be expanded to a model applicable to various industries by applying it to marketing and other areas, such as the development of products and technologies, rather than from a financial investment perspective.

6. Conclusions

In this study, an intelligent security investment decision support system was developed by linking opinion mining of news data, social data, and corporate data to existing security investments. First, to screen at-risk firm groups, firm data were obtained, and the system tried to exploit SVM and identify the keyword appearance frequency of credit events. From the credit event identification process, a firm assessment score was derived. After the screening process, the sentiment value of the document was calculated by exploiting the core keyword sentiment value derived from the news and investor opinion. Thus, each risk value of the firm (the so-called opinion mining score) was derived. With the firm assessment score and the opinion mining score, the risk value of the firm was calculated. This value was exploited to forecast credit risk and suggest information about the analyzed firm. All derived data could be visualized and were utilized in the system.

In previous studies, the development of such an algorithm was mainly conducted by using either firm data, such as disclosure data, financial statements, or opinion mining. However, biased result may be derived. Thus, both types of data were exploited in this study. First, it is possible to evaluate the company as at-risk by using only firm data. However, investors are subjects of the investment. Their opinions are not reflected in the company’s activities. Thus, it is possible to acquire information that cannot be reflected by corporate data by using opinion mining to reflect the opinions of investors. For instance, important news and rumors about a company may not be reflected in the disclosure data or financial statements. However, they can be reflected in investors’ opinions. In other words, it is possible to capture rumors before they are reflected in corporate data. This can help investors act before the speculation is materialized and reflected in stock prices. In addition to company A illustrated in this study, a medium sized company B was also analyzed. Company B’s firm activity grade was derived as 3, meaning that the firm’s business was relatively good. However, the opinion rating for company B was over 4.5. Thus, negative signals were found, and the investment outlook was determined to be in danger. Immediately after the analysis of company B, the stock price of company B fell by more than 30%.

Conversely, analysis focused on opinion mining can result in a loss of business performance. In the case of company A, the opinion mining rating was about 3.5. However, its business activity rating was 5. This means that investors evaluated the company positively, even though the actual performance of company A was not good. In fact, the stock price of company A continued to rise. Investors had many positive opinions about company A. Therefore, the company’s stock price is 10 times higher than its price a year ago, just after the analysis period. However, as the company continued to struggle with its poor financial conditions and management rights, it eventually recorded a decline. As a result, the stock price fell by 40% during the analyzed period. Thus, in order to evaluate a company, it is necessary to evaluate the company’s data along with opinion mining.

Although quantitative methodologies have been widely used in existing risk management systems, they are difficult to understand. In addition, the objectivity can deteriorate if a methodology relies heavily on the qualitative opinions of experts. On the other hand, our system visualizes all results and makes it easy for the public to grasp the information. By automating most processes, users do not have to expend significant resources to derive the information that they need, because they can easily process the information. The proposed system succeeds in ensuring objectivity by relying less on expert opinions. The contribution of the proposed system in this study is that text-based structured qualitative data is used as a quantitative corporate risk assessment indicator through data analysis. This system allows users to see objective corporate evaluation results at a glance by utilizing not only internal confidential information but also shared information on the Internet. In addition, users can easily evaluate and participate in the sentiment evaluation of each data source based on their on-site know-how. This participation is reflected in real-time learning in the system.

From a theoretical point of view, this study suggests a novel methodology to detect a company’s risk signal by utilizing both an indicator of structured corporate data alongside unstructured social data-based opinion mining in the financial stock market. This study has developed a methodology for helping financial investment entities achieve their goals, with the aim of detecting financial risks, indexing corporate information (including financial statements), and analyzing the degree of positivity and negativity through the opinion mining of social data. A novel methodology was presented on the basis of the hypothesis that information on financial risk can be quantitatively analyzed through sentiment analysis on social data extracted from news articles and internet postings.

Based on results of this study, it is possible to derive risk signals through integrated monitoring. It is also possible to obtain various types of decision data for users. Through analysis of one-year data, we can provide signals according to the risk level for each company and provide information that can help investors make decisions based on risk signals. Besides analyzing existing financial statements and disclosure data, we sought to utilize information that could not be derived from existing data through opinion mining. This provides an opportunity to identify new investment momentums while at the same time identifying the risk factors of a company.

However, in this study, since the cut-off value of the risk level was set according to 534 cases of existing companies, the criteria can be derived differently according to data from a raw data set. In addition, the characteristics of company information by industry can be different. Such characteristics were not included in this study. Moreover, there is a lack of expertise in the area of investments according to the signals presented through literature research and case studies. Thus, investment knowledge needs to be added and supplemented. In addition, because data collection takes a long time, real-time analysis may be difficult. Indeed, real-time analysis might be unsuitable for short-term stock trading, such as scaling or high frequency trading. Opinion mining also uses very subjective indicators, which makes it easy to understand the overall flow. However, confidence in the accuracy of opinion mining is not high. Finally, in the case of proper nouns, there is a disadvantage in adding a terminology dictionary because, due to the limitations of natural language processing, a company’s name might not be recognized correctly. In addition, social data usage requires the pre-validation of data sources, such as false articles on the Internet. Since this study did not cover fake filtering, further studies will require an in-depth study of the process of data refining.

In future research, it will be necessary to define not only enterprise analysis for investment, but also an enterprise evaluation index for various utilization purposes. In addition, further studies need to provide an evaluation result based on indicators. In addition, the scope of the analysis should be extended to individuals, rather than confined to companies, so that it becomes possible to utilize artificial intelligence for customer-specific credit analysis and new customer credit analysis. This will set parameters for each customer, analyze the credit for each type of customer, and process it into another system.

Author Contributions

B.Y. designed the study, outlined the methodology, performed data analysis, and wrote the manuscript. T.R. analyzed the data, interpreted the results, and wrote the manuscript. H.J. performed data analysis and wrote the manuscript. D.Y. designed the study, outlined the methodology, and helped draft the paper. All authors have read and approved the final manuscript.

Funding

This work was supported by National Research Foundation of Korea Grant funded by the Korean Government (NRF-2017R1D1A1A09000758).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ravisankar, P.; Ravi, V.; Rao, G.R.; Bose, I. Detection of financial statement fraud and feature selection using data mining techniques. Decis. Support Syst. 2011, 50, 491–500. [Google Scholar] [CrossRef]
Lin, C.C.; Chiu, A.A.; Huang, S.Y.; Yen, D.C. Detecting the financial statement fraud: The analysis of the differences between data mining techniques and experts’ judgments. Knowl. Based Syst. 2015, 89, 459–470. [Google Scholar] [CrossRef]
Güler, A.; Aybars, A.; Kutlu, O. Managing corporate performance: Investigating the relationship between corporate social responsibility and financial performance in emerging markets. Int. J. Product. Perform. Manag. 2010, 59, 229–254. [Google Scholar]
Chen, Y.; Srinivasan, K.; Goodson, G.; Katz, R. Design implications for enterprise storage systems via multi-dimensional trace analysis. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, Cascais, Portugal, 23–26 October 2011; ACM. [Google Scholar]
Alkubaisi, G.A.A.J.; Kamaruddin, S.S.; Husni, H. Stock Market Classification Model Using Sentiment Analysis on Twitter Based on Hybrid Naive Bayes Classifiers. Comput. Inf. Sci. 2018, 11, 52–64. [Google Scholar] [CrossRef]
Weng, B.; Lu, L.; Wang, X.; Fadel, M.M.; Martinez, W. Predicting short-term stock prices using ensemble methods and online data sources. Expert Syst. Appl. 2018, 112, 258–273. [Google Scholar] [CrossRef]
Zhang, C.; Wang, H.; Du, C.; Wang, Y.; Chen, C.; Yin, H. Stockassistant: A stock AI assistant for reliability modeling of stock comments. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; ACM. [Google Scholar]
Mittal, A.; Goel, A. Stock Prediction Using Twitter Sentiment Analysis; CS229. Stanford University, 2011. Available online: http://cs229.stanford.edu/proj2011/GoelMittal-StockMarketPredictionUsingTwitterSentimentAnalysis.pdf (accessed on 25 August 2017).
Johan, B.; Mao, H.; Pepe, A. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
Si, J.; Mukherjee, A.; Liu, B.; Li, Q. Exploiting topic based twitter sentiment for stock prediction. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Sofia, Bulgaria, 5 August 2013. [Google Scholar]
Yazdani, M.; Chatterjee, P.; Kazimieras, E.; Sarfaraz, Z.; Zolfani, H. Integrated QFD-MCDM framework for green supplier selection. J. Clean. Prod. 2017, 142, 3728–3740. [Google Scholar] [CrossRef]
Giada, L.S.; Aiello, G.; Rastelliniet, C.; Micale, R. Multi-criteria decision making support system for pancreatic islet transplantation. Expert Syst. Appl. 2011, 38, 3091–3097. [Google Scholar]
Malmir, B.; Amini, M.; Shing, I.C. A medical decision support system for disease diagnosis under uncertainty. Expert Syst. Appl. 2017, 88, 95–108. [Google Scholar] [CrossRef]
Daniel, C.; Cubillos, C.; Vicari, R.M.; Coloma, E.U. Decision-making system for stock exchange market using artificial emotions. Expert Syst. Appl. 2015, 42, 7070–7083. [Google Scholar]
Chiang, W.C.; Enke, D.; Wu, T.; Wang, R. An adaptive stock index trading decision support system. Expert Syst. Appl. 2016, 59, 195–207. [Google Scholar] [CrossRef]
Paiva, F.D.; Cardoso, R.; Hanaoka, G.P. Wendel Moreira Duarte. Decision-making for financial trading: A fusion approach of machine learning and portfolio selection. Expert Syst. Appl. 2019, 115, 635–655. [Google Scholar] [CrossRef]
Penalver-Martinez, I.; Garcia-Sanchez, F.; Valencia-García, R.; Rodríguez-García, M.Á. Feature-based opinion mining through ontologies. Expert Syst. Appl. 2014, 41, 5995–6008. [Google Scholar] [CrossRef]
Singh, V.K.; Piryani, R.; Uddin, A.; Waila, P. Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification. In Proceedings of the 2013 IEEE International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), Kottayam, India, 22–23 March 2013. [Google Scholar]
Min, H.J.; Jong, C.P. Identifying helpful reviews based on customer’s mentions about experiences. Expert Syst. Appl. 2012, 39, 11830–11838. [Google Scholar] [CrossRef]
Lee, S.H. How do online reviews affect purchasing intention. Afr. J. Bus. Manag. 2009, 3, 576–581. [Google Scholar]
Thorleuchter, D.; van den Poel, D.; Prinzie, A. Analyzing existing customers’ websites to improve the customer acquisition process as well as the profitability prediction in B-to-B marketing. Expert Syst. Appl. 2012, 39, 2597–2605. [Google Scholar] [CrossRef]
Mahtab, G.; McDonald, A.D.; Lee, J.D. Text mining to decipher free-response consumer complaints: Insights from the NHTSA vehicle owner’s complaint database. Hum. Factors 2014, 56, 1189–1203. [Google Scholar]
Serhan, K.A.; Ozgulbas, N. Financial early warning system model and data mining application for risk detection. Expert Syst. Appl. 2012, 39, 6238–6253. [Google Scholar]
Martin, D. Early warning of bank failure: A logit regression approach. J. Bank. Finance. 1977, 1, 249–276. [Google Scholar] [CrossRef]
Boitan, I. Development of an early warning system for evaluating the credit portfolio’s quality. A case study Romania. Prague Econ. Pap. 2012, 21, 347–362. [Google Scholar] [CrossRef]
Lestano, J.J.; Kuper, G.H. Indicators of financial crises do work! An early-warning system for six Asian countries. EconWPA. Crisis 2001, 1970, 12. [Google Scholar]
Gang, K.; Peng, Y.; Wang, G. Evaluation of clustering algorithms for financial risk analysis using MCDM methods. Inf. Sci. 2014, 275, 1–12. [Google Scholar]
Brockett, P.L.; Cooper, W.W. Report to the State Auditor and the State Board of Insurance on Early Warning Systems to Monitor the Performance of Insurance Companies in Texas; Office of the State Auditor: Austin, TX, USA, 1990. [Google Scholar]
Yoon, K.T.; Joo, K.; Sohn, O.I.; Hwang, C. Usefulness of artificial neural networks for early warning System of economic crisis. Expert Syst. Appl. 2004, 26, 583–590. [Google Scholar]
Yang, B.; Xia, L.; Ji, L.H.; Xu, J. An early warning system for loan risk assessment using artificial neural networks. Knowl. Based Syst. 2001, 14, 303–306. [Google Scholar] [CrossRef]
Ou, J.A.; Stephen, H.P. Financial statement analysis and the prediction of stock returns. J. Account. Econ. 1989, 11, 295–329. [Google Scholar] [CrossRef]
Sundararajan, V.; Enoch, C.; José, A.S.; Hilbers, P.; Krueger, R.; Moretti, M.; Slack, G. Financial Soundness Indicators: Analytical Aspects and Country Practices; International Monetary Fund: Washington DC, USA, 2002; Volume 212. [Google Scholar]
Mark, I.; Liu, Y. Measuring financial stress in a developed country: An application to Canada. J. Financ. Stab. 2006, 2, 243–265. [Google Scholar]
Cecchini, M.; Haldun, A.; Gary, J.K.; Pathak, P. Making words work: Using financial text as a predictor of financial events. Decis. Support Syst. 2010, 50, 164–175. [Google Scholar] [CrossRef]
Owen, L.; Polk, C.; Saaá-Requejo, J. Financial constraints and stock returns. Rev. Financ. Stud. 2001, 14, 529–554. [Google Scholar]
Ferson, W.E.; Warther, V.A. Evaluating fund performance in a dynamic market. Financ. Anal. J. 1996, 52, 20–28. [Google Scholar] [CrossRef]
Erica, F.; Pande, R.; Papp, J.; Park, Y.J. Repayment flexibility can reduce financial stress: A randomized control trial with microfinance clients in India. PLoS ONE 2012, 7, e45679. [Google Scholar]
Gitman, L.J.; Juchau, R.; Flanagan, J. Principles of Managerial Finance; Pearson Higher Education AU: Sydney, Australia, 2015. [Google Scholar]
Sornette, D. Why Stock Markets Crash: Critical Events in Complex Financial Systems; Princeton University Press: Princeton, NJ, USA, 2017; Volume 49. [Google Scholar]
Ziniu, H.; Liu, W.; Bian, J.; Liu, X. Listening to chaotic whispers: A deep learning framework for news-oriented stock trend prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; ACM. [Google Scholar]
Liu, J.; Liu, X.; Lin, H.; Xu, B.; Ren, Y.; Diao, Y.; Yang, L. Transformer-Based Capsule Network for Stock Movement Prediction. In Proceedings of the First Workshop on Financial Technology and Natural Language Processing, Macao, China, 12 August 2019. [Google Scholar]
Picasso, A.; Merello, S.; Ma, Y.; Onetoa, L.; Cambria, E. Technical Analysis and Sentiment Embeddings for Market Trend Prediction. Expert Syst. Appl. 2019, 135, 60–70. [Google Scholar] [CrossRef]

Figure 1. Risk detection process.

Figure 2. Total risk grade matrix.

Figure 3. Risk signal decision process.

Figure 4. Main page of the risk signal detection system.

Figure 5. The opinion mining score page of each firm.

Figure 6. News collection for each firm.

Figure 7. Disclosure data collection of each firm.

Figure 8. Firm dictionary setting page.

Figure 9. Common dictionary setting page.

Figure 10. Classifying the sentiment news of each firm.

Figure 11. Signal detection page.

Table 1. Financial indicators related to each credit event.

Credit Event	Pattern of Credit Event Symptom	Disclosure	Criteria
Abolishment of listing	Financing	Issuing private placement corporate bonds	Number of bonds with warrants issues
		Issuing private placement corporate bonds	Number of convertible bonds issues
		Capital reduction without refund	Number of enforced capital reductions without refund
		Issuing small public offering corporate bonds	Number of small public offering corporate bonds issued
	Governance & Management Authority	Replacement of major shareholder	Number of major shareholders replaced in the last 2 years
		Replacement of CEO	Number of CEOs replaced in the last year
		Suspicion of embezzlement and breach of duty	Amount of embezzlements and breaches of duty in the last year
	Sales method	Acquisition of other corporate stocks or investment certificates	Number of other corporate stocks acquired in the last year
	Sales method	Change of business purpose	Number of business purpose changes in the last year
	Audit opinion	Ongoing concern of uncertainty	Number of ongoing concerns and uncertainty in audit opinions
	Audit opinion	Adverse opinion	Number of qualifications, adversions, and disclaimers in audit opinions
Insolvency	Lack of liquidity	Accrued principal and interest	Amount of accrued principal and interest
	Lack of liquidity	Overdue loans	Number of overdue loans
	Malicious hearsay	Request for an inquiry notice	Number inquiry notices requested in the last year
	Governance & Management Authority	Replacement of major shareholders	Number of major shareholders replaced in the last 2 years
		Replacement of CEO	Number of CEOs replaced in the last year
		Blocking sales from major stakeholders	Number of sales blocked from major stakeholders in the recent year
	Audit opinion	Ongoing concern of uncertainty	Number of ongoing concerns and uncertainty in audit opinions
	Audit opinion	Adverse opinion	Number of qualifications, adversions, and disclaimers in audit opinions
Rehabilitation proceeding	Default	Accrued principal and interest	Amount of accrued principal and interest
	Default	Delinquent loans	Number of ongoing delinquent loans
	Restructuring	Acquisition of other corporate stocks or selling certifications	Number of investments in other companies‘ stocks
		Change in essential business	Number of essential business changes
		Transaction suspension	Number of transaction suspensions
		Improving financial structure	Number of self-rescue plans submitted for improving financial structure
	Rehabilitation	Rehabilitation proceedings	Number of applied rehabilitation proceedings
Bankruptcy	Default	Accrued principal and interest	Amount of accrued principal and interest
	Default	Delinquent loans	Number of ongoing delinquent loans
	Rehabilitation	Rehabilitation proceedings	Number of applied rehabilitation proceedings
	Sale of assets	Sales of other corporate stocks or shares	Number of sales other companies‘ stocks or shares

Table 2. Financial indicators exploited by the bank.

	Criteria	Relationship with Credit Event
	Criteria	Abolishment of Listing	Insolvency	Rehabilitation Processing	Bankruptcy
Asset Scale 1	$\log [\frac{M a r k e t C a p i t a l i z a t i o n}{\log (t r a i l i n g P E R)}]$	O	O
Asset Scale 2	$\log \frac{{T o t a l A s s e t - I A) + (T o t a l S a l e s)}}{2}$	O	O
Asset Scale 3	$\log \frac{a v g (S a l e s_{3 y e a r s})}{s d (S a l e s_{3 y e a r s})}$	O
Quality of Asset 1	$\frac{(N o n C u r r e n t A s s e t s - I n t a n g i b l e A s s e t)}{T o t a l A s s e t}$			O
Business Performance 1	$\frac{O p e r a t i n g I n c o m e}{S a l e s}$	O
Quality of Outcome 1	$\frac{a (N C F_{3 y e a r s})}{a (S a l e s_{3 y e a r s})}$	O		O
Quality of Outcome 2	$\frac{a (F C F_{3 y e a r s})}{a (T o t a l A s s e t_{3 y e a r s})}$	O		O
Financial Burden 1	$\frac{T o t a l L i a b i l i t i e s}{S t o c k h o l d e r s^{'} E q u i t y}$	O	O		O
Financial Burden 2	$\frac{T o t a l B o r r o w i n g s}{T o t a l A s s e t}$	O			O
Repayment Ability 1	$\log \frac{O F C + T I E}{S h o r t t e r m B o r r o w i n g s + T I E}$			O
Repayment Ability 2	$\frac{E x t e r n a l F C I - E x t e r n a l F C O}{T o t a l B o r r o w i n g s}$			O

a = Accumulated. NCF = Net Cash Flow. TIE = Total Interest Expense. IA = Intangible Asset. FCI = Funding Cash inflow. FCO = Funding Cash Outflow.

Table 3. Financial statement based monitoring indicators associated with each credit event.

Credit Event	Reason	Related indicator
Abolishment of listing	1. If the firm’s financial statement follows the equation under 50% for two years in a row, the firm is going to abolish its listing: $I m p a i r e d C a p i t a l r a t i o - \frac{c a p i t a l s t o c k - s t o c k h o l d e r s^{'} e q u i t y}{C a p i t a l s t o c k} .$ 2. If sales are less than 5 billion won for 2 consecutive years; 3. Financial condition vulnerability; 4. Possibility of default or bankruptcy.	Asset Scale 1 Asset Scale 2 Asset Scale 3 Business performance 1 Quality of outcome 1 Quality of outcome 2 Financial burden 1 Financial burden 2
Rehabilitation proceeding	1. If a company is worth continuing its business but fails to cover the debt with operating profits because of over-investment or financial accidents, it will follow a corporate rehabilitation process; 2. In other words, If the debt is bigger than the operating profit, it is likely to undergo the rehabilitation process.	Asset Scale 1 Asset Scale 2 Financial burden 1
Bankruptcy	Capital adequacy ratio, current ratio, checking ratio, and ordinary income of total assets are the cause of corporate default.	Quality of Asset 1 Quality of outcome 1 Quality of outcome 2 Repayment Ability 1 Repayment Ability 2
Insolvency	It is necessary to review whether the debtor has become insolvent or overdue. It is necessary to look at the financial statements, the lists of assets, and the liabilities borne by the debtor. If there are more assets than liabilities, bankruptcy accounting is suspected, and bankruptcy applications are not possible. In other words, to identify the insolvency, financial burden should be checked.	Financial burden 1 Financial burden 2

Table 4. Monitoring index based on financial disclosure data.

	Concepts	Evaluation
Capital increase	An activity utilized by companies to raise share capital by providing new or existing shareholders the right to subscribe to new shares.	In general, negatively recognized.
Change of the largest shareholder	The possibility of delisting because of frequently changing the largest shareholder.	Checking the frequency of changing the largest shareholder within 2 years.
Change of a representative director	Causing financial risks and a frequent change of the company’s strategy.	Checking the frequency of changing a representative director within 2 years.
Merging at face value Sales of treasury stock Change of essential business Addition of essential business Request to disclose the Acquisition of another company’s stock	Merging the number of shares ready to trade at a ratio by changing the price of shares; Selling the treasury stock of a company at a discounted price; Problematic in frequently changing the essential business; Problematic in adding new essential businesses for a short-term increase of stock price; Enabling the monitoring of malicious rumors on a company; The possibility of acquiring stock for M&A.	Positively evaluated; Positively evaluated; Checking the frequency of changing the essential business; Checking the relations with existing businesses; Checking the answers on inquired disclosure; Checking the purchase price and purposes of acquisition.

Table 5. Monitoring index based on opinion mining.

Indicator	Unit	Definition and Explanation
Average of accumulated sentiment value	Score	The rate of change can be interpreted as a single signal. Because this signal exploits the accumulated value, not the average, it can be interpreted as different from the ratio.
The number of the same sentiment days	Day	It can be understood that the same evaluation is maintained as long as the same sentiment is retained. If positive maintenance lasts a long time, then the probability of a negative credit event is low. On the other hand, if negative maintenance lasts a long time, then the possibility of a negative credit event is high
The number of positive and negative opinions overlapping each other	Number	The number of conversions from positive to negative or from negative to positive opinions within a certain period. This value can be interpreted by the many opposing opinions within the period and because investors were interested in this phenomenon.
Changing ratio of average sentiment score per day	%	Index indicates the changing ratio of the daily sentiment score

Table 6. Index cut-off value.

Standard	Firm Activity Score	Opinion Mining Score
Top 20%	0.1996	0.1967
Top 40%	0.2632	0.2855
Top 60%	0.3340	0.3711
Top 80%	0.4507	0.4396

Table 7. The number of data-points.

Data Collection Period	The Number of Data-Points Collected
Data Collection Period	News	Community	Total
09/13/2016–09/12/2017	187	1733	1920

Table 8. Credit risk evaluation of company A.

Indicator Designation	Value
Asset Scale 1	1
Asset Scale 2	1
Asset Scale 3	0.729942
Quality of Asset 1	0.623354
Business Performance 1	0.113553
Quality of Outcome 1	−0.20392
Quality of Outcome 2	0.662191
Financial Burden 1	1
Financial Burden 2	0.347014
Repayment Ability 1	−1
Repayment Ability 2	−0.48343

Table 9. Total risk grade of the company.

Period	Index Grade
Period	Firm Activity Score	Opinion Mining Score	Total Score
Whole section	5.00	3.60	4.40
Recent 6 months	5.00	3.31	4.31
Recent 3 months	5.00	3.67	4.50

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoon, B.; Roh, T.; Jang, H.; Yun, D. Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support. Sustainability 2019, 11, 4258. https://doi.org/10.3390/su11164258

AMA Style

Yoon B, Roh T, Jang H, Yun D. Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support. Sustainability. 2019; 11(16):4258. https://doi.org/10.3390/su11164258

Chicago/Turabian Style

Yoon, Byungun, Taeyeoun Roh, Hyejin Jang, and Dooseob Yun. 2019. "Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support" Sustainability 11, no. 16: 4258. https://doi.org/10.3390/su11164258

APA Style

Yoon, B., Roh, T., Jang, H., & Yun, D. (2019). Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support. Sustainability, 11(16), 4258. https://doi.org/10.3390/su11164258

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Developing an Risk Signal Detection System Based on Opinion Mining for Financial Decision Support

Abstract

1. Introduction

2. Background

2.1. Financial Decision-Making Support System

2.2. Opinion Mining

2.3. Risk Detection in Financial

3. System Design

3.1. Basic Concept

3.2. System Process

3.2.1. Credit Risk Evaluation

3.2.2. Opinion Mining

3.2.3. Signal Detection

3.3. System Architecture

3.3.1. Database

3.3.2. Modules

3.3.3. Functions

4. Illustration

4.1. Data Collection

4.2. Credit Risk Evaluation

4.3. Opinion Mining

4.4. Signal Detection

4.5. Interface Implementation of Risk Signal Detection System

5. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI