Building an Operational Solution Assistant System for Foreign SMEs in ROK

: Foreign Direct Investment (FDI) is an important resource that helps accelerate the development of the country’s economy, add substantial funding to growth and facilitate technology transfer. Republic of Korea (ROK) is one of the world’s developed countries with dynamic economy, advanced science and technology. In recent years, the Korean government has continuously formulated tax policies, policies to support the business economy and import policies to support foreign businesses in Korea. The Pangyo Valley Creative Economy Valley is being groomed as a global startup hub in Asia. Small and medium enterprises (SMEs) in foreign countries are increasingly interested and eager to seek investment opportunities in the Korean market. Nonetheless, for these companies, language barriers and cultural and institutional differences make it more difﬁcult and time-consuming to learn about the Korean market (such as investment trends, laws, visa policies, taxes and business establishment issues in Korea, etc.). In this study, we explored the process of searching information and seeking investment opportunities and built a business consulting and support application in the ﬁrst stages of starting a business in ROK to increase effectiveness and save time, which is also an innovative business practice in Use-case ROK. We designed our Virtual Assistant system that can crawl and analyze data on foreign investments in ROK from open data resource websites (data.co.kr) and used analytic and aggregation techniques to explore trends in investments of foreign enterprises. We also researched the process of searching information and seeking investment opportunities for SMEs when investing in ROK, government support policies, laws and taxes as well as a number of other related issues. We built datasets and used Natural Language Processing (NLP) together with Natural Language Understanding (NLU) algorithms to build chatbot applications. Friendly framework for new developers to add and build up the dataset of AI Assistant is built by providing input intent data function, input Entity data function, input utterance data function as well as training and test function. In addition, we built a web-app connected to the server to visualize all the results of research so that SMEs owners can easily use and look for information on investments. Based on the research results, we can make recommendations to SMEs in keeping with the changing investment trends in ROK.


Introduction
Foreign Direct Investment (FDI) is an important resource that helps accelerate the development of the country's economy, add substantial funding to growth and facilitate technology transfer, aside from strengthening export capabilities and creating more jobs [1]. Even though a growth economy with no competitive advantage of cheap labor, ROK has a great geographic advantage (between China and Japan, the world's second and third largest economies, respectively); it has also entered the 5 G era with new industries emerging in recent years. The FDI in the Republic of Korea (ROK) has been increasing largest economies, respectively); it has also entered the 5 G era with new industries emerg ing in recent years. The FDI in the Republic of Korea (ROK) has been increasing continu ously especially in the service sectors where IT/ICT solutions are extensively utilized bu it is notable that a number of global companies are also constantly seeking some invest ment opportunities in the areas where high returns can be expected by establishing thei headquarters or R&D centers in the country to participate in the businesses involving cut ting-edge technologies or advanced materials. According to a 6 January 2020 report by the Ministry of Industry, Trade and Resources of Korea, the total FDI registration in the yea 2019 reached 23.3 billion, the second highest in history [2]. The amount of FDI is actually disbursed to US $12.8 billion, the 4th highest of all time. Compared to the record high 26.9 billion in 2018, last year's registered FDI fell 13.3%, and the capital was disbursed by 26% [2]. ROK has attracted many investments and foreign enterprises in various developmen industries thanks to policies seeking to support enterprises in a timely manner. Figure 1 shows foreign-invested companies in Korea. In the past, there have been many companies and organizations seeking investment cooperation and development opportunities in ROK, a country of potential. According to representatives of Vietnam's foreign Ministry in the Workshop titled "Starting and regis tering a business in Korea" in 2019, Vietnam-ROK relations are developing well in al fields, in the context of the two countries celebrating the 10th anniversary of establishmen of "Vietnam-Korea strategic cooperation partnership" (2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019). In terms of invest ment, ROK is the largest investor in Vietnam with total accumulated investment capita of US$ 62.5 billion as of the end of 2018 (accounting for 18.3%), with 7460 projects creating jobs for 70,000 employees and contributing about 30% of the total export value of Vietnam [3]. With regard to trade, ROK is Vietnam's second largest trading partner (after China with total two-way trade turnover of US$ 65.7 billion in 2018, aiming to record VND 100 billion by 2020 [3]. As for tourism, there were nearly 3.5 million ROK tourists visiting Vietnam in 2018 whereas the number of Vietnamese tourists to ROK reached nearly 500,000, up 42.1% People-to-people relations take place widely in all levels and fields. Currently, each coun try has more than 150,000 citizens studying, living and working in the other country (Vi etnam's foreign ministry).
Like many countries, the ROK is striving to keep their economic growth rate to en sure their economic success in the era of the of the 4th industrial revolution that can guar antee higher employment rates and GDP levels. In this effort, the ROK government i encouraging and offering a series of startup programs by providing a substantial suppor to the promising young entrepreneurs, establishing a global startup hub such as Pangyo Creative Economy Valley, for example. In the context of investment opportunities in Ko rea, there are many small and medium enterprises (SMEs) from many different countrie participating in investment; due to cultural differences, they have encountered certain challenges in the process of seeking opportunities and investments. According to the Fu ture of Business Survey [4], some of the major challenges that may be encountered by In the past, there have been many companies and organizations seeking investment, cooperation and development opportunities in ROK, a country of potential. According to representatives of Vietnam's foreign Ministry in the Workshop titled "Starting and registering a business in Korea" in 2019, Vietnam-ROK relations are developing well in all fields, in the context of the two countries celebrating the 10th anniversary of establishment of "Vietnam-Korea strategic cooperation partnership" (2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019). In terms of investment, ROK is the largest investor in Vietnam with total accumulated investment capital of US$ 62.5 billion as of the end of 2018 (accounting for 18.3%), with 7460 projects creating jobs for 70,000 employees and contributing about 30% of the total export value of Vietnam [3]. With regard to trade, ROK is Vietnam's second largest trading partner (after China) with total two-way trade turnover of US$ 65.7 billion in 2018, aiming to record VND 100 billion by 2020 [3].
As for tourism, there were nearly 3.5 million ROK tourists visiting Vietnam in 2018, whereas the number of Vietnamese tourists to ROK reached nearly 500,000, up 42.1%. People-to-people relations take place widely in all levels and fields. Currently, each country has more than 150,000 citizens studying, living and working in the other country (Vietnam's foreign ministry).
Like many countries, the ROK is striving to keep their economic growth rate to ensure their economic success in the era of the of the 4th industrial revolution that can guarantee higher employment rates and GDP levels. In this effort, the ROK government is encouraging and offering a series of startup programs by providing a substantial support to the promising young entrepreneurs, establishing a global startup hub such as Pangyo Creative Economy Valley, for example. In the context of investment opportunities in Korea, there are many small and medium enterprises (SMEs) from many different countries participating in investment; due to cultural differences, they have encountered certain challenges in the process of seeking opportunities and investments. According to the Future of Business Survey [4], some of the major challenges that may be encountered by SMEs include increasing revenue, maintaining profitability, attracting customers, securing financing for expansion, developing new products/innovation, finding/working with suppliers, tax laws and rules, other government regulations, etc. as shown in the Figure 2.

of 26
SMEs include increasing revenue, maintaining profitability, attracting customers, securing financing for expansion, developing new products/innovation, finding/working with suppliers, tax laws and rules, other government regulations, etc. as shown in the Figure 2. Especially in the case of foreign business owners, they are expected to have difficulties in language particularly in seeking information related to investment issues and establishment of the company at the initial stage when entering the Korean market. Neither do they have much money to invest in building a dedicated team of legal institutions in ROK.
In this study, we focused on building an operational solution for small and medium businesses in the form of a Virtual Assistant system that can provide information on finding/working with suppliers, tax laws and rules, other government regulations, investment trends and new investment sectors in Korea, investment regions, etc. We presented methods of collecting and exploiting big data on foreign investment in Korea and methods of building data sets on investment and using Natural Language Processing (NLP) and Natural Language Understanding (NLU) technologies to build chatbot applications for the direct response of assistant systems.
The rest of this paper is organized as follows: Section 1 gives an introduction to investment trends of foreign companies in ROK and some difficulties of foreign SMEs particularly Vietnamese SMEs in the foundation-stage information search and in looking for investment opportunities due to language differences; Section 2 provides an overview of previous research or papers related to aspects of understanding the original information of foreign enterprises, process of foreign investment in Korea and techniques used to build Especially in the case of foreign business owners, they are expected to have difficulties in language particularly in seeking information related to investment issues and establishment of the company at the initial stage when entering the Korean market. Neither do they have much money to invest in building a dedicated team of legal institutions in ROK.
In this study, we focused on building an operational solution for small and medium businesses in the form of a Virtual Assistant system that can provide information on finding/working with suppliers, tax laws and rules, other government regulations, investment trends and new investment sectors in Korea, investment regions, etc. We presented methods of collecting and exploiting big data on foreign investment in Korea and methods of building data sets on investment and using Natural Language Processing (NLP) and Natural Language Understanding (NLU) technologies to build chatbot applications for the direct response of assistant systems.
The rest of this paper is organized as follows: Section 1 gives an introduction to investment trends of foreign companies in ROK and some difficulties of foreign SMEs particularly Vietnamese SMEs in the foundation-stage information search and in looking for investment opportunities due to language differences; Section 2 provides an overview of previous research or papers related to aspects of understanding the original information of foreign enterprises, process of foreign investment in Korea and techniques used to build the assistant and Chatbot; Section 3 introduces the methodologies for building Assistant Systems, research design, process of building a chatbot and building Operational Solutions; Section 4 presents the results and discussions of the research. Section 5, the last chapter, presents the conclusions of this research and future work to improve the results and contribute to Innovative Business not only in Korea but also in other potential countries.

Methods
As for the methodology of this study, data were explored and built according to the process of searching for information and seeking investment opportunities; a business consulting and support application dealing with the first stages of starting a business in ROK was then built to increase effectiveness and reduce time, which is also an innovative business practice in Use-case ROK. We designed our operational solution as a Virtual Assistant system that can crawl and analyze data on foreign investments in ROK from open data resource websites (data.co.kr, accessed on 24 July 2020) and used analytic techniques to explore investments trends of foreign enterprises.
We also researched the process of searching for information and seeking investment opportunities for SMEs when investing in ROK, government support policies, laws and taxes as well as a number of other related issues. Then, we built data sets and Natural Language Processing (NLP) and Natural Language Understanding (NLU) algorithms to build chatbot applications. We built a web-app connected to the server to visualize all the results of research so that SMEs owners can easily use and look for information on investments. Based on the research results, we can make recommendations to SMEs according to the changing investment trends in ROK.

Foreign Investors and Business Establishment in Korea
FDI could play a role in the restructuring of the industrial sector through competition in Korea [5]. Over the past decades, the Korean government has made every effort to follow the development roadmap; as technological progress and human resource development have led to an increase in the assets created by all products, the relative significance of both intra-industry and inter-industry trade and attraction of foreign direct investment (FDI) has increased [6]. The continuous increase in foreign direct investments in the Republic of Korea (ROK) are being led by China who is now on the threshold of becoming an industrialized country and some of the cash-rich Middle Eastern countries seeking lucrative investment opportunities taking advantage of the free trade systems such as WTO, FTA and the regional trade agreements [7]. Recently, such FDIs are turning their eyes to the Korean service industry or cultural industry but also seeking their business opportunities in the areas where the latest Korean technologies are being adopted or used to develop a new service system or material [8]. The World Bank regards the Republic of Korea as a country with a highly developed business environment, ranking 5th in Doing Business 2020 [9].
Factors that encourage foreign investors to invest in the Korean market can usually be classified into the following four main groups: highly educated skilled labors force/improving labor climate; excellent social infrastructure; technology and major industries; finance, tax and accounting. There have been a number of organizations established to support the creation of businesses and investments for foreign investors in Korea, such as Korea Foreign Company Association (FORCA), Invest Korea, Korea Trade-Investment Promotion Agency (KOTRA), etc. Invest KOREA's headquarters and KOTRA's 126 overseas offices, 36 of which are overseas FDI offices, are devoted to attracting foreign investment to Korea (source: investkorea.org, accessed on 24 July 2020). KOTRA is operating a Foreign Investor Support Center to handle investment licensing/procedures and manage investment notification and other civil affairs, provide investment consulting services (accounting, tax, legal affairs, etc.) and help foreign investors settle down in Korea (one-day secretarial and consulting services). Foreign investors can simply go to the office of KOTRA or avail themselves of Online Consulting by calling. Consulting and support for foreign businesses are still done in traditional ways and are limited, so businesses can find information and opportunities through seminars, support programs, information in websites or business forums or get directly supported by staff at investment advisory centers. Finding such information entails a lot of time and effort for business owners. Therefore, in this research, we proposed the building of a virtual assistant that provides information on investment and business registration in Korea.

Natural Language Processing (NLP) and Natural Language Understanding (NLU) Techniques
Language is a complex system for communication between higher-order animals or thought-capable animals like humans. Natural language is not the same as artificial languages such as computer languages (C, PHP). There are currently about 7000 languages in the world. There are many ways to classify languages, with some common language classifications based on origin and characteristics. Formal Language is a set of strings built on an alphabet, bound by predefined rules or grammars. Alphabet can be a set of characters in natural language or a set of self-definition characters. The natural language model follows the rules of the Markov chain, and it was first formalized by Noam Chomsky in the 1950s and was called the "Chomsky Hierarchy Model". Chomsky proposed a series of great simplifications and abstractions for the empirical field of natural language. In particular, this approach completely ignores the meaning, with all problems related to the use of expressions like their frequency, context dependency and processing complexity [10]. These models were later used to create programming languages or applications in automated translation studies.
Since the emergence of computers until now, programmers have tried to write programs that can understand the natural language. The reason is quite clear: humans have a history of writing for thousands of years, and it would be very useful if a computer could read and understand all the data from the massive number of articles written all those years. With the development of technology and engineering, the volume of natural linguistic text worldwide has surged, carrying a large amount of knowledge, but it is increasingly difficult for people to disseminate to discover knowledge/intellect in it, particularly at any given time limit [11]. Natural Language Processing (NLP) and Natural Language Understanding (NLU) aim to do this job efficiently and accurately, just like humans do (for a limited amount of text).
In recent years, conversational UIs (chatbots, bots) have been built, researched and developed by many companies such as Facebook, Google, Amazon and Apple and can be recognized and commanded with text, voice, images, etc. [12]. In addition to the advantages of always being available to communicate and advising users 24/7, conversational UIs also help us save human resources and time and allow us to focus on other activities. In this research, we built a chatbot that builds on a given conversation database (Corpusbased chatbots). This document store can be collected using large amounts of data from user conversations via extraction methods. The system exports information (Information Retrieval) by using machine learning methods to create answers based on the context of the conversation with the user.

Natural Language Processing (NLP)
In the field of human language technology, Natural Language Processing (NLP) adopting a series of computational techniques for the analysis and/or representation of language-based expressions [13] plays an essential part in machine translation, question answering, text summarization, topic modeling, as well as opinion mining, IR and IE. Such an NLP has been used in this research to identify and exclude stop-words, or for the text tokenization (e.g., sentence/word tokenization). Being a basic process in NLP [14], a word string is broken into a series of tokens based on a specific protocol. While sentence tokenization separates and lists sentences in a text corpus, word tokenization breaks up the words in a sentence, based on the spaces, commas, periods and/or carriage return characters. In these techniques, removal of high-frequency words of negligible semantic value (stop-words removal) is an essential step. Generally, word tokens can be separated by blank spaces, and sentence tokens, by stops. Removing stop words is an important step in NLP text processing. It involves filtering out high-frequency words that add little or no semantic value to a sentence.
As an offshoot of Artificial Intelligence, NLP focuses on studying the interaction between computers and natural human languages in the form of speech or text. NLP can be divided into two large, not completely independent branches: speech processing and text processing. Speech processing focuses on the research and development of algorithms and computer programs that process human language in the form of voices (audio data). Important applications of speech processing include speech recognition and speech synthesizer. Whereas speech recognition is to convert language from speech to text, synthesized speech converts language from text to speech. Text processing focuses on text data analysis. Important applications of text processing include information search and retrieval, machine translation, automatic text summary or automatic spelling check.
Text processing is sometimes further divided into two smaller branches including text understanding and text birth [14]. In this research, we focused on collected and processed Text data.
Text processing consists of the following four main steps: • Diagnostic analysis: The identification, analysis and description of the structure of a hieroglyph in a given language and other linguistic units, such as root word, verb, affix, subclass, etc. In Vietnamese language processing, two typical problems in this section are word segmentation and part-of-speech tagging. • Parsing: Process of parsing a series of symbols in the form of natural language or computer language according to formal grammar. Formal grammar commonly used in the parsing of natural languages includes Context-free Grammar (CFG), Combinatory categorical grammar (CCG) and Secondary Grammar Dependency grammar-DG. Input to the parse is a sentence consisting of a series of words and their type tag, and output is a parse tree representing the sentence's syntactic structure. • Semantic analysis: Process of relating semantic structure, from the phrase, clause, sentence and paragraph level to the whole article level, to their independent meanings. In other words, this is to find out the semantics of the verbal input. Semantic analysis has two levels: Semantic lexical expression, which expresses the meanings of the component words and distinguishing the meaning of words; and Component semantics, which refers to the way words combine to form broader meanings. • Discourse analysis: Text analysis considering the relationship between language and context-of-use. Therefore, discourse analysis is performed at the paragraph level or whole text rather than just analysis at the sentence level alone.
There are many different approaches used to process natural language in NLP, and they fall roughly into four categories: symbolic, statistical, connectionist and hybrid. They have different foundations, typical techniques, differences in processing and system aspects, and robustness, flexibility and suitability for various tasks [15].

Natural Language Understanding (NLU)
Natural Language Understanding (NLU) can be regarded as an essential first step of NLP when dealing with understanding the semantics of a specific text [16]. However, a solution (i.e., AI algorithm, etc.) that can support NLU perfectly is yet to be developed as NLU often attempts to identify the semantics in a text where human errors are made. There are few methods that can be applied to NLU when analyzing and understanding the semantics, but they all require a specific lexicon and a parser along with grammatical rules that can be used when separating the sentences or words. Developing a content-rich lexicon with a well-defined ontology such as WordNet is not easy and often takes years of effort and human resources [17]. Again, playing a basic role for NLP, NLU attempts to understand natural languages and represent their semantics in a form that can be interpreted by computers when NLP performs a syntactic/semantic analysis [18].

Personal Assistant Technology
A virtual assistant, which can be called a digital assistant, a voice assistant or an AI assistant, is a task-oriented programming application that recognizes human voices and executes spoken commands by the user. There are three main methods to interact with a virtual assistant through: Text, includes online chat (as messaging app or other app such as Facebook, Viber, Skype), SMS Text, e-mail; Voice, such as Amazon Alexa [19], Google Assistant, Cortana and Siri [20]; or Images Data. Voice assistants typically rely on an Automatic Voice Recognition (ASR) system across 3 levels, starting with capturing sound from the microphone and breaking it down into phonemes for processing into text. Phonology is a basic measure of user speech recognition, which gives better results than word decoding by ignoring contextual limitations. Finally, the system will model the language to find probabilistic information according to the recorded context. Basically, all three popular virtual assistants Google Assistant, Cortana and Siri [20] are different in structure but operate on the basis of neural network technology (such as networks in human brain cells) in the deep backend. Virtual assistant model is provided for free on the technology devices that these big companies release such as iPads, cell phones, laptops, iPods, smart watches and other home electronic devices as a part of added value for users [19]. In this research, we focus on exploring and setting up experiments as text interactive method and based on Chatbot technology.
Chatbot is a popular functional technique of virtual assistant and developed and used in most modern fields such as Marketing, Education, Healthcare, Customer Services [21]. Chatbot is a combination of pre-existing scenarios and self-learning in the interactive process. With the questions posed, chatbot uses Natural Language Processing systems to analyze the data, and then based on Natural Language Understanding systems and machine learning algorithms to generate different types of reactions, they will predict and respond as accurately as possible. Chatbot uses multiple systems to scan keywords within the input, then the bot initiates an action, pulls an answer with the most relevant keywords and responds to information from a database/API, or handed over to humans. If that situation has not happened yet (not in the database), Chatbot will ignore it, but will also learn by itself to apply to future chats. There are many different concepts to build and execute a chatbot such as Artificial Intelligence Markup Language (AIML) creating natural language agents as well as using pattern recognition and pattern matching techniques available in various programming languages (for example Java, Ruby, Python, C) [22]; Latent Semantic Analysis (LSA), a technique to distribute semantics, analyze relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms [22]; Natural Language Processing (NLP) [23,24]; Natural Language Understanding (NLU) [25,26], Utterance, intent and entity are the three most important terms in building chatbot.
In our study, we will build a virtual assistant as a chatbot which can be much more interactive and personal than the rule-based chatbots. Chatbot can easily interact with users the same way humans converse and communicate in real-world situations. The conversational skills of chatbot technology empower them to deliver what the user is looking for. It can understand the context and intent of complex conversations and attempt to provide more relevant responses.

Data Collection
At this stage, we focused on understanding the factors that investors may be interested in or the business development opportunities for investors when starting a business in Korea. We also compiled documents and information of investors interested in searching the information shown in Figure 3. We collected all this data, saved it in MongoDB and prepared for the construction of the chatbot application in the next stage. i. 2021, 11, 4510 8 of 26 We also collected data on foreign investment in Korea by country by year, data on new business registration by locality, and invested industries from the website "data.co.kr" (accessed on 24 July 2020). These data were analyzed to find out investment trends in Korea and used for advice and suggestions to users of the system.

Proposed Architecture of the Operational Solution Assistant System
The following are the major features of the Operational Solution Assistant System we built: Data set about Korean business establishment procedures, laws, tax and labor policies, immigration policies and investment policies. The user sends a message through the Assistant interface. The message goes to the Message box in the Assistant interface, triggers a callback to a Messenger webhook and gets sent to the API gateway. The API Gateway passes the message to the Python NLP functions. In this NLP function, the message is processed with the NLP code written in Python to remove stop words in the message and extract and organize entities of user's messages and is saved in MongoDB. After that, all the entities of the user's messages pass the Chatbot SNS topic and get sent to the Python NLU function.
The Python NLU function contains all the logic algorithm of the assistant, which will be presented in detail in the next subsection. All of the user's ideas kept in the Python NLU function are saved in MongoDB. After this function is finished, the results go to the sender SNS topic and get sent to the Python Sender function. This function sends the API of the most suitable answer to the Assistant Interface, which then shows the answer to the end user. The system also has another component called Analytics Market Data. This component is used to collect data about the realistic investment market in Korea from reports by quarter, by year or changing policy of FDI in Korea. We also collected data on foreign investment in Korea by country by year, data on new business registration by locality, and invested industries from the website "data.co.kr" (accessed on 24 July 2020). These data were analyzed to find out investment trends in Korea and used for advice and suggestions to users of the system.

Proposed Architecture of the Operational Solution Assistant System
The following are the major features of the Operational Solution Assistant System we built: Data set about Korean business establishment procedures, laws, tax and labor policies, immigration policies and investment policies. The user sends a message through the Assistant interface. The message goes to the Message box in the Assistant interface, triggers a callback to a Messenger webhook and gets sent to the API gateway. The API Gateway passes the message to the Python NLP functions. In this NLP function, the message is processed with the NLP code written in Python to remove stop words in the message and extract and organize entities of user's messages and is saved in MongoDB. After that, all the entities of the user's messages pass the Chatbot SNS topic and get sent to the Python NLU function.
The Python NLU function contains all the logic algorithm of the assistant, which will be presented in detail in the next subsection. All of the user's ideas kept in the Python NLU function are saved in MongoDB. After this function is finished, the results go to the sender SNS topic and get sent to the Python Sender function. This function sends the API of the most suitable answer to the Assistant Interface, which then shows the answer to the end user. The system also has another component called Analytics Market Data. This component is used to collect data about the realistic investment market in Korea from reports by quarter, by year or changing policy of FDI in Korea.  All data from this component are sent to the Python Analysis function and saved in MongoDB. When an assistant user asks about the trends of the market, the Python NLU function sends a request to MongoDB and collects this data to answer the assistant user.
The AI Assistant's main workflow is shown in the flowchart in Figure 5. When users send a message to the system, the AI Assistant's main workflow starts with the Define intent in the field of investment by the Python NLP functions. The system defines the Entities belonging to the field of investment. After that, it builds the pattern of utterance and sends it to the TensorFlow training model. This model builds the chatbot using a model trained to interact with users. If these results have good confidence, this ends the session. If not, the system stores and suggests relative intents. The AI Assistant's main workflow is shown in the flowchart in Figure 5. When users send a message to the system, the AI Assistant's main workflow starts with the Define intent in the field of investment by the Python NLP functions. The system defines the Entities belonging to the field of investment. After that, it builds the pattern of utterance and sends it to the TensorFlow training model. This model builds the chatbot using a model trained to interact with users. If these results have good confidence, this ends the session. If not, the system stores and suggests relative intents.  The flowchart in Figure 6 presents the process of defining Intent, which is a critical factor in chatbot functionality because the chatbot's ability to parse intent is what ultimately determines the success of the interaction.

AI ASSISTANT Flowchart
The flowchart in Figure 7 presents the process of defining Entity. Entities are knowledge repositories used by the bot to provide personalized and accurate responses. Entities can help the system extract important information from the ongoing conversation and catch important data.
The flowchart in Figure 8 presents the process of defining Utterance. Utterances are the user input that the chatbot needs to derive intents and entities. To train any chatbot to extract intents and entities accurately from the user's dialog input, it is imperative to capture a variety of different example utterances for each and every intent. The flowchart in Figure 6 presents the process of defining Intent, which is a critical factor in chatbot functionality because the chatbot's ability to parse intent is what ultimately determines the success of the interaction.
The flowchart in Figure 7 presents the process of defining Entity. Entities are knowledge repositories used by the bot to provide personalized and accurate responses. Entities can help the system extract important information from the ongoing conversation and catch important data.
The flowchart in Figure 8 presents the process of defining Utterance. Utterances are the user input that the chatbot needs to derive intents and entities. To train any chatbot to extract intents and entities accurately from the user's dialog input, it is imperative to capture a variety of different example utterances for each and every intent.  The flowchart in Figure 9 shows the workflow of the chatbot. There are four main stages in this process: (i) Train model; (ii) User Interface; (iii) Understanding Process; and (iv) Result. The process starts in stage (i), and then AI natural language Processing and Natural Language Understanding Processing are sent to the Train and Build model. After Training, we deployed the chatbot engine in the User Interface. When the user inputs a   The flowchart in Figure 9 shows the workflow of the chatbot. There are four main stages in this process: (i) Train model; (ii) User Interface; (iii) Understanding Process; and (iv) Result. The process starts in stage (i), and then AI natural language Processing and Natural Language Understanding Processing are sent to the Train and Build model. After Training, we deployed the chatbot engine in the User Interface. When the user inputs a  sentence to start the conversation, this sentence is split into entities; if it has good confidence in any intent, the system will classify and print out the answer. If the confidence is not good enough, this situation will be inserted and updated in the training model. In stage (iv), after sending the output answer, if there is a similar Question, the System will send Suggest the related Information to the user for user selection. If there is none, the process ends.    The flowchart in Figure 9 shows the workflow of the chatbot. There are four main stages in this process: (i) Train model; (ii) User Interface; (iii) Understanding Process; and (iv) Result. The process starts in stage (i), and then AI natural language Processing and Natural Language Understanding Processing are sent to the Train and Build model. After Training, we deployed the chatbot engine in the User Interface. When the user inputs a sentence to start the conversation, this sentence is split into entities; if it has good confidence in any intent, the system will classify and print out the answer. If the confidence is not good enough, this situation will be inserted and updated in the training model. In stage (iv), after sending the output answer, if there is a similar Question, the System will send Suggest the related Information to the user for user selection. If there is none, the process ends.
Appl. Sci. 2021, 11, 4510 12 of 26 sentence to start the conversation, this sentence is split into entities; if it has good confidence in any intent, the system will classify and print out the answer. If the confidence is not good enough, this situation will be inserted and updated in the training model. In stage (iv), after sending the output answer, if there is a similar Question, the System will send Suggest the related Information to the user for user selection. If there is none, the process ends.     Figure 10 shows the entire UML framework of the Operational Solution Assistant system. The user can use the chatbot in MainActivity, which has the chatbot application interface. MainActivity connects with the server through the Internet network. The process starts once the user is selected, and the agreement terms for collecting and using the user's information are shown during the time of using the Application. After the user provides user_ID, the application will query and access the user database in the Server through Internet connection. This user_data contains personal data and historical data (if available) of old conversations with applications. When a user starts a conversation with our application, the application sends to the NLP and NLU functions. When the user asks the chatbot any question, the chatbot sends to the server through API. The server also connects with the NLP and NLU functions to give the output answer to the sender, which sends this message to the user.

UML Diagram and Application
Appl. Sci. 2021, 11, 4510 13 of 26 Figure 10 shows the entire UML framework of the Operational Solution Assistant system. The user can use the chatbot in MainActivity, which has the chatbot application interface. MainActivity connects with the server through the Internet network. The process starts once the user is selected, and the agreement terms for collecting and using the user's information are shown during the time of using the Application. After the user provides user_ID, the application will query and access the user database in the Server through Internet connection. This user_data contains personal data and historical data (if available) of old conversations with applications. When a user starts a conversation with our application, the application sends to the NLP and NLU functions. When the user asks the chatbot any question, the chatbot sends to the server through API. The server also connects with the NLP and NLU functions to give the output answer to the sender, which sends this message to the user.

Building the Data Set for the System
As a result, we built a data training set that answers questions in the following main aspects: Investment in Korea: Introduction; Korea's Investment Climate; Investment Guide; Government support policies and programs, and Legal, Tax and Labor. We also built an interactive system that can enter data directly into the database and test model as shown in Figures 11-13 as below.

Building the Data Set for the System
As a result, we built a data training set that answers questions in the following main aspects: Investment in Korea: Introduction; Korea's Investment Climate; Investment Guide; Government support policies and programs, and Legal, Tax and Labor. We also built an interactive system that can enter data directly into the database and test model as shown in Figures 11-13 as below.
14 of 26 Figure 11. Input interface of chatbot intent data.   Figure 11 presents the Input interface of chatbot intent data. Developers can input the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of this intent. Figure 12 shows the Input interface of the chatbot entity data. Developers can input Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). Figure 13 presents the Input interface of chatbot utterance data. First, when the Developer chooses Intent code, intent name and Korean intent will appear. After that, the Developer chooses the Question's type and inputs the question with a suitable answer and the type of answer.
The Data set sample of Intent after input is presented in Figure 14, concluding the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of    Figure 11 presents the Input interface of chatbot intent data. Developers can input the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of this intent. Figure 12 shows the Input interface of the chatbot entity data. Developers can input Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). Figure 13 presents the Input interface of chatbot utterance data. First, when the Developer chooses Intent code, intent name and Korean intent will appear. After that, the Developer chooses the Question's type and inputs the question with a suitable answer and the type of answer.
The Data set sample of Intent after input is presented in Figure 14, concluding the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of    Figure 11 presents the Input interface of chatbot intent data. Developers can input the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of this intent. Figure 12 shows the Input interface of the chatbot entity data. Developers can input Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). Figure 13 presents the Input interface of chatbot utterance data. First, when the Developer chooses Intent code, intent name and Korean intent will appear. After that, the Developer chooses the Question's type and inputs the question with a suitable answer and the type of answer.
The Data set sample of Intent after input is presented in Figure 14, concluding the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of Figure 13. Input interface of chatbot utterance data. Figure 11 presents the Input interface of chatbot intent data. Developers can input the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of this intent. Figure 12 shows the Input interface of the chatbot entity data. Developers can input Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). Figure 13 presents the Input interface of chatbot utterance data. First, when the Developer chooses Intent code, intent name and Korean intent will appear. After that, the Developer chooses the Question's type and inputs the question with a suitable answer and the type of answer.
The Data set sample of Intent after input is presented in Figure 14, concluding the Intent code, name of intent, Synonym of intent in Korean, Intent type and Description of this intent. The Data set sample of Entity after inputting is presented in Figure 15, concluding Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). The Data set sample of Utterance after inputting is presented in Figure 16, concluding the Intent code, Intent, Intent Korean, Entity group, Question's Type, Question, Answer's Type and Answer. The Data set sample of Entity after inputting is presented in Figure 15, concluding Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). The Data set sample of Entity after inputting is presented in Figure 15, concluding Entity names, Entity value, Description of this entity and Synonym tag (the user can use many types to express a similar Entity). The Data set sample of Utterance after inputting is presented in Figure 16, concluding the Intent code, Intent, Intent Korean, Entity group, Question's Type, Question, Answer's Type and Answer. The Data set sample of Utterance after inputting is presented in Figure 16, concluding the Intent code, Intent, Intent Korean, Entity group, Question's Type, Question, Answer's Type and Answer. Data is stored in MongoDB as document type which composed data as field and value pair. Value of the field can be any of data types [27], presented in Figure 17. In our research, we chose this database because of the properties of natural language. They are unstructured or semi-structured (non-structured or semi-structured) and they cannot be stored in fixed formats such as tables. In these use-case, we stored a data record fieldvalue-pairs.  Data is stored in MongoDB as document type which composed data as field and value pair. Value of the field can be any of data types [27], presented in Figure 17. In our research, we chose this database because of the properties of natural language. They are unstructured or semi-structured (non-structured or semi-structured) and they cannot be stored in fixed formats such as tables. In these use-case, we stored a data record field-value-pairs. Data is stored in MongoDB as document type which composed data as field and value pair. Value of the field can be any of data types [27], presented in Figure 17. In our research, we chose this database because of the properties of natural language. They are unstructured or semi-structured (non-structured or semi-structured) and they cannot be stored in fixed formats such as tables. In these use-case, we stored a data record fieldvalue-pairs. In our system, after inputting Entity, Intent and Utterance, we also built functions for developer testing; the evaluation results of the chatbot are shown below Figure 18. In our system, after inputting Entity, Intent and Utterance, we also built functions for developer testing; the evaluation results of the chatbot are shown below Figure 18. We continued with collecting investment data in ROK from public portals from the Korean government such as Korea Public Data Portal (data.go.kr (accessed on 24 July 2020)) and Busan Provincial Portal (bigdata.busan.go.kr (accessed on 24 July 2020)), as shown in the Figure 19 below. In this research, we mainly used data collected from these websites because they are accurate and are updated year by year; this is true for the investment data. The data that needs to be collected correctly will ensure the efficiency and accuracy of the system later. We mainly collected Invest attraction data and industry register data. Figure 20. presents the structure of specialized data of Invest attraction saved in database in our system. After processing, Invest attraction data are stored in MongoDB in the following fields: Id, Name of country, year and the amount money invest into ROK. Each data saves the information separately by ID for easier extraction and analysis. We continued with collecting investment data in ROK from public portals from the Korean government such as Korea Public Data Portal (data.go.kr (accessed on 24 July 2020)) and Busan Provincial Portal (bigdata.busan.go.kr (accessed on 24 July 2020)), as shown in the Figure 19 below. In this research, we mainly used data collected from these websites because they are accurate and are updated year by year; this is true for the investment data. The data that needs to be collected correctly will ensure the efficiency and accuracy of the system later. We continued with collecting investment data in ROK from public portals from the Korean government such as Korea Public Data Portal (data.go.kr (accessed on 24 July 2020)) and Busan Provincial Portal (bigdata.busan.go.kr (accessed on 24 July 2020)), as shown in the Figure 19 below. In this research, we mainly used data collected from these websites because they are accurate and are updated year by year; this is true for the investment data. The data that needs to be collected correctly will ensure the efficiency and accuracy of the system later. We mainly collected Invest attraction data and industry register data. Figure 20. presents the structure of specialized data of Invest attraction saved in database in our system. After processing, Invest attraction data are stored in MongoDB in the following fields: Id, Name of country, year and the amount money invest into ROK. Each data saves the information separately by ID for easier extraction and analysis. We mainly collected Invest attraction data and industry register data. Figure 20 presents the structure of specialized data of Invest attraction saved in database in our system. After processing, Invest attraction data are stored in MongoDB in the following fields: Id, Name of country, year and the amount money invest into ROK. Each data saves the information separately by ID for easier extraction and analysis.

Data Analysis
In addition to static information such as laws and government regulations related to investing, statistical data and analysis of investment trends and business support policies were also collected and aggregated. Typically, information on the countries that invested in Korea over the years from 2013 to 2019 are shown in Figures 22a,b and 23. Japan

Data Analysis
In addition to static information such as laws and government regulations related to investing, statistical data and analysis of investment trends and business support policies were also collected and aggregated. Typically, information on the countries that invested in Korea over the years from 2013 to 2019 are shown in Figure 22a

Data Analysis
In addition to static information such as laws and government regulations related to investing, statistical data and analysis of investment trends and business support policies were also collected and aggregated. Typically, information on the countries that invested in Korea over the years from 2013 to 2019 are shown in Figures 22a,b and 23.
Japan       Finally, in order to assist foreign businesses in seeking information on cooperation with Korean businesses, enterprise information was collected as shown in Figure 21. With the information above, the aggregation and analysis data set were built to help the Virtual Assistant learn data.

Operational Solution Assistant: Chatbot
We built an Assistant system, a smart assistant for foreign investors looking for information such as Investment in Korea: Introduction, Korea's Investment Climate, Investment Guide, Government support policies and programs and Legal, Tax and Labor. Users receive different types of information.
As shown in Figures 26 and 27, when the user asks the Assistant in a narrative sentence, not in question form, about which type of visa allows opening companies in Korea, the Assistant can give a correct answer. The Assistant system can also interact with users naturally like a human. Finally, in order to assist foreign businesses in seeking information on cooperation with Korean businesses, enterprise information was collected as shown in Figure 21. With the information above, the aggregation and analysis data set were built to help the Virtual Assistant learn data.

Operational Solution Assistant: Chatbot
We built an Assistant system, a smart assistant for foreign investors looking for information such as Investment in Korea: Introduction, Korea's Investment Climate, Investment Guide, Government support policies and programs and Legal, Tax and Labor. Users receive different types of information.
As shown in Figures 26 and 27, when the user asks the Assistant in a narrative sentence, not in question form, about which type of visa allows opening companies in Korea, the Assistant can give a correct answer. The Assistant system can also interact with users naturally like a human.  In addition, the Assistant Application can provide analysis information as shown in Figure 28. We collected data about the supported industries in regions of Korea and made some analysis, so our assistant can also provide the following analysis results. As shown in Figure 29, the Assistant Application can provide information on other companies in Korea such as address and contact details. In addition, the Assistant Application can provide analysis information as shown in Figure 28. We collected data about the supported industries in regions of Korea and made some analysis, so our assistant can also provide the following analysis results. In addition, the Assistant Application can provide analysis information as shown in Figure 28. We collected data about the supported industries in regions of Korea and made some analysis, so our assistant can also provide the following analysis results. As shown in Figure 29, the Assistant Application can provide information on other companies in Korea such as address and contact details. Our Assistant Application also available on websites, user can get access and help to improve performance and get more conversational data recorded by responding to customers. To evaluate the performance of the application, we set up a realistic experiment with 20 colleagues and friends to test the performance of all functions in our system as well as estimate the accuracy of chatbot. First, all the functions (chatbot, visualization and developer) worked well and make a friendly environment for user to use and looking for information and seeking investment opportunities in ROK. Second, we calculate the accuracy of chatbot by the percent of correct answers or suggestion of the questions or interactive communication with users. Main status can be appeared: (a) chatbot gives correct answer, (b) chatbot gives wrong answer, (c) chatbot gives a correct answer but not understanding or missing the referred context. When testing, most of the questions chatbot can answer, the number of wrong answers is very small (accounting for 5 wrong answers out of 92 answers). Since the number of testers and database is still very small, at this stage, the accuracy of the chatbot will be high. Future. We will expand the database as well as the number of testers to come to more accurate conclusions.

Web-App Visualized Analysis Investment Data
As the result, we also build a web-app to visualize analysis investment data collected from public website in ROK. Typically, we collected data about countries that invested in ROK, business registration information for new companies, industry investment as well as the amount of investment companies over the years from 2013 to 2019 from public provider website in ROK (data.go.kr, accessed on 24 July 2020). After pre-processing and aggregating with Python language, we built tables, simple charts to visualize for user easily access and use information.

Conclusions and Future Work
This study used an algorithm that focuses on building an Assistant system to help foreign investors pass language barriers and cultural and institutional differences by providing tools to search for more information and seek investment opportunities in ROK. We designed our Virtual Assistant system that can crawl and analyze data on foreign investments in ROK from open data resource websites and used analytic and aggregation techniques to explore trends in investments of foreign enterprises. We built data sets about information and investment data for SMEs in ROK, concludes government support policies, laws, taxes, investment data, new established enterprises data. In this research, we Our Assistant Application also available on websites, user can get access and help to improve performance and get more conversational data recorded by responding to customers. To evaluate the performance of the application, we set up a realistic experiment with 20 colleagues and friends to test the performance of all functions in our system as well as estimate the accuracy of chatbot. First, all the functions (chatbot, visualization and developer) worked well and make a friendly environment for user to use and looking for information and seeking investment opportunities in ROK. Second, we calculate the accuracy of chatbot by the percent of correct answers or suggestion of the questions or interactive communication with users. Main status can be appeared: (a) chatbot gives correct answer, (b) chatbot gives wrong answer, (c) chatbot gives a correct answer but not understanding or missing the referred context. When testing, most of the questions chatbot can answer, the number of wrong answers is very small (accounting for 5 wrong answers out of 92 answers). Since the number of testers and database is still very small, at this stage, the accuracy of the chatbot will be high. Future. We will expand the database as well as the number of testers to come to more accurate conclusions.

Web-App Visualized Analysis Investment Data
As the result, we also build a web-app to visualize analysis investment data collected from public website in ROK. Typically, we collected data about countries that invested in ROK, business registration information for new companies, industry investment as well as the amount of investment companies over the years from 2013 to 2019 from public provider website in ROK (data.go.kr, accessed on 24 July 2020). After pre-processing and aggregating with Python language, we built tables, simple charts to visualize for user easily access and use information.

Conclusions and Future Work
This study used an algorithm that focuses on building an Assistant system to help foreign investors pass language barriers and cultural and institutional differences by providing tools to search for more information and seek investment opportunities in ROK. We designed our Virtual Assistant system that can crawl and analyze data on foreign investments in ROK from open data resource websites and used analytic and aggregation techniques to explore trends in investments of foreign enterprises. We built data sets about information and investment data for SMEs in ROK, concludes government support policies, laws, taxes, investment data, new established enterprises data. In this research, we built a friendly framework for new developers to add and build up the dataset of AI Assistant by providing input intent data function, input Entity data function, input utterance data function as well as training and test function. In addition, we used cloud computing combined with programming techniques such as Python natural language processing, natural language understanding and web application to support this project in successfully building the processing core engine of a virtual assistant. In addition, we built a web-app connected to the server to visualize all the results of analyzed collected data so that SMEs owners can easily use and look for information on investments. Based on the research results, we can make recommendations to SMEs in keeping with the changing investment trends in ROK.
In our research, we have only focused on collecting and constructing investment data sets (concludes government support policies, laws, taxes, investment data, new established enterprises data) at ROK, so the data set is not yet diverse. These data just public data so they may not be complete and accurate. In order for the application to be able to develop more generator and generalize information, it is necessary to cooperate with governments and Korean and foreign enterprises to build big data collection, build dataset updated year by year as well as other useful information such as investment surveys, information on government assistance programs as well as surveys on SMEs difficulties. In this study, assess the level of approval and performance of the virtual assistant is not discussed, so it has not yet evaluated the practical applicability of the system. In the future, we will do more empirical studies in SMEs in ROK. As a next step, the big data set will have more information related to investment issues in Korea. Especially, real-time market analysis data and forecasts that foreign investors desire will be provided through cooperation with the government and domestic and foreign organizations in Korea such as KOTRA, FORCA, Invest Korea, etc. to complete the project and bring more value. Building virtual assistants friendly and make it easier for other developers who do not need strong technical knowledge to apply in other fields such as health care, social welfare, education and insurance is also the future works of our research.