Hit or Miss ? Evaluating the Potential of a Research Niche : A Case Study in the Field of Virtual Quality Management

When knowledge is developed fast, as it is the case so often nowadays, one of the main difficulties in initiating new research in any field is to identify the domain’s specific state-of-the-art and trends. In this context, to evaluate the potential of a research niche by assisting the literature review process and to add a new and modern large-scale and automated dimension to it, the paper proposes a methodology that uses “Latent Semantic Analysis” (LSA) for identifying trends, focused within the knowledge space created at the intersection of three sustainability-related methodologies/concepts: “virtual Quality Management” (vQM), “Industry 4.0”, and “Product Life-Cycle” (PLC). The LSA was applied to a significant number of scientific papers published around these concepts to generate ontology charts that describe the knowledge structure of each by the frequency, position, and causal relation of associated notions. These notions are combined for defining the common high-density knowledge zone from where new technological solutions are expected to emerge throughout the PLC. The authors propose the concept of the knowledge space, which is characterized through specific descriptors with their own evaluation scales, obtained by processing the emerging information as identified by a combination of classic and innovative techniques. The results are validated through an investigation that surveys a relevant number of general managers, specialists, and consultants in the field of quality in the automotive sector from Romania. This practical demonstration follows each step of the theoretical approach and yields results that prove the capability of the method to contribute to the understanding and elucidation of the scientific area to which it is applied. Once validated, the method could be transferred to fields with similar characteristics.


Introduction
In the paradigm of the knowledge-based society, knowledge becomes an indispensable resource in the further development of the contemporary social and economic status.Hence, in recent years, for supporting the required advancements of modern society, the speed through which knowledge is generated increased exponentially.Yang Lu in Reference [1] associates this phenomenon with the Fourth Industrial Revolution and the emergence of the "Internet-of-Things", "Cyber-Physical Systems", and "Enterprise Integration"; however, the increasing computational capacity and connectivity of computers could be mentioned as facilitators.Often, this high dynamic in research (knowledge generation), especially in interdisciplinary fields, brings the need to express new specific contents, meanings or trends under a linguistic form.In this way, new syntagmas or concepts are brought to light and adopted by professionals as part of the specific jargon in the field.Even if their creators endowed them with a clear meaning at an incipient stage, when they become more popular in an emerging area, these concepts are quickly surrounded by a large amount of new knowledge that is developed with an amazing speed, enriching and enlarging their initial sphere.
The "virtual Quality Management" (vQM) concept could be a significant example for the circumstances described previously.It is born through a semantic operation, joining two established and mature concepts: "virtual" and "QM", thus it is representative for an area which is in a period of high dynamic development and of interest for companies preoccupied with sustainability from the perspective of operations management and organizational culture.
In this context in which the amount of information relating to new concepts quickly reaches unmanageable levels, regardless of the field, solutions that can analyze extended documentation with the purpose of disambiguating information and capturing the essentials, thus creating knowledge, become the focus of attention and gain in importance.Traditional solutions for that purpose lay in the literature review process, trying to collect, select, filter, and structure the existing and relevant information in a synthetic way.However, this solution has limits: on one hand the large amounts of data can lead to a decrease in the ability of a research team to handle that information and on the other, the selection and interpretation of the chosen references is exposed to strong subjectivity.
Natural Language Processing (NLP), a field of computational linguistics, can be considered as a possible alternative for the above stated issues making users capable of going through large amounts of information fast, retaining only important data.It deals with translating human language to computers, enabling them to summarize and essentialize inputted information from an unlimited number of references.The computer programs deployed for this task use various algorithms, each with their own advantages and disadvantages.Among these worth mentioning are: Part-of-Speech Tagging; Named Entity Recognition; Semantic Role Labeling; and Latent Semantic Analysis (LSA).
The current paper proposes a framework for identifying trends, based on LSA by analyzing the knowledge space resulting from the intersection of two knowledge structures given by the vQM and Industry 4.0 concepts.
The investigatory process for these two concepts was done by applying LSA to a significant number of scientific papers published around them, selected after they underwent an initial raw filtration process proposed by the authors.As a result of this analysis, the generated ontology charts defined the knowledge structures of the two concepts by the frequency, position in text, and causal relation of associated notions.Combining these notions with one another resulted in senseful descriptors.The common ones were retained and further used for identifying new technological solutions and industrial applications, which are considered to emerge from this particular intersection of concepts.They were finally sorted and ranked within the PLC stages using the affinity diagram (KJ diagram) and the matrix diagram.
The intended added value of this approach is to supersede the classic literature review in the state-of-the-art analysis with automated information processing tools, both classic (affinity diagramming, conceptual mapping and analysis-based on ontology charts) and modern (Latent Semantic Analysis) for investigating complex knowledge spaces born at the intersection of new concepts.The illustration of this endeavor was done with the help of the vQM and Industry 4.0 concepts, and its end result is the current and emerging practical Industry 4.0 applications related to vQM throughout the entire PLC stages, particularly in the automotive industry.

Objective of the Research
The main objective of the research presented in this paper is to develop an innovative and expedient approach to evaluate the result potential and dissemination outlook of a research topic at the beginning of an investigation endeavor, especially in emerging scientific domains that are Sustainability 2019, 11, 1450 3 of 26 characterized by a high degree of entropy and uncertainty.The presented methodology relies on the use of information processing tools, techniques, and methods (combing both classic and new) to address the most common concerns of research teams and help them arrive at a timely answer with an acceptable degree of validity to the issue concerning the relation between invested resources (time, staff, equipment, etc.) and foreseeable impact.
In this regard, the research approached in the current paper proposes, firstly, to analyze the knowledge density zone obtained by intersecting two knowledge structures, represented here by the vQM and Industry 4.0 axis, throughout the PLC stages (Figure 1).
Sustainability 2019, 11, x FOR PEER REVIEW 3 of 27 use of information processing tools, techniques, and methods (combing both classic and new) to address the most common concerns of research teams and help them arrive at a timely answer with an acceptable degree of validity to the issue concerning the relation between invested resources (time, staff, equipment, etc.) and foreseeable impact.In this regard, the research approached in the current paper proposes, firstly, to analyze the knowledge density zone obtained by intersecting two knowledge structures, represented here by the vQM and Industry 4.0 axis, throughout the PLC stages (Figure 1).Secondly, in addition to concept investigation and disambiguation, it aims to identify and hierarchize industrial applications (or new technological solutions) that are expected to emerge.
Finally, the paper intends to validate the proposed methodology through an online survey.
The need to develop this algorithm appeared to the authors due to their involvement in high dynamic manufacturing research demarches, where they were compelled to generate practical solutions for companies before the theories were settled.As such, the validation included in this paper was also based on vQM and Industry 4.0 concepts but can be transferred to any domain.
Contributing to the achievement of these endeavors the following stages were established:

•
In the first stage an identification of relevant papers and scientific references was carried out for each of the two main concepts individually: vQM and Industry 4.0.After that, a filtration process was performed for limiting as much as possible the amount of irrelevant information.The ontology charts, which were both similar in nature as two knowledge structures, both for vQM and Industry 4.0, were determined by applying LSA (using a dedicated software program) on the final sets of bibliographic references.

•
The second stage was reserved for weighting notions that were in relation with the main concepts.Based on the ontology charts and other information provided by the software each notion relating to vQM and Industry 4.0 was ranked by calculating their CN (correlation number).Next, associations were made between them with the scope of obtaining syntagmas (descriptors) that characterized the two main concepts.A score was also calculated for each of them signifying their importance to both concepts.

•
The third stage began with extracting the common meaningful descriptors between the two main concepts.Based on these expressions, applications were identified using search engines.This way the focus was kept only on those applications that were in close reference to vQM and Industry 4.0.Secondly, in addition to concept investigation and disambiguation, it aims to identify and hierarchize industrial applications (or new technological solutions) that are expected to emerge.
Finally, the paper intends to validate the proposed methodology through an online survey.
The need to develop this algorithm appeared to the authors due to their involvement in high dynamic manufacturing research demarches, where they were compelled to generate practical solutions for companies before the theories were settled.As such, the validation included in this paper was also based on vQM and Industry 4.0 concepts but can be transferred to any domain.
Contributing to the achievement of these endeavors the following stages were established:

•
In the first stage an identification of relevant papers and scientific references was carried out for each of the two main concepts individually: vQM and Industry 4.0.After that, a filtration process was performed for limiting as much as possible the amount of irrelevant information.The ontology charts, which were both similar in nature as two knowledge structures, both for vQM and Industry 4.0, were determined by applying LSA (using a dedicated software program) on the final sets of bibliographic references.

•
The second stage was reserved for weighting notions that were in relation with the main concepts.
Based on the ontology charts and other information provided by the software each notion relating to vQM and Industry 4.0 was ranked by calculating their CN (correlation number).Next, associations were made between them with the scope of obtaining syntagmas (descriptors) that characterized the two main concepts.A score was also calculated for each of them signifying their importance to both concepts.

•
The third stage began with extracting the common meaningful descriptors between the two main concepts.Based on these expressions, applications were identified using search engines.
This way the focus was kept only on those applications that were in close reference to vQM and Industry 4.0.

•
The fourth stage, firstly, was focused on identifying the product life cycle stages and secondly, on the construction and completion of the matrix diagram.It had, as inputs on the left, the product life cycle stages together with identified applications and above the common descriptors between vQM and Industry 4.0 (associated by their weighted importance).It illustrates not only the hierarchy of the main categories of applications throughout the product life cycle stages, but also the rank of every sub-category application within the main categories.

•
Within the fifth stage, an online questionnaire was applied for checking results obtained from the matrix diagram against an average rating coming from experts from Romania's automotive industry.
The research methodology is summarized in the following figure (Figure 2) showing also inputs and outputs for all its phases in the form of an extended flowchart:  The fourth stage, firstly, was focused on identifying the product life cycle stages and secondly, on the construction and completion of the matrix diagram.It had, as inputs on the left, the product life cycle stages together with identified applications and above the common descriptors between vQM and Industry 4.0 (associated by their weighted importance).It illustrates not only the hierarchy of the main categories of applications throughout the product life cycle stages, but also the rank of every sub-category application within the main categories.

•
Within the fifth stage, an online questionnaire was applied for checking results obtained from the matrix diagram against an average rating coming from experts from Romania's automotive industry.
The research methodology is summarized in the following figure (Figure 2) showing also inputs and outputs for all its phases in the form of an extended flowchart:

The Review of the Scientific Literature
Considering the objectives set forth by the paper (detailed previously), in this section the authors assembled the elements relating to the background and literature review into two categories: aspects relating to the need to assist the literature review process with modern instruments and techniques; and background and aspects which related to the main concepts forming the analyzed knowledge space.

The Need for New Approaches
If the statement that the development of the current society is strongly linked to that of knowledge is already a truism, what seemed fascinating to scientists was and is its speed of growth.The study of knowledge growth has enjoyed attention starting with Price's theory on the exponential growth of science [2], passing through Buckminster Fuller's knowledge doubling curve [3], and arriving at recent works such as Reference [4] which summarizes mathematical models for knowledge growth or References [5,6] which focus on modern digital knowledge spreading channels, to mention just some of the countless publications on this topic in these last years.
At the communication level, new knowledge is asking usually for new concepts to formalize its significance.The emergence of new concepts or concept groups generate in their semantic proximity domain-specific or interdisciplinary knowledge spaces.Any connected new research pass by an exploratory phase when a preliminary investigation of the mentioned spaces is necessary to identify the state-of-the-art, directions or the trending elements in its area.This exploratory investigation is often carried out with the help of literature reviews, which are complex, laborious, and timeconsuming tasks, whose results are strongly dependent on the author's critical thinking ability, experience, and competence in the domain.The unprecedented increase in the amount of information that needs to be processed to capture the relevant aspects of the knowledge space in question leads to either the reduction of the reference sample with the risk of losing important inferences from the

The Review of the Scientific Literature
Considering the objectives set forth by the paper (detailed previously), in this section the authors assembled the elements relating to the background and literature review into two categories: aspects relating to the need to assist the literature review process with modern instruments and techniques; and background and aspects which related to the main concepts forming the analyzed knowledge space.

The Need for New Approaches
If the statement that the development of the current society is strongly linked to that of knowledge is already a truism, what seemed fascinating to scientists was and is its speed of growth.The study of knowledge growth has enjoyed attention starting with Price's theory on the exponential growth of science [2], passing through Buckminster Fuller's knowledge doubling curve [3], and arriving at recent works such as Reference [4] which summarizes mathematical models for knowledge growth or References [5,6] which focus on modern digital knowledge spreading channels, to mention just some of the countless publications on this topic in these last years.
At the communication level, new knowledge is asking usually for new concepts to formalize its significance.The emergence of new concepts or concept groups generate in their semantic proximity domain-specific or interdisciplinary knowledge spaces.Any connected new research pass by an exploratory phase when a preliminary investigation of the mentioned spaces is necessary to identify the state-of-the-art, directions or the trending elements in its area.This exploratory investigation is often carried out with the help of literature reviews, which are complex, laborious, and time-consuming tasks, whose results are strongly dependent on the author's critical thinking ability, experience, and competence in the domain.The unprecedented increase in the amount of information that needs to be processed to capture the relevant aspects of the knowledge space in question leads to either the reduction of the reference sample with the risk of losing important inferences from the analysis result, or the use as support of modern computer-assisted tools able to deliver a primary structure of the conceptual framework for the targeted knowledge spaces.

Literature Review Methods and Tools
A scientific literature review "has become the most common way of acquiring knowledge and oftentimes sets the direction for a study" [7].Alongside the methods used and inputted constraints, the type of review can differ.The authors [8] identified a typology set of 14 main reviews for investigating and evaluating specialized literature, out of which worth mentioning are: "critical review"; "mapping review"; "rapid review"; "state-of-the-art review"; and "systematic search and review".
From the tools and techniques' perspective, two categories are identifiable: classic tools and those that use computer-assisted instruments.One might argue that, currently, literature reviews are carried out using tools from the latter category; however, some that have more of a classic background are still promoted: concept mapping (structuring the knowledge from a certain area based on main concepts in the form of highly-visual impact maps) [9,10]; affinity diagramming (structuring main ideas extracted from an initial literature search) [11]; snowball sampling (finding a network of professionals specific to a certain area by conducting interviews and asking them for other references) [12]; and literature tables used both for managing and ensuring the traceability of identified references, and for summarizing information and extracting main ideas.
The computer assisted tools are comprised of software programs, that help the user in making this review more efficient considering the following points:

Natural Language Processing Approaches
Natural Language Processing (NLP), a field of computational linguistics, deals with translating human language to computers enabling them to understand, summarize, and essentialize inputted information from an unlimited number of references.An analysis of the specialty literature categorized mainly used models and algorithms (incorporated into software solutions) into ones that complete a basic text processing (part-of-speech tagging; named entity recognition), ones that use statistical language models (bag-of-words model; n-gram models; neural network language model), and ones that discern the text based on the semantical meaning of words within (semantic role labeling; latent semantic analysis).Based on these, the software assists the user by increasing the quality of the review and the amount of information subjected to the analysis, while it reduces the time in which the user can process individually that same amount of text (literature).
Part-Of-Speech (POS) Tagging assigns parts of speech to words from a text body, allowing words having the same written form, but different meanings in different contexts to be understood accordingly by the software.Developed by Reference [16] and later advanced by other contributors [17][18][19], the method is enjoying recent applications regarding the analysis of written information on social media websites [20][21][22], dealing with grammar related aspects [23,24], identifying plagiarism [25], or even acting as a human-machine interface [26].
Named Entity Recognition enables the localization and categorization of "important and proper nouns in a text" [27], helping the reviewer to focus on important concepts that characterize the text.According to Reference [28], "this is an important task because its performance directly affects the quality of many succeeding NLP applications such as information extraction".Its application recently gained popularity for processing semi-structured knowledge bases regarding entity disambiguation/mapping [29][30][31] and extracting/retrieving information [32] or for analyzing content generated on social media [33][34][35].
The Bag-of-Words Model identifies and represents by histograms the frequency of words appearing in the text [36] enabling the reviewer to conduct an initial text categorization.Although its application is grounded in NLP (e.g., opinion mining on information generated by social media users [37,38]), recent approaches prove its utility in other fields as well, such as image processing [39][40][41].
n-Gram Models help determine the probability of a sequence of words in a sentence or in a text.Their application varies from identifying patterns in text [42] to data extraction [43], automatic speech recognition, machine translation, and spell checking [44,45].Neural Network Language models offer an improved version [46], both having the potential to be integrated into computer-assisted tools for supporting text reviewers.
Semantic Role Labeling traditionally uses a shallow parsing algorithm that identifies "arguments within the local context of a predicate" [47] enabling the software solutions, to a certain degree, to summarize information [48].However, recent advancements, presented by References [49] and [50] outline the possibility for using this technique without syntactic parsing.
Latent Semantic Analysis (LSA) analyzes the relationship between terms contained in several documents and outputs a set of concepts that are semantically related to the terms and documents subjected to the analysis, and has multiple applications in the fields of information processing and knowledge discovery [51][52][53].It uses Single Value Decomposition (SVD), a mathematical tool that allows the reduction of matrix rows, without seriously compromising the structure of columns.The matrix rows m (constructed out of the body of text) are the individual word types and the columns n represent the meaning of that word encoded into sentences and paragraphs.The cells that result contain information about the reoccurrence of a word in a paragraph [54].After applying the SVD, and based on information obtained from it, an additional step is carried out, "sentence selection", in which the "most characteristic parts of text" are selected to illustrate ideas that are essential in the body of text [55].

Virtual Quality Management (vQM)
The expansion of applications in the virtual domain opened new horizons also in the case of "Quality Management", which evolved alongside the trends from modern industry.A new concept was thus born called "virtual Quality Management" (vQM).
The concept of vQM is based on, but not limited to, simulation studies, which are efficiently deployed for the sole purpose of "generating resilient knowledge and dimensioning quality techniques" [56] that can be applied either to products or processes, before they physically exist.By doing so, products and processes reach a certain level of maturity in the planning stage, so they can be introduced straight into production, having an increased level of performance compared to ones developed using conventional methods.
As identified from several references approaching the virtual area of quality management in manufacturing, simulation also plays an important role for vQM in developing models capable to foresee, adapt to, and optimize different scenarios by analyzing existing data [57], to increase efficiency and implement sustainable development [58] or even to reproduce measurement uncertainty due to temperature, humidity or other external influences [59].Moreover, the approach proposed by Reference [60], regarding stochastic simulations for tolerance analysis "contributes to the improvement of virtual quality assessment", which in this case, takes place already in the design stage of a product's life cycle.The combination between "simulation" and "virtual reality" allows operators to interact virtually with the fabrication process [61], which in term enables virtual control and even virtual inspection of the manufacturing flow [62].Immersive virtual reality applications can also be used efficiently in problem-solving activities, as demonstrated by Reference [63].
The nature of vQM is highly innovative as it is capable to provide the necessary information for deploying quality management techniques before the actual start of the manufacturing processes [59].Quality and process parameters are obtained with the help of advanced modeling and simulation tools, all contributing to the process design of a manufacturing system.The architecture of vQM, presented in Figure 3, illustrates precisely the above stated particularities.
Sustainability 2019, 11, x FOR PEER REVIEW 7 of 27 [59].Quality and process parameters are obtained with the help of advanced modeling and simulation tools, all contributing to the process design of a manufacturing system.The architecture of vQM, presented in Figure 3, illustrates precisely the above stated particularities.Another important side of vQM is communication.The development of virtual channels (supported by the internet) brought several advantages in this field, such as improving the way we interact with each other or even with machines.They also increased the speed of information transfer, narrowed the gap between supplier and customer, and enabled product developers to provide their contributions remotely by forming "virtual teams" [64].
Although the definition of the concept is quite clear, the impact vQM has on current production engineering and quality management related concepts and how it interacts with them is yet to be found.
A possible research instrument that can exactly capture and contribute to the disambiguation of the vQM concept is LSA.This type of instrument is proposed because the clarification process should begin right at the semantic value of the expression, by analyzing it in the already used or possible contexts, put forth by the specialty literature of the latest years.

Industry 4.0
The forth industrial revolution (smart automation) is on the verge of unfolding and the driver of it will be the integration of physical objects in the information network.The main promoter of this leap in manufacturing industry is Germany, who introduced its own term "Industry 4.0" (or Industrie 4.0 in German) in 2011 at the Hannover Messe trade fair.
The year 2013 brought a clear definition of what the requirements are for the next industrial leap.In this sense, in Germany it was published the "Recommendations for Implementing the Strategic Initiative Industry 4.0" report, written by the German Communication Promoters Group of the Industry-Science Research Alliance, in collaboration with the National Academy of Science and Engineering and sponsored by the Federal Ministry of Education and Research.According to this report (also referred to as the Industry 4.0 National Working Group report), every physical object that is connected with the manufacturing process will be interlinked into a single network through the IoT.This network will incorporate everything starting from the factory floor to the delivery process, i.e., multiple systems that are outlaid as "Cyber-Physical Systems", forming the so-called "Smart Factory".This sort of system has the capability to be self-aware, thus it can actively intervene in the manufacturing processes to prevent potential faults [66].Its self-awareness is given by the sensory data collected within the system.Actions that are precisely ordered are based on previously Another important side of vQM is communication.The development of virtual channels (supported by the internet) brought several advantages in this field, such as improving the way we interact with each other or even with machines.They also increased the speed of information transfer, narrowed the gap between supplier and customer, and enabled product developers to provide their contributions remotely by forming "virtual teams" [64].
Although the definition of the concept is quite clear, the impact vQM has on current production engineering and quality management related concepts and how it interacts with them is yet to be found.
A possible research instrument that can exactly capture and contribute to the disambiguation of the vQM concept is LSA.This type of instrument is proposed because the clarification process should begin right at the semantic value of the expression, by analyzing it in the already used or possible contexts, put forth by the specialty literature of the latest years.

Industry 4.0
The forth industrial revolution (smart automation) is on the verge of unfolding and the driver of it will be the integration of physical objects in the information network.The main promoter of this leap in manufacturing industry is Germany, who introduced its own term "Industry 4.0" (or Industrie 4.0 in German) in 2011 at the Hannover Messe trade fair.
The year 2013 brought a clear definition of what the requirements are for the next industrial leap.In this sense, in Germany it was published the "Recommendations for Implementing the Strategic Initiative Industry 4.0" report, written by the German Communication Promoters Group of the Industry-Science Research Alliance, in collaboration with the National Academy of Science and Engineering and sponsored by the Federal Ministry of Education and Research.According to this report (also referred to as the Industry 4.0 National Working Group report), every physical object that is connected with the manufacturing process will be interlinked into a single network through the IoT.This network will incorporate everything starting from the factory floor to the delivery process, i.e., multiple systems that are outlaid as "Cyber-Physical Systems", forming the so-called "Smart Factory".This sort of system has the capability to be self-aware, thus it can actively intervene in the manufacturing processes to prevent potential faults [66].Its self-awareness is given by the sensory data collected within the system.Actions that are precisely ordered are based on previously stored information, this way making the system not only self-aware, but self-maintained and self-learning.Due to all these technological advancements, Industry 4.0 is viewed as one of the fastest developing trends in production engineering.

Research Questions and Rationale
Within this context, the current research demarche begins with the following set of questions that have been developed starting from the main premise that has been stated, that under severe time constraints complex and resource intensive scientific investigations should only be initiated after a preliminary analysis of available information arrives at the conclusion that there will be sufficient return on investment in the end.The authors laid out in a concentrated form the possible formulation of these questions:

•
What is it and how can an ontology chart that depicts the layout of a concept be obtained?The input selection in the case of the two main concepts was made in the same way.Mostly, the authors focused on gathering references with the help of search engines that have access to large databases of scientific research that offer the possibility to access full-text articles (Elsevier's Scopus, Elsevier's Science Direct and Google Scholar).However, if additional documents were found with the help of traditional search engines (like Google or Yahoo) they were not disregarded but retained only if they were considered to be relevant after the first filtration process.This type of documentation was manually converted to a format supported by the LSA software (typically .pdfor .txt).
As search engines work with keywords, the challenge was to determine those words that accurately represent the studied concepts.In the first instance, the search was limited to identifying references in relation with the name of the concept, then it was extended to include certain keywords obtained from the top ten most cited scientific papers.These were restricted to six at the most for avoiding the display of insignificant documentation.The timeframe was also preset to the last 7 years for limiting outdated information.It was considered that documentation outside this framework cannot accurately capture state-of-the-art advancements in the studied fields.After a brief revision (the focus was kept on the abstract of the paper, but not limited to it) of displayed papers, a total number of 35 references were considered to be sufficient for illustration purposes for each main concept.They were downloaded and saved in separate folders.The brief review was chosen instead of an extensive literature browsing due to two reasons: the latter one would not serve to the proposed objective, which is quick identification of trending elements and concepts from a specific field of interest (meaning it was much more time consuming); and secondly, there were no guarantees that by reading the paper entirely it could be used in the semantic analysis, so there was the chance that after the review it would be disregarded.We point out the fact that the choices made here represent one of the limitations of the proposed methodology, that for making these decisions relating to the scope of the research it must rely on the experience and expertise of the reviewers, the quantity and quality of available literature, and on the relative positioning of the studied notions within the processed material when performing the first quantitative evaluation.
Making sure that every selected paper was indeed relevant for the analysis, the methodology included an additional filtration process performed by the review team.Each key word (mentioned above) was analyzed from its distribution perspective, throughout all the 35 papers.If the maximum number out of all keyword frequencies for each paper was not greater than five (viewed as a minimal threshold considering that all of the studied papers were more than five pages in length), the paper was considered to be irrelevant and it was removed from the analysis folder, where all were saved in the convened format.For this, the following condition had to be fulfilled: where x represents the frequency of the i keyword (i = {1,8}) For illustration purposes, Figure 4 contains the frequency of mentioned keywords in a certain paper: Sustainability 2019, 11, x FOR PEER REVIEW 9 of 27 was considered to be irrelevant and it was removed from the analysis folder, where all were saved in the convened format.For this, the following condition had to be fulfilled: where x represents the frequency of the i keyword (i={1,8}) For illustration purposes, Figure 4 contains the frequency of mentioned keywords in a certain paper: The above figure contains the occurrence number of each keyword for one of the 35 papers.As it can be seen, the keywords "management" and "organization" are the two least frequent ones.Although "management" appears only two times it is not sufficient to consider this paper irrelevant; for a certain scientific reference to be entirely removed from the analysis it must be scarce in all key words (their maximum must be lower than or equal to five).It can happen that a certain paper is scarce in one key word, but rich in others; in this case, the paper is taken into account, as it is the case of this example, again appealing to the reviewers' competence, but to a reduced degree compared to the classic literature review.After this final filtration process a set of 30 papers remained, which were subject to the LSA.
The semantic analysis was carried out using a software program called Tropes v8.4 (English version) which is available free of charge and is capable of processing and perceiving, to a sufficiently advanced level, relationships between words, identifying or disregarding words specific to a domain of interest (based on input settings), counting the reoccurrence of words within different contexts, and retrieving neighboring concepts for certain notions or words.

Practical Implications (Defining the Ontology Charts)
As a direct result of the semantic analysis, an ontology chart of vQM was obtained and is presented in the following figure (Figure 5): The above figure contains the occurrence number of each keyword for one of the 35 papers.As it can be seen, the keywords "management" and "organization" are the two least frequent ones.Although "management" appears only two times it is not sufficient to consider this paper irrelevant; for a certain scientific reference to be entirely removed from the analysis it must be scarce in all key words (their maximum must be lower than or equal to five).It can happen that a certain paper is scarce in one key word, but rich in others; in this case, the paper is taken into account, as it is the case of this example, again appealing to the reviewers' competence, but to a reduced degree compared to the classic literature review.After this final filtration process a set of 30 papers remained, which were subject to the LSA.
The semantic analysis was carried out using a software program called Tropes v8.4 (English version) which is available free of charge and is capable of processing and perceiving, to a sufficiently advanced level, relationships between words, identifying or disregarding words specific to a domain of interest (based on input settings), counting the reoccurrence of words within different contexts, and retrieving neighboring concepts for certain notions or words.

Practical Implications (Defining the Ontology Charts)
As a direct result of the semantic analysis, an ontology chart of vQM was obtained and is presented in the following figure (Figure 5): of interest (based on input settings), counting the reoccurrence of words within different contexts, and retrieving neighboring concepts for certain notions or words.

Practical Implications (Defining the Ontology Charts)
As a direct result of the semantic analysis, an ontology chart of vQM was obtained and is presented in the following figure (Figure 5):  The vQM ontology chart (Figure 5) contains notions that are strongly related to quality management (measurement, control, quality, innovation, SPC).Some make references to instruments deployed through virtual means (such as simulation, modeling and optimization) and introduces "communication" and "collaboration" as notions linked with vQM (although having a low frequency; in sentences they were used closely together with vQM).
For "Industry 4.0", there appeared considerably more related notions on the ontology chart (Figure 6) than in the case of vQM, which can be explained by the fact that although the two concepts are considered to be relatively new, "Industry 4.0" is more developed and established in the scientific community as it is promoted to be the next generation of production engineering.Also, the known information on this subject reached a more mature level due to a higher number of publications.The vQM ontology chart (Figure 5) contains notions that are strongly related to quality management (measurement, control, quality, innovation, SPC).Some make references to instruments deployed through virtual means (such as simulation, modeling and optimization) and introduces "communication" and "collaboration" as notions linked with vQM (although having a low frequency; in sentences they were used closely together with vQM).
For "Industry 4.0", there appeared considerably more related notions on the ontology chart (Figure 6) than in the case of vQM, which can be explained by the fact that although the two concepts are considered to be relatively new, "Industry 4.0" is more developed and established in the scientific community as it is promoted to be the next generation of production engineering.Also, the known information on this subject reached a more mature level due to a higher number of publications.The related notions in this chart include the main pillars of Industry 4.0, such as IoT (Internet of Things), cyber (making reference to "Cyber-Physical Systems"), "Big Data", and "Cloud Computing"; some design principles of Industry 4.0 as stated in Reference [67] (interoperability, decentralization, virtualization), and some notions which are also common to vQM.
The correlation degree of each current notion (denominated k) with regards to the main concept is reflected in its position on the chart.The position can be expressed as a function of the causal relation (the x-axis) and the concentration of relations-or frequency of association (the y-axis) [68].Thus, the graph supports a visual comparison of the weight of relations between all concepts present on the chart with the main one.
On the x-axis, the number of words separating one concept from another in sentences can be considered as a direct reference for establishing their causal relation.Taking this into account, for quantifying the causal relation for both the "actant" (predecessor, enabler or source) and "acted" (successor, consequence or attribute) concepts, a scale from 0 to 1 was attributed from the center to both ends of the chart.As a certain k notion is further apart (closer to the value 1) on the chart from The related notions in this chart include the main pillars of Industry 4.0, such as IoT (Internet of Things), cyber (making reference to "Cyber-Physical Systems"), "Big Data", and "Cloud Computing"; some design principles of Industry 4.0 as stated in Reference [67] (interoperability, decentralization, virtualization), and some notions which are also common to vQM.
The correlation degree of each current notion (denominated k) with regards to the main concept is reflected in its position on the chart.The position can be expressed as a function of the causal relation (the x-axis) and the concentration of relations-or frequency of association (the y-axis) [68].Thus, the graph supports a visual comparison of the weight of relations between all concepts present on the chart with the main one.
On the x-axis, the number of words separating one concept from another in sentences can be considered as a direct reference for establishing their causal relation.Taking this into account, for quantifying the causal relation for both the "actant" (predecessor, enabler or source) and "acted" (successor, consequence or attribute) concepts, a scale from 0 to 1 was attributed from the center to both ends of the chart.As a certain k notion is further apart (closer to the value 1) on the chart from the main concept, it means that in sentences a significant number of words are placed between them.This value, called the "Causal Relationship Value" (CRV) takes values from 0 to 1: On the y-axis, a fraction is represented between the total number of relations (or association) that each k concept has with the main one and the number of different relations.This value is called the "Concentration Value" or CV.The chart shows the minimum (bottom) and maximum (top) values, as they were calculated by the software program used to complete the semantic analysis.The formula for expressing the CV is: where A is the total number of relations and B k is the number of relations that each concept has with the main one.

Theoretical Aspects
As a consequence of attributing the two scales for the "actant" and "acted" concepts their strength of the causal relation with the main concept is inversely proportional.
The CV and the inverse of CRV having completely different calculation rules but comparable in terms of magnitude, a correlation number (CN), obtained by the product of the two, was considered to be significant for expressing the importance of each concept in relation with the main one.The formula used for determining the CN is presented as follows:

Practical Implications
The CN of each notion from the ontology charts are centralized in Tables 1 and 2, starting from the highest and descending to the lowest: In the next step, combinations between the "actant" and "acted" notions were made for assuring increased homogeneity between all the terms of each concept.This action was taken for bringing the specific notions of each concept more closely together with the scope of finding common grounds between them.
A cumulative score can be calculated for each resulting association by multiplying the individual correlation values of the two forming notions, thus focusing on their conceptual intersections.This score can be used to determine the hierarchy between the obtained associations.However, only the meaningful associations were retained (thus becoming descriptors for each concept) and ordered in a descending manner in Tables 3 and 4: In the case of the two concepts, the expressions obtained by pair-wise associations were filtered for redundancies and were verified if they have real-life applicability with the help of search engines.In some cases, connection words were added to further refine the expressions and to assure their coherence related to the main concept.

Theoretical Aspects
Based on the above steps, we can now characterize the previously ambiguous knowledge space, found at the intersection of the two notions, that are connected through a common goal, namely, sustainability.Overlapping the tables containing the descriptors of vQM and Industry 4.0 (Tables 3 and 4) it can be observed that there are common ones between them.These are used for determining common ground between two apposed concepts.Ranking them was done by calculating their weighted average (thus, still respecting their importance to each main concept) and consequently their weighted importance obtained in percentages.

Practical Implications
As seen in the table below, it can be the case that there is a considerable difference between the total scores of vQM and Industry 4.0.The weighted average provides an aggregate that accurately reflects the importance of each descriptor to both main concepts, regardless of their score difference.Although each common notion has its own LSA score depending on the main concept (vQM or Industry 4.0), the ranking presented in the above table was made according to the calculated weighted average of the score pairs.These theoretical results can be further expanded for the use of practitioners by mapping onto the knowledge space the adequate concrete implementations that were found in industry.The proposed solution for identifying industrial applications uses the authors' expertise and is similar to identifying relevant papers for the LSA.Firstly, the top five most important common descriptors (obtained from the LSA-see Table 5) were entered into search engines followed by keywords extracted from the top 10 most cited "Industry 4.0" references together with the "application" keyword that was added to limit the amount of retrieved information.This way it is assured that the search was focused on practical applications and were related to "Industry 4.0" as its representative keywords were used.As "Industry 4.0" is considered to be the new trend in production engineering, applications related to it are also categorized as emergent.The retained applications were focused on those that had connections with-or can be used in relationship to vQM.A list of applications is obtained, but they are not sorted in any way.

Theoretical Aspects
Focusing on the theoretical connections uncovered before, this last step of the research methodology deploys the new structure of the two-fold common knowledge space into workable results that can be applied in companies.In other words, by using the proper applications in the proper manner, based on the guidelines below, firms can employ vQM tools to rip the competitive benefits of Industry 4.0.Taking into account the development (the third) dimension, to the two axes (vQM and Industry 4.0) another axis is added, the PLC, along which identified applications are grouped into its stages (see Figure 1

for an overview).
There are many views when it comes to defining precisely the product life cycle stages.For this reason, the authors synthesized information from an internationally accepted standard which inclusively covers all stages of a product life cycle: Reference [69].The stages were grouped into main stages and sub-stages accordingly.These are presented in Table 6:

Practical Implications
The affinity diagram (also known as a KJ diagram) was applied for grouping the set of applications into the corresponding life cycle stages, defining both categories and sub-categories of applications.The matrix diagram (presented in Table A1 in the Appendix A Section) contains both these and the common notions with their weighted importance (expressed in percentages).
The cells resulted at the intersection between main or sub-category applications and the common notions were filled out according to the correlation between them, with a number on a scale of 0 to 3 (0 means that there is no relation between the two and 3 signifies a strong relation).The sum of products between these correlation numbers and the weighted importance of the common notions was calculated on each line, thus obtaining an aggregate number that can be viewed as the importance of each application.If a main category application contains more sub-category items, its score is provided by the average of them (indicated in bold in Table A1).
Having the importance scores for every sub-or main-category application, they can be hierarchized for all the stages of the product life cycle (in Table A1 the applications are already ranked).Thus, the matrix diagram contains a ranking of identified vQM applications in the context of Industry 4.0, that are or will become essential in each stage of the product life cycle.Where blank squares exist corresponding to a life cycle stage it means that there were no suitable applications identified that fit the previously mentioned search conditions (see Section 4.3).All identified applications grouped into the PLC stages are presented in Figure 7: The above representation combines PLC stages with the identified significant notions and their potential applicability degree to provide an overview for potential academic and industrial users.The circles represent the main categories of applications and their size is proportional with their importance as calculated in the matrix diagram (Table A1).The squares correspond to actual applications and their size also represents their priority in reference to the category they belong to.The denomination of each application is presented below the figure.The above representation combines PLC stages with the identified significant notions and their potential applicability degree to provide an overview for potential academic and industrial users.The circles represent the main categories of applications and their size is proportional with their importance as calculated in the matrix diagram (Table A1).The squares correspond to actual applications and their size also represents their priority in reference to the category they belong to.The denomination of each application is presented below the figure.

Research Implications
The main implication of the current research is uncovered by carrying out a validation stage of the proposed LSA-based methodology.This was done through an online questionnaire, applied to groups of professionals with relevant knowledge connected to the actual technological developments from the automotive industry, which is considered in Romania to have the highest degree of quality management and industrial practices proficiency and is also subjected to the most intense competition on a European scale.
The validation demarche raised the following challenges: • covering the important companies from the automotive sector such that even if statistical relevance is not reached, this will be reflected by including within the survey all/most of the major actors from this field; • selecting the target group such that it has access to relevant information; • designing and constructing the questionnaire.
Statistically speaking, there are 510 automotive companies (NACE code 29) in Romania (according to the online database of Reference [70]), but in terms of our analysis not all of them present the same relevance.An analysis of the technological level of these companies clearly leads to the conclusion of an inhomogeneity of their relevance in relation to our analysis.One can expect not to obtain similar information on new technological developments regarding Industry 4.0/vQM from Renault or Ford (big players, global auto makers) and from a small firm that supplies low complexity parts.For this reason, the focus of this study was aimed at those few major manufacturers from this industry branch and questioning consultants, specialists, quality auditors or managers that draw their expertise from not just one, but several important companies and can provide answers with a high degree of relevance.This way, the answers of one individual can represent feedback from more than one company.
Considering the above-mentioned aspects, the companies were grouped into three main categories: the first two categories (A. automotive manufacturers and B. manufacturers of major components-e.g., Dacia-Renault, Ford, Bosch, Continental, Leoni, Rombat, Daimler, Pirelli, RAAL) represent manufacturers with international notoriety; all other companies were left for a third (C) category.The first category had 10 representatives, the second had around 150, and in the third category 350 companies were included.
The investigation was guided towards the first two categories, but there was also a concern about obtaining some samples as well from the third category companies, mainly from those which were preoccupied with new technologies.The reason behind this approach is that SME-type companies (from the C category) are mostly focused on concrete measures of physical QM due to resource constraints, lack of maturity, and limited openness towards adopting new technologies and due to the need to obtain quick and measurable results, with a noticeable impact on their process performance and bottom line.
The surveyed target group consisted of consultants, specialists, quality auditors or even managers that drew their expertise from not just one, but several important companies and could provide answers with a high degree of relevance.An important issue addressed was information confidentiality, as the research team made sure that the expertise and feedback of the focus group was captured but that company specific information was filtered out.
Within the survey, the abovementioned groups were provided with a list of 41 applications determined through the LSA (presented in the Matrix Diagram-Tables A1 and A2) and they had to rate them considering their implementation or future implementation into the companies whose manufacturing processes they were familiar with.Responders also had the possibility to name and rate additional applications that were not included, but were considered to be important by them (centralized in Table A3).
The scale for rating each application was kept the same as in the case of the matrix diagram, from 0 to 3, thus enabling an objective and direct comparison between the score calculated from the matrix diagram and the average score obtained from the survey (0-"it's not likely to be used by manufacturing companies"; 1-"only some manufacturing companies have or will implement it"; 2-"most of the manufacturing companies have or will implement it"; 3-"all manufacturing companies have or will implement it").
Table 7 includes the number of responses, the number of companies covered from each type and the percentage of coverage: As it can be seen, not all of the 510 companies were covered; however, all of the type A and most of the type B companies were reached within the study.
For analyzing the responses, the chosen reference values were the ones obtained from the questionnaire because they are considered to be the "voice" of the industry.A margin of error was calculated for each score obtained from the matrix diagram and findings showed that it was not more than ±8% (representing the deviation from the reference).The average margin of error was even better, being just under 4% (representing the mean of the deviation modulus values).
The list of applications alongside their rating and margins of error can be consulted in Table A2.It presents not only the accuracy of the research methodology approached within this paper, but also prioritizes trending applications that companies will need for their own development, thus defining the emerging research directions.

Conclusions
The current paper focused on analyzing the common knowledge space in specific areas by intersecting the knowledge structures obtained through a concept disambiguation process.The results of this analysis were further used to determine the technological solutions that are expected to emerge from that concept intersection space.Particularly, it was analyzed the possibility of identifying emergent industrial applications by intersecting the knowledge structures of two concepts: Industry 4.0 and vQM.
The investigation and disambiguation process were achieved with the help of a fairly new knowledge extraction method called latent semantic analysis (applied by a specialized software program), which is capable of analyzing large amounts of text and quickly identify the relationship, causality, and frequency of words, thus providing an interpretation of the state-of-the-art literature (in close connection with the two main concepts-vQM and Industry 4.0).By applying this technique, the common areas of interest (common knowledge space) were identified in the form of descriptors that were further used for identifying applications that are defined by or relate to them.
The set of applications was grouped into the stages of the product life cycle according to the ISO/IEC/IEEE 15288:2015 standard (with the help of a classic sorting tool-Affinity Diagram) and ranked considering their importance scores resulting from applying a matrix diagram.These calculated scores reflect the weighted importance of common descriptors between vQM and Industry 4.0, as they also served as inputs into the matrix diagram.
The research loop was closed with a survey that focused on obtaining information from experts (from the field of quality) with close ties to the automotive industry, that confirms the reliability of the presented methodology with an average margin of error of just under 4%.
As illustrated through the course of this paper, one can note the important advantages of this LSA-based methodology:

•
Emerging technological trends can be identified from the common knowledge space of two or more concepts;

•
The state-of-the-art knowledge discovery process (concept disambiguation) is based upon an exponentially greater number of scientific references and it is reduced to a few seconds;

•
The outputs of the LSA can be further analyzed, sorted, and grouped by deploying various tools and techniques, thus obtaining more useful information.
Considering the abovementioned key points of the methodology, it can be stated that the resulting applications are the product of current research concerns in the two analyzed fields, vQM and Industry 4.0.The constructed matrix diagram also provides a useful preview of industrial applications throughout the product life cycle and ranks them to depict those that are more critical to quality based on their importance scores.
The current framework has a high degree of portability and it can be applied to other concept associations as well for determining common areas or emergent aspects in a specific context.
In a broader understanding, the model can make forecasts for applications that are more likely to become popular in the next period of time.Organizations that are capable to foresee them and focus on their development and implementation have a clear competitive advantage towards achieving sustainable growth and advancement.The applicability of the methodology and the results described in this paper are two-fold.From an academic point of view, they can provide the clarification of the concepts for the use in training and instruction as well as an approach to investigate the two domains in the future to gauge their development.On the other hand, industrial users could benefit from the identified ranking of potential applications to simplify and make more efficient their implementation projects, allowing them to keep the focus on the bottom line and on the concrete results that are needed for market success.

Figure 1 .
Figure 1.Knowledge density zone between vQM and Industry 4.0 along the PLC stages.

Figure 1 .
Figure 1.Knowledge density zone between vQM and Industry 4.0 along the PLC stages.

Figure 4 .
Figure 4. Occurrence of keywords in one identified paper.

Figure 4 .
Figure 4. Occurrence of keywords in one identified paper.

Figure 7 .
Figure 7. Industrial applications along the PLC stages.

Table 7 .
Number of responses and percentage of coverage.

Table A2 .
List of rated applications.

Table A3 .
List of applications nominated by survey responders.