Industry 4.0 Technological Advancement in the Food and Beverage Manufacturing Industry in South Africa—Bibliometric Analysis via Natural Language Processing

: The food and beverage (FOODBEV) manufacturing industry is a signiﬁcant contributor to global economic development, but it is also subject to major global competition. Manufacturing technology evolution is rapid and, with the Fourth Industrial Revolution (4IR), ever accelerating. Thus, the ability of companies to review and identify appropriate, beneﬁcial technologies and forecast the skills required is a challenge. 4IR technologies, as a collection of tools to assist technological advancement in the manufacturing sector, are essential. The vast and diverse global technology knowledge base, together with the complexities associated with screening in technologies and the lack of appropriate enablement skills, makes technology selection and implementation a challenge. This challenge is premised on the knowledge that there are vast amounts of information available on various research databases and web search engines; however, the extraction of speciﬁc and relevant information is time-intensive. Whilst existing techniques such as conventional bibliometric analysis are available, there is a need for dynamic approaches that optimise the ability to acquire the relevant information or knowledge within a short period with minimum effort. This research study adopts smart knowledge management together with artiﬁcial intelligence (AI) for knowledge extraction, classiﬁcation, and adoption. This research deﬁnes 18 FOODBEV manufacturing processes and adopts a two-tier Natural Language Processing (NLP) protocol to identify technological substitution for process optimisation and the associated skills required in the FOODBEV manufacturing sector in South Africa.


Introduction
The FOODBEV manufacturing industry significantly contributes to the global economy. In South Africa, the sector plays a fundamental role in economic growth and employment creation. Based on the FOODBEV's gross domestic product (GDP) contribution, it is the third largest manufacturing sector in South Africa, contributing to 27% of the country's manufacturing GDP [1]. The sector encounters numerous challenges, including changing consumer preferences, compliance, escalating population growth, and rapid advances in disruptive technologies courtesy of the advent of Industry 4.0. While the sector is growing, the impact of new technology adoption on the sector's growth and skills requirements is unclear. Most importantly, with rapidly evolving technologies, it is challenging to identify contemporary substitute production technologies.
Technological innovation is a cornerstone of the FOODBEV manufacturing industry. The mechanism and path of the rise of the manufacturing value chain can be categorized into innovation-and technology-driven paradigms [2]. Organisations like Boeing have achieved a dominant role in their respective manufacturing industries due to their smart value chains, created by adopting Industry 4.0 technologies such as Big Data Analytics (BDA), Internet of Things (IoT), and robotics [3]. The South African FOODBEV sector can adopt similar technologies across entire value chains to realize a competitive advantage, overcome production bottlenecks, and improve productivity and efficiency [2,4]. One crucial constraint is the speed of new technological advancements. Clearly demarcated methods for implementing Industry 4.0 technologies or substituting existing technologies in manufacturing entities cannot be identified in the literature [4]. This infers significant opportunities for the overall impact of Industry 4.0 on the sector, ultimately indicating a research gap. Whilst there is a wide array of information in various scientific databases such as Web of Science or Scopus, it is a mammoth task to acquire specific details that guide and inform the implementation of specific technologies [5]. A bibliometric analysis (BA) has traditionally enabled the identification of research trends and patterns with the aid of information sourced from research databases. A BA is found to be challenging and time-consuming, with the researcher still manually reviewing and eliminating papers for relevance, and depending on the search criteria, can range from ten to hundreds of papers [6]. This points to a research opportunity to identify a more expeditious approach.
This research study thus seeks to address the research gap by exploring the overall impact of Industry 4.0 implementation in the South African FOODBEV sector. Further, this research seeks to explore which existing or new Industry 4.0 technologies can be introduced to the benefit of the sector, taking into consideration the rapidly changing landscape. Finally, the research seeks to map skills associated with newly identified technologies. The research is initiated by an extensive literature review to explore the impact of Industry 4.0 on manufacturing value chains, followed by a detailed structured methodology entailing a technologies substitution framework developed for the FOODBEV manufacturing sector.

Literature
The FOODBEV manufacturing industry encompasses any company that produces, processes, manufactures, and sells FOODBEV products [3,7]. Agricultural production and food processing are rapidly growing industries; this is due to escalating global population growth, multiplicity of consumer desires, and diverse consumer needs for processed foods [8]. In South Africa, the FOODBEV manufacturing sector accounts for ZAR 522.7 billion in terms of income contribution [7].
Over the years, the sector has experienced a significant increase in FOODBEV trade [1]. Between 2019 and 2020, the South African FOODBEV sector generated the largest share of revenue exports, when compared to others in the manufacturing sector [7]. Exports revenue increased from ZAR 49.7 billion to ZAR 77 billion, driven largely by South Africa's weaker domestic currency. Imports experienced slower growth than exports, with a slight increase of ZAR 2.5 billion from ZAR 70 billion in 2019 to ZAR 72.5 billion in 2020 [1]. While the sector is growing, it is still unknown how the adoption of the rapid-evolving Industry 4.0 is influencing its growth.

The FOODBEV Industry and Industry 4.0
The FOODBEV Manufacturing Cycle (FBMC) is an essential value chain. The FBMC's basic model is maximum value addition at optimum efficiency with cost-effective production. This is valid across the spectrum from the processing of agricultural goods into ready-for-consumption commodities [8]. Due to the escalating global population and increased demand for processed foods, opportunities for product differentiation and value addition for raw goods have increased [9]. Through monitoring and controlling the factors that lead to food corruption or inefficient production processes, the primary objective of food quality and safety, from farm to fork in food manufacturing, is preserved and maintained [8]. The emergence of Industry 4.0 and associated technological drivers can enable this primary objective. 4IR refers to the innovative processes that are fully or partially automated (digitalised) through technology and devices which autonomously communicate with each other [4]. It is premised on the intelligent networking of innovative Information Technology (IT) systems, electrical equipment, and machines, thus facilitating process optimization and accelerated productivity of value chain creation activities and processes [7]. Digital transformation is a principal component of the ever-evolving industrial revolution. Digitalisation is not the simple transfer or transition from "analogic" to digital documents and data but instead signifies the networking between business processes, developed interfaces, and the exchange and management of data [10]. Manufacturing and production models are evolving via the development and formulation of digital (smart) or innovative technologies such as new artificial intelligence (AI), generation of sensors, Machine Learning (ML), Machine-to-Machine (M2M) communication, and cloud computing (CC). The utilisation of key enabling technologies (KETs) in existing plants enables a novel phase of automation, which in turn results in innovative and more efficient processes, products, and services [7]. Industry 4.0 offers a valuable opportunity to propel the sustainability of the food sector [3]. There are nine Industry 4.0 technologies and approaches applicable to the food industry: autonomous robots, Big Data Analytics (BDA), horizonal integration, vertical integration, cybersecurity, simulation, Industrial Internet of Things (IIoT), additive manufacturing, and cloud augmented reality [11]. The adoption of Industry 4.0 technologies will propel the rapid transformation of industry and change the business abruptly, enabling faster manufacturing and the production of higher-quality food products at reduced cost [7].

Impact of 4IR in the FOODBEV Manufacturing Sector
The implementation of technologies results in benefits and challenges. The areas of impact of 4IR on the FOODBEV manufacturing sector are detailed in Figure 1 below. that lead to food corruption or inefficient production processes, the primary objective of food quality and safety, from farm to fork in food manufacturing, is preserved and maintained [8]. The emergence of Industry 4.0 and associated technological drivers can enable this primary objective.
4IR refers to the innovative processes that are fully or partially automated (digitalised) through technology and devices which autonomously communicate with each other [4]. It is premised on the intelligent networking of innovative Information Technology (IT) systems, electrical equipment, and machines, thus facilitating process optimization and accelerated productivity of value chain creation activities and processes [7]. Digital transformation is a principal component of the ever-evolving industrial revolution. Digitalisation is not the simple transfer or transition from "analogic" to digital documents and data but instead signifies the networking between business processes, developed interfaces, and the exchange and management of data [10]. Manufacturing and production models are evolving via the development and formulation of digital (smart) or innovative technologies such as new artificial intelligence (AI), generation of sensors, Machine Learning (ML), Machine-to-Machine (M2M) communication, and cloud computing (CC). The utilisation of key enabling technologies (KETs) in existing plants enables a novel phase of automation, which in turn results in innovative and more efficient processes, products, and services [7]. Industry 4.0 offers a valuable opportunity to propel the sustainability of the food sector [3]. There are nine Industry 4.0 technologies and approaches applicable to the food industry: autonomous robots, Big Data Analytics (BDA), horizonal integration, vertical integration, cybersecurity, simulation, Industrial Internet of Things (IIoT), additive manufacturing, and cloud augmented reality [11]. The adoption of Industry 4.0 technologies will propel the rapid transformation of industry and change the business abruptly, enabling faster manufacturing and the production of higher-quality food products at reduced cost [7].

Impact of 4IR in the FOODBEV Manufacturing Sector
The implementation of technologies results in benefits and challenges. The areas of impact of 4IR on the FOODBEV manufacturing sector are detailed in Figure 1 below.  Intelligent manufacturing-"Smart Factories" intertwine virtual and physical worlds by applying innovative digital technology such as Cyber Physical System (CPS), IoT, IIoT, cloud computing, AI, 3D printers, advanced robotics, data capture and analytics, advanced marketing models, hi-tech sensors, and software-as-a-service (SaaS). Many systems and functions found in a FOODBEV enterprise are embracing and adopting 4IR technologies, such as Food Quality Assurance, Enterprise Resource Planning (ERP), Facilities Management, Research and Development, and Manufacturing Execution System (MES) [12].
Systems inclusive of physical devices such as manufacturing equipment and sensor devices and ICT monitor and analyse processes, detect deviations, and trigger corrective adjustments with minimum human interaction. Using artificial intelligence (AI) techniques, the system learns from past experience through collected datasets, adjusts to new inputs from the surrounding environment, performs human-like tasks, and memorises inputs for future optimization [11].
Quality Control-The integration of digital image processing with robots entails a sequential series of activities commencing with the (1) capture of real-time images via a contactless manner, (2) visual representation in the computer, (3) automatic analysis, and (4) control commands' generation premised on measurement results or findings. This is especially advantageous during the inspection of food quality activities such as verifying the accuracy of labelling, colours, and volume of dimensions [9]. Analytical monitoring activates necessary adjustments based on detected deviations to fulfil the standards of food safety and facilitate the early or prompt detection of defects, thus subsequently minimising food wastages or costly food recalls. Simultaneously, the technology automatically stores data for evidence and documentation purposes should a customer raise complaints in the future [11].
Food Traceability System-Traceability is costly and challenging with an increase in value chain complexity. This complexity can be integrated to novel attributes of the food material components undergoing dynamic transformation from raw material to individual food products. Radio Frequency Identifiers (RFIDs) have been applied in chicken meat tracing [13]. The system is applied throughout the entire value chain, from the farmhouse to the slaughterhouse, food processing factory, retailer, and consumer. Traceability information is collected and registered via RFID readers and transmitted to a central database [14]. At particular stations, devices exist whereby consumers are able to read data about the chicken meat from the central database. A cheaper alternative for a smart traceability system is the Quick Response (QR) code, whereby consumers are able to obtain data pertaining to the specific food item via scanning the code. This is also achievable through utilising a reader application which is installed on a smartphone [11].
Manufacturing Design-Within plant operations, 4IR is elevating simulations. Simulation software leverages real-time information or data and models actual manufacturing ecosystems in a virtual model to include materials, processes, machines, processing lines, humans, and material handling systems [7]. Testing, analysis, and optimisation are executed virtually, prior to any physical changeover at the real factory. An example is the design of a new brewery that allows the simulation of the holistic production process and the evaluation of numerous planning strategies beforehand [10]. Reliable decision making leads to effective cost saving and planning. Manufacturing production failures or down times are initiated as early as in the start-up phase [11].
Automation for Repetitive Tasks-Loading or unloading, assembly, packaging, palletization, sorting, and piling are common in the food sector and are a robot's "specialization". Despite the slow robotic implementation progress in replacing the human workforce, the potential of robots is encouraged by robot manufacturers due to advantages such as achieving hygiene or food safety standards and requirements, resource efficiency improvement, simplification of maintenance, and human injury prevention [8]. The gripper technology is a sub-system of the equipment which gets into contact with gripped objects, with advantages such as not leaving visible marks on items after gripping and hygiene standards. Such technology has eliminated the need for pipes or tubes that are difficult to clean [11].
Marketing-Augmented Reality (AR) is reshaping marketing. Current mobile technology improvements in built-in cameras, sensors, computational resources, and mobile cloud computing enable AR on mobile devices [8]. AR supports the consumer to personally engage with products as if the products were proximal to them. This cuts expenditures with regards to resources, logistics, marketing personnel, and advertising material. The technology stores and provides instantaneous data about customer behaviour and feedback without conventional post-purchase survey [11]. Training-AR supports enhanced learning and training. Trainees can comprehend the subject faster than conventional learning, thus preventing trainees from disturbing real production activities [11].
Customer management-Accomplishing individual customer preferences has affected areas such as product design, order management, R&D, commissioning, shipment, utilisation, and the recycling of products. Considering the increasing popularity of individualism in customer requirements, 3D printers or additive manufacturing technologies are used in food fabrication [15]. Based on the desired layout configuration, recipe, or shape, raw materials are deposited in sequential layers during the manufacture of products by a 3D food printer. Binding printers can adhere materials together with edible cement [9]. The 3D food printers possess more advanced technology, featuring lasers, high-tech nozzles, syringes, and robotic arms which work with powdery material so as to manufacture customized patterned chocolate or geometrically different pastry [11].
Technological developments enable higher manufacturing efficiency rates and lower production costs, which is critical for a manufacturing organisation's competitive advantage. Moreover, food security and safety have recently been a great concern and a top global priority [16]. 4IR technologies facilitate the overall optimisation of the FOODBEV processing cycle and improve food security. A key consideration is the identification and implementation cycle for new technologies or the process of digitalisation.

Digitalization
Digitisation is the transformation of services and products via the usage of digital technologies so as to entirely replace them or enhance their features [17]. Several emerging Industry 4.0 technologies are converging to provide digital solutions. Prior works have proposed maturity models for implementing these technologies [4]. Others have studied the impact of such technologies on industrial performance [18]. Though these studies provide useful details on the implementation aspects of digital manufacturing, none of these studies enable the holistic comprehension of the processes for implementation [14].
Frank et al. in 2019 [14] proposed a conceptual framework for Industry 4.0 implementation, as illustrated in Figure 2. The centre of the framework places "front-end technologies" which transform manufacturing activities based on emerging technologies (Smart Manufacturing) and products (Smart Products) and, thus, are concerned with operational and market needs. It considers raw materials and product delivery (Smart Supply Chain) and new ways in which workers perform activities based on emerging technology support (Smart Working). The front-end layer relies on the "base technologies" layer, which allows front-end technologies to be integrated in a complete manufacturing system.

Knowledge Extraction for Digitalisation
Despite the rapid rate at which Industry 4.0 technologies are evolving, it is challenging to decipher which ones (existing or current) can be implemented in FOODBEV manufacturing across entire value chains [19]. Furthermore, there is a wide array of information on FOODBEV manufacturing processes scattered across the internet in diverse formats, such as internet sources, journal publications, and white papers [5]. Searching for a specific production process information is time-consuming and strenuous; thus, researchers require tools that aid the bibliographic searches in huge collections of this nature [5]. Common bibliographic search engines utilise keyword queries which are too limited to consider such variability. For instance, a keyword query such as "cheese" fails to retrieve or extract all relevant information. For instance, they miss documents where the proper cheese name ("Brie") is used instead of the term "cheese" . Queries which include all cheese names are not practical to formulate and maintain. Moreover, keyword queries are unsuitable for retrieving processes and waste or by-products produced during manufacture [20]. The rapid evolution of Industry 4.0 technologies and the vast FOODBEV manufacturing information available in a wide array of formats thus presents a challenge in gathering precise information on FOODBEV manufacturing processes or the associated Industry 4.0 technologies utilised.

Mapping Skills Associated with Newly Identified Technologies
The main observed consequence of technological changes on the food industry is the fast-growing demand for technological skills such as basic digital skills and advanced technological skills, for example programming [12]. Awareness of data security, data processing, and data protection gains more importance due to this demand. The demand for emotional or social skills (which machines are a long way from learning) is increasing rapidly due to advanced technology adoption. Because of the increasing digitalisation and automation of industrial process activities, the labour force is required to execute more complex tasks [19]. Executing these tasks entails solid literacy, numeracy, problemsolving techniques, and ICT skills and the soft skills of autonomy, coordination, and collaboration. Higher cognitive skills, for example critical thinking, creativity, problem solving, teamwork, lifelong learning, and decision making, are becoming very critical [15]. Skills such as independent problem solving, critical thinking, and decision making will become particularly critical in technocratic job profiles, such as control technicians or production operators. Demand for organisational, communication, and managerial skills will significantly increase [12].

Summary of Literature Review
The knowledge covered in the literature review alludes to the time-consuming and mammoth task of identifying Industry 4.0 technologies that can be implemented or substituted in the FOODBEV manufacturing industry. This is due to the wide array of information on scientific databases and its varying structure.
The research gap is thus the lack of a dynamic approach that is time-and efforteffective for extracting information on Industry 4.0 technologies that can be substituted in the FOODBEV manufacturing sector. This study develops a digital tool, based on NLP, that can effectively extract and store data on Industry 4.0 technologies from various sources. These data are then accumulated in a single structure for analysis for identification of 4IR technologies for substitution to specific processes as well as the skills required for the implementation and operation of the identified technology.

Introduction
The research approach of this study is quantitative and premised on knowledge management, which is subdivided into the extraction of unstructured knowledge and the manipulation of knowledge so that it is managed in a structured manner [19]. The literature highlights the availability of immense information on the FOODBEV industry manufacturing processes, existing mostly in unstructured formats [5]. The literature further reveals the availability of Industry 4.0 technologies for addressing various challenges in the FOODBEV sector. However, knowledge extraction and the management of relevant content of applicable Industry 4.0 technologies for the FOODBEV industry is a timeconsuming and colossal task [20]. Knowledge extraction and content management methods are complemented by systems such as ERP and MES and AI using Structured Query Language (SQL) and NLP [19]. The methodology thus seeks to consolidate both knowledge extraction and management in order to develop a smart digital tool that facilitates the efficient and automated digital extraction of information [19]. The search is premised on a multi-tier NLP optimised keyword search [5]. Figure 3 illustrates the various techniques adopted in developing the smart technological substitution and associated skills tool for the FOODBEV manufacturing sector.

FOODBEV Specific Knowledge
Background information gathering commences with an extensive literature review to collate information on the various FOODBEV manufacturing value chain categories. The current structure of the FOODBEV sector in SA is referenced. The construct of the background is based on Frank et al [14], whereby the process and the systems are considered in a structured manner. Figure 4 illustrates the structure for the initial data extraction.

FOODBEV Specific Knowledge
Background information gathering commences with an extensive literature review to collate information on the various FOODBEV manufacturing value chain categories. The current structure of the FOODBEV sector in SA is referenced. The construct of the background is based on Frank et al. [14], whereby the process and the systems are considered in a structured manner. Figure 4 illustrates the structure for the initial data extraction.
The information sources for the various manufacturing processes include peer-reviewed journal publications and the South African FOODBEV SETA website. The knowledge is both qualitative and quantitative. Qualitative information is gathered to identify the FOODBEV value chains categories, such as dairy and wine, whereas quantitative information is gathered by collating all the variables that are monitored or controlled during production, such as pressure and temperature [16]. This knowledge provides a foundation upon which the research and development of the digital knowledge management tool is based and defines the overall structure in alignment to Frank et al. [14].

FOODBEV Specific Knowledge
Background information gathering commences with an extensive literature review to collate information on the various FOODBEV manufacturing value chain categories. The current structure of the FOODBEV sector in SA is referenced. The construct of the background is based on Frank et al [14], whereby the process and the systems are considered in a structured manner. Figure 4 illustrates the structure for the initial data extraction. Upon summarising the FOODBEV value chains categories, further knowledge gathering is executed to gather process steps of each value chain category, from raw material procurement to final packaging and the distribution of the manufactured product. The MES and ERP are typical system software that are used in manufacturing environments to assist smart and efficient product production. Identifying MES or ERP tasks that are relevant for each process activity is critical in offering insight into which Industry 4.0 technologies may be offered for replacement. The information acquired for each value chain category is saved on multiple sheets in an Excel file to improve the readability and migration of this information into an SQL database, as explained in successive steps. Refer to Figure 4 for the structure adopted in configuring the tables.

Database Migration
Data recorded in the Excel sheets are transferred to a Structured Query Language (SQL) database, thus enabling the storage and access of all information (value chain process steps and variables monitored) from a single database [20], unlike the multiple sheets in Excel. SQL is more advantageous compared to Excel, since Excel is slow when dealing with huge datasets [21]. This may not be an issue presently, but as we further develop the database in the future and add more FOODBEV manufacturing processes, data volumes will increase; thus, SQL will be advantageous. Adding more products necessitates the creation of more Excel spreadsheets, which would result in even more complications, such as storage. Furthermore, the SQL database enables easier extraction of information for the development of a Graphical User Interface (GUI) and Python coding. The development of the GUI with the aid of Python is detailed below.

Graphical User Interface (GUI) Formulation
A GUI is created to enable the user to easily extract the required information from the database, without prior knowledge or the use of SQL [20]. The stored information in the database is used to concatenate keyword phrases that are subsequently employed in the program for knowledge extraction. The GUI is developed to make the tool user-friendly and allow users to concatenate a string of keyword phrases to use in search of particular information. The GUI is essentially a facilitated mechanism to develop a search string based on current FOODBEV specific information.

Knowledge Extraction
Web scraping is an automated technique used to extract information from websites using programming languages [21]. It enables the retrieval of large volumes of data in a short amount of time, which is advantageous in a world where information is vast and rapidly updating and expanding [21]. Web scraping typically consists of two programs: a crawler and a scraper. The crawler is responsible for systematically navigating the web by following links. Its main objective is to discover and index web pages, such as those found on Google Scholar [22]. On the other hand, a scraper is designed to extract specific data from the pages which are discovered or indexed by the crawler [21]. In this work, both crawler and scraper programs were implemented using Python programming language.
The crawler program takes the concatenated string from the GUI as input. The string is parsed into the Google Scholar URL [23]. Subsequently, the program traverses through Google Scholar, indexing the first 50 pages of the website. The scraper program is then employed to extract data, such as the URL link to the article, article title, date of publication, and citation count. These details are subsequently stored in the SQL database. These stored data are further categorized into "highly cited" and "most recent" publications [21]. The two categories are compared, and duplicates are removed. Using the URL link to the article, the 50 "highly cited" articles and 50 "most recent" articles are downloaded and stored in a PDF format.

Natural Language Processing (NLP)
Following from the knowledge extraction above with the aid of web scraping, natural language processing (NLP) is then used to process the downloaded articles. NLP is a subfield of artificial intelligence and computational linguistics that is concerned with the extraction of meaning from text [24]. A notable application of NLP is text summarization, which efficiently condenses large volumes of text while retaining valuable information. Extractive text summarization is one example of text summarization that creates a summary by extracting a subset of existing words, phrases, or sentences from the original text [25]. This type of summarization uses a statistical approach such as Term Frequency-Inverse Document Frequency (TF-IDF) for selecting important words, sentences, or phrases in a document. TF-IDF is a numerical technique which reflects how important a word is to a document [24]. The TF-IDF technique generates a matrix consisting of the documents as column names, and the rows represent the extracted key words. The entries of the matrix represent the frequency count of the keywords in each document. A high frequency count indicates the importance of the key word in the document. Based on this information, this TF-IDF approach is adopted for this study. Firstly, the downloaded documents undergo pre-processing using the NLP Python NLTK (Natural Language ToolKit) package [25]. This involves tasks such as removing punctuation and common words (e.g., "is", "the", "as") and converting all words to lower case. Once the data are pre-processed, a TF-IDF matrix is constructed. This matrix is made of document titles and the frequency count of each key word.

NLP and Final Data Filtering
The extracted keywords are presented to the user. The user selects relevant keywords, and the program retrieves the most relevant articles for the user. The skills required for the selected technology are also compiled via an NLP analysis utilising skill-related keywords. This forms the basis of the skills recommendations sent to the user together with the relevant filtered papers.

Results Discussion
The methodology is applied in a structured manner in order to achieve the goals of this study. The block diagram in Figure 5 below illustrates a summative discussion of the findings, as explained in successive paragraphs below:  [16] further categorized the five FOODBEV categories into manufacturing processes, as illustrated in Table 1 below. A literature review is conducted to define the processes of the five FOOD-BEV categories by filtering journal publications. From the five (5) FOODBEV value chain processes, eighteen (18) sub-processes in total are further identified. Whilst this categorisation could have been achieved via a conventional bibliometric analysis [23], the categorisation is guided by the South African FOODBEV SETA and is a structured approach towards achieving the study objectives.
Sequential production steps for each of the eighteen sub-processes are then gathered from the literature, and eighteen excel spreadsheets of collated information are compiled.  [16] further categorized the five FOODBEV categories into manufacturing processes, as illustrated in Table 1 below. A literature review is conducted to define the processes of the five FOODBEV categories by filtering journal publications. conventional bibliometric analysis [23], the categorisation is guided by the South African FOODBEV SETA and is a structured approach towards achieving the study objectives.
Sequential production steps for each of the eighteen sub-processes are then gathered from the literature, and eighteen excel spreadsheets of collated information are compiled. Table 2 illustrates an Excel sheet for beer production. The first column (titled process) identifies each manufacturing production step. Each step is documented so that as one searches for technology substitution information, precise technologies inapplicable to each step are identified. Variables monitored during production (time, temperature, and pressure) and MES and ERP systems are also captured.

. Database Migration and Front End
Following the collection of the process information in Excel spreadsheets as illustrated in Table 2, the data are migrated to an SQL database. All information from the eighteen Excel spreadsheets is now stored in one SQL table called "Product-Process", as illustrated in Figure 6. The migration to SQL is conducted so as to achieve storage and data volume advantages, in comparison to Excel, as aforementioned in the methodology.  Table 2 illustrates an Excel sheet for beer production. The first column (titled process) identifies each manufacturing production step. Each step is documented so that as one searches for technology substitution information, precise technologies inapplicable to each step are identified. Variables monitored during production (time, temperature, and pressure) and MES and ERP systems are also captured.

Database Migration and Front End
Following the collection of the process information in Excel spreadsheets as illustrated in Table 2, the data are migrated to an SQL database. All information from the eighteen Excel spreadsheets is now stored in one SQL table called "Product-Process", as illustrated in Figure 6. The migration to SQL is conducted so as to achieve storage and data volume advantages, in comparison to Excel, as aforementioned in the methodology.

GUI Development
Following the development of the SQL database, the Excel process structures and supporting data are imported into SQL. The Python programming language is used to integrate the SQL to the GUI via the Python library pyodbc. The product type, process step, operating conditions, and 4IR system are extracted and displayed in the Graphical User Interface. The user is guided via the GUI to configure their search for new technologies based on their specific needs, as illustrated in Figure 7.

GUI Development
Following the development of the SQL database, the Excel process structures and supporting data are imported into SQL. The Python programming language is used to integrate the SQL to the GUI via the Python library pyodbc. The product type, process step, operating conditions, and 4IR system are extracted and displayed in the Graphical User Interface. The user is guided via the GUI to configure their search for new technologies based on their specific needs, as illustrated in Figure 7.
The system assists in navigating the user in a sequential, structured, and detailed manner. The user keywords once selected are displayed in the right-hand block of the GUI. The user validates their search and executes it by pressing the "Keyword search" icon as illustrated in Figure 7. The user keywords are converted into a logical search string as illustrated in Figure  8. The search string is parsed into Google Scholar. The search seeks papers with the userspecified strings and searches for the most recent 50 papers and the top cited 50 papers. With this extraction configuration, the possibility of extracting content that is relevant, effective, and recent is very high. The Google Scholar constraint is set at the last 10 years of publications. For instance, for the search string of TITLE-ABS-KEY ("Chocolate" AND "manufacture" AND "coaching" AND "Temperature" AND "Pressure" AND "ERP" AND "MES"), the Google Scholar results are illustrated in Figure 9. Each paper is stored with a unique ID, the paper title, the URL of the extracted paper, the number of citations, TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "coaching" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Diary" AND "manufacture"_ AND "tempering" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "moulding" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "shaping" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Dairy" AND "manufacture"_AND "baking" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "enrobing" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) Figure 7. Front-end application to enable user-friendly applicability.
The system assists in navigating the user in a sequential, structured, and detailed manner. The user keywords once selected are displayed in the right-hand block of the GUI. The user validates their search and executes it by pressing the "Keyword search" icon as illustrated in Figure 7.
The user keywords are converted into a logical search string as illustrated in Figure 8. The search string is parsed into Google Scholar. The search seeks papers with the userspecified strings and searches for the most recent 50 papers and the top cited 50 papers. With this extraction configuration, the possibility of extracting content that is relevant, effective, and recent is very high. The Google Scholar constraint is set at the last 10 years of publications.

GUI Development
Following the development of the SQL database, the Excel process structures and supporting data are imported into SQL. The Python programming language is used to integrate the SQL to the GUI via the Python library pyodbc. The product type, process step, operating conditions, and 4IR system are extracted and displayed in the Graphical User Interface. The user is guided via the GUI to configure their search for new technologies based on their specific needs, as illustrated in Figure 7.
The system assists in navigating the user in a sequential, structured, and detailed manner. The user keywords once selected are displayed in the right-hand block of the GUI. The user validates their search and executes it by pressing the "Keyword search" icon as illustrated in Figure 7. The user keywords are converted into a logical search string as illustrated in Figure  8. The search string is parsed into Google Scholar. The search seeks papers with the userspecified strings and searches for the most recent 50 papers and the top cited 50 papers. With this extraction configuration, the possibility of extracting content that is relevant, effective, and recent is very high. The Google Scholar constraint is set at the last 10 years of publications. For instance, for the search string of TITLE-ABS-KEY ("Chocolate" AND "manufacture" AND "coaching" AND "Temperature" AND "Pressure" AND "ERP" AND "MES"), the Google Scholar results are illustrated in Figure 9. Each paper is stored with a unique ID, the paper title, the URL of the extracted paper, the number of citations, TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "coaching" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Diary" AND "manufacture"_ AND "tempering" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "moulding" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "shaping" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Dairy" AND "manufacture"_AND "baking" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) TITLE-ABS-KEY ( "Chocolate" AND "manufacture"_AND "enrobing" AND "Temperature" AND "Pressure" AND "ERP" AND "MES" ) For instance, for the search string of TITLE-ABS-KEY ("Chocolate" AND "manufacture" AND "coaching" AND "Temperature" AND "Pressure" AND "ERP" AND "MES"), the Google Scholar results are illustrated in Figure 9. Each paper is stored with a unique ID, the paper title, the URL of the extracted paper, the number of citations, and the published year. Python is then configured to filter and remove duplicates. Using the "url of paper", the highest cited and most recent papers are downloaded and stored in a PDF format. and the published year. Python is then configured to filter and remove duplicates. Using the "url of paper", the highest cited and most recent papers are downloaded and stored in a PDF format.

Figure 9.
Index of stored papers.
The next step is to use NLP to extract frequent words from the stored PDFs. The NLP is configured to exclude "common words", and the research team have developed a protocol that stores common words in a reserve list. Upon review, new common words can be added to the master common word list. This ensures the system becomes more focused. The remaining words found in the list of papers are then filtered, classified, consolidated, and exported to the user and displayed in the form of a GUI. A typical result is illustrated in Figure 10 with the paper, the table headers, and the keywords on the horizontal axis.  The next step is to use NLP to extract frequent words from the stored PDFs. The NLP is configured to exclude "common words", and the research team have developed a protocol that stores common words in a reserve list. Upon review, new common words can be added to the master common word list. This ensures the system becomes more focused. The remaining words found in the list of papers are then filtered, classified, consolidated, and exported to the user and displayed in the form of a GUI. A typical result is illustrated in Figure 10 with the paper, the table headers, and the keywords on the horizontal axis. and the published year. Python is then configured to filter and remove duplicates. Using the "url of paper", the highest cited and most recent papers are downloaded and stored in a PDF format. The next step is to use NLP to extract frequent words from the stored PDFs. The NLP is configured to exclude "common words", and the research team have developed a protocol that stores common words in a reserve list. Upon review, new common words can be added to the master common word list. This ensures the system becomes more focused. The remaining words found in the list of papers are then filtered, classified, consolidated, and exported to the user and displayed in the form of a GUI. A typical result is illustrated in Figure 10 with the paper, the table headers, and the keywords on the horizontal axis.  The filtered keywords provide the user with the opportunity to refine the search based on the available knowledge. As the search keywords are selected, the Python code filters and provides a final shortlist of papers. The user then has the option to receive the full papers for review. The Python code extracts a skill-related NLP and presents the user with the keywords with the highest occurrence per paper, as arranged in hierarchical order (see Table 3), and associated skills, as required by the technologies identified. The outputs of the Python model and filtering as illustrated in Table 3, provide technological insights for FOODBEV process substitution.

Data Extraction: Sample of Results
The research team seek to conduct a set of trial runs for the identification of digital substitutions and skills. For this purpose, 12 searches with variable constraints are conducted. The results are illustrated in Table 4 below. Table 4. Paper extraction results.

Search String User-Selected Keywords Results (Technologies) Skills
Production, processing, and preservation of meat, fish, fruit, vegetables, oil, and fats The filtered keywords provide the user with the opportunity to refine the search based on the available knowledge. As the search keywords are selected, the Python code filters and provides a final shortlist of papers. The user then has the option to receive the full papers for review. The Python code extracts a skill-related NLP and presents the user with the keywords with the highest occurrence per paper, as arranged in hierarchical order (see Table 3), and associated skills, as required by the technologies identified.
The outputs of the Python model and filtering as illustrated in Table 3, provide technological insights for FOODBEV process substitution.

Data Extraction: Sample of Results
The research team seek to conduct a set of trial runs for the identification of digital substitutions and skills. For this purpose, 12 searches with variable constraints are conducted. The results are illustrated in Table 4 below. Qualitative spectroscopy and chemometrics [27]-innovative technology that makes the counterfeiting or falsifying of fish products difficult.

['Salting', 'Processing']
Ultrasound instrument for meat salting in pork [28]-enhancement of salt distribution during meat processing, thus compliance to quality standard of processed meat.
Near-Infrared Spectroscopy (NIRS) for salted composition diagnostics [29]-diagnosis of minced meat at varying temperatures using NIRS.
Food Science [29] Electronics [29] Quality Controller [28] Manufacture of food preparation products Adoption of automation and robotics in precision agriculture [9]-use of robotics equipment to enable farmers to execute agricultural operations in a timely manner, such as planting, inspection, and spraying with minimum costs.
Robotics in packaging of farm produce via HSV analysis [30]-robot utilisation to package farm produce based on colour or size.
Robotics [30] Electrical Engineering and Computer Science [9] Agronomy Ultrasound instrument for meat salting in pork [28]-enhancement of salt distribution during meat processing, thus compliance to quality standard of processed meat.
Near-Infrared Spectroscopy (NIRS) for salted composition diagnostics [29]-diagnosis of minced meat at varying temperatures using NIRS.
Food Science [29] Electronics [29] Quality Controller [28] Manufacture of food preparation products Ultrasound instrument for meat salting in pork [28]-enhancement of salt distribution during meat processing, thus compliance to quality standard of processed meat.
Near-Infrared Spectroscopy (NIRS) for salted composition diagnostics [29]-diagnosis of minced meat at varying temperatures using NIRS.
Food Science [29] Electronics [29] Quality Controller [28] Manufacture of food preparation products Adoption of automation and robotics in precision agriculture [9]-use of robotics equipment to enable farmers to execute agricultural operations in a timely manner, such as planting, inspection, and spraying with minimum costs.
Robotics in packaging of farm produce via HSV analysis [30]-robot utilisation to package farm produce based on colour or size.

Convolutional Neural Networks
Robotics in packaging of farm produce via HSV analysis [30]-robot utilisation to package farm produce based on colour or size.
Robotics [30] Electrical Engineering and Computer Science [9] Agronomy [9]  ics in precision agriculture [9]-use of robotics equipment to enable farmers to execute agricultural operations in a timely manner, such as planting, inspection, and spraying with minimum costs.
Robotics in packaging of farm produce via HSV analysis [30]-robot utilisation to package farm produce based on colour or size.
3D printing [33]-emerging as a popular technique for customised food manufacture.
Cloud manufacturing (CM) [34]service-oriented business model to share manufacturing capabilities and resources on a cloud platform via collaborative design, greater automation, improved process resilience, and enhanced waste reduction, reuse, and recovery.
Robotics in packaging of farm produce via HSV analysis [30]-robot utilisation to package farm produce based on colour or size.
3D printing [33]-emerging as a popular technique for customised food manufacture.
Cloud manufacturing (CM) [34]service-oriented business model to share manufacturing capabilities and resources on a cloud platform via collaborative design, greater automation, improved process resilience, and enhanced waste reduction, reuse, and recovery.
Food Science [32] Process Control [34] IT [34] AI and robotics [32]  Food Science [32] Process Control [34] IT [34] AI and robotics [32] Information 2023, 14, x FOR PEER REVIEW 17 of 21 ['coffee', 'by-product, 'processing', 'sustainable'] Dry processing of coffee silverskin (CSS) [35]-dry processing of CSS to produce wood polymer composites due to its high fibre content and antioxidant properties. Significant reduction in energy-consuming dry processing of CSS since it demands less thermal processing energy.
Integrated bio-refinery valorisation [36]-valorisation of food waste via integrated bio-refinery approaches to produce pharmaceutical, cosmetic, food, and non-food applications.
Thiobarbituric Acid Reactive Substances (TBARS) Assay [37]-the discovery of the addition of two levels of CSS to new formulations of chicken meat burgers. Innovative addition of natural ingredients derived from coffee silverskin in the new formulations of chicken meat burgers, thus limiting food waste via formulation of new chicken burgers.
Replacement of malted by raw barley [38]-raw barley can be processed by hammer mills (roller mills also). Hammer mills ensure efficient extraction in raw barely compared to malted barley by producing finer grist and larger surface area for enzymatic hydrolysis of endosperm.
Analytical Skills [39] [ Dry processing of coffee silverskin (CSS) [35]-dry processing of CSS to produce wood polymer composites due to its high fibre content and antioxidant properties. Significant reduction in energy-consuming dry processing of CSS since it demands less thermal processing energy.
Integrated bio-refinery valorisation [36]-valorisation of food waste via integrated bio-refinery approaches to produce pharmaceutical, cosmetic, food, and non-food applications.
Thiobarbituric Acid Reactive Substances (TBARS) Assay [37]-the discovery of the addition of two levels of CSS to new formulations of chicken meat burgers. Innovative addition of natural ingredients derived from coffee silverskin in the new formulations of chicken meat burgers, thus limiting food waste via formulation of new chicken burgers.
Culinary arts [37] Biotechnology [36] Biochemistry [36] Pharmacy [35] Manufacture of beverages Dry processing of coffee silverskin (CSS) [35]-dry processing of CSS to produce wood polymer composites due to its high fibre content and antioxidant properties. Significant reduction in energy-consuming dry processing of CSS since it demands less thermal processing energy.
Integrated bio-refinery valorisation [36]-valorisation of food waste via integrated bio-refinery approaches to produce pharmaceutical, cosmetic, food, and non-food applications.
Thiobarbituric Acid Reactive Substances (TBARS) Assay [37]-the discovery of the addition of two levels of CSS to new formulations of chicken meat burgers. Innovative addition of natural ingredients derived from coffee silverskin in the new formulations of chicken meat burgers, thus limiting food waste via formulation of new chicken burgers.

['design']
Hazard Analysis Critical Control Points [38] -simple, specialized method to prevent health hazards emanating from consuming contaminated food and beverages.
Replacement of malted by raw barley [38]-raw barley can be processed by hammer mills (roller mills also). Hammer mills ensure efficient extraction in raw barely compared to malted barley by producing finer grist and larger surface area for enzymatic hydrolysis of endosperm.
Analytical Skills [39] [ Dry processing of coffee silverskin (CSS) [35]-dry processing of CSS to produce wood polymer composites due to its high fibre content and antioxidant properties. Significant reduction in energy-consuming dry processing of CSS since it demands less thermal processing energy.
Integrated bio-refinery valorisation [36]-valorisation of food waste via integrated bio-refinery approaches to produce pharmaceutical, cosmetic, food, and non-food applications.
Thiobarbituric Acid Reactive Substances (TBARS) Assay [37]-the discovery of the addition of two levels of CSS to new formulations of chicken meat burgers. Innovative addition of natural ingredients derived from coffee silverskin in the new formulations of chicken meat burgers, thus limiting food waste via formulation of new chicken burgers.
Industrial Engineering [13] Manufacture of breakfast products Industrial durum wheat processing and addition of additives in pasta manufacture [41]-industrial durum wheat processing using a durum mill and the addition of additives, whose quality definition dictates the control of fungal phytopathogens to control fungal disease.
Agronomy [41] Chemistry [41] ['corn', 'germ', 'pro- Industrial durum wheat processing and addition of additives in pasta manufacture [41]-industrial durum wheat processing using a durum mill and the addition of additives, whose quality definition dictates the control of fungal phytopathogens to control fungal disease.
Agronomy [41] Chemistry [41] Information Industrial durum wheat processing and addition of additives in pasta manufacture [41]-industrial durum wheat processing using a durum mill and the addition of additives, whose quality definition dictates the control of fungal phytopathogens to control fungal disease.

['process']
Industrial durum wheat processing and addition of additives in pasta manufacture [41]-industrial durum wheat processing using a durum mill and the addition of additives, whose quality definition dictates the control of fungal phytopathogens to control fungal disease.

['preparation']
Reduced-lactose ice cream using dried rice protein concentrate-(DRPC) [43]-this by-product from the milling of rice is used as a new technology substitution with lowfat, low-lactose properties.
Date syrup substituent [44]-use of date syrup to alter the viscosity of ice cream and as a substitute for sugar in ice cream manufacture.
Fat replacers with different fat content in ice cream manufacture [45]carbohydrate-based fat replacers like maltodextrin, inulin, or modified tapioca starch on the sensory and physical properties of reducedfat and low-fat coconut milk ice cream containing different fat levels.
Non-nutritive sweeteners in ice cream manufacture [46]-sweeteners with low nutritional and low calorie values in ice cream manufacture. Thermal treatment to ensure the sensory quality of non-dairy-based additives as substituents of milk [47]-Ultra-High Temperature (UHT) as heat treatment of plantbased beverages which are used as substituents of milk in the dairy industry.
High-Pressure Homogenisation (HPH) or Ultra-High-Pressure Homogenisation-heat treatment to improve stability of plant-based emulsions and their physicochemical properties [48].

Conclusions
The research team identifies the challenges associated with the transition to digital in the FOODBEV industry, which is essential for sustainability. This is premised on the challenges that emanate due to the diversity of the FOODBEV and 4IR research domains and the time and effort intensity of acquiring specific knowledge. The failure of keywords to extract all relevant information during search queries further adds to this challenge. In order to facilitate a smart structure and expedited skills and technological responses, an NLP-enabled tool is developed. The development and enablement protocol for the tool is detailed. Five searches across the various FOODBEV categories are presented, identifying potential technology for substitution and the associated skill requirement. The research team's recommendations based on this research are threefold. First, the enablement of the toolset developed by piloting the system in order to allow the AI models to gather learning Thermal treatment to ensure the sensory quality of non-dairy-based additives as substituents of milk [47]-Ultra-High Temperature (UHT) as heat treatment of plant-based beverages which are used as substituents of milk in the dairy industry.
High-Pressure Homogenisation (HPH) or Ultra-High-Pressure Homogenisation-heat treatment to improve stability of plant-based emulsions and their physicochemical properties [48].

Conclusions
The research team identifies the challenges associated with the transition to digital in the FOODBEV industry, which is essential for sustainability. This is premised on the challenges that emanate due to the diversity of the FOODBEV and 4IR research domains and the time and effort intensity of acquiring specific knowledge. The failure of keywords to extract all relevant information during search queries further adds to this challenge. In order to facilitate a smart structure and expedited skills and technological responses, an NLP-enabled tool is developed. The development and enablement protocol for the tool is detailed. Five searches across the various FOODBEV categories are presented, identifying potential technology for substitution and the associated skill requirement. The research team's recommendations based on this research are threefold. First, the enablement of the toolset developed by piloting the system in order to allow the AI models to gather learning data. Then, the embedding of the AI-based toolset into the FOODBEV website, which would be based on advanced integration to the UJ servers. Finally, the skills development and support for the internal adoption of the advanced toolset for technological substitution.