A Natural Language Interface to Relational Databases Using an Online Analytic Processing Hypercube

: Structured Query Language (SQL) is commonly used in Relational Database Management Systems (RDBMS) and is currently one of the most popular data deﬁnition and manipulation languages. Its core functionality is implemented, with only some minor variations, throughout all RDBMS products. It is an effective tool in the process of managing and querying data in relational databases. This paper describes a method to effectively automate the conversion of a data query from a Natural Language Query (NLQ) to Structured Query Language (SQL) with Online Analytical Processing (OLAP) cube data warehouse objects. To obtain or manipulate the data from relational databases, the user must be familiar with SQL and must also write an appropriate and valid SQL statement. However, users who are not familiar with SQL are unable to obtain relevant data through relational databases. To address this, we propose a Natural Language Processing (NLP) model to convert an NLQ into an SQL query. This allows novice users to obtain the required data without having to know any complicated SQL details. The model is also capable of handling complex queries using the OLAP cube technique, which allows data to be pre-calculated and stored in a multi-dimensional and ready-to-use format. A multi-dimensional cube (hypercube) is used to connect with the NLP interface, thereby eliminating long-running data queries and enabling self-service business intelligence. The study demonstrated how the use of hypercube technology helps to increase the system response speed and the ability to process very complex query sentences. The system achieved impressive performance in terms of NLP and the accuracy of generating different query sentences. Using OLAP hypercube technology, the study achieved distinguished results compared to previous studies in terms of the speed of the response of the model to NLQ analysis, the generation of complex SQL statements, and the dynamic display of the results. As a plan for future work, it is recommended to use inﬁnite-dimension (n-D) cubes instead of 4-D cubes to enable ingesting as much data as possible in a single object and to facilitate the execution of query statements that may be too complex in query interfaces running in a data warehouse. The study demonstrated how the use of hypercube technology helps to increase system response speed and process very complex query sentences.


Introduction
Natural Language Processing (NLP) is an important part of Artificial Intelligence (AI) used to create intelligent models that simulate human thinking. Due to its advanced capabilities, it can reduce the gap between machines and humans [1]. The primary goal of processing NLQs is a model that is able to translate English sentence structures [2] into processable machine code.
NLP has been used in developing systems to translate Natural Language (NL) sentences into SQL [3,4]. A query can be entered in natural language by the user. When the user enters the query in English, it is then translated into an SQL query [5]. There are many AI 2021, 2 721 difficulties in converting NLQs to SQL queries, such as complexity, which implies that a single term may have several meanings. In this case, a single word may be mapped to several meanings [5,6]. Another difficulty is the development of complex SQL queries that are executed on multiple database objects [7].
For users who are not fully familiar with complex database query languages such as SQL, accessing databases via questions in NL and obtaining the results of the query in an understandable format are not easy tasks [8]. These systems are intended for users who work with databases but lack SQL experience. Several studies have examined various solutions using NLP interfaces, and this remains an interesting research field. The focus of this research is NLP methods and their conversion into SQL statements, in addition to the challenges associated with big data, which may hinder the simulation that takes place between the natural language interface (NLI) and large databases. The volume of data produced globally each day is increasing tremendously. At the current pace, 2.5 quintillion bytes of data are being generated per day; however, this pace is accelerating as a result of the growth of the Internet of Things, the spread of social networking sites, and the interaction of people with digital services. The volume of data increased by 90% during the past two years, and this growth is of interest to researchers [9,10]. Organizations cannot function without data and, due to their quantity and rapid rate of generation, data have become the fuel that drives these organizations. These data are accumulated in huge datasets, collectively referred to as "big data", which must be analyzed to enhance decision making. However, organizations face challenges regarding big data. These include data quality, data storage, a lack of data science professionals, data validation, and the aggregation of data from various sources. One of the most pressing challenges for big data is the correct storage of these huge datasets to enable easy access and handling. Thus, studies have sought suitable solutions for storing large quantities of data in relational databases (RDBs) in a structured manner that facilitates the processes of managing and retrieving data [11].
The main challenge that NLI-RDBs developers continue to face is the automatic mapping or conversion of complex NLQs into SQL queries, particularly given the huge amount of data distributed in database tables and the challenges of dynamically displaying query results.
The systems reviewed in this paper proposed algorithms to handle English language queries made by a user to obtain an SQL query built using a number of techniques [3,12,13]. The NLIDB-OLAP novel architecture proposed in this paper built upon three main pillars. Firstly, several methodologies are used to process natural language for the interaction of the model with the user and to provide an effective analysis of the meanings and etymologies of the entered phrases, as follows:
Secondly, the model interacts with the user and extracts data in real time and displays it dynamically. Thirdly, the key novelty of the proposed NLIDB-OLAP is the use of the OLAP hypercube to enable processing of a huge amount of data, which is a challenge in relational databases when using complex SQL query statements. The proposed solution in this study was to develop an interface system in which NLP techniques and data retrieval techniques are used to build and execute SQL statements, in addition to the usage of an OLAP hypercube [14,15]. The main contribution of this study is utilizing the idea of having an OLAP hypercube table which is based on unifying and coordinating data in a single multi-dimensional data warehouse object called the hypercube table. The hypercube table is queried, rather than information being distributed in more than one table, which requires complex and joint SQL statements. The model workflow is shown in Figure 1 [16]. which requires complex and joint SQL statements. The model workflow is shown in Figure 1 [16]. The current research focused on providing solutions for developing the natural language interface and relational databases with OLAP hypercube technology. The aim was to enable the processing of large quantities of data and complex SQL statements that are difficult to use, especially by users who have no knowledge of executing SQL query statements. This technology will also help companies with big data provide automated chat services with customers and an integrated query system capable of addressing all database inquiries.
This paper is organized as follows: Section 2 introduces the main concepts of OLAP and the related literature. Section 3 discusses the related studies and other researchers' challenges in the same field. Section 4 outlines the proposed NLIDB-OLAP model workflow, and the algorithm structure is presented in Section 5. Section 6 presents the results and analysis. Section 7 provides the conclusion and discusses the scope of future research.

Online Analytical Processing (OLAP)
The term Online Analytical Processing (OLAP) first appeared in a 1993 whitepaper published by Arbor Software and was coined by Edgar F. Codd [17], the discoverer of relational databases. This product was introduced to the market as an "extended spreadsheet database" [17]. OLAP [18] is an organizational mechanism that enables users to query and extract data in an easy and fast manner to enable different forms of analysis. OLAP plays an important role in accounting and financial reports for large data that need a large amount of time to be processed [19]. The process of querying big data distributed in multiple tables results in complex [18], joint and overlapping SQL statements to perform a correct and accurate query of the data [15]. OLAP data are collected from multisource production environment databases, transferred to data warehouses and then cleaned and organized into data cubes to significantly improve query time. The idea of using the infinite OLAP cube n-D began with a study [19] which was confined to data mining technology; however, the author did not address the linkage of using a multidimensional OLAP hypercube table with NLP interfaces. The work in this paper explores the use of an OLAP hypercube table to help in developing an interface system that offers a better performance in translating NLQ to SQL. The current research focused on providing solutions for developing the natural language interface and relational databases with OLAP hypercube technology. The aim was to enable the processing of large quantities of data and complex SQL statements that are difficult to use, especially by users who have no knowledge of executing SQL query statements. This technology will also help companies with big data provide automated chat services with customers and an integrated query system capable of addressing all database inquiries. This paper is organized as follows: Section 2 introduces the main concepts of OLAP and the related literature. Section 3 discusses the related studies and other researchers' challenges in the same field. Section 4 outlines the proposed NLIDB-OLAP model workflow, and the algorithm structure is presented in Section 5. Section 6 presents the results and analysis. Section 7 provides the conclusion and discusses the scope of future research.

Online Analytical Processing (OLAP)
The term Online Analytical Processing (OLAP) first appeared in a 1993 whitepaper published by Arbor Software and was coined by Edgar F. Codd [17], the discoverer of relational databases. This product was introduced to the market as an "extended spreadsheet database" [17]. OLAP [18] is an organizational mechanism that enables users to query and extract data in an easy and fast manner to enable different forms of analysis. OLAP plays an important role in accounting and financial reports for large data that need a large amount of time to be processed [19]. The process of querying big data distributed in multiple tables results in complex [18], joint and overlapping SQL statements to perform a correct and accurate query of the data [15]. OLAP data are collected from multisource production environment databases, transferred to data warehouses and then cleaned and organized into data cubes to significantly improve query time. The idea of using the infinite OLAP cube n-D began with a study [19] which was confined to data mining technology; however, the author did not address the linkage of using a multidimensional OLAP hypercube table with NLP interfaces. The work in this paper explores the use of an OLAP hypercube table to help in developing an interface system that offers a better performance in translating NLQ to SQL.

Literature Review
NLP research began in the late 1940s when the idea of machine translation (MT) was investigated in 1946, with Weaver and Booth implementing the first natural language program on machine translation to crack codes during World War II [20]. After that, most of the systems that were created from this perspective were based on searching in the dictionary for the appropriate words for translation and rearranging words to suit the rules of word order of the synonymous language. This was carried out without looking at the lexical ambiguity inherent in the natural language, as it led to inaccurate and bad results [20]. This prompted researchers to find solutions more appropriate to languages, which witnessed a significant development in the 1960s in the production of primary NLP systems [20][21][22]. By the 1970s, applications of NLP had developed dramatically, as rhetorical documents were used to create response-generator text meta descriptions such as McKeown's discourse planner, TEXT [23], and McDonald's response generator, MUMMBLE [24]. By the 1980s, the concept of natural language had expanded and there was a growing realization of the need to find solutions to the limitations of natural language programming and a general push towards applications that worked with the language in a broad real-world context. Natural language programming grew rapidly from that time until it underwent a major transformation in the early 1990s with the transition to relying on empirical methodologies versus the introspective generalizations that characterized the Chomsky era which had an impact on theoretical linguistics [25]. Many applications in our modern world require dealing with natural language interfaces with relational databases; research and development in this field continue to progress at a fast pace to provide the best user experience and perfect solutions to handling complex queries and large data volumes [26]. A study entitled "Text-to-SQL Generation for Question Answering on Electronic Medical Records" [27] aimed to provide services related to health care that are asked by patients in the form of queries through databases, so that these questions are translated into medical inquiries, and then, responses are made from medical records entered into the databases, where the questions are related to several tables, which requires complexity in query strings that may produce false results [27]. Their model was built on huge data that were divided into several tables due to the large medical data volume, and the correct retrieval of information was complicated due to complex medical terminology; the author processed and cleaned the data stored in databases for easy query processing and retrieval of data according to the questions asked and gave the correct answers. The study in [28], "Natural language Interface for Database: A Brief review" provided about an introduction to natural language interfaces and their processing of databases to provide an intelligent data system; this paper discussed the shortcomings in understanding natural language to achieve widespread language interfaces and their association with databases. The authors of [29] modeled SQL query logs as a query part diagram to improve the ability of language interfaces, access information within log information pipes, match words and terms used by the user, and enter them through the system according to their proposed NLI-DBS system to enhance the performance of language interfaces with the limitations of poor accuracy in converting NLQ to SQL, a bad effect of user sessions in an SQL query log, data matching confusion, and no ways were found to improve existing deep learning from start to finish.
The study in [30] aimed to take appropriate action using a computer by interpreting a sentence in natural language by processing and translating it accurately to extract data and summarize information from multiple data sources according to users' requests. SQL statements are passed to databases, and their results are displayed to the user through applications prepared for this purpose. The model handles simple queries and the search does not provide solutions to a complex query; not using a dictionary leads to inaccurate data extraction, and the fixed layout used limited the data result with fixed values. The experimental outcomes in [31] present new methods for analyzing a research dataset based on NLP and relational data analysis, and they propose to implement relational queries to investigate the possibility of using certain software tools that allow wide NLP references to deal with relational databases. The focus of the study in [31] was to limit the work to lexical analysis and the prediction of query sentences and successive words' weaknesses in text processing at the natural language level, which requires focusing on the development of additional query tools. In [24], the provision of accurate information to users from a railway database was studied, including the provision of seats or travel destinations. The paper aimed to develop a model that uses NLP by entering a question in natural language and it being analyzed by the model and converted into SQL query sentences and data extraction according to the required query, coding, and analysis. The mapping was used to provide the user with a comprehensive view of the available data with the limitation of querying only one statement without addressing the fetching of more data through multiple queries. Enhancing the Detection of Criminal Organizations in Mexico using machine learning (ML) and NLP was introduced in [32]. The use of NLP helped to extract encoded information from criminal groups on a daily basis in a systematic and reliable manner. The dictionary of actors used in this application was robust and included a comprehensive list of the most relevant criminals and their organizations. The natural language (NL) and NLP applications presented in this study provide strong empirical foundations for advancing objective academic and political analysis to better understand and monitor organized criminal activity. The authors in [33] discussed Aspect-Based Sentiment Analysis using NLP, which is characterized by three classifications. The authors explored both praise and criticism in post. The study in [34] proposed a standard to help big data OLAP designers choose the cube design best suited to their goals; the study identified the main requirements and trade-offs for designing a big data OLAP effectively. Cubes take advantage of data pre-clustering techniques. In [35], OLAP was used to tweak the path data that includes storing the sawn data at a specific time and point. The researchers intended to propose this technique due to the large volume of multi-dimensional data and analytical queries, which have three times the analysis capacity than ordinary data storage. The researchers in [36] presented a study on social robots, specifically in the tourism sector. Through NLP in Python, the researchers developed the detection of feelings in the text by converting speech to text, which allowed one to study the feelings of visitors to obtain results evaluating artworks in a museum. The authors in [37] presented a study on the design of a decision support system for an aviation engine control system; the results indicate that fact tables do not provide any information on how to group records when calculating the data aggregation, where they adopted the implementation of the hypercube as a separate database to support functional dependency, and multiple elements combined into one using a specially selected function union that provides the most efficient access to data in terms of speed. The researchers in [38] concluded that a joint query would also not produce the desired result, since joint queries must have concurrent schemas (groups of attributes after the SELECT operator); thus, it is necessary to specify all relationships with these attributes in all queries that are standardized. The OLAP hypercube solves this problem automatically without additional user efforts by using a joint query. In [39], a protocol designed to build and maintain a hypercube structure in a dynamic environment was designed, and it was found that implementing HyperD results in savings, especially in those networks where the ripple rate is small relative to the overall network activity. In addition, the network is more resilient to failures and bottlenecks. In [40], an examination of the relative costs of improving queries on databases by providing solutions about the structure of databases and their impact on network load was presented. The authors concluded that hypercube offers distinct cost advantages over stochastic topologies for query optimization. In [41], hypercube helped in business intelligence fields and created an instance of spatial data cubes that shared common dimensional levels. It demonstrated that data cube descriptive models can be used for the easy integration of heterogeneous data and SOLAP navigation in complex towers of data cubes. In [42], the study proved that the combination of both OLAP and data mining provides excellent solutions. OLAP mining databases enhance data mining and analysis capabilities directly into the database server. The paper also provided a brief introduction to these techniques and compared the data warehouse models the STAR schema and the Snowflake schema.
The development of modern technologies for communications and Internet networks has provided many solutions and facilities for humans to deal with various types of fields through the Internet or local networks [19]; this matter led to an increase in the volume of interactions with the technological environment and a huge increase in data [34,43]. These technologies helped people to work through digital interfaces. One of the most important interfaces are natural language interfaces that allow users to interact with the computer using a human language [29]. This study [44] discussed the theory of using an OLAP hypercube to deal with data, as the theory was successful in reducing the time to execute the query. This study did not deal with building natural language interfaces, hence why our study aimed to build a query interface through natural language processing and query execution through an OLAP hypercube.
The research presented in this paper focuses on providing solutions for the development of the natural language interface along with relational databases with OLAP hypercube technology. This novelty integration will allow a very complex SQL statement to be processed on a huge amount of data with a very high speed of execution, which was difficult to deal with through natural language query interfaces in previous studies.

Proposed Model
Keeping data inside databases allows us to retrieve the data at any time. Retrieval of data by a layman is very difficult using traditional methods because it requires a lot of effort and a strong knowledge of database architecture. The database management system can handle natural language queries through standard database languages [31].
The goal of our proposed work was to build an interface to provide a facility that enables the user to enter his/her query in English, which will be processed by several units by using Python Natural Language Tool Kit (NLTK) and other Python libraries to form an equivalent SQL query to be executed on a multi-dimensional OLAP cube and display the query results dynamically, so that only required data are displayed in the query statement. Figure 1 shows model workflow.

Multi-Dimensional OLAP 4D-Cube (Hypercube)
OLAP technology is implemented to facilitate model work and increase responsiveness around the results of SQL query statements [34]. Data are first collected from multiple data sources, stored in data warehouses, and then cleaned and organized into a data cube. Our four-dimensional OLAP cube was designed for university professors and their details such as name, gender, job title, and department they work in, as well as contact details.
In addition, it contained information regarding their research interests, projects, publications, and Ph.D. students. The dimensions were populated with data that were organized hierarchically that was obtained from [45]. OLAP cubes are often pre-summarized across dimensions to significantly improve query time across relational databases.
A four-dimensional cube (hypercube) illustrated in Figure 2B was built, and data were filled in it; then, we could easily obtain all the relational information regarding the professor, publications, departments, and students by one SQL query on one table alone without having to perform a query from more than one table, as shown in Figure 2B, which saves a lot of time and avoids writing complex SQL statements [44].
, FOR PEER REVIEW To implement the hypercube, the sequence theory was applied for the dimens describe the members, as shown in Figure 2A; the following values were used acc to the sequence theory [37,39,41,44]: D-the combination of the dimensions, but if we consider to be the mem the dimension combinations, then we have ∈ D, that is D = { 1 , 2 , 3 , 4 , … . , From this, n is a number of dimensions. 1 , 2 , 3 , 4 , … . , are members of mension combination that are placed on the axis of hypercube [37,44].
Each member of the dimensional combination consists of internal members according to the coordination axes of the hypercube. So, according to the internal bers of the hypercube dimensions, we applied the following values: { 1 , 2 , 3 , … . , }-the internal combination of dimensions (members); = is the combination of values in the dimensions.
Accordingly, every dimension 1 = { 1 1 , 2 1 , 3 1 , … . , 1 }, { 1 2 , 2 2 , 3 2 , … . , 2 }, ….., = { 1 , 2 , 3 , … . , }, consists of the co tion of internal members. Here, is the value of internal members of every dim being taken. Taking as the members of the cube dimensions, all the internal me of dimensions can be shown through the connection (∪), ={ d 1 ∪ d 2 ∪ d } [39]. The members' value of every dimension 1 is the members' combination of sion d 1 , the value of members 2 is the members' combination of dimension represents the sequence number of the member for dimension, 1 = d 1 , and t ues of members = d , as illustrated in Figure 2A. In Figure 2B, each dimension represents an entity, 1 represents departments, resents professors, 3 represents students, and 4 represents publications. In ad each dimension member represents an attribute: 1 1 represents the first attribu partment id) in dimension 1 (departments). The analysis can be performed b To implement the hypercube, the sequence theory was applied for the dimensions to describe the members, as shown in Figure 2A The members' value of every dimension d 1 is the members' combination of dimension Md 1 , the value of members d 2 is the members' combination of dimension Md 2 , k represents the sequence number of the member for dimension, so K 1 = Md 1 , and the values of members K n = Md n , as illustrated in Figure 2A.
In Figure 2B, each dimension represents an entity, d 1 represents departments, d 2 represents professors, d 3 represents students, and d 4 represents publications. In addition, each dimension member represents an attribute: M1 1 represents the first attribute (department id) in dimension d 1 (departments). The analysis can be performed by four types of OLAP analytic operations against a multidimensional object [46,47], namely:
The hypercube illustrated in Table 1

From NLP to SQL
The following steps illustrated in Figure 3 were taken to build the model and convert the natural language into SQL statements to be executed on databases: Step 1: Tokenize Module At this stage, the system implements tokenization on the entered query sentence by separating it into single words, each word representing a unique symbol "Token". Then, these words are stored in a separate list and passed to the Lemmatized and Stop-Word module [27]. The NLTK library was used in order to tokenize the input; further details will be explored with an example in the next walkthrough practical section.
Step 2: Lemmatized and Stop-Word Module The system performs lemmatization on the output of the tokenized module. In addition, the stop-word module removes all the unwanted words from the list using the ignore list. Then, this is stored in a separate list and passed to the lexical module [3,48].
Step 3: Lexical Module The lemmatized list is mapped with the dictionary. In this step, these words are replaced with words from the database dictionary and passed to syntactic analysis [49].
Step 4: Semantic Module The system finds words that are considered as symbols and conditions, and then, the word is selected from the dictionary. (For example: If there is "less than or equal to" in the query, it is replaced with the symbol "<=") [50,51].
Step 5: POS Tagging Module Parts of Speech tagging of tokens is carried out here. The tag signifies whether the word is a noun, adjective, or verb [52].
Step 6: Syntactic Module At this point, the dictionary of attributes, keywords, and table names is preserved. Each encoded word is assigned an attribute in the dictionary [53].
Step 7: SQL Query Generation Module In this module, the SQL query is generated using the output of the syntactic module. A walkthrough is highlighted with an example in the next section. Internal-Use Only

Walkthrough Practical Example
The following walkthrough takes the reader through a practical example to create an SQL query from a natural language query input: a natural language query in the English language. Figure 4 displays the techniques workflow to convert NLQ to SQL.
Step 1: Query entered by the user in natural language (text) is stored in a string.
Example. fetch all the information of the professors who have department number is equal to one.
Step 2: Divide the sentence into single words (tokens) by the spaces between the words in the sentence. These codes are saved in a separate list.
The token for the above query is as follows.
Step 3: We use lemmatization to get the lemma of the tokens generated.
Step 4: Only important tokens are selected for better processing after the tokens are compared with the ignore list. This is carried out through named entity extraction; a dictionary is implemented and consists of data within the database and metadata relating to building SQL statement architecture.
Step 5: Specific tokens are mapped with database hypercube words; depending on the important tokens from step 4, this mapping is carried out through the dictionary that contains all the words in the database hypercube and words within the SQL statement structure. Then, tokens are replaced by their equivalents.
Step 6: Specific values or conditions are determined by the dictionary of conditions that are applied by the algorithm if any conditions are defined and selected according to the correct SQL statements architecture. (eg. replacing 'equal' with "=").
Step 7: Parts of Speech tagging of tokens is carried out by using Python NLTK library tools. The tag is a part-of-speech tag, which signifies whether the word is a noun, adjective, verb, or other. Step 8: Syntax analysis checks the text for meaningfulness compared to the rules of formal grammar by using NLTK.
Step 10: The SQL query is executed on the database and user provided with output.

Walkthrough Practical Example
The following walkthrough takes the reader through a practical example to create an SQL query from a natural language query input: a natural language query in the English language. Figure 4 displays the techniques workflow to convert NLQ to SQL.
Step 1: Query entered by the user in natural language (text) is stored in a string.
Example. fetch all the information of the professors who have department number is equal to one.
Step 2: Divide the sentence into single words (tokens) by the spaces between the words in the sentence. These codes are saved in a separate list.
The token for the above query is as follows.
Step 3: We use lemmatization to get the lemma of the tokens generated.
Step 4: Only important tokens are selected for better processing after the tokens are compared with the ignore list. This is carried out through named entity extraction; a dictionary is implemented and consists of data within the database and metadata relating to building SQL statement architecture.
Step 5: Specific tokens are mapped with database hypercube words; depending on the important tokens from step 4, this mapping is carried out through the dictionary that contains all the words in the database hypercube and words within the SQL statement structure. Then, tokens are replaced by their equivalents.
Step 6: Specific values or conditions are determined by the dictionary of conditions that are applied by the algorithm if any conditions are defined and selected according to the correct SQL statements architecture. (eg. replacing 'equal' with "=").
Step 7: Parts of Speech tagging of tokens is carried out by using Python NLTK library tools. The tag is a part-of-speech tag, which signifies whether the word is a noun, adjective, verb, or other. Step 8: Syntax analysis checks the text for meaningfulness compared to the rules of formal grammar by using NLTK.
Step 10: The SQL query is executed on the database and user provided with output.

Model Screen Shots
The model query interface illustrated in Figure 4 shows the entry of the query text through the user and how the model processes the entered query text and transforms it into an SQL query statement.
AI 2021, 2, FOR PEER REVIEW 11 The model query interface illustrated in Figure 4 shows the entry of the query text through the user and how the model processes the entered query text and transforms it into an SQL query statement.   The results in Figure 5 show a user-specific query about the professor's name information, the professor's department, publications, and the affiliated students under the professor's supervision for the professor id number equal to one. In addition, it shows the model's processing of the compound query statement and its transformation into an SQL query.  Table 2 shows how the model displays the query results dynamically. The model showed all information related to the professors according to the query's request.
The results in Figure 5 show a user-specific query about the professor's name information, the professor's department, publications, and the affiliated students under the professor's supervision for the professor id number equal to one. In addition, it shows the model's processing of the compound query statement and its transformation into an SQL query. Table 3 shows how the model displays the query results dynamically. In this table, the model only showed the information specified in the query statement (professor name, professor ID, professor publication, professor title, affiliated students).  Figure 5. NLP Interface query dealing with another example (2). Table 3 shows how the model displays the query results dynamically. In this table, the model only showed the information specified in the query statement (professor name, professor ID, professor publication, professor title, affiliated students).

Results and Analysis
The applications of NLP in SQL query generation are still a challenging area [54,55]. In this study, we focused on the high performance, accuracy, and responsiveness of SQL query generation. The proposed model in this paper was implemented using the NLP library in Python using the features provided such as syntactic, semantic analysis, NumPy arrays, tokenization, lemmatized stop-word removal, lexical, POS tagging, as well as the use of dictionaries and grammar for such analysis performed using the natural language to SQL query conversion tool connected on the Oracle 19c database by using CX_Oracle connector Along with a list of custom query words to form an SQL query, the execution was executed with a set of English sentences and generated SQL queries.
By adopting OLAP hypercube technology, the query can be executed on one table only, and thus the query process shortens the great effort in executing the query from several tables. The advantages of this model are the speed and the accuracy of the data response. In addition, the scheduled results are illustrated in Table 4.
In Table 4, the SQL_EXEC_TIME is the time in milliseconds that is taken to execute the SQL query on the databases, and TOTAL_EXEC_TIME is the time in milliseconds that the model takes to process NLQ on DBMS and retrieve the data and display the results.
The results in Table 4 were extracted by calculating the time in seconds that it took to perform the analysis of processing as follows:

1.
Calculate the time taken to process the natural language, configure the query, execute, and display the results; 2.
Calculate the time taken to execute SQL statements on databases.
The results showed that the model has a high-speed performance compared to the current systems and the studies carried out in this regard.
This model did not focus on the speed of execution and the representation of dynamic results alone, but rather the ability to process complex query sentences. OLAP hypercube technology provided the ability to perform a complex query on big data databases; this model helps organizations with large data, which allows its users to immediately query the data by providing the query in a highly accurate and fast manner with the fewest resources and equipment needed by the infrastructure.

Conclusions and Future Scope
Natural language interfaces are the most consistent and flexible interfaces for the user. The use of plain English and NLP will help users retrieve and manage data from databases. The user does not need to learn SQL or any other complex query language; user interfaces can be integrated with relational databases built through OLAP, which makes it easier for the user to deal with huge databases using this technology; in addition, the integrated query system provided a capability of covering all inquiries from Big Relational Databases.
The OLAP hypercube (4-D Cube) model presented in this study shows the ability to execute complex query clauses in one single command and solve the problem of displaying complex and multiple results by developing a dynamic display interface in which data display is limited to query results alone.
As a plan for future work, it is recommended to use infinite-dimensional n-D cubes instead of 4-D cubes to enable the ingestion of big data in a single object and to facilitate the execution of query statements that may be too complex in query interfaces running in a data warehouse. Abbreviations section summarizes all abbreviations used in the paper.