1. Introduction
Databases are an integral part of modern information infrastructure, providing an efficient and structured data storage solution. They play a crucial role in ensuring the reliability, integrity, and availability of information for applications and systems. Learning about databases is one of the most important stages in the development of a young specialist in information systems [
1].
PostgreSQL is a database management system. In 2022, PostgreSQL ranked fourth in the global ranking of popular database management systems. The system has many advantages, among which the following are highlighted:
Free Access: The database management system can be installed completely free of charge, allowing users to start working with databases. However, additional fees apply for cloud storage.
Cross-Platform Compatibility: The database management system can operate on any operating system, such as Linux, macOS, or Windows.
The object of this research is educational materials for learning database programming. The subject of the research encompasses the methods and tools for creating a con-temporary electronic resource (website) to provide materials for studying PostgreSQL. The development of the electronic resource will utilize technologies such as React, Node.js, and Express, as well as the creation of custom databases—a test database and a data storage database for the website.
The aim of this work is to develop functional test databases suitable for further use, as well as to create the server-side component of a practice-oriented electronic resource designed to provide materials for learning database programming. Within the framework of this work, the following tasks will be accomplished:
A. The technologies React and Node.js will be studied for the creation of the information resource.
B. Materials for learning PostgreSQL will be prepared.
C. A system for testing and evaluating student performance on the website will be developed.
D. The structure of the test database “Boats” will be designed, which will be used in assignments throughout the course.
E. The structure of the database intended for use with the information resource will be developed.
F. Testing of database queries will be conducted, and its functionality was assessed.
In light of all the above, it can be concluded that the development of this project is aimed at creating an interactive and user-friendly tool for learning database programming, as well as facilitating independent exploration of modern technologies in the field of development: Node.js and React.
2. Materials and Methods
There are various methods for validating the results of SQL queries. The simplest method involves comparing the results of two result sets, while the most complex is the semantic approach. The following sections will explore several existing methods for verifying SQL queries from different perspectives, as well as a method for assessing query optimization in PostgreSQL.
2.1. Validation of Query Results by Final Result Set
This is the most common method for verifying SQL queries against a reference standard. The primary objective of this method is to determine whether the rows and columns of the resulting set match the expected output [
2]. This result validation helps compare queries solely based on the outcome.
However, this algorithm has its limitations. The sequence of columns and their names in the resulting sets must strictly adhere to the reference, and the order of rows or the content of cells must not be altered.
One way to address the issue of insensitivity to the order of fields in the SELECT statement is to implement the following algorithm [
3,
4]:
Extract the results of the query without field names and place them into lists for further processing.
Extract only the field names and store them in a separate list.
Create a Key-Value dictionary, where the Key is the field name from the list of names, and the Value is a list of all values for the selected field.
Populate and compare the dictionaries of the solution query and the reference query.
In this algorithm, the result of the SQL query is interpreted as a set of rows, where each row is an ordered collection of values. The result is considered correct if the set of rows from the query is equivalent to the set of rows from the reference answer. To implement the algorithm, it is necessary to check the inclusion of the result arrays from the student’s query in the results provided by the instructor and vice versa, ensuring complete correspondence between the queries. The equivalence check can be realized through the computation of set differences.
This is a straightforward algorithm for validation, which can later be enhanced with additional complexities as described below.
2.2. Syntactic Approach
This method involves comparing the syntactic structure of a student’s SQL query with a reference query based on their textual representation. This type of validation includes determining the presence of special characters within the string. Minor differences, such as case sensitivity, may be disregarded to avoid penalizing the query’s score; however, key differences, such as permutations in object names, are carefully noted and analyzed.
For these tasks, the Levenshtein distance metric can be employed. The Levenshtein distance [
5] is a metric that quantifies the “similarity” between two strings by calculating the minimum number of operations—specifically, the insertion of a single character, the deletion of a single character, and the substitution of one character for another—required to transform one string into the other. While this method is sometimes used to detect plagiarism, in this project, the opposite is true: the greater the similarity between the solution query and the reference query, the better the evaluation. The Levenshtein distance is computed using the recursive formula of the Wagner–Fischer algorithm. The Levenshtein distance represents the number of operations needed to convert one string into another.
Below is the formula for calculating the distance [
4]:
2.3. Semantic Approach
This approach is based on the analysis of the semantic structure of queries using an Abstract Syntax Tree (AST). An Abstract Syntax Tree (AST) [
6] is a data structure used to represent the structure of a program or a fragment of code. Each node of the tree represents a construct found in the text. The syntax is abstract, as it reflects only the structurally significant parts of the algorithm, omitting details that are not essential.
The tree for SQL queries is constructed as follows: the query is broken down into nodes that represent key SQL constructs (such as SELECT, FROM, WHERE, JOIN, etc.). Each node of the tree contains an operation and its arguments, while the child nodes contain table names, conditions, and field names. The construction of the tree occurs in several stages: first, the query is divided into lexemes; next, the main constructs of the SQL query are identified; then, the abstract tree is built with key constructs in the nodes; and finally, the names and conditions are added as child nodes.
For example, consider the following SQL query involving the tables users and orders:
SELECT u.name, o.order_date
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.age > 30;
The Abstract Syntax Tree for this query would be represented as follows (
Figure 1):
If each query is represented as an Abstract Syntax Tree, the trees can be compared using a recursive breadth-first traversal algorithm. By comparing the ratio of correctly identified nodes in the tree to the total number of nodes, it is possible to determine a semantic metric for evaluating the correctness of the query.
3. Web Application PolySQL
3.1. Brief Description of the PolySQL Web Application
The “PolySQL” web application is an interactive educational platform designed to assist students in getting started with the PostgreSQL database management system (DBMS) while enabling educators to make the learning process more dynamic. This platform aims to address several pertinent issues.
The first issue is the lack of a dedicated test database suitable for the educational process of studying databases at Peter the Great St. Petersburg Polytechnic University. The platform will provide its own database script along with assignments tailored to this database.
The second issue is the potential to automate the routine validation of SQL queries, allowing students to test their queries and receive immediate feedback on their work.
User interaction is a primary focus of the website. The functionalities available to four groups of potential users will be examined: the administrator, the instructor, registered users, and unregistered users.
Use cases for unregistered users include the following:
User registration on the platform.
Access to theoretical course materials.
Use cases for registered users include the following:
User authentication on the platform.
Access to course materials, including tests and practical exercises.
Completion and validation of tests and practical assignments related to database programming.
Use cases for instructors include the following:
Use cases for administrators include the following:
3.2. Comparative Analysis of Technologies for the Backend of the Website and Database
For this project, PostgreSQL was selected as the database management system (DBMS). A classification of properties was established: absence (weak property), ± (average capabilities/importance), + (strong property), and ++ (high capabilities/performance). Below is a summary table (
Table 1) of the properties of various DBMSs, including PostgreSQL, MySQL, Oracle, and Red Database Management System [
7].
For the development of the backend portion of the application, various frameworks and platforms were considered, including the following [
7]:
Zend Framework 3—a free PHP framework with open-source code. Its development is directed and sponsored by Zend Technologies, the same company that developed the PHP language.
Django—a free web application framework for Python that employs the Model-View-Controller (MVC) design pattern.
Node.js—a JavaScript runtime environment that allows for server-side code execution for web pages and web applications, as well as command-line programs.
Below is a comparative table of the properties of the aforementioned frameworks (
Table 2).
Node.js excels due to its high performance, excellent real-time support, flexibility, and active community. Additionally, this language offers extensive development capabilities due to its modular structure and a vast number of npm libraries. In contrast, PHP is considerably more limited in terms of development speed.
To provide further clarification, a summary from an article comparing Node.js and PHP based on the execution of well-known sorting algorithms is presented below [
8].
In our study, a performance comparison between PHP and Node.js was conducted. Various complex sorting algorithms and a heap algorithm were implemented to generate all possible permutations with different array sizes. The tests were conducted using Apache JMeter, and the collected data were analyzed using the statistical T-Test method. The results indicated that Node.js outperformed PHP in all tests, with statistically significant differences observed between the two scripting technologies [
8].
It can be concluded that for server-side implementation, using Node.js is more advantageous than PHP.
In this project, the user interface will be developed using the React library, utilizing programming languages such as HTML, CSS, and JavaScript [
9,
10]. React offers several advantages, including the ability to employ a component-based approach, which allows for the reuse of parent elements and the creation of child components, as well as specialized functions such as using State for managing object states and using Effect for executing side effects within functional components.
Hooks are fundamental functions in React that enable interaction with state and lifecycle methods. This technology allows for the development of React applications without the need for classes. Such functions will significantly simplify the development of the frontend portion of the web application, making them highly beneficial in future implementations.
4. Conclusions
The practical value of this project lies in its ability to gather and prepare the necessary theoretical framework (the selection of theory was not addressed within the scope of this article) and potential algorithms for solving various sub-tasks for a future project aimed at developing a web application for studying the theoretical and practical aspects of databases, using PostgreSQL as a case study. This project is unique in its content at Peter the Great St. Petersburg Polytechnic University and provides students with an immersive experience in the study of databases.
The research encompasses algorithms for the automatic validation of SQL queries, as well as application development technologies such as React and Node.js. The algorithms and technologies explored will facilitate the creation of a comprehensive web application and a dedicated database containing a set of test tasks, which will be utilized to automate the educational process related to the study of databases and the execution of select queries.
Author Contributions
Conceptualization, N.S. and E.A.; methodology, E.R.; software, N.S.; validation E.A. and A.S.; formal analysis, E.A.; investigation, E.R.; resources, A.S.; data curation, E.A.; writing—original draft preparation, N.S.; writing—review and editing, E.R.; visualization, E.A.; supervision, A.S.; project administration, E.R.; funding acquisition, E.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Dataset available on request from the author.
Acknowledgments
During the preparation of this manuscript, the authors used GPT-4o mini for the purposes of improving the English language, as the authors are not native speakers. This tool assisted in enhancing the clarity, coherence, and overall quality of the text, ensuring that the manuscript meets the standards expected in academic writing. The authors have reviewed and edited the output and take full responsibility for the content of this publication.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Dhavalkumar, P.; Piyush, P.; Shreekant, M.; Naresh, K. Ensuring Data Accuracy in Text-to-SQL Systems: A Comprehensive Validation Framework. Int. J. Comput. Trends Technol. 2024, 72, 17–24. [Google Scholar] [CrossRef]
- Abdullayev, V.; Chauhan, A. SQL Injection Attack: Quick View. Mesopotamian J. Cyber Secur. 2023, 30–34. [Google Scholar] [CrossRef]
- Schukin, A.; Sabinin, O. Plagiarism Detection in SQL Student Assignments. In Teaching and Learning in a Digital World. ICL 2017. Advances in Intelligent Systems and Computing; Auer, M., Guralnick, D., Simonics, I., Eds.; Springer: Cham, Switzerland, 2017; Volume 716. [Google Scholar] [CrossRef]
- Janardhana Rao, P.; Nageswara Rao, K.; Gokuruboyina, S.; Neeraja, K.N. An Efficient Methodology for Identifying the Similarity Between Languages with Levenshtein Distance. In Proceedings of the 6th International Conference on Communications and Cyber Physical Engineering. ICCCE 2024, Hyderabad, India, 28–29 April 2023; Lecture Notes in Electrical Engineering; Kumar, A., Mozar, S., Eds.; Springer: Singapore, 2024; Volume 1096. [Google Scholar] [CrossRef]
- Odim, M.O.; Arekete, S.A.; Oguntunde, B.O.; Lawal, A.O. A Multi-Pass Compiler with Code Optimized Abstract Syntax Tree. Int. J. Sci. High. Res. Eng. 2023, 11, 31–39. [Google Scholar] [CrossRef]
- Jerson, J.; Preethi, N. An Analysis of Levenshtein Distance Using Dynamic Programming Method. In Proceedings of 3rd International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications, Hyderabad, India, 28–29 March 2022; Lecture Notes in Networks and Systems; Gunjan, V.K., Zurada, J.M., Eds.; Springer: Singapore, 2023; Volume 540. [Google Scholar] [CrossRef]
- PostgreSQL Documentation. Available online: https://www.postgresql.org/docs/current/sql-explain.html (accessed on 20 January 2025).
- Odeniran, Q.; Wimmer, H.; Rebman, C. Node.js or PHP? Determining the Better Website Server Backend Scripting Language. School of Business: Faculty Scholarship 2023. Available online: https://digital.sandiego.edu/busnfaculty/18/ (accessed on 2 December 2024).
- Odeniran, Q. Comparative Analysis of Fullstack Development Technologies: Frontend, Backend and Database. Electronic Theses and Dissertations 2023. Available online: https://digitalcommons.georgiasouthern.edu/etd/2663/ (accessed on 2 December 2024).
- Quvvatov, B. Web front-end and back-end technologies in programming. Theor. Asp. Form. Pedagog. Sci. 2024, 3, 208–215. [Google Scholar]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).