Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport

Ziemba, Paweł; Majewski, Filip

doi:10.3390/su17167520

Open AccessArticle

Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport

by

Paweł Ziemba

^1,*

and

Filip Majewski

²

¹

Institute of Management, University of Szczecin, 70-453 Szczecin, Poland

²

Faculty of Computer Science and Telecommunications, Maritime University of Szczecin, 70-500 Szczecin, Poland

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(16), 7520; https://doi.org/10.3390/su17167520

Submission received: 21 July 2025 / Revised: 9 August 2025 / Accepted: 18 August 2025 / Published: 20 August 2025

(This article belongs to the Special Issue Sustainable, Intelligent, Resilient, and Connected Mobility and Transport Services)

Download

Browse Figure

Versions Notes

Abstract

Recently, the popularity of large language models (LLMs) used as artificial intelligence tools supporting humans has been growing. LLMs are applied in many fields, including increasingly for various sustainability-related issues. One of the most popular tools of this type is ChatGPT, which, after being supplied with appropriate knowledge, can act as a domain expert, including in the area of sustainable transport. The article uses this functionality of ChatGPT, feeding it with knowledge about electric vehicles (EVs) available on the Polish market. The aim of the research was to develop a solution based on an LLM, which will act as an advisor when buying an EV. After appropriate modelling of knowledge and feeding it into ChatGPT, an expert system was obtained, which, based on the defined needs of the user, recommends the most suitable EV for them. When answering the system’s questions, the user provides only a description of the decision-making situation at the LLM input (e.g., the locations to which they are travelling, information on the number of family members, etc.). In turn, the appropriately fine-tuned ChatGPT provides a recommendation of vehicles that meet the user’s defined needs. This is a very user-friendly solution because it does not require the user to precisely define the vehicle evaluation criteria or a set of alternatives. This approach also does not require the user to have detailed domain knowledge.

Keywords:

sustainable transport; electric vehicles; sustainability; expert system; sustainable decisions; ChatGPT; large language model; fine-tuning

1. Introduction

In the last few years, the data processing capabilities of artificial intelligence (AI) models have increased significantly. This is primarily due to the development of new Large Language Models (LLMs), in particular Chat Generative Pre-trained Transformer (ChatGPT) by OpenAI. The first version of ChatGPT, introduced in November 2022 [1], was based on a fine-tuned GPT-3.5 model [2]. Earlier, in 2017, transformer networks were developed, on which all GPT models are based [3]. Then, in June 2018, the GPT-1 model was published, in February 2019, GPT-2, and then in June 2020—GPT-3 [4]. Finally, in March 2022, the GPT-3.5 model appeared, and a year later—in March 2023—GPT-4 was published, on which the latest versions of ChatGPT are based [5].

The dynamic development of the GPT model is associated with the increase in the number of parameters used in its subsequent versions. For example, GPT-1 was based on 117 million parameters, while GPT-4 already uses 100 trillion parameters [6]. Thanks to this growth, each subsequent version of the model processes input data better and better and generates more accurate answers. It is worth noting that the very high level of advancement of ChatGPT-4 raises disputes as to whether it is already artificial general intelligence (AGI) [3] or is there still a way to go [7]. However, regardless of how we perceive this issue, we should appreciate the enormous potential of this AI model.

In addition to the dynamic development of the GPT model itself, the number of its users is growing equally dynamically. Within 5 days of its publication in November 2022, ChatGPT had as many as 1 million users [8], and after 3 months it was 100 million users [9]. Currently, this number is estimated at around 0.8–1 billion users [10]. The reason for the rapid growth in the number of ChatGPT users is undoubtedly the enormous possibilities and very wide range of applications of this tool. ChatGPT is used in such areas as business and finance, education, healthcare and medicine, content creation, law and legal services, scientific research, etc. [6,11].

In addition to the above-mentioned areas, ChatGPT is also increasingly used in sustainability-related problems. Hussein et al. [12] analyzed the impact of ChatGPT on various sectors, including sustainability, noting that it holds significant potential to advance sustainability efforts through predictive analytics, education, and process optimization. Rane [13] characterized the potential of ChatGPT application in data analysis, modelling, forecasting and optimization problems related to sustainable energy. Menéndez Medina and Heredia Álvaro [14] in their study confirmed that ChatGPT is able to predict electricity price fluctuations in the Spanish market, thus supporting the achievement of sustainability goals. Zhang et al. [15] applied ChatGPT to the sustainable development of the building sector for energy load prediction, fault diagnosis, and anomaly detection. In turn, Hidalgo-Betanzos et al. [16] applied ChatGPT as an expert in the field of energy renovation of buildings, advising on energy efficiency. Pei et al. [17] verified the potential of GPT-like tools to aid in the design of sustainable materials. Zhang et al. [18] investigated the possibility of using ChatGPT as an expert in the field of positive energy districts creating sustainable, self-sufficient communities. Finally, Feng et al. [19] integrated ChatGPT into an electric vehicles (EVs) battery recycling auction management system. In this study, ChatGPT was used as a decision support tool to reduce the risk of loss due to potential participant misunderstanding. The presented applications of ChatGPT in problems related to broadly understood sustainability indicate its great potential in this field. In particular, the use of ChatGPT as an expert or recommendation system supporting decisions in the area of sustainable development is an interesting research area discussed in the cited works.

One of the most important areas requiring expert support for society is certainly customer decision-making in the area of sustainable transport systems. In the European Union (EU), the implementation of sustainable development goals is primarily driven by the Green Deal adopted in 2019 and Fit for 55 implemented from 2021 [20]. According to the assumptions of the Green Deal package, the EU economy should become zero-emission and energy independent by 2050 [21]. In turn, Fit for 55 is a legislative package on climate and energy, containing the climate goals of the Green Deal [22]. It imposes on member states, among others, a 55% reduction in greenhouse gas (GHG) emissions by 2030 [23]. Since transport in the EU is responsible for approximately 25% of GHG emissions [24], it is of great importance for achieving the emission reduction targets adopted under the Green Deal and Fit for 55. The climate policy adopted by the EU therefore supports the development of EVs, which are now becoming an increasingly popular means of transport [25,26]. This is due to the fact that EVs have a much smaller impact on the environment than conventionally powered vehicles (up to 70% smaller than diesel vehicles) [27]. The electrification of transport, along with the increasing use of renewable energy sources, provides an opportunity to significantly reduce GHG emissions, contributing to the sustainability of the transport system [28]. This is particularly important in countries such as Poland, where as many as 73% of cars are older than 10 years [29], which means they do not meet modern emission standards.

The key issue for the successful introduction of EVs is the acceptance of consumers who are potential users of such vehicles [30]. Unfortunately, there are a number of operational factors that limit the acceptance of EVs, including limited range, long battery charging time, higher purchase costs than in the case of conventional vehicles, a smaller number of charging stations, or low consumer awareness and knowledge about EVs [31]. In a situation where the EV charging time is measured in tens of minutes or hours, and the number of charging stations is relatively small, purchasing an expensive EV with too short a range may cause the user to wait for hours during a longer journey for a charging station to become available and then to charge the battery. Of course, such situations will negatively affect the consumer’s experience and acceptance of EVs, as well as the perceived quality of life related to transport accessibility [32]. Therefore, it is very important to support decisions regarding the purchase of such vehicles, so that the consumer considering the purchase of an EV can receive assistance in choosing the vehicle best suited to their needs. The large number of available EV models, combined with the possibility of their additional configuration, indicates the need to develop an expert tool for recommending EVs tailored to the individual needs of the consumer. The previously mentioned applications of ChatGPT and its enormous data processing capabilities indicate the usefulness of the GPT model in such tasks. Based on these premises, the aim of the article was formulated, which is to develop an expert EVs recommendation system based on ChatGPT, as well as to fine-tune this system for diverse user groups.

Section 2 discusses the state of the art on LLMs and ChatGPT, including its potential and examples of application as a domain expert system. Section 3 describes the adopted research procedure related to the construction of an expert system based on ChatGPT. Section 4 presents the results of operation and fine-tuning of the developed system. The paper ends with conclusions from the conducted research included in Section 5.

2. Literature Review

2.1. The Potential of Using LLM ChatGPT as an Expert System

ChatGPT’s data processing skills are based on the cast amount of text data on which it was pre-trained [33]. Due to the size of the model itself and the large amount of training data, it has reached such a level of development that, according to some researchers, it can be considered an early version of AGI [34]. The term AGI means a model that has achieved human-level intelligence and is capable of performing all tasks that a human can perform [35]. A similar definition states that it is a system capable of performing a wide range of intellectual tasks, such as reasoning, problem-solving, and creativity [36]. Modern LLMs such as GPT-4 have such capabilities. This allows us to consider the latest LLMs, if not AGI, then at least a significant step in this direction [37].

LLMs owe their superior abilities in the area of understanding (i.e., text summarization and question answering) largely to their ability in natural language processing (NLP) [38]. NLP focuses on using computational techniques to learn, understand, and generate human language content [39]. This allows for improved human–machine communication by better understanding textual information sources and generating human-readable responses from the data [40]. Thanks to their NLP capabilities, LLMs are capable of mathematical and symbolic reasoning and other forms of inference [41]. Compared to traditional machine learning and deep learning approaches, LLMs have more advanced NLP skills and can understand text in a more complex way [42]. The advantage of LLMs over traditional approaches results from the way they are constructed. Their pre-training is based on large volumes of text data [43]. Therefore, during the pre-training and acquisition of general knowledge, they acquire universal linguistic knowledge and improve their NLP skills.

However, LLMs are not limited to general and universal knowledge, as they can also acquire handcrafted features and domain-specific knowledge [39]. Such matching of LLMs to domain knowledge occurs in the meta-training stage, which is conducted after the pre-training [44]. Meta-training allows the performance of pre-trained LLMs to be adjusted to the needs, preferences and intentions of users [43]. In this step, the LLM parameters are fine-tuned to specific tasks [45]. This is the so-called task-specific fine-tuning [43], or domain-specific fine-tuning [46,47]. Fine-tuning the LLM improves its adaptability, accuracy, and ability to more accurately identify communication indicators related to text [42]. Fine-tuning on domain-specific data improves the ability of LLMs to accurately interpret the nuances of words and semantic meanings, which leads to improved performance [48]. One important fine-tuning method is reinforcement learning from human feedback. This method leverages the knowledge of human evaluators and allows the model to adapt and evolve in response to real-world input data [49]. Human feedback also allows for correcting errors of LLMs. Therefore, it is important to carefully formulate prompts (prompt engineering), which allows LLMs to perform complex tasks more efficiently and generate better and more reliable answers [50]. Prompt engineering involves creating specific text instructions or cues to guide LLMs in performing particular tasks. In contrast to traditional full fine-tuning methods that directly adjust model parameters, prompt engineering focuses on refining prompts to effectively control model output. This approach leverages the inherent capabilities of LLMs, enabling efficient task adaptation without extensive parameter updates [48].

The ability to capture domain knowledge and perform domain-specific fine-tuning makes LLMs useful in many specialized areas. LLMs are used primarily in broadly understood medicine [51] and healthcare [52], including medical education, patient care, communications, hospital operations, data management, etc. [53]. There is also ongoing research into the development of LLM applications in areas such as accounting and finance [54], manufacturing [40], materials science [50], computer-aided design [55] and many others. Domain knowledge determines the application of LLMs as domain experts. LLM researchers note that they can replace human workers in demanding intellectual tasks [54]. What is more, LLMs can identify patterns and trends in data that may be invisible to human analysts [56]. As a result, LLMs can outperform human experts in tasks such as answering questions, generating summaries, and interpreting structured or unstructured data [51].

The above-described features of LLMs:

LLMs are similar to AGI,
ability to perform intellectual tasks such as problem-solving, creativity, reasoning,
high NLP skills, allowing for better communication, understanding source information, interpretation and reasoning,
having general and domain knowledge,

enable the construction of expert systems based on LLMs, with which one can communicate using natural language just as with human experts. Based on the above analyses, in this study we decided to use an approach in which the process of constructing an expert system would be based on the pre-trained LLM ChatGPT. It should be subjected to a domain-specific fine-tuning process to match it do domain knowledge. Reinforcement learning from human feedback, based on prompt engineering, will be used as a fine-tuning method. This approach will ensure efficient use of resources, time efficiency, relatively low complexity, user-friendliness, speed of adaptation, and the possibility of experimenting with different task configurations [48].

2.2. Applications of ChatGPT as a Domain Expert in the Literature

The concept of using ChatGPT as a domain expert is supported in the literature. Since the publication of ChatGPT in November 2022, there has been an increasing number of studies using ChatGPT as a domain expert, decision system, or expert system. The latest studies in this area are summarized in Table 1.

Praveen et al. [48] used 4 LLMs, including GPT-2, to assess sentiment in customer reviews. They validated the performance of LLMs using more than 1 million hotel reviews. Performance metrics known from classifier evaluation, such as accuracy, precision, recall, etc., were used to compare the performance of different LLMs. Liu [57] compared 12 LLMs, including GPT-3.5, as experts in the air transportation domain. LLMs generated answers to 18 domain reference questions covering various facts, reasoning, and explanations in the air transportation domain. Again, the models were compared using classifier performance metrics. Alipour-Vaezi and Tsui [58] used ChatGPT as a movie industry expert to evaluate the recognition (fame) of directors, scriptwriters, and actors. The accuracy of ChatGPT scores was verified by a group of 20 human domain experts. Zheng et al. [47] analyzed the effectiveness of GPT-3.5 Turbo and LLaMa-2 models, as well as 10 classic classifiers, in detecting faults in complex systems and distinguishing between different types of faults. Basic classification performance metrics were used to evaluate LLMs and classifiers. Wu et al. [59] developed an agent-based decision-making system for managing carbon dioxide emissions in manufacturing. In the agent system, 4 different LLMs were used, including GPT-3.5 and GPT-4. The system based on a given LLM was to diagnose the causes of increased CO₂ emissions in textile production, as well as to estimate the degree of certainty of its diagnosis. The performance of different LLMs was verified and assessed by a human domain expert. Świrski and Błach [60] verified the effectiveness of ChatGPT-4o in selecting the charging and discharging hours of energy storage. The criterion for model verification was the financial income from the operation of energy storage in daily cycles. Similarly, Menéndez Medina and Heredia Álvaro [14] verified the ability of ChatGPT to predict electricity price fluctuations in the market. The study included not only the study of the effectiveness of LLMs but also their prior fine-tuning. Basic classification performance metrics were used to evaluate LLMs and classifiers. Hidalgo-Betanzos et al. [16] applied ChatGPT as a domain expert in the problem of energy renovation of buildings for energy efficiency. LLM recommendations were generated in three variants based on prompts with different levels of detail. Individual energy renovation recommendations were assessed by a human expert. Jurišević et al. [61] investigated the applicability of two versions of ChatGPT as expert systems to support building energy efficiency management. Depending on the scenario, the systems’ performance was compared with expert assessments or actual information and data. Similarly, Zhang et al. [18] used ChatGPT as an expert in the field of positive energy districts. ChatGPT’s responses to questions about the challenges, impacts, and best practices in positive energy districts were compared with the responses of the expert panel and verified by the authors of that study.

All the cited studies confirm the validity and great potential of using LLMs, including ChatGPT, as a domain expert or expert system for decision support purposes, including in the energy industry. It should be noted, however, that during the analysis of the literature on the use of LLMs as experts, we did not identify any study in which LLMs were used as experts in the area related to EVs. The closest to this area is the work of Feng et al. [19]. However, in this case, ChatGPT was an element of the EV battery recycling auction management system, not an autonomous or expert system. Considering the indicated gap, it is fully justified to develop an expert system that will recommend EVs that are best suited to the needs of users.

3. Configuration of ChatGPT as an Expert System for EV Recommendation

During the study, the ChatGPT 4o model based on the GPT-4-turbo algorithm was used. This model, developed by OpenAI [62], operates through an API interface. The choice of this algorithm was motivated by its availability at the time of the study and the stability of its responses. In order to construct an EV recommendation system based on user preferences and needs, a procedure consisting of the following stages was used [63]:

data collection,
data processing,
analysis of user preference classes and needs,
system configuration, testing and fine-tuning.

3.1. Data Collection

The territory of Poland will be the area of operation of the recommendation system. Therefore, in order to obtain up-to-date data on available EVs models, the “Electric Vehicle Catalogue 2023” [64] developed by the Polish New Mobility Association (PNMA) was used. This document contains detailed technical specifications, such as engine power, range, price, release date and other technical data of EVs. This data is necessary to understand the parameters of available models and their comparison. An additional source of information was the Electric Vehicle Database (EVDB) [65], which provides similar data, with an emphasis on technical specifications, range, energy efficiency and cost of use. This database is particularly useful for analyzing the economic profitability of individual EVs models. In order to extend the data set with practical market information, websites with car advertisements—mobile.de [66] and otomoto.pl [67] were used as an additional source of current data. Data from these websites included EVs sales offers, including prices, mileage, locations, and other relevant parameters. Web scraping techniques were used to extract data from the given websites using the BeautifulSoup [68] and Selenium [69] libraries. This involved navigating the websites to identify elements containing car data, then extracting the data and saving it in a structured format.

3.2. Data Processing

In order to avoid redundancies, data from PNMA and EVDB were integrated into a single data set. Data collected from different sources required a uniform format and cleaning. The data processing process included:

Removing duplicates and incomplete records.
Normalization of values (e.g., currency conversion, unification of units of measurement, etc.).
Supplementing missing data, if possible—based on information available in other sources, and otherwise—based on approximation.
Data formatting for more efficient processing by the LLM. Such formatting included, among others, changing all letters to lowercase, removing punctuation marks, changing the order of attributes, unifying separator characters, separating columns with values from units of measurement.

In order to simplify the process of collecting and processing data, as well as to verify the possibility of self-filling data by ChatGPT, the number of cars has been limited to 18 models representing different car classes.

City class—small, low-priced cars with low performance designed mainly for city driving—5 models: Fiat 500e 3+1 42 kWh, Dacia Spring Electric 45, Opel Corsa Electric 50 kWh, Peugeot e-208 50 kWh, Nissan Leaf.
Middle class—larger cars, the most versatile—6 models: Kia EV6, Skoda Enyaq 60, Volkswagen ID.4 Pro, Hyundai IONIQ 5 Standard Range 2WD, Tesla Model 3, BMW iX3.
High class—premium class cars—4 models: Mercedes-Benz EQE 300, Audi Q8 e-tron 50 quattro, Volvo EX90 Single Motor, Tesla Model S Dual Motor.
Minivan class—cars selected for their greater capacity to transport people or cargo—3 models: Citroen e-Berlingo M 50 kWh, Nissan Townstar EV, Renault Kangoo E-Tech Electric.

One of the convenient options of working with ChatGPT is providing data in the PDF format, so the collected data was downloaded, converted and merged into files in this format. Thanks to this solution, the structure of the document is preserved and the manual document correction process is facilitated. Using external sources (PNMA and EVDB), a file containing selected data was prepared, including detailed information on technical specifications, prices and availability of individual car models. The file was cleaned to remove errors, duplicates, and unnecessary content. Despite the significant convenience of working with the PDF format, LLMs may have difficulty reading complex tables and graphs. Therefore, some information was prepared in the XLSX structured format. The PyMuPDF library [70] was used to extract data from PDF files, which allows for accurate processing of PDF documents. This process included the following steps:

Opening and reading the content of PDF files.
Identifying sections containing important information about cars.
Extracting data and saving it in the XLSX format.

Both the data collected in PDF files and in the XLSX file fed the knowledge base of LLM ChatGPT, used in the expert recommendation system. In addition, ChatGPT was instructed to obtain information from its general knowledge acquired during the pre-training of the model and from the Internet (e.g., EVs manufacturers’ websites) when necessary. This was particularly important when no EV included in the knowledge base fully matched the user’s preferences. ChatGPT could then search for an EV other than the 18 cars included in the knowledge base and recommend it as the preferred solution.

3.3. EV Selection Criteria and Recommendation Techniques

A set of criteria was defined to select appropriate EVs, tailored to the needs of specific users. Based on the literature [28,31,71,72,73,74], the following basic EV selection criteria were used:

Preferred budget: What is the maximum budget for purchasing a car?
Expected range: What range on a single charge is preferred? Does the user plan on long trips or mainly city driving?
Body type: What body type is preferred by the user? Are there any specific requirements regarding the vehicle class?
Charging method: Will the car be charged only using chargers? Will the user be able to charge independently, and if so, from which power source?
Vehicle size and capacity: How many people will be transported? Will large luggage be transported?

The recommendation system presents these criteria to the user in the form of questions in order to model user preferences. The answers obtained from the user are processed using NLP algorithms, then classified and compared with the data contained in the knowledge base of the recommendation system. Based on the obtained results, ChatGPT generates recommendations regarding the most relevant EV models.

The ChatGPT system uses advanced recommendation algorithms, including Content-Based Filtering, Collaborative Filtering, Hybrid Recommendation Systems, Matrix Factorization Models, Machine Learning, Rule-Based Systems, and Demographic-Based Personalization [62]. These algorithms analyze the compatibility of available EV models with user preferences and take into account both technical parameters and subjective user expectations. Each algorithm has its specific application and is used to analyze user preferences and match available EV models to them.

Content-Based Filtering [75] involves analyzing the characteristics of items (in this case EVs) and user preferences. The system creates a user preference profile based on the answers provided and compares it with the features of available EV models. This filtering is used to analyze the technical specifications of cars, such as engine power, range, charging time, body type, and to match user preferences to car features. For example, if the user prefers EVs with long range and low operating costs, the system will display EV models that meet these criteria.

Collaborative Filtering [75] is based on the analysis of other users’ behaviour. The algorithm compares the user’s choices with other users with similar preferences to provide recommendations. This process is used to analyze the choices and opinions of other users and makes recommendations based on the similarity of preferences among users. Example: if multiple users with similar preferences have selected a particular model, the system can suggest this model to a new user with similar preferences.

Hybrid Recommendation Systems [76] combine content-based filtering and collaborative filtering to provide more precise recommendations. The hybrid approach minimizes the shortcomings of each single approach and increases the accuracy of the recommendation through complementarity of the methods. Example: the system first identifies EV models that meet the user’s technical requirements, and then applies collaborative filtering to select models that are most popular among users with similar preferences.

Matrix Factorization Models [77] reduce the dimensionality of recommendation data by decomposing the matrix of users and items (in this case cars) into lower dimensions. This allows for the identification of hidden preference patterns by analyzing large sets of recommendation data and identifying hidden patterns of user preferences. Example: a system may identify that users who prefer certain car brands also tend to choose cars with certain technical features.

Machine Learning algorithms [78], such as Random Forest [79], Support Vector Machine [80], and Deep Neural Networks [79] are used to analyze complex patterns in data. They can predict users’ preferences and future choices by learning from historical data. An example is when a system can learn that users belonging to certain demographic groups tend to choose cars with certain specifications.

Rule-Based Systems [81] use defined rules to generate recommendations. These rules can be based on the analysis of historical data or expert knowledge. The rules are used to generate recommendations based on strictly defined criteria and take into account specific user requirements. For example, if a user indicates that they prefer cars with a high level of safety, the system will apply rules to select models with the best crash test results.

Demographic-Based Personalization [82], as the name suggests, is based on algorithms that use user demographic data (e.g., age, gender, place of residence). This data is used to provide recommendations that are more tailored to specific user needs, resulting from demographics. For example, the system can identify that users of a certain age and from a certain region prefer certain types of vehicles, which allows for better tailoring recommendations to individual needs.

3.4. System Configuration, Testing, and Fine-Tuning

The process of configuring and testing the ChatGPT-based recommending system differed from the traditional approach used when working with machine learning models. These differences are due to the design and functionality of LLM ChatGPT, and in particular its high level of natural language understanding. In general, the communication style and quality of responses generated by LLM outperform classical machine learning models in terms of semantics, fluency, and usability. Therefore, configuring and testing a system based on LLM requires a different approach than in the case of machine learning. The flow chart of the testing process is shown in Figure 1.

The first step in the configuration, testing and fine-tuning process was to validate the data that fed the system’s knowledge base. This validation was performed by asking the expert system basic questions about the EV parameters stored in the knowledge base. Unsatisfactory data validation results forced several iterations of data processing. In order to correctly load all data into the system, modification of the format, method of recording and order of the source data was required. In particular, it was necessary to modify the data in the XLSX file (splitting the columns with data into more detailed ones and changing the order of data in the columns) and to modify the data in the PDF files (cleaning and removing unnecessary data). These actions allowed for obtaining more accurate and precise answers from the expert system to basic questions verifying the content of the system’s knowledge base.

After feeding the expert system with data, prompts were provided to the system using natural language regarding how to obtain the required information from the user. This information was to be obtained by the system using questions regarding, among others, the purchase cost and the way the vehicle was used. The initial (basic) set of questions included the following topics:

Budget—How much do you plan to spend on the car? (e.g., <150,000 PLN, 150–250,000 PLN, >250,000 PLN)
Range—How long routes do you cover most often? (e.g., mainly in the city, up to 150 km per day, long routes >300 km)
Body type—Are you looking for a hatchback, SUV, sedan or maybe a van?
Charging—Do you have access to charging at home or at work?
Space—How many people usually travel with you and do you often carry large luggage?

The importance of individual questions asked by the system to the user and their impact on generating the final answer (recommendation) were also provided and regulated in natural language. The method of presenting the results of the system’s operation was also defined and transferred to ChatGPT using natural language. It should be noted that during the construction of individual prompts in natural language for the system, prompt formulation methods prepared specifically for this language model were used [83].

The essence of the testing process was the user’s interaction with the system. The user defined their needs by answering questions asked by the expert system, and then the domain expert verified the EV recommendations generated by the system. Based on a comparison between the recommendations of the domain expert and the expert system, we calculated the expert system’s error rate, defined as the ratio of incorrect recommendations generated by the system to the total number of recommendations. Similarly, we calculated the user preference error rate, defined as the number of incorrect user choices relative to the expert’s recommendations. The study also utilized a score-based evaluation for both the expert system’s recommendations and user preferences, as well as the opposing metric to the error rate: the expert system’s accuracy [84], defined as the number and rate of recommendations consistent with the expert.

Due to the inaccurate recommendations occurring in the case of users with less knowledge and lower awareness of their needs, additional modifications were introduced at a later stage of system development. Using natural language, the way the system provides answers (recommendations for users) was modified and a set of additional questions for users was defined in order to specify their needs and preferences. It should be noted that the stage of configuring and setting the system using natural language prompts was carried out in several iterations. Both the importance of questions and their impact on recommendations, as well as the content of additional questions for users, were adjusted and modified in subsequent iterations depending on the previous test results. The aim of these modifications was to obtain the most accurate vehicle recommendations. It is also worth noting that during the testing process, communication with the recommending system was conducted in Polish, English and German. This enabled verification of answers provided in different languages, as well as comparison of the efficiency and consistency of the conversation with the LLM in many languages. In addition, such a multilingual approach ensures greater usability and universality of the expert system.

4. Results

4.1. Testing of the EVs Recommendation System Based on the Initial Set of Questions

At the stage of testing the recommendation system, research was conducted on a group of 34 users. The results of the study are presented in Table 2.

During the research, the users’ technical, economic and automotive knowledge related to the subject of EVs was important. Users were divided into three groups based on their self-assessment of their knowledge of EVs. These were users with advanced knowledge of these areas (user A), intermediate users with standard (average) knowledge (user I), and users with little knowledge (user B). These user groups are distinguished in Table 2 in the “User” column. Each user initially selected their preferred EV model. This was a subjective choice, based on knowledge and individual user preferences (column “User preference before interaction” in Table 2). During the user’s interaction with the expert system, based on the user’s answers to the basic 5 questions, the system identified the vehicle best suited to the user’s needs and requirements. This vehicle was presented in the “System recommendation” column. Then, based on the conversation with the user, a similar vehicle identification was made by the domain expert (the “Expert recommendation” column). The significant difference was that the expert system was limited to a basic set of 5 questions, while the expert could ask the user any number of freely formulated questions. In addition, the expert assessed the accuracy of the user’s choice on a point scale, i.e., whether the vehicle preferred by the user is adapted to their requirements and needs (the “Accuracy of user preferences—assessment” column). The expert also made a similar assessment of the vehicle recommended by the expert system (the “Accuracy of system recommendations—assessment” column). These assessments were supported by additional comments on the differences between the vehicles preferred by the user, those indicated by the expert and those recommended by the expert system. In particular, the column “Accuracy of user preferences—comments” contains encoded comments on the accuracy of the user’s initial preference in relation to the expert’s recommendation. In turn, the column “Accuracy of system recommendations—comments” contains a comment on the consistency of the recommendation generated by the system with the vehicle selection made by the expert. All of these comments refer to the expert perspective, e.g., “cheaper option” means that the expert suggested a comparable EV that is cheaper. It should be noted that the domain expert (similarly to ChatGPT) could use information contained on the Internet, both when selecting a vehicle for the user, as well as when assessing the choices made by the user and the expert system. This allowed for the ongoing filling of any information gaps by the expert, ensuring that the expert recommendations and assessments are based on current and complete knowledge of the EV market.

The analysis of Table 2 indicates that users correctly assess their knowledge of EVs. Users who assessed themselves as advanced (A) significantly demonstrated greater knowledge and familiarity with EVs. They mostly correctly selected vehicles adapted to their needs. Intermediate (I) and beginner (B) users coped much worse with this task. They often selected vehicles that were too large or too expensive for their needs. Moreover, sometimes a vehicle offering greater possibilities, e.g., larger trunk, more space inside or a longer range, could be found at a price similar to the preferred vehicle. As for the correctness of the selection (matching) of vehicles to the needs of users by the expert team, also in this case the best results were obtained in the group of advanced users. On the other hand, the recommendations generated by the expert system for the remaining user groups were much worse. In the user groups with a standard (I) and low (B) level of knowledge about EVs, the expert system often suggested purchasing vehicles with similar performance parameters, but more expensive than those indicated by the domain expert. Equally often, the system suggested vehicles that were too large for the needs or at a similar price to those recommended by the domain expert, but less functional. Table 3 presents a quantitative summary of user preferences and expert system recommendations in relation to actual needs diagnosed by the domain expert. These results are presented separately for individual user groups.

To sum up the results presented in Table 2 and Table 3, it should be pointed out that advanced users are highly aware of their needs. Their preferences match the recommendations of the domain expert well, and the average value of the user preference accuracy score is 9.8. This means that advanced users are able to correctly determine what kind of car they need. Thanks to this, the expert system, based on the basic five questions, accurately recommends vehicles. In 8 out of 10 cases, the system recommended exactly what the users needed, and the average recommendation score of the system was 9.8. In the case of this group of users, the expert system worked well. In turn, beginners and intermediate users are people who have basic or even slightly more knowledge, but very often choose vehicles that do not match their actual needs. The average rating of their preferences was 6.69 for intermediate users and 5.27 for beginners. Because these users cannot accurately diagnose their needs, the expert system also fails to select EVs for them and only generates correct recommendations in individual cases. The average recommendation rating of the expert system for these user groups was 7.54 for intermediate users and 6.45 for beginners, respectively. This is better than the accuracy assessment of these users’ preferences, so the expert system’s recommendations are better than their initial vehicle choice. However, it is still not a satisfactory result.

Taking into account the good results of the system in the group of advanced users, we adopted the hypothesis that the incorrect recommendations of the expert system in the groups of less advanced users resulted from the lack of knowledge. In particular, the lack of awareness of their needs and the lack of knowledge about possible technical and infrastructural solutions related to EVs resulted in the lack of information necessary for accurate recommendation. In other words, if users do not have much knowledge, they are also unable to precisely define their needs in relation to the market offer. For this reason, their preferences are often contradictory, which causes the expert system to function incorrectly, e.g., they are looking for a cheap vehicle, but at the same time large and with a long range. The conclusions that emerge from the analysis of Table 2 and Table 3 suggest introducing additional auxiliary questions that would allow supplementing the knowledge about the detailed needs of users, especially intermediate ones and beginners.

4.2. Optimization of the EVs Recommendation System and Its Impact on the Recommendation Results

The optimization of the expert system consisted in finding the most common problems related to the lack of data on user needs. These gaps were filled by introducing auxiliary questions, which were to allow for obtaining more detailed requirements that EVs should meet. In the improved system, the data gaps were filled using auxiliary questions:

about the range and charging frequency,
about the vehicle size and capacity,
about battery changing,
about weather conditions and available infrastructure.

4.2.1. Optimization of Questions About Range and Charging Frequency

Problem: users were overstating the minimum range, not being aware of the actual Energy consumption, charging time and availability, and the available EV charging infrastructure.

Solution: additional questions were introduced.

How often do you cover distances over 300 km in a single trip?
Between which cities do you travel most often?
Do you have access to a fast charger?
Do you prefer a car with a large battery but longer charging time, or a smaller battery and fast charging?

Impact on recommendations:

If the user rarely exceeds 300 km without breaks, the recommendation included cars with a shorter range (e.g., Hyundai Kona Electric 39 kWh instead of the more expensive 64 kWh model).
Users who often travel on the motorway received recommendations for cars with fast charging support (e.g., Tesla Model 3 LR, BMW i4).
People who preferred a long range and charged their car at home received recommendations for models with a larger battery (e.g., Mercedes EQE 500).

4.2.2. Optimization of the Question About the Size and Load Capacity of the Vehicle

Problem: users often chose oversized EVs, not necessarily needing a larger body. The key information was the boot capacity adjusted to the needs.

Solution: additional questions were added.

How many people will most often travel by car, do you have children, how often do you travel with passengers?
How often and if at all do you transport large luggage (stroller, sports equipment)?
Do you need the increased ground clearance of an SUV for driving in difficult terrain?

Impact on recommendations:

Users with two children were offered family cars with a large boot (e.g., Škoda Enyaq, Kia EV9).
People who chose SUVs but did not need a lot of space were offered spacious hatchbacks (e.g., Volkswagen ID.3 instead of ID.4).
If the user needed an SUV only for ground clearance and traction, but mainly drove in the city, models with AWD drive and a smaller body were suggested (e.g., Volvo XC40 Recharge AWD instead of Volvo EX90).

4.2.3. Optimization of the Question About Charging

Problem: not all users were aware of how important access to a home charger and its version is, especially if the primary use of the EV was to be moving around the city.

Solution: additional questions were introduced.

Do you have access to a 230 V socket or a home charger?
How long does your car usually park in the same place?
Do you use DC chargers on the route or do you stop for longer breaks?

Impact on recommendations:

Users without home charging were given models with shorter DC charging times (e.g., Tesla Model 3, Hyundai Ioniq 5 instead of Renault Megane E-Tech).
If the car was to be parked for more than 8 h per day, models with smaller batteries and economical energy consumption were suggested (e.g., Volkswagen ID.3 58 kWh).
People with access to a home charger could receive recommendations for models with larger batteries but slower public charging (e.g., BMW iX3 instead of BMW i4).

4.2.4. New Questions Related to Weather Conditions and Available Infrastructure

Problem: EVs have different performance related to ambient temperature and parking space. Unfavourable conditions can reduce the range by up to 30%, which affects the comfort of EV use.

Solution: additional questions were introduced

Do you live in a region where temperatures regularly drop below −10 °C?
Will your car be regularly parked outside in winter?
Do you have the option to heat the battery before driving?

Impact on recommendations:

Users from cold regions were recommended EVs with heat pumps and better thermal insulation (e.g., Hyundai Ioniq 5, Volvo XC40 Recharge).
People without a garage and access to a warm parking space were given models that cope well in winter (e.g., Tesla Model Y AWD instead of Volkswagen ID.4).
Users who park outdoors and frequently use public chargers were informed about higher energy losses in winter, which could have influenced the choice of a car with a larger battery.

4.3. Tests of the EVs Recommendation System Based on an Extended Set of Questions

After introducing additional 13 questions to the system, it based its recommendations on a total of 18 questions about user preferences. In the dialogue, he used 5 basic questions and, if necessary, offered the possibility of refining the user’s preferences using the remaining 13 questions. Using the modified question base, the study was repeated on the same group of users. The results of the repeated study are presented in Table 4, and their analysis in Table 5. The results presented in Table 4 largely confirm the hypothesis that the expert system’s erroneous recommendations result from a lack of awareness of their needs. In turn, the introduction of additional questions allowed for the clarification of the user’s specific needs and resulted in an increase in the accuracy of the recommendations generated by the expert system. In fact, it was a partial representation of the work of a human domain expert who, when preparing a vehicle recommendation for the user, was not limited in any way in asking questions and could ask about any issues. The quantitative approach to the improvement of the system’s performance in each user group is presented in Table 5.

As mentioned earlier, the results presented in Table 5 confirm the increase in the accuracy of recommendations generated by the expert system using the extended set of questions. After introducing additional questions, the number of inaccurate recommendations generated by the expert system decreased in each user group. In the case of advanced users, this decreased from 20% to 10%, for intermediate users—from 92% to 54%, and for beginners—from 82% to 73%. Moreover, in each user group, the ratings of the recommendations generated by the system increased significantly in situations when the recommendation of the expert system differed from the recommendation of the human domain expert. This means that even when the expert system recommended a different vehicle than the human domain expert, the recommendation was better and closer to the human expert’s recommendation than the recommendation generated based on the basic set of questions. As a result, the average recommendation score increased by 0.1 in the advanced user group, by 1.31 in the intermediate user group and by 2.18 in the beginner user group.

The increase in the quality of the expert system recommendations after expanding the question database was also confirmed by the number of comments describing errors in the system recommendations. The summary of all comments from the human domain expert on the recommendations proposed by the expert system is presented in Table 6.

According to Table 6, after using additional questions, the number of consistent recommendations increased significantly, while the number of suboptimal recommendations decreased. In particular, the number of EVs recommended by the expert system decreased, which, in relations to the recommendations of the human domain expert, were characterized by:

smaller trunk (−8.82%),
higher price (−5.88%),
smaller range (−2.94%),
worse equipment (−2.94%),
too large in size in relations to the needs (−14.71%).

On the other hand, the number of cases in which the system recommended EVs that were too small compared to the recommendations of a human expert increased by 2.94%. This is due to the fact that during fine-tuning, the system was instructed to place greater importance on price than on car size when in a situation where it is unable to select the optimal vehicle that meets all the consumer’s expectations.

5. Conclusions

The aim of the study was to develop an expert EV recommendation system based on ChatGPT and to tune it so that it would effectively recommend vehicles for different user groups. In general, the tests showed that the expert system worked correctly, but the key role was played by the awareness of the user’s needs and knowledge. During testing of the recommendation system on different user groups, it turned out that depending on the level of knowledge of users about motorization and EVs, the system generates more or less accurate recommendations. Users with greater technical knowledge and more aware of their needs and the range of available EV market options achieved very good recommendation results. However, the analysis of the interactions of users with less technical knowledge indicated their difficulties in precisely defining requirements, which led to inaccurate or suboptimal recommendations. Therefore, the system was expanded with additional questions addressed to the user. Their aim was to clarify the user’s expectations and preferences, so that the user could define their needs more precisely. It certainly had a positive impact on consumers’ self-awareness of their needs. These activities allowed for a significant increase in the accuracy of recommendations generated by the system. Therefore, it should be stated that the adopted research procedure, based on pre-trained ChatGPT and domain-specific fine-tuning with the use of reinforcement learning from human feedback and prompt engineering, proved effective. It allowed us to construct an expert system and then improve its effectiveness by expanding the set of questions based on which recommendations were generated. The greatest improvement in effectiveness was achieved through hybrid systems. These proved especially effective for intermediate users. Their use increased the accuracy of recommendations and enabled better alignment with user needs, particularly in cases where the user’s actual requirements did not match their initial choices. It should be noted that although the system was developed within the context of the Polish electric vehicle market, its question structure and integration with local data sources make it easily adaptable to other countries. It is sufficient to replace the Polish data sources with their local equivalents and adjust or expand the questions to reflect the infrastructure and user preferences of a different region. An additional advantage is ChatGPT’s built-in multilingual capabilities, which enable the system to be deployed without the additional costs and effort typically required for interface translation.

As for the research limitations, one can certainly include the small research sample. The study involved 34 users of the expert system. Therefore, generalizing the conclusions from the research may seem difficult or doubtful. However, our work should be considered an exploratory study. We aimed to initially investigate the feasibility of using ChatGPT as an expert system, while simultaneously exploring possibilities for improving its performance through optimizations and refinements based on human feedback and prompt engineering. On the other hand, in studies on software usability conducted using field observation, thinking aloud, questionnaires, etc., it is assumed that such a number of users is fully sufficient [85]. Another limitation is the fact that ChatGPT is dynamic. Sometimes a small change in the content of the query can give slightly different results. In addition, ChatGPT is constantly learning, so in subsequent iterations, slightly better or worse answers can be obtained. To minimize the impact of the model’s dynamic nature, consistency control was implemented. All interactions with the system were conducted using the same model version based on GPT-4-turbo. Additionally, double validation was performed—identical input data were entered at different time intervals, and the consistency of responses was verified. Moreover, we tried to isolate the test environment by removing individual user dialogues with the system and disabling the option of saving conversations in memory and using them in subsequent responses during testing. However, in reality, the GPT architecture, like any Deep Neural Network, is a kind of “black box”, so its mechanism of action in a given case is not fully known. The lack of access to the model’s internal parameters was the main issue related to system configuration. Standard configuration limits [86] required data consolidation and proper alignment. The use of standardized prompts, structured input questions, and operating on a fixed version of the model made it possible to ensure consistent system behaviour.

When considering further research directions, it is important to note that the preliminary results indicate that the direction we have taken in this exploratory study is appropriate. Therefore, to continue the research on improving the performance of GPT-based expert systems, it is worthwhile to involve a larger group of respondents. The enormous potential of LLMs such as GPT should also be emphasized. They can be widely used to support users, and fine-tuning a pre-trained model allows it to become an expert in almost any field. We intend to use this potential in further research, developing an expert system that supports users in even more complex decision-making problems. An example of such a problem is support during the development of energy transformation projects for buildings related to the installation of RES and energy storage facilities.

Author Contributions

Conceptualization, F.M. and P.Z.; methodology, F.M. and P.Z.; software, F.M.; validation, P.Z.; formal analysis, F.M. and P.Z.; investigation, F.M.; resources, F.M.; data curation, F.M.; writing—original draft preparation, F.M. and P.Z.; writing—review and editing, P.Z.; visualization, F.M.; supervision, P.Z.; project administration, P.Z.; funding acquisition, F.M. and P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Marey, A.; Saad, A.M.; Tanas, Y.; Ghorab, H.; Niemierko, J.; Backer, H.; Umair, M. Evaluating the Accuracy and Reliability of AI Chatbots in Patient Education on Cardiovascular Imaging: A Comparative Study of ChatGPT, Gemini, and Copilot. Egypt. J. Radiol. Nucl. Med. 2025, 56, 37. [Google Scholar] [CrossRef]
Introducing ChatGPT. Available online: https://openai.com/index/chatgpt/ (accessed on 15 April 2025).
Akhtar, Z.B. Unveiling the Evolution of Generative AI (GAI): A Comprehensive and Investigative Analysis toward LLM Models (2021–2024) and Beyond. J. Electr. Syst. Inf. Technol. 2024, 11, 22. [Google Scholar] [CrossRef] [PubMed]
Marr, B. A Short History of ChatGPT: How We Got to Where We Are Today. Available online: https://www.forbes.com/sites/bernardmarr/2023/05/19/a-short-history-of-chatgpt-how-we-got-to-where-we-are-today/ (accessed on 27 May 2025).
Taloni, A.; Borselli, M.; Scarsi, V.; Rossi, C.; Coco, G.; Scorcia, V.; Giannaccare, G. Comparative Performance of Humans versus GPT-4.0 and GPT-3.5 in the Self-Assessment Program of American Academy of Ophthalmology. Sci. Rep. 2023, 13, 18562. [Google Scholar] [CrossRef] [PubMed]
Ray, P.P. ChatGPT: A Comprehensive Review on Background, Applications, Key Challenges, Bias, Ethics, Limitations and Future Scope. Internet Things Cyber-Phys. Syst. 2023, 3, 121–154. [Google Scholar] [CrossRef]
Emmert-Streib, F. Is ChatGPT the Way toward Artificial General Intelligence. Discov. Artif. Intell. 2024, 4, 32. [Google Scholar] [CrossRef]
Pagano, S.; Strumolo, L.; Michalk, K.; Schiegl, J.; Pulido, L.C.; Reinhard, J.; Maderbacher, G.; Renkawitz, T.; Schuster, M. Evaluating ChatGPT, Gemini and Other Large Language Models (LLMs) in Orthopaedic Diagnostics: A Prospective Clinical Study. Comput. Struct. Biotechnol. J. 2025, 28, 9–15. [Google Scholar] [CrossRef]
Dermata, A.; Arhakis, A.; Makrygiannakis, M.A.; Giannakopoulos, K.; Kaklamanos, E.G. Evaluating the Evidence-Based Potential of Six Large Language Models in Paediatric Dentistry: A Comparative Study on Generative Artificial Intelligence. Eur. Arch. Paediatr. Dent. 2025, 26, 527–535. [Google Scholar] [CrossRef]
Paris, M. ChatGPT Hits 1 Billion Users? ‘Doubled In Just Weeks’ Says OpenAI CEO. Available online: https://www.forbes.com/sites/martineparis/2025/04/12/chatgpt-hits-1-billion-users-openai-ceo-says-doubled-in-weeks/ (accessed on 27 May 2025).
Fui-Hoon Nah, F.; Zheng, R.; Cai, J.; Siau, K.; Chen, L. Generative AI and ChatGPT: Applications, Challenges, and AI-Human Collaboration. J. Inf. Technol. Case Appl. Res. 2023, 25, 277–304. [Google Scholar] [CrossRef]
Hussein, H.; Gordon, M.; Hodgkinson, C.; Foreman, R.; Wagad, S. ChatGPT’s Impact Across Sectors: A Systematic Review of Key Themes and Challenges. Big Data Cogn. Comput. 2025, 9, 56. [Google Scholar] [CrossRef]
Rane, N. Contribution of ChatGPT and Other Generative Artificial Intelligence (AI) in Renewable and Sustainable Energy. J. Adv. Artif. Intell. 2024, 2, 1–26. [Google Scholar] [CrossRef]
Menéndez Medina, A.; Heredia Álvaro, J.A. Using Generative Pre-Trained Transformers (GPT) for Electricity Price Trend Forecasting in the Spanish Market. Energies 2024, 17, 2338. [Google Scholar] [CrossRef]
Zhang, C.; Lu, J.; Zhao, Y. Generative Pre-Trained Transformers (GPT)-Based Automated Data Mining for Building Energy Management: Advantages, Limitations and the Future. Energy Built Environ. 2024, 5, 143–169. [Google Scholar] [CrossRef]
Hidalgo-Betanzos, J.M.; Prol-Godoy, I.; Terés-Zubiaga, J.; Briones-Llorente, R.; Martín-Garín, A. Can ChatGPT AI Replace or Contribute to Experts’ Diagnosis for Renovation Measures Identification? Buildings 2025, 15, 421. [Google Scholar] [CrossRef]
Pei, Z.; Yin, J.; Zhang, J. Language Models for Materials Discovery and Sustainability: Progress, Challenges, and Opportunities. Progress Mater. Sci. 2025, 154, 101495. [Google Scholar] [CrossRef]
Zhang, X.; Shah, J.; Han, M. ChatGPT for Fast Learning of Positive Energy District (PED): A Trial Testing and Comparison with Expert Discussion Results. Buildings 2023, 13, 1392. [Google Scholar] [CrossRef]
Feng, J.; Ning, Y.; Wang, Z.; Li, G.; Xu, S.X. ChatGPT-Enabled Two-Stage Auctions for Electric Vehicle Battery Recycling. Transp. Res. Part E Logist. Transp. Rev. 2024, 183, 103453. [Google Scholar] [CrossRef]
Ziemba, P.; Zair, A. Temporal Analysis of Energy Transformation in EU Countries. Energies 2023, 16, 7703. [Google Scholar] [CrossRef]
Kayakuş, M.; Terzioğlu, M.; Erdoğan, D.; Zetter, S.A.; Kabas, O.; Moiceanu, G. European Union 2030 Carbon Emission Target: The Case of Turkey. Sustainability 2023, 15, 13025. [Google Scholar] [CrossRef]
Gawrońska, D.; Mularczyk, A. Analysis of Greenhouse Gas Emissions Drivers in Poland and the EU: Correlation and Regression-Based Assessment. Sustainability 2025, 17, 4345. [Google Scholar] [CrossRef]
Tutak, M.; Brodny, J. Renewable Energy Consumption in Economic Sectors in the EU-27. The Impact on Economics, Environment and Conventional Energy Sources. A 20-Year Perspective. J. Clean. Prod. 2022, 345, 131076. [Google Scholar] [CrossRef]
Sichigea, M.; Cîrciumaru, D.; Brabete, V.; Barbu, C.M. Sustainable Transport in the European Union: Exploring the Net-Zero Transition through Confirmatory Factor Analysis and Gaussian Graphical Modeling. Energies 2024, 17, 2645. [Google Scholar] [CrossRef]
Meegoda, J.N.; Watts, D.; Patil, U. Regulations and Policies on the Management of the End of the Life of Lithium-Ion Batteries in Electrical Vehicles. Energies 2025, 18, 604. [Google Scholar] [CrossRef]
Sikora, R.; Krajewski, Ł.; Popenda, A.; Korzeniewska, E. Feasibility of Electric Vehicle Charging Stations from MV/LV Stations in Small Cities. Energies 2024, 17, 6357. [Google Scholar] [CrossRef]
Ziemba, P. Selection of Electric Vehicles for the Needs of Sustainable Transport under Conditions of Uncertainty—A Comparative Study on Fuzzy MCDA Methods. Energies 2021, 14, 7786. [Google Scholar] [CrossRef]
Ziemba, P. Multi-Criteria Stochastic Selection of Electric Vehicles for the Sustainable Development of Local Government and State Administration Units in Poland. Energies 2020, 13, 6299. [Google Scholar] [CrossRef]
Lis, A.; Szymanowski, R. Greening Polish Transportation? Untangling the Nexus between Electric Mobility and a Carbon-Based Regime. Energy Res. Soc. Sci. 2022, 83, 102336. [Google Scholar] [CrossRef]
Moons, I.; De Pelsmacker, P. Self-Brand Personality Differences and Attitudes towards Electric Cars. Sustainability 2015, 7, 12322–12339. [Google Scholar] [CrossRef]
Ziemba, P. Multi-Criteria Approach to Stochastic and Fuzzy Uncertainty in the Selection of Electric Vehicles with High Social Acceptance. Expert Syst. Appl. 2021, 173, 114686. [Google Scholar] [CrossRef]
Szaja, M.; Ziemba, P. Stochastic Modelling in Multi-Criteria Evaluation of Quality of Life—The Case of the West Pomeranian Voivodeship in Poland. Sustainability 2025, 17, 1966. [Google Scholar] [CrossRef]
Singh, R.; Hamouda, M.; Chamberlin, J.H.; Tóth, A.; Munford, J.; Silbergleit, M.; Baruah, D.; Burt, J.R.; Kabakus, I.M. ChatGPT vs. Gemini: Comparative Accuracy and Efficiency in Lung-RADS Score Assignment from Radiology Reports. Clin. Imaging 2025, 121, 110455. [Google Scholar] [CrossRef]
Bubeck, S.; Chandrasekaran, V.; Eldan, R.; Gehrke, J.; Horvitz, E.; Kamar, E.; Lee, P.; Lee, Y.T.; Li, Y.; Lundberg, S.; et al. Sparks of Artificial General Intelligence: Early Experiments with GPT-4. arXiv 2023. [Google Scholar] [CrossRef]
Haghir Chehreghani, M. The Embeddings World and Artificial General Intelligence. Cogn. Syst. Res. 2024, 84, 101201. [Google Scholar] [CrossRef]
Zhao, L.; Zhang, L.; Wu, Z.; Chen, Y.; Dai, H.; Yu, X.; Liu, Z.; Zhang, T.; Hu, X.; Jiang, X.; et al. When Brain-Inspired AI Meets AGI. Meta-Radiol. 2023, 1, 100005. [Google Scholar] [CrossRef]
Jahani Yekta, M.M. The General Intelligence of GPT–4, Its Knowledge Diffusive and Societal Influences, and Its Governance. Meta-Radiol. 2024, 2, 100078. [Google Scholar] [CrossRef]
Xu, Q.; Wu, Y.; Zheng, H.; Yan, H.; Wu, H.; Qian, Y.; Wu, Y.; Liu, B. Standardization in Artificial General Intelligence Model for Education. Comput. Stand. Interfaces 2025, 94, 104006. [Google Scholar] [CrossRef]
Kastrati, M.; Imran, A.S.; Hashmi, E.; Kastrati, Z.; Daudpota, S.M.; Biba, M. Unlocking Language Barriers: Assessing Pre-Trained Large Language Models across Multilingual Tasks and Unveiling the Black Box with Explainable Artificial Intelligence. Eng. Appl. Artif. Intell. 2025, 149, 110136. [Google Scholar] [CrossRef]
Garcia, C.I.; DiBattista, M.A.; Letelier, T.A.; Halloran, H.D.; Camelio, J.A. Framework for LLM Applications in Manufacturing. Manuf. Lett. 2024, 41, 253–263. [Google Scholar] [CrossRef]
López Espejel, J.; Ettifouri, E.H.; Yahaya Alassan, M.S.; Chouham, E.M.; Dahhane, W. GPT-3.5, GPT-4, or BARD? Evaluating LLMs Reasoning Ability in Zero-Shot Setting and Performance Boosting through Prompts. Nat. Lang. Process. J. 2023, 5, 100032. [Google Scholar] [CrossRef]
Shah, S.M.; Gillani, S.A.; Baig, M.S.A.; Saleem, M.A.; Siddiqui, M.H. Advancing Depression Detection on Social Media Platforms through Fine-Tuned Large Language Models. Online Soc. Netw. Media 2025, 46, 100311. [Google Scholar] [CrossRef]
Kalyan, K.S. A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4. Nat. Lang. Process. J. 2024, 6, 100048. [Google Scholar] [CrossRef]
Jiang, W.; Liu, G.; He, D.; He, K. Boosting Meta-Training with Base Class Information for Robust Few-Shot Learning. Eng. Appl. Artif. Intell. 2025, 152, 110780. [Google Scholar] [CrossRef]
Sinha, S.; Yue, Y.; Soto, V.; Kulkarni, M.; Lu, J.; Zhang, A. MAML-En-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 2711–2720. [Google Scholar]
Kraišniković, C.; Harb, R.; Plass, M.; Zoughbi, W.A.; Holzinger, A.; Müller, H. Fine-Tuning Language Model Embeddings to Reveal Domain Knowledge: An Explainable Artificial Intelligence Perspective on Medical Decision Making. Eng. Appl. Artif. Intell. 2025, 139, 109561. [Google Scholar] [CrossRef]
Zheng, S.; Pan, K.; Liu, J.; Chen, Y. Empirical Study on Fine-Tuning Pre-Trained Large Language Models for Fault Diagnosis of Complex Systems. Reliab. Eng. Syst. Saf. 2024, 252, 110382. [Google Scholar] [CrossRef]
Praveen, S.V.; Gajjar, P.; Ray, R.K.; Dutt, A. Crafting Clarity: Leveraging Large Language Models to Decode Consumer Reviews. J. Retail. Consum. Serv. 2024, 81, 103975. [Google Scholar] [CrossRef]
Anisuzzaman, D.M.; Malins, J.G.; Friedman, P.A.; Attia, Z.I. Fine-Tuning Large Language Models for Specialized Use Cases. Mayo Clin. Proc. Digit. Health 2025, 3, 100184. [Google Scholar] [CrossRef]
Satpute, P.; Tiwari, S.; Gupta, M.; Ghosh, S. Exploring Large Language Models for Microstructure Evolution in Materials. Mater. Today Commun. 2024, 40, 109583. [Google Scholar] [CrossRef]
Zagar, P.; Ravi, V.; Aalami, L.; Krusche, S.; Aalami, O.; Schmiedmayer, P. Dynamic Fog Computing for Enhanced LLM Execution in Medical Applications. Smart Health 2025, 36, 100577. [Google Scholar] [CrossRef]
Zhang, Y.; Pei, H.; Zhen, S.; Li, Q.; Liang, F. Chat Generative Pre-Trained Transformer (ChatGPT) Usage in Healthcare. Gastroenterol. Endosc. 2023, 1, 139–143. [Google Scholar] [CrossRef]
Kunze, K.N.; Nwachukwu, B.U.; Cote, M.P.; Ramkumar, P.N. Large Language Models Applied to Health Care Tasks May Improve Clinical Efficiency, Value of Care Rendered, Research, and Medical Education. Arthrosc. J. Arthrosc. Relat. Surg. 2025, 41, 547–556. [Google Scholar] [CrossRef]
Dong, M.M.; Stratopoulos, T.C.; Wang, V.X. A Scoping Review of ChatGPT Research in Accounting and Finance. Int. J. Account. Inf. Syst. 2024, 55, 100715. [Google Scholar] [CrossRef]
Zhou, J.; Camba, J.D. The Status, Evolution, and Future Challenges of Multimodal Large Language Models (LLMs) in Parametric CAD. Expert Syst. Appl. 2025, 285, 127520. [Google Scholar] [CrossRef]
Ullah, R.; Ismail, H.B.; Islam Khan, M.T.; Zeb, A. Nexus between Chat GPT Usage Dimensions and Investment Decisions Making in Pakistan: Moderating Role of Financial Literacy. Technol. Soc. 2024, 76, 102454. [Google Scholar] [CrossRef]
Liu, Y. Large Language Models for Air Transportation: A Critical Review. J. Air Transp. Res. Soc. 2024, 2, 100024. [Google Scholar] [CrossRef]
Alipour-Vaezi, M.; Tsui, K.-L. Data-Driven Portfolio Management for Motion Pictures Industry: A New Data-Driven Optimization Methodology Using a Large Language Model as the Expert. Comput. Ind. Eng. 2024, 197, 110574. [Google Scholar] [CrossRef]
Wu, T.; Li, J.; Bao, J.; Liu, Q. ProcessCarbonAgent: A Large Language Models-Empowered Autonomous Agent for Decision-Making in Manufacturing Carbon Emission Management. J. Manuf. Syst. 2024, 76, 429–442. [Google Scholar] [CrossRef]
Świrski, K.; Błach, P. Energy Storage Management Using Artificial Intelligence to Maximize Polish Energy Market Profits. Energies 2024, 17, 4855. [Google Scholar] [CrossRef]
Jurišević, N.; Kowalik, R.; Gordic, D.; Novaković, A.; Vukasinovic, V.; Rakić, N.; Nikolic, J.; Vukicevic, A. Large Language Models as Tools for Public Building Energy Management: An Assessment of Possibilities and Barriers. Int. J. Qual. Res. 2025, 19. [Google Scholar] [CrossRef]
OpenAI Platform. Available online: https://platform.openai.com (accessed on 26 January 2025).
Roy, D.; Dutta, M. A Systematic Review and Research Perspective on Recommender Systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
Polish New Mobility Association. Katalog Pojazdów Elektrycznych 2023; Polish New Mobility Association: Warszawa, Poland, 2022. [Google Scholar]
EV Database. Available online: https://ev-database.org/ (accessed on 6 August 2025).
Gebrauchtwagen & Neuwagen. Mobile.De. Available online: https://www.mobile.de/ (accessed on 14 January 2025).
OTOMOTO—Nowe i Używane Samochody i Motocykle Oraz Części Samochodowe. Ogłoszenia Motoryzacyjne. Available online: https://www.otomoto.pl/ (accessed on 14 January 2025).
Beautiful Soup Documentation—Beautiful Soup 4.12.0 Documentation. Available online: https://www.crummy.com/software/BeautifulSoup/bs4/doc/ (accessed on 14 January 2025).
Selenium. Available online: https://www.selenium.dev/ (accessed on 14 January 2025).
PyMuPDF 1.25.1 Documentation. Available online: https://pymupdf.readthedocs.io/en/latest/ (accessed on 14 January 2025).
Pamucar, D.; Ecer, F.; Deveci, M. Assessment of Alternative Fuel Vehicles for Sustainable Road Transportation of United States Using Integrated Fuzzy FUCOM and Neutrosophic Fuzzy MARCOS Methodology. Sci. Total Environ. 2021, 788, 147763. [Google Scholar] [CrossRef]
Biswas, T.K.; Das, M.C. Selection of Commercially Available Electric Vehicle Using Fuzzy AHP-MABAC. J. Inst. Eng. India Ser. C 2019, 100, 531–537. [Google Scholar] [CrossRef]
Sonar, H.C.; Kulkarni, S.D. An Integrated AHP-MABAC Approach for Electric Vehicle Selection. Res. Transp. Bus. Manag. 2021, 41, 100665. [Google Scholar] [CrossRef]
Ziemba, P.; Kannchen, M.; Borawski, M. Selection of the Family Electric Car Based on Objective and Subjective Criteria—Analysis of a Case Study of Polish Consumers. Energies 2024, 17, 1347. [Google Scholar] [CrossRef]
Glauber, R.; Loula, A. Collaborative Filtering vs. Content-Based Filtering: Differences and Similarities. arXiv 2019, arXiv:1912.08932. [Google Scholar] [CrossRef]
Çano, E.; Morisio, M. Hybrid Recommender Systems: A Systematic Literature Review. Intell. Data Anal. 2017, 21, 1487–1524. [Google Scholar] [CrossRef]
Koren, Y.; Bell, R.; Volinsky, C. Matrix Factorization Techniques for Recommender Systems. Computer 2009, 42, 30–37. [Google Scholar] [CrossRef]
Ziemba, P.; Becker, J.; Becker, A.; Radomska-Zalas, A.; Pawluk, M.; Wierzba, D. Credit Decision Support Based on Real Set of Cash Loans Using Integrated Machine Learning Algorithms. Electronics 2021, 10, 2099. [Google Scholar] [CrossRef]
Kumar, V.; Gleyzer, L.; Kahana, A.; Shukla, K.; Karniadakis, G.E. MYCRUNCHGPT: A LLM Assisted Framework for Scientific Machine Learning. J. Mach. Learn. Model. Comput. 2023, 4, 41–72. [Google Scholar] [CrossRef]
El-Hajjami, A.; Fafin, N.; Salinesi, C. Which AI Technique Is Better to Classify Requirements? An Experiment with SVM, LSTM, and ChatGPT. arXiv 2024. [Google Scholar] [CrossRef]
Khennouche, F.; Elmir, Y.; Djebari, N.; Himeur, Y.; Amira, A. Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs. arXiv 2023. [Google Scholar] [CrossRef]
Salinas, A.; Shah, P.; Huang, Y.; McCormack, R.; Morstatter, F. The Unequal Opportunities of Large Language Models: Examining Demographic Biases in Job Recommendations by ChatGPT and LLaMA. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, New York, NY, USA, 30 October 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 1–15. [Google Scholar]
Khennouche, F.; Elmir, Y.; Himeur, Y.; Djebari, N.; Amira, A. Revolutionizing Generative Pre-Traineds: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs. Expert Syst. Appl. 2024, 246, 123224. [Google Scholar] [CrossRef]
Ziemba, P.; Becker, J.; Becker, A.; Radomska-Zalas, A. Framework for Multi-Criteria Assessment of Classification Models for the Purposes of Credit Scoring. J. Big Data 2023, 10, 94. [Google Scholar] [CrossRef]
Holzinger, A. Usability Engineering Methods for Software Developers. Commun. ACM 2005, 48, 71–74. [Google Scholar] [CrossRef]
What Are GPT-4 Turbo Token Limits? Available online: http://anakin.ai/blog/gpt-4-turbo-token-limits/ (accessed on 6 August 2025).

Figure 1. The process of configuring, testing and fine-tuning of the recommending system.

Table 1. Studies on the use of LLMs as subject experts in various areas.

Research Problem	Research Scenarios	Number of LLMs	Examined LLMs	LLM Evaluation Method	Number of Criteria	LLM Evaluation Criteria	Ref.
Sentiment assessment in consumer reviews	1. Opinions on hotel services	4	GPT-2, Falcon-7B, MPT-7B, BERT	SL	4	Precision, recall, f1-score, accuracy	[48]
Generating information about air transport	1. Fact retrieval, 2. Complex reasoning, 3. Explanation	12	GPT-3.5, Claude-2, Cohere, ERNIE Bot 3.5, Falcon-180B, HunYuan-V1.5.8, LLaMa-2-70b, Mistral-7B-Instruct-v0.2, PaLM 2, Qwen-7B, Vicuna-33B, Yi-34B	SL	8	True positives, false positives, true negatives, false negatives, precision, recall, f1-score, speed of answer generation	[57]
Assessing the recognition of public figures	1. Assessing the recognition of people associated with the film industry	1	ChatGPT	EA	1	Accuracy	[58]
Diagnosing faults in complex systems	1. High-speed train braking system, 2. Tennessee Eastman process simulation	2	GPT-3.5 turbo, LLaMa-2	SL	4	Accuracy, f1-score, G-mean, MCC	[47]
Diagnosing the causes of increased CO₂ emissions and the degree of certainty of the diagnosis	1. Determining the causes of increased CO₂ emissions in textile production	4	GPT-4, GPT-3.5, Vicuna-13b, LLaMa-13b	EA	5	METEOR, BERTScore, NUBIA, BLEURT, ROUGE	[59]
Scheduling production from energy storage	1. Selection of charging and discharging hours of the energy storage	1	ChatGPT-4o	EA	1	Revenue	[60]
Forecasting electricity prices in the short term	1. Electricity prices in the Spanish market	2	ChatGPT, BERT	SL	2	Accuracy, MCC	[14]
Generating recommendations for energy renovation of a building	1. Renovation with energy efficiency in mind	1	ChatGPT	EA	1	Quality	[16]
Identifying and generating information about building construction periods and their energy performance	1. Classifying building construction periods, 2. Recommending legal acts that govern building thermal performance, 3. Comprehending period-specific details of building thermal envelopes	2	GPT-3.5, GPT-4 Turbo	EA	1	Quality	[61]
Generating information about positive energy districts	1. Challenges, impacts and good practices of positive energy districts	1	ChatGPT	EA	1	Quality	[18]

Abbreviations: EA—expert assessment, SL—supervised learning, MCC—Matthews Correlation Coefficient, METEOR—Metric for Evaluation of Translation with Explicit ORdering, BERTScore—Bidirectional Encoder Representations from Transformers Score, NUBIA—NeUral Based Interchangeability Assessor, BLEURT—BERT-Like Score for Evaluating Generation, ROUGE—Recall-Oriented Understudy for Gisting Evaluation.

Table 2. Test results of the EV recommendation system based on the initial set of questions.

User	User Preference Before Interaction	System Recommendation	Expert Recommendation	Accuracy of User Preferences		Accuracy of System Recommendations
User	User Preference Before Interaction	System Recommendation	Expert Recommendation	Assessment	Comments	Assessment	Comments
A001	Tesla Model Y (u)	Tesla Model Y (u)	Tesla Model Y (u)	10	CC	10	CR
A002	Volkswagen ID.3 (h)	Volkswagen ID.3 (h)	Nissan Leaf (h)	9	CO	9	CO
A003	Hyundai Ioniq 6 (s)	Hyundai Ioniq 6 (s)	Hyundai Ioniq 6 (s)	10	CC	10	CR
A004	Audi Q4 e-tron (u)	Audi Q4 e-tron (u)	Audi Q4 e-tron (u)	10	CC	10	CR
A005	Tesla Model 3 LR (s)	Tesla Model 3 LR (s)	Tesla Model 3 LR (s)	10	CC	10	CR
A006	BMW i4 (s)	BMW i4 (s)	BMW i4 (s)	10	CC	10	CR
A007	Kia EV6 (c)	Kia EV6 (c)	Hyundai IONIQ 5 (s)	9	LT	9	LT
A008	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	10	CC	10	CR
A009	Skoda Enyaq iV 80 (u)	Skoda Enyaq iV 80 (u)	Skoda Enyaq iV 80 (u)	10	CC	10	CR
A010	Mercedes EQE 300 (s)	Mercedes EQE 300 (s)	Mercedes EQE 300 (s)	10	CC	10	CR
I001	Tesla Model X (u)	Tesla Model Y (u)	Hyundai Kona 64 kWh (c)	7	CO, US	8	CO, LT, US
I002	Volkswagen ID.4 (u)	Volkswagen ID.3 (h)	Peugeot e-308 (h)	3	US	8	LT
I003	Nissan Leaf (h)	Renault Megane E-Tech (h)	Volkswagen ID.3 (h)	7	GR	8	BE
I004	Hyundai Kona 64 kWh (c)	Hyundai Kona 64 kWh (c)	Kia Niro EV (c)	8	MS, LT	8	MS, LT
I005	Audi Q8 e-tron (u)	Audi Q4 e-tron (u)	Tesla Model Y (u)	6	CO	6	GR, LT, CO
I006	BMW iX3 (u)	BMW i4 (s)	Hyundai Kona 64 kWh (c)	4	US	3	CO
I007	Peugeot e-208 (h)	Opel e-Corsa (h)	Peugeot e-208 (h)	10	CC	9	BE
I008	Kia EV6 (c)	Kia Niro EV (c)	Hyundai IONIQ 5 (c)	7	MS, LT	7	MS, LT, BE
I009	Volkswagen ID.5 (c)	Volkswagen ID.4 (u)	Skoda Enyaq iV 80 (u)	6	CO, LT	8	GR, LT, CO
I010	Ford Mustang Mach-E (u)	Tesla Model Y (u)	Kia EV6 (c)	7	US	8	US, LT
I011	Mercedes EQS (s)	Tesla Model S (s)	BMW i5 (s)	6	US, CO	6	CO, BE
I012	Renault Zoe (h)	Fiat 500e (h)	Fiat 500e (h)	8	CO	10	CR
I013	Volvo XC40 Recharge (u)	Hyundai Ioniq 5 (c)	Skoda Enyaq iV 80 (u)	8	LT, CO	9	CO, LT
B001	Tesla Model S Plaid (s)	BMW i4 (s)	Hyundai Ioniq 6 (s)	4	CO	6	GR, CO, MS
B002	Nissan Ariya (u)	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	5	US	10	CR
B003	Skoda Enyaq iV 80 (u)	Kia Niro EV (c)	Hyundai Kona 64 kWh (c)	3	US, CO	8	GR, CO
B004	Audi Q8 e-tron (u)	Audi Q4 e-tron (u)	Peugeot e-308 (h)	2	US	1	US
B005	Mercedes EQV (v)	Volkswagen ID.4 (u)	Peugeot e-208 (h)	2	US, TL	2	US
B006	Renault Zoe (h)	Renault Zoe (h)	Renault Zoe (h)	10	CC	10	CR
B007	Hyundai Kona 39 kWh (c)	Hyundai Kona 64 kWh (c)	Peugeot e-208 (h)	8	CO	8	CO
B008	Peugeot e-2008 (u)	Peugeot e-2008 (u)	Nissan Leaf (h)	2	US	2	US
B009	BMW iX3 (u)	BMW i4 (s)	Hyundai Ioniq 6 (s)	7	US	8	GR, CO
B010	Tesla Model X (u)	Tesla Model Y (u)	Skoda Enyaq iV 80 (u)	6	CO, LT	7	CO
B011	Fiat 500e (h)	Fiat 500e (h)	Fiat 500e 3+1 (h)	9	MS	9	MS

Abbreviations: CC—correct choice, CR—consistent recommendation, CO—cheaper option, LT—larger trunk, GR—greater range, BE—better equipment, US—unnecessary SUV/VAN, MS—more space, (h)—hatchback, (c)—crossover, (s)—sedan, (v)—van, (u)—SUV.

Table 3. Quantitative representation of user preferences and expert system recommendations based on domain expert ratings.

Users	Inaccurate User Preference	Average User Preference Accuracy Rating	Inaccurate Expert System Recommendation	Average Expert System Recommendation Rating
Advanced (A)	20.00%	9.8	20.00%	9.8
Intermediate (I)	92.31%	6.69	92.31%	7.54
Beginner (B)	90.91%	5.27	81.82%	6.45

Table 4. Test results of the EVs recommendation system based on an extended question set.

User	User Preference Before Interaction	System Recommendation	Expert Recommendation	Recommendation Accuracy of the System
User	User Preference Before Interaction	System Recommendation	Expert Recommendation	Evaluation	Comments
A001	Tesla Model Y (u)	Tesla Model Y (u)	Tesla Model Y (u)	10	CR
A002	Volkswagen ID.3 (h)	Nissan Leaf (h)	Nissan Leaf (h)	10	CR
A003	Hyundai Ioniq 6 (s)	Hyundai Ioniq 6 (s)	Hyundai Ioniq 6 (s)	10	CR
A004	Audi Q4 e-tron (u)	Audi Q4 e-tron (u)	Audi Q4 e-tron (u)	10	CR
A005	Tesla Model 3 LR (s)	Tesla Model 3 LR (s)	Tesla Model 3 LR (s)	10	CR
A006	BMW i4 (s)	BMW i4 (s)	BMW i4 (s)	10	CR
A007	Kia EV6 (c)	Kia EV6 (c)	Hyundai IONIQ 5 (s)	9	LT
A008	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	10	CR
A009	Skoda Enyaq iV 80 (u)	Skoda Enyaq iV 80 (u)	Skoda Enyaq iV 80 (u)	10	CR
A010	Mercedes EQE 300 (s)	Mercedes EQE 300 (s)	Mercedes EQE 300 (s)	10	CR
I001	Tesla Model X (u)	Hyundai Kona 64 kWh (c)	Hyundai Kona 64 kWh (c)	10	CR
I002	Volkswagen ID.4 (u)	Peugeot e-308 (h)	Peugeot e-308 (h)	10	CR
I003	Nissan Leaf (h)	Volkswagen ID.3 (h)	Volkswagen ID.3 (h)	10	CR
I004	Hyundai Kona 64 kWh (c)	Hyundai Kona 64 kWh (c)	Kia Niro EV (c)	8	MS, LT
I005	Audi Q8 e-tron (u)	Tesla Model Y (u)	Tesla Model Y (u)	10	CR
I006	BMW iX3 (u)	Kia Niro EV (c)	Hyundai Kona 64 kWh (c)	8	CO
I007	Peugeot e-208 (h)	Opel e-Corsa (h)	Peugeot e-208 (h)	9	BE
I008	Kia EV6 (c)	Kia Niro EV (c)	Hyundai IONIQ 5 (c)	7	MS, LT, BE
I009	Volkswagen ID.5 (c)	Volkswagen ID.4 (u)	Skoda Enyaq iV 80 (u)	8	CO, LT, GR
I010	Ford Mustang Mach-E (u)	Kia EV6 (c)	Kia EV6 (c)	10	CR
I011	Mercedes EQS (s)	Tesla Model S (s)	BMW i5 (s)	6	CO, BE
I012	Renault Zoe (h)	Fiat 500e (h)	Fiat 500e (h)	10	CR
I013	Volvo XC40 Recharge (u)	Hyundai Ioniq 5 (c)	Skoda Enyaq iV 80 (u)	9	CO, LT
B001	Tesla Model S Plaid (s)	BMW i4 (s)	Hyundai Ioniq 6 (s)	6	GR, CO, MS
B002	Nissan Ariya (u)	Renault Megane E-Tech (h)	Renault Megane E-Tech (h)	10	CR
B003	Skoda Enyaq iV 80 (u)	Kia Niro EV (c)	Hyundai Kona 64 kWh (c)	8	GR, CO
B004	Audi Q8 e-tron (u)	Volkswagen ID.3 (h)	Peugeot e-308 (h)	9	CO
B005	Mercedes EQV (v)	Fiat 500e (h)	Peugeot e-208 (h)	8	MS, LT
B006	Renault Zoe (h)	Renault Zoe (h)	Renault Zoe (h)	10	CR
B007	Hyundai Kona 39 kWh (c)	Hyundai Kona 64 kWh (c)	Peugeot e-208 (h)	8	CO
B008	Peugeot e-2008 (u)	Nissan Leaf (h)	Nissan Leaf (h)	10	CR
B009	BMW iX3 (u)	BMW i4 (s)	Hyundai Ioniq 6 (s)	8	GR, CO
B010	Tesla Model X (u)	Kia EV9 (c)	Skoda Enyaq iV 80 (u)	9	CO
B011	Fiat 500e (h)	Fiat 500e (h)	Fiat 500e 3+1 (h)	9	MS

Abbreviations: CR—consistent recommendation, CO—cheaper option, LT—larger trunk, GR—greater range, BE—better equipment, MS—more space, (h)—hatchback, (c)—crossover, (s)—sedan, (v)—van, (u)—SUV.

Table 5. Quantitative approach to expert system recommendations based on domain expert assessments using basic and extended question sets.

Users	Basic Set of Questions		Extended Set of Question
Users	Inaccurate Expert System Recommendation	Average Expert System Recommendation Rating	Inaccurate Expert System Recommendation	Average Expert System Recommendation Rating
Advanced (A)	20.00%	9.8	10.00%	9.9
Intermediate (I)	92.31%	7.54	53.85%	8.85
Beginner (B)	81.82%	6.45	72.73%	8.64

Table 6. Number of individual expert comments on expert system recommendations generated based on the basic and extended question sets.

Commentary on Expert System Recommendations	Basic Set of Questions		Extended Set of Questions
Commentary on Expert System Recommendations	%	Number	%	Number
Consistent recommendation	32.35%	11	52.94%	18
Larger trunk	26.47%	9	17.65%	6
Cheaper option	35.29%	12	29.41%	10
Greater range	14.71%	5	11.76%	4
Better equipment	11.76%	4	8.82%	3
Unnecessary SUV/VAN	14.71%	5	0.00%	0
More space	11.76%	4	14.71%	5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ziemba, P.; Majewski, F. Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport. Sustainability 2025, 17, 7520. https://doi.org/10.3390/su17167520

AMA Style

Ziemba P, Majewski F. Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport. Sustainability. 2025; 17(16):7520. https://doi.org/10.3390/su17167520

Chicago/Turabian Style

Ziemba, Paweł, and Filip Majewski. 2025. "Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport" Sustainability 17, no. 16: 7520. https://doi.org/10.3390/su17167520

APA Style

Ziemba, P., & Majewski, F. (2025). Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport. Sustainability, 17(16), 7520. https://doi.org/10.3390/su17167520

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Using the Large Language Model ChatGPT to Support Decisions in Sustainable Transport

Abstract

1. Introduction

2. Literature Review

2.1. The Potential of Using LLM ChatGPT as an Expert System

2.2. Applications of ChatGPT as a Domain Expert in the Literature

3. Configuration of ChatGPT as an Expert System for EV Recommendation

3.1. Data Collection

3.2. Data Processing

3.3. EV Selection Criteria and Recommendation Techniques

3.4. System Configuration, Testing, and Fine-Tuning

4. Results

4.1. Testing of the EVs Recommendation System Based on the Initial Set of Questions

4.2. Optimization of the EVs Recommendation System and Its Impact on the Recommendation Results

4.2.1. Optimization of Questions About Range and Charging Frequency

4.2.2. Optimization of the Question About the Size and Load Capacity of the Vehicle

4.2.3. Optimization of the Question About Charging

4.2.4. New Questions Related to Weather Conditions and Available Infrastructure

4.3. Tests of the EVs Recommendation System Based on an Extended Set of Questions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI