1. Introduction
Recently, there has been enough development of technologies. The Internet of Things (IoT) is a famous technology in the modern era and is in increasing demand. The IoT is used to remotely control objects, and its structure typically consists of three elementary layers. The first is the sensing layer, and it contains visual or scalar sensors that receive data in real time from its surroundings. The second is the network layer, and it is the one responsible for transporting and routing the data between different systems, typically employing protocols such as IP and ICMP. The third is the application layer, where software processes the compiled data and facilitates real-time communication with devices. Of all the various IoT applications, we view smart parking systems as crucial components of a smart city that provide real-time information on available and used parking spaces. This technology has simplified daily living compared to 10 years ago. With so many additional automobiles entering the road every day, parking issues have come to the forefront of public discussion in recent years [
1,
2].
One of the most fundamental and essential requirements for smart cities is smart parking systems. A driver looking for a parking space wastes 17 h a year on intermediates. This has caused the smart parking sector’s rapid growth and the frequent introduction of fresh ideas.
Multiple current procedures employ sensors at each location to decide whether parking spaces are available. A straightforward solution to the issue of automated parking place recognition has been to set up sensors in each parking space. Nevertheless, the equipment and implementation costs are quite high, particularly for certain large and historic buildings. One of the more significant issues is that there is not a lot of detailed sensor information available; whether a parking space is occupied or not is typically known, but other information, such as the license plate number or vehicle type (car or motorbike), is not available. To help fix this, some advanced systems use cameras as sensors in every parking space to gather more detailed data. This provides space for numerous other clever features, such as the ability to allow the user to locate their car via its license plate number. However, this type of technology needs a strong network throughout the parking area. A network infrastructure faces several difficulties due to the massive bandwidth requirements for gathering camera data. Another option by which to address parking issues is through using autonomous cars [
3]. However, one disadvantage [
4] of this is that passengers will disembark from the vehicle at the location and then go off to locate a parking space by themselves. In this situation, an automobile could circle the surrounding area, searching for a parking spot, and obstruct others and waste petrol. Another option for solving this issue is to use robotic valet methods [
5], although doing so would require costly and sophisticated mechanical apparatus. By using vision-based algorithms to cover large parking zones, specific systems lower the cost of sensor installation [
6,
7]. A deep learning approach solution built using the VGGNet group has been implemented to solve the parking space detection problem.
In recent years, computer vision-based smart parking systems have garnered substantial attention, driven by advancements in deep learning and the growing need for scalable, real-time solutions deployable on edge devices. These systems leverage camera feeds and deep neural networks to monitor and manage parking areas efficiently, often reducing the need for costly ground sensors [
7]. For instance, a deep learning solution based on the VGGNet group was implemented to detect parking space occupancy, demonstrating the effectiveness of convolutional models in structured parking scenarios. In [
8], a hybrid approach utilizing Faster R-CNN and YOLOv3 enabled the real-time detection of available slots, significantly improving urban traffic flow by notifying drivers about free parking spaces through a smart interface. Similarly, ref. [
9] proposed an advanced real-time parking management framework tailored for congested urban environments, combining YOLO-v4 object detection with behavioral data analysis. This integration allowed for the dynamic allocation of parking spaces based on user preferences and historical patterns, achieving 82% precision and reducing parking search times by 20%.
A recent survey showed that around 200 vehicles enter and exit Islamia College University every day. These vehicles vary in terms of type, engine capacity, and weight, and are driven by employees, teachers, students, and visitors. Upon entering the university, drivers begin searching for a parking space near their destination, with delays often caused by insufficient spaces and a lack of real-time parking information. The drivers spend approximately 20 to 25 min, on average, locating a parking spot, because the process is manual and inefficient. This causes students, employees, and instructors to be late for their work or classes, as it is difficult to locate good parking spaces within the necessary time. In addition, the traditional system affects the environment, which is not conducive to health. The paper is set out as follows:
Section 2 provides information about the literature study process;
Section 3 describes the experiment’s methodology in depth;
Section 4 discusses the experiment’s results and comparisons; and
Section 5 concludes the paper.
The primary goal of this research is to address the dual challenge of real-time parking availability detection and optimal parking slot recommendation using deep learning and algorithmic optimization. While prior works have focused either on nearest parking lot detection or individual slot detection, this study combines both aspects to deliver a unified, efficient solution. The novelty lies in the integration of deep CNN-based classification with the merge sort algorithm to rank parking distances within a mobile application. This approach not only minimizes user effort and fuel consumption but also contributes to applied algorithmic modeling through its practical implementation in smart city infrastructure.
2. Literature Review
The discussion featured in the literature is based on smart parking systems. Researchers have carried out extensive work on smart parking, and various methods have been developed to remove or mitigate the problem. In [
10], traffic congestion in urban areas was reduced through a polygon-based public parking zoning method, and optimal parking locations were provided using a genetic algorithm (GA). In [
11], the proposed approach focused on street parking detection. In their experiment, a convolutional neural network (CNN) model was trained to perform binary classification. A mobile application provided users with real-time information about available parking spaces. In [
12], three CNN models were trained on a parking dataset for classification, segmentation, and detection. After comparing their performance, the best model was selected to detect occupied parking lots and count cars. This initiated a discussion on determining the nearest available parking stall. In [
13], a method was proposed to identify the nearest parking space. The Dijkstra algorithm was utilized to determine the shortest path. Similarly, it was also used to find the nearest slot in the parking area. The IR sensor was installed at the gate to measure the size of the car, allowing the system to allocate a nearby space based on the vehicle’s dimensions. As a result, the parking spaces were utilized more efficiently, as discussed in [
14]. In [
15], a smart parking system was proposed specifically for outdoor parking environments. Instead of the Dijkstra algorithm, a genetic algorithm was used to help users find the nearest parking location via a mobile application. Another proposed system utilized the recognition of patterns of the K-Nearest Neighbors (K-NN) algorithm and image processing techniques, such as Gaussian blur, Otsu binarization, and Threshold INV, on OKU stickers. The OKU sticker was mounted on vehicles driven by disabled individuals. In [
16], the system assigned an appropriate proper parking space for disabled drivers as soon as the vehicle approached the barrier. Most existing solutions address the smart parking problem using a centralized approach based on a trusted third party, which is typically not transparent. In [
17], an integrated smart parking system was proposed, with the goal of integrating all parking services into a single platform. In [
18], an ultrasonic sensor was placed in each parking slot in order to detect the presence of vehicles. The occupied space signal was sent to the Raspberry Pi and was forwarded to cloud storage. The end user received information from the website. In [
19], a camera and ultrasonic sensors were installed within the vehicle to collect data from the roadside. When a commuter slowed down near the roadside in a smart city environment, a learning supervisor identified and suggested an empty parking space based on the vehicle’s size. Chungsan Lee et al. [
20] used ultrasonic sensors for indoor parking lots and magnetic sensors for outdoor parking lots. An ultrasonic sensor mote and a Bluetooth communication module were placed on the ceiling of each parking slot. The sensor mote collected the data from the ultrasonic sensor and communicated with the user’s smartphone using BLE (Bluetooth Low Energy). It also transmitted the mote ID and USIM (Universal Subscriber Identity Module) ID to a server via Zigbee through the gateway in the outdoor parking system. The mote included a magnetic sensor module. An underground garage faces two main problems: detecting free space and positioning moving vehicles. Cheng Yuan et al. [
21] developed a smart parking system that combined Wi-Fi and sensor networks. A geomagnetic sensor was used for space detection, and Wi-Fi was used to obtain the car’s position in the indoor parking area. The information was displayed on the mobile app whenever a commuter entered the parking area. Varghese et al. [
22] trained a binary SVM classifier on the parking dataset. The features extracted using SURF and color information were then applied to K-means clustering to learn the visual dictionary.
The following are the main contributions:
The proposed method provides the closest slot within the nearest parking location in real time.
The proposed architecture is lightweight and has a reduced number of layers.
A mobile application was developed for end-users to make parking requests.
We used the merge sort algorithm behind this application to sort the distances.
We created a parking dataset using a small number of images.
While the existing body of work has made significant strides in optimizing parking systems through a combination of sensors, algorithms (such as Dijkstra, GA, and CNNs), and mobile application development, several persistent challenges remain. These include the need for higher prediction accuracy, robustness in dynamic environments, and efficient real-time responsiveness across distributed systems.
Traditional approaches, although effective in specific contexts, often struggle with complex urban dynamics, memory-dependent behaviors, and scalability. To bridge this gap, recent research has turned to fractional calculus, a mathematical framework that generalizes classical differentiation and integration to non-integer (fractional) orders. This approach has proven especially effective in systems exhibiting nonlinearity, memory, and multiscale behaviors, which are common in real-world urban mobility scenarios.
The integration of fractional calculus into smart parking systems is revolutionizing urban mobility by improving prediction accuracy, control, and resource efficiency. Advances in fractional-order modeling, optimization, and sensor fusion have tackled key challenges like real-time parking management, vehicle stability, and energy-efficient data processing. While smart parking research has progressed through algorithms, sensors, and machine learning, fractional calculus—a mathematical approach extending differentiation and integration to non-integer orders—provides powerful tools for capturing complex, memory-dependent, and multi-scale dynamics. This synthesis connects foundational smart parking research with these cutting-edge fractional calculus applications, highlighting contributions from both fields. Early systems employed ultrasonic sensors to detect parking occupancy, transmitting data via Raspberry Pi to cloud platforms for user access [
23,
24]. Magnetic and geomagnetic sensors improved outdoor detection accuracy [
25], while Bluetooth Low Energy (BLE) and Zigbee enabled real-time communication between sensors and mobile apps. However, scalability and adaptability to dynamic urban environments remained limitations. The Dijkstra algorithm dominated early pathfinding for parking slot allocations [
26], while genetic algorithms (GAs) optimized parking zoning in polygon-based systems [
27,
28]. Later, CNNs and SVMs enhanced occupancy classification by processing visual data from cameras [
29,
30,
31]. Centralized architecture, however, often lacked transparency and real-time responsiveness. Mobile applications emerged as critical interfaces, providing real-time parking availability and navigation. Merge sort algorithms improved distance-based sorting efficiency (user contribution), while K-NN classifiers prioritized disabled drivers via OKU sticker recognition. Integrated platforms unified parking services, yet challenges persisted in balancing computational efficiency with accuracy. Fractional-order PID (FOPID) controllers, utilizing non-integer differentiation (μ) and integration (λ) orders, have demonstrated superior robustness in dynamic systems. In autonomous parking, FOPID controllers improved lateral control accuracy by 27% compared to classical PID controllers, effectively handling nonlinearities in vehicle dynamics [
32]. The CRONE methodology, a pioneering FOC approach, has been adapted for real-time parking guidance systems, leveraging power-law memory kernels to weight historical sensor data [
33]. Fractional edge detection operators, such as the Grünwald-Letnikov derivative, have revolutionized CNN-based parking classifiers. By integrating global pixel dependencies, these operators reduce false positives from shadows by 15% while preserving structural details. Experimental validations on parking datasets show that fractional Sobel filters achieve peak signal-to-noise ratios (PSNR) of 42 dB, outperforming Otsu binarization [
34]. The fractional Lighthill-Whitham-Richards (LWR) model, employing local fractional derivatives, captures non-differentiable density patterns in vehicular flow. A 2D fractal LWR model predicted parking demand with 92% accuracy in urban centers by modeling traffic viscosity as a fractional-order function. Complementary work on fractional grey models improved short-term traffic forecasts using spatiotemporal oscillatory data, reducing prediction errors to 4.2% RMSE. Fractional calculus enables novel IoT resource management strategies. The WildWood algorithm, combined with a fractional Golden Search variant, reduced sensor network energy consumption by 22% while maintaining coverage—a critical advancement for distributed parking systems. Finite-time allocation algorithms for fractional-order multi-agent systems (FOMAS) further ensure stability in dynamic parking networks [
35]. The proposed system’s lightweight architecture benefits from FOPID controllers, which adaptively adjust parking barrier responses using historical occupancy trends. As demonstrated in low-speed autonomous vehicles, FOPID controllers achieve settling times of 0.28 s with zero overshoot, outperforming integer-order controllers in snow and rain conditions. Replacing traditional Gaussian blur with fractional differentiation (order ν = 0.5) in the preprocessing pipeline enhances CNN performance using the custom parking dataset. This approach strengthens high-frequency features in low-light images, improving classification accuracy by 12%. Integrating the fractional Riccati model with Dijkstra-based routing enables proactive slot reservations. By analyzing traffic hereditary properties via Caputo-Fabrizio operators, the system reroutes drivers 8 min before congestion spikes, reducing search times by 34% [
36].
Recent work in intelligent transportation systems has demonstrated the superiority of fractional-order controllers for robust vehicle control, including automated parking and car-following, due to their iso-damping properties and adaptability to varying dynamics [
37]. In computer vision, fractional calculus-based neural networks have enhanced object detection and denoising, directly benefiting parking occupancy classification in challenging environments [
38]. Additionally, the integration of fractional calculus in signal processing and clustering—such as fractional fuzzy C-means—has improved the accuracy and adaptability of parking data analysis [
39,
40]. These advances align with the latest trends highlighted in recent special issues on fractional calculus, underscoring its transformative potential in smart mobility and resource management [
41].
4. Experiment Result
In this phase, the mobile application user interface is presented, emphasizing key functionalities that facilitate user interaction. Furthermore, the performance of the pretrained model is evaluated using a confusion matrix across different datasets to assess its generalization ability. Several baseline parameters—such as accuracy and model size—are also compared to determine overall effectiveness.
4.1. Mobile Application
Mobile applications are software programs designed to run on mobile devices, providing end-users with access to essential services. Commonly referred to as mobile apps or simply apps, they are widely used and often developed for specific functionalities. In the context of smart parking, dedicated mobile applications have been designed to support parking-related services. This app offers faster performance compared to its web-based counterpart. Given the widespread use of mobile devices, with users remaining connected around the clock, the app enables real-time access to information regarding the nearest available parking lots. Before using the app, end-users are required to register and log in. Once logged in, they can access a variety of features displayed on the mobile interface. Thus, these coordinates are entered into the distance function to find the distance between the user and the parking space, as illustrated in
Figure 10, which demonstrates the steps of the Merge Sort algorithm used to sort the distances.
It has several functions, such as ‘request’, information’, ‘map’, ‘setting’, ‘profile’, and ‘around info’. These functions are mentioned in the bulleted form, and
Figure 11 depicts app pages.
Behind the process, the ‘request’ function is based on the service of the nearest parking location and then the closest slot. Suppose the user wants space for a standing vehicle in the parking lot; then they click on the ‘request’ event. Then, the mobile GPS is automatically turned on to find the user’s current coordinates. This way, the parking coordination is stored in cloud storage. Thus, these coordinates are put in the distance function to find the distance between the user and the parking space. The outcomes of the distance function are kept in an array. The merge sort algorithm is used to sort the data after the distance has been found, since there are multiple parking distances between the parking spaces and the users. The element in the first index, the nearest parking distance location, will be shown on the screen.
The user’s name, phone number, and license plate number are available on the profile page.
The information page shows details about the reservation space and the parking space.
In the settings page, include the change password and email events.
The map page shows the map and navigation options.
The info page describes the information that surrounds the parking.
4.2. Evaluation of Models Using the Confusion Matrix and Classification Measures
Table 1 describes the distribution of the parking dataset. The dataset is divided into three phases: training, testing, and validation. In the training phase, 60% of samples of both classes are used during the execution time of the training model, while in the testing phase, 20% of samples are used and 20% are reserved for validation. Samples from the testing 470 phase are treated as new data and used after the training to evaluate the model generalization. So, to check the model’s perception on validation images, two other datasets were used to test all proposed models, which were trained on the parking dataset. The CNR + EXT dataset and PKLot dataset, specifically the UFPR04 and UFPR05 subsets. Confusion matrices are widely used for estimation when attempting to solve classification issues. They can solve both binary and multiclass classification issues.
Table 2 illustrates a confusion matrix for binary classification.
Confusion matrices show counts between expected and actual labels. The result “TN” stands for True Negative and displays the number of negatively classed cases that were correctly identified. Similarly, “TP” stands for True Positive and denotes the quantity of correctly identified positive cases. The abbreviation “FP” denotes false positive, i.e., the number of negative cases incorrectly categorized as positive, while “FN” denotes false negative, i.e., the number of positive cases incorrectly categorized as negative.
Since accuracy might be deceptive when applied to unbalanced datasets, alternative metrics based on the confusion matrix are also relevant for assessing performance. The “confusion matrix” method from the “sklearn” module in Python 3.8 can be used to obtain the confusion matrix. The command “from sklearn. metrics import confusion_matrix” can be used to import this function into Python 3.8. Users must supply the function with actual and predicted values to obtain the confusion matrix. The classification measure contains terms such as precision, recall, F1-score, etc., extending the confusion matrix. Precision is a matrix term; it identifies the true positive label prediction in the quantity form out of all actual true positive labels. Thus, true positive labels predicted through the model are divided by the sum of the predicted true positive and false positive labels.
Equation (2) represents a precision process. TP stands for true positive label, while FP stands for false positive. Recall is a term of the true positive matrix; it identifies the true positive label prediction as a quantity out of all true positive labels. Thus, true positive la-bels predicted through the model are divided by the sum of predicted true positive and false negative labels.
Equation (3) represents the recall process; TP stands for true positive, while FN stands for false negative. A classifier’s recall and precision scores are combined to create the F-measure, sometimes called the F-value. In classification situations, the F-measure is another often-used statistic. A weighted average of recall and precision is used to get the F-measure. It is helpful to comprehend the trade-off between precision and coverage when classifying a positive situation.
Equation (4) represents the F1-score process. The F1-score is calculated as the harmonic mean of precision and recall, using the formula: F1 = 2 × (precision × recall)/(precision + recall).
In this section, the performance of the proposed pre-trained model of individual class is analyzed in terms of precision, recall, and F1-score based on the validation images, as demonstrated in
Table 3. The Vgg16, Resnet50, Xception, and Lenet models have the same score; these models have a precision value of 99.97%, a recall value of 100.0%, and an F1 value of 99.98%. In the empty class, the precision score is 100.0%, the recall is 99.92%, and the F1 value is 99.96%. Similarly, MobileNet and the proposed CNN model have the same score for all terms, 100.0%, and higher than other models, while the AlexNet model’s score is less than that of all other models. The precision, recall, and F1-score for the empty class are 99.97%, while for the occupied class, these metrics are 99.92%. These results indicate strong performance and validate the effectiveness of both the MobileNet and the proposed CNN model on the test images. Another analysis of the performance of the pre-trained architecture of individual classes in terms of precision, recall, and F1-score on the UERE04 and UFPR05 datasets. UFPR04 and UFPR05 are subsets of the PkLot da-taset. The PkLot dataset contains 12,417 full images of parking yards in different weather conditions, such as overcast, sunny, and rainy. The total number of cropped images of slot areas is 695,899 from all the empty or occupied images. The subset of UFPR04 empty class images is 59,718, and occupied class images are 46,125, so the total number of images for both classes is 105,843, while a subset of UFPR05 has a total of 165,785 images: 68,359 empty class images, and 97,426 occupied class images; these subsets are detailed in
Table 4. In
Table 5, The Vgg16 model’s high score is 99.54% in terms of precision based on the empty class, while on the occupied class outcome value is 99.84% in terms of recall.; Likewise, while observing the performance of the ResNet50 model performs similarly to VGG16, showing strong results only in terms of precision for the empty class and recall for the occupied class. However, both models underperform compared to others when evaluated using the F1-score for both classes. In contrast, the MobileNet and Xception models demonstrate a strong likelihood of superior performance, with only a 0.5 to 0.7 difference in F1-scores between them. The proposed CNN model and LeNet follow as the second-best performers, achieving up to 80% across all classification metrics. LeNet alone ranks third, scoring slightly below 80% on all metrics. In
Table 6, the MobileNet model has the highest precision score on the empty class and recall score on the occupied class, in the same way as the model of Lenet and the proposed CNN. In terms of the F1-score, the Xception model performs particularly well on the occupied class, due to its high precision and recall. On the other hand, AlexNet and the other models show varying performance, with some models exhibiting lower or higher precision, recall, and F1-scores across both classes. The third analysis is the performance of the pre-trained models on the CNR + EXT dataset. The CNRPPark + EXT is an extension of the CNRPark dataset. It has 4287 full images taken on 23 different days with three weather scenarios: overcast, sunny, and rainy. The full images are cropped to extract the slot parts, resulting in a total of 144,965 slot images—comprising 65,658 images of empty slots and 79,307 images of occupied slots. The CNR + EXT details are mentioned in
Table 7.
4.3. Comparison of Proposed Models
The comparative analysis of seven experiment-trained models, such as LeNet, AlexNet, Xception, VGG16, MobileNet, and ResNet50, as well as the proposed CNN, revealed in
Table 8, is based on model size and computational cost for smart parking applications.
Among the heavyweight models, Xception and VGG16 achieved high accuracy on the proposed dataset (99.97%) and strong generalization on benchmark datasets (e.g., 97.18% for UFPR04 with Xception). However, this comes at the cost of large model sizes (239.6 MB and 170.1 MB, respectively) and extremely high computational requirements (1103.36 MFLOPs for Xception, 2766.89 MFLOPs for VGG16). These models, though accurate, are unsuitable for edge deployment due to latency and hardware constraints.
ResNet50 also shows decent accuracy (99.97%) but suffers from a large model size (283.9 MB) and high MFLOPs (986.03), placing it in a similar category as VGG16 and Xception in terms of inefficiency for real-time embedded systems.
MobileNet, designed for lightweight applications, performs well with high accuracy (97.24% on UFPR04 and 100% on the proposed dataset), a moderate model size (40.2 MB), and relatively low MFLOPs (101.99). It proves to be a strong contender for smart parking on resource-constrained devices.
LeNet, the smallest traditional architecture in terms of computation (8.93 MFLOPs) and size (4.5 MB), delivers moderate performance (87.77% on UFPR04), making it suitable for simpler deployments where top-tier accuracy is not required.
The proposed CNN model, with only 2 MB size and 20.15 MFLOPs, achieves 100% accuracy on the custom dataset and performs comparably on public datasets like UFPR04 (83.64%) and UFPR05 (84.46%). While it may not outperform complex models on all datasets, its balance between accuracy and computational efficiency makes it ideal for real-time edge deployment in smart parking systems.
4.4. Brief Discussion Based on the Detection and Classification Model
In this section, the discussion is about the detection and classification model. The classification model is a model that predicts class labels. Its architecture is based on two main parts: the feature extraction part, which includes convolutional, pooling, stride, dropout, and ReLU function layers, and the other is the fully connected part, which involves dense layers, and the final layer is equipped with an activation function. This model takes an image as input and outputs a class label that belongs to a specific class group. There are a lot of generated classification models that are trained on massive datasets. In detection models, the input image is processed not only for classification but also for the localization (and sometimes orientation) of the object. It has a similar architecture to the classifier but includes an additional head section for the localization process. Multiple detection concepts are available, like YOLO [
112], SSD [
113], R-CNN [
114], Faster R-CNN, LSTM, 1D-CNN, Decision Tree [
115,
116,
117,
118,
119,
120,
121], window sliding [
122], etc., and they are widely used to solve detection problems. If both models are compared according to the architecture, processing time, and outcome, the classifier architecture predicts the class label. For example, if an object is small or located at the top left side of the image, the classifier still predicts only the class of the whole image; regardless of object detection location, it classifies the whole image. Its computational time is less than that of the detection model. One detection scenario includes a feature extraction component called the backbone. Another component is the detection head, which focuses on the concept of object detection. It is efficient in generating output, as it classifies objects and provides their position or localization within the image. However, due to the added complexity of detection, it requires significantly more computational resources and takes longer to process than the classifier alone. However, the proposed CNN model is a classifier; all the dataset images have been manually annotated for the slot area, whether empty or occupied. Therefore, in the experiment method, cameras are fixed in the parking area, not for surveillance. All annotated slots in every image are passed to the classifier for prediction in the detection process. Then, a bounding box is drawn at the slot position in the image, annotated with a label and score.
Figure 12 shows detection process and
Figure 13 shows the validation performance of proposed CNN.
To ensure transparency and reproducibility of the proposed CNN model, we present its key hyperparameters in
Table 9. The model accepts input images of size 71 × 71 × 3 and is composed of two Conv2D layers, one DepthwiseConv2D layer, a single MaxPooling2D layer, and a fully connected Dense layer. ReLU is used as the activation function throughout the network. The model is trained using the Adam optimizer with a learning rate of 0.001 and a batch size of 128. These parameters were selected after empirical tuning to strike a balance between computational efficiency and performance, particularly for edge device deployment. The simplicity of the architecture and optimized training settings contribute to reduced model size and lower inference latency, making it suitable for real-time parking applications on resource-constrained platforms.
To provide evidence of the convergence and stability of the proposed CNN model, we include training and validation performance plots in
Figure 13. The consistent decline in cross-entropy loss and simultaneous rise and stabilization of accuracy curves indicate effective convergence during training. These empirical results demonstrate the model’s reliability and generalization, serving as a practical alternative to formal convergence proofs typically used in numerical methods.
4.5. Error Analysis
Although the proposed CNN model demonstrates high accuracy and a compact size suitable for edge deployment, some limitations remain, particularly under real-world conditions not represented in the training dataset. The model exhibited false positives that occurred on images affected by shadows, partial occlusions, and artificial lighting, such as red lamps or reflective surfaces (
Figure 14a–d). These environmental factors were not present in the custom training dataset, which primarily contained well-lit, unobstructed parking slots. As a result, the model lacked exposure to such challenging conditions, leading to misclassifications. For example, shadows cast by vehicles or surrounding objects were sometimes misclassified as occupied slots, and glare from lamps or bulbs caused the model to incorrectly detect vehicle presence. Furthermore, because of this limited environmental diversity in the training data, the model may have developed biases toward clean visual patterns, reducing its ability to generalize in less controlled scenarios.
4.6. Discussion
The experimental results across multiple datasets demonstrate that the proposed CNN model performs competitively, achieving 100% accuracy on the proposed parking dataset and delivering satisfactory results on public datasets such as UFPR04, UFPR05, and CNRPark-EXT. These results indicate that the proposed model is capable of effectively identifying parking slot conditions in diverse scenarios. As shown by the proposed model performance
Figure 15 sunny conditions and
Figure 16 shows the rainy conditions.
A key advantage of the proposed CNN architecture lies in its compact size and low computational complexity. Compared to deeper and more resource-intensive models like VGG16, ResNet50, and Xception, the proposed CNN offers a more efficient solution suitable for deployment on edge devices such as Raspberry Pi. This makes it ideal for real-time applications in smart parking systems where speed, efficiency, and hardware constraints are critical.
However, the model has certain limitations. In public datasets such as CNRPark-EXT and UFPR05, the performance slightly declines due to challenging image conditions. These include visual obstructions (e.g., trees or lamp posts), shadows on occupied slots, improper cropping, and variations in lighting and weather. These issues make it difficult for the model to generalize across diverse environments.
To address these challenges, future work may explore the following directions:
Incorporating data from different perspectives and environmental conditions to improve model robustness.
Using attention mechanisms or adaptive preprocessing techniques to enhance feature extraction in low-visibility conditions.
Integrating semantic segmentation or object detection to better localize and understand parking slots.
Table 10 shows a comparison of the performances of several models on the PKLot dataset in terms of accuracy and model size. The proposed CNN model was tested on the Full PKLot dataset images, unlike previous works [
123,
124,
125]) that tested only on distributed subsets of the dataset. Despite this comprehensive testing, the proposed CNN achieved an accuracy of 84.04% with a significantly smaller model size (2.00 MB), highlighting its efficiency and suitability for edge devices. On the custom proposed dataset, the same model achieved 100.0% accuracy, confirming its strong performance under controlled conditions.
5. Conclusions
This article focuses on identifying the nearest available parking space using a custom-built parking dataset and deep learning techniques. In the experimental study, six widely used pre-trained models—VGG16, ResNet50, MobileNet, LeNet, AlexNet, and Xception—were evaluated through transfer learning. Additionally, a lightweight, custom-designed CNN model was proposed and tested on the same dataset. A mobile application was also developed to provide real-time parking information to users. The backend of the application employs a merge sort algorithm to efficiently sort parking spaces based on distance, helping to reduce traffic congestion, save time and fuel, and lower CO2 emissions.
Future work will focus on enhancing the dataset by incorporating images under diverse weather and lighting conditions, such as overcast, foggy, and nighttime scenarios, which are currently underrepresented. The current binary classification task (occupied vs. free) will be expanded to handle more nuanced cases, including reserved spaces, illegally parked vehicles, and misaligned parking. The distance estimation will be refined to consider route-based navigation rather than straight-line distances. Additional features planned for the mobile application include time tracking, integrated payment systems, real-time notifications, and improved user authentication—potentially incorporating biometric security like fingerprint verification. Further exploration of advanced deep learning techniques is also planned to improve robustness and accuracy across varied environments.