A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application

Nicula, Andrei-Sebastian; Ternauciuc, Andrei; Vasiu, Radu-Adrian

doi:10.3390/app15147869

Open AccessArticle

A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application

by

Andrei-Sebastian Nicula

^*

,

Andrei Ternauciuc

^*

and

Radu-Adrian Vasiu

^*

Communications Department, Faculty of Electronics, Telecommunications and Information Technologies, Politehnica University of Timișoara, 300006 Timișoara, Romania

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 7869; https://doi.org/10.3390/app15147869

Submission received: 28 May 2025 / Revised: 23 June 2025 / Accepted: 25 June 2025 / Published: 14 July 2025

Download

Browse Figures

Versions Notes

Abstract

Rental accommodations near European university campuses keep rising in price, while listings remain scattered and opaque. This paper proposes a solution that overcomes these issues by integrating real-time open listing ingestion, zone-level geospatial enrichment, and a reinforcement-learning recommender into one streamlined analysis pipeline. On demand, the system updates price statistics for most districts in Timișoara and returns five budget-safe offers in a short amount of time. By combining adaptive ranking with new spatial metrics, it significantly cuts search time and removes irrelevant offers in pilot trials. Moreover, this implementation is fully open-data, open-source, and free, designed specifically for students to ensure accessibility, transparency, and cost efficiency.

Keywords:

student housing; rental recommender system; algorithm; Timisoara universities; open data; cost-effective solutions; reinforcement learning

1. Introduction

Finding suitable rental accommodation is a critical challenge for university students in urban centers [1]. The proliferation of online listings offers many options, but comparing them is complex due to varying locations, prices, and features [2]. Students often need housing near the campus, within a limited budget, and must weigh trade-offs between proximity and affordability [3]. Traditional rental websites provide search filters and basic maps, but they may not tailor recommendations to individual student priorities or provide advanced analytics [4]. As a result, students can spend considerable time browsing and still miss optimal choices. There is a growing need for smart housing search tools that leverage data and AI to simplify decision-making as part of broader smart city initiatives for improving urban living [5].

This paper addresses these limitations by introducing a modular, data-driven web application tailored to the student-housing ecosystem in Timișoara. The solution integrates real-time rental data ingestion, geospatial zoning, and an adaptive recommendation engine trained to deliver personalized housing suggestions based on proximity to a student’s chosen university, room preferences, and budget constraints. The system is built to deliver fast, university-specific results while remaining open, reproducible, and cost-free to users.

1.1. Related Work

Various systems have applied data-driven methods to rental recommendations, often using collaborative filtering or fuzzy logic to handle user preferences [3]. These approaches, however, typically lack real-time learning or geospatial features. Some platforms have added augmented reality or basic filtering, but their models remain static and shallow [2].

Real-estate recommender systems have evolved from simple collaborative filtering [6] to sophisticated hybrid approaches. One solution [7] proposed a fuzzy logic-based system achieving 72% user satisfaction, while another [8] demonstrated that incorporating user behavioral patterns improved recommendation accuracy by 34%. However, these systems lack real-time adaptability. Recent work [9] has introduced deep learning for property recommendations, achieving state-of-the-art results on a dataset. Our approach extends this by incorporating reinforcement learning for continuous improvement.

Reinforcement learning (RL) offers dynamic, long-term optimization and is increasingly used in domains like e-commerce. Its use in housing is limited, and our system is among the first to apply RL to student-focused rental recommendations based on campus proximity and budget [1].

Geospatial tools like interactive maps and rent heatmaps enhance decision-making by visualizing neighborhood trends. We extend this with real-time visualizations using live GeoJSON and listing data [4]. MongoDB and other NoSQL databases support flexible housing data storage, geospatial queries, and keyword searches. When combined with automated data scraping, they enable up-to-date insights critical for smart-city applications [5].

The uniquely developed application combines RL, live mapping, real-time data collection, and a modern full-stack design, features not jointly addressed in prior housing systems.

1.1.1. Student Housing and AI in Recommendations

Student housing presents unique challenges, which have been addressed in only limited ways in the literature. Some literature reviews [10] identified five key factors: proximity, affordability, safety, amenities, and social environment. Another approach [11] conducted a comprehensive survey of 2400 students, revealing that 67% prioritize location over price. Exploring [12] behavioral intentions in on-campus housing, another study found that recommendation systems could reduce search times by 60%. However, no existing work combines these insights with adaptive AI systems.

The need for transparency in AI recommendations has driven significant research. One research study [13] categorized explanation types, while another [14] proposed collaborative explainable recommendations. For real estate, a scientific paper [6] showed that explanations increased user trust by 42%. Recent advances in explainable RL [15] provide frameworks we adapt for our context, specifically their concept of “action influence” visualization.

1.1.2. Anomaly Detection and Geospatial Analysis in Real-Estate Data

Data quality remains crucial for real-estate platforms. While applying isolation forest [16] seems to be the best approach in general for anomaly detection in specific applications such as real estate, some other approaches [17] have detected price manipulation in Greek real estate with 89% accuracy. There are studies [18] that combined multiple anomaly detection algorithms, showing that ensemble methods outperform single approaches by 15–20%. Our implementation builds on these findings with real-time processing capabilities.

Dynamic urban analysis has evolved from utilizing static administrative boundaries to data-driven approaches. One development [19] used spectral clustering on check-in data. For real estate, in general [20], spatial representations outperform ZIP codes for price prediction by 8%. Recent work [21] on DBSCAN parameter optimization achieved 95% clustering stability, which we incorporate into our dynamic zone generation.

1.1.3. Research Gaps Addressed

Our work addresses several gaps: (1) no existing system combines RL with explainable recommendations for student housing, (2) real-time anomaly detection integrated with recommender systems remains unexplored, (3) dynamic micro-region generation for real estate has not been applied on a city scale, and (4) multi-source integration with automated conflict resolution lacks implementation in production systems.

1.2. Application Showcased in This Paper

Our system improves upon traditional rental market analysis by dynamically generating micro-regions based on real-time housing listing data rather than relying on fixed administrative boundaries [22]. We employ a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm that automatically groups listings into clusters [23]. Initially, each rental listing is transformed into spatial data points, incorporating geographical coordinates (latitude and longitude) and pricing details [24]. The DBSCAN algorithm then clusters these points using an adaptive approach to calculate the optimal clustering radius (eps) through entropy-based optimization and sets a flexible minimum number of points (minPts) based on the dataset size, ensuring that clusters remain representative and scalable [25]. To further refine these micro-regions, we integrate Open-Source Routing Machine (OSRM 5.27.2) (Heidelberg Institute for Theoretical Studies, Heidelberg, Germany), specifically tailored for pedestrian routing, to measure realistic walking times from each cluster centroid to key points of interest, such as universities. This step significantly enhances practical usability, accounting for real-world variables like terrain, pedestrian pathways, and seasonal variations, resulting in more meaningful micro-region definitions. Empirical validation of this method in recent urban planning research has demonstrated an approximately 5% improvement in accuracy over static boundary approaches, ensuring efficient scaling even for continental-scale applications, with response times maintained at milliseconds. The recommendation engine leverages reinforcement-learning logic to rank listings not only by price and area but also by contextual cues like zone relevance, posting freshness, and text-based similarity to university-specific keywords [26]. This approach sharply reduces irrelevant results and search time, particularly when compared to the keyword-only filtering methods used in most existing platforms [27].

This implementation is publicly available, cloud-hosted, and fully operational. Both the frontend (React 18.2.0) and backend (ExpressJS 4.18.2 + MongoDB 7.0.5) are deployed serverless on Vercel (Vercel Inc., San Francisco, CA, USA), offering high availability and cross-device compatibility with no user login required. The application is free and open-source, emphasizing transparency and long-term maintainability. Its architecture is designed to be easily adapted for other cities or extended with additional machine-learning models, enabling reproducibility and future research. By combining live data acquisition, spatial analysis, and student-centered AI recommendations into a single cohesive pipeline, this work sets a new standard for affordability-focused housing tools. It contributes both a practical solution for students in Timișoara and a reusable blueprint for others tackling similar urban housing accessibility problems in academic contexts.

A critical innovation in our implementation is the integration of an explainable reinforcement-learning (XRL) framework that provides transparent recommendation rationales. Drawing from recent advances in XRL visualization [3], we implemented a multi-layer explanation system that exposes the decision-making process through both visual weight distributions and natural language explanations. The system displays feature importance scores for distance (35%), price alignment (30%), posting freshness (20%), and text similarity (15%) directly in the user interface through intuitive visual indicators. This transparency mechanism has been shown to increase user trust [2].

2. Research Questions and Methodology

2.1. Research Questions

The research questions for this paper are as follows:

Q1—How can reinforcement learning-based recommendations improve the relevance and efficiency of housing searches for university students compared to traditional filter-based approaches?

Q2—What are the benefits and limitations of integrating real-time open data, geospatial zoning, and free cloud deployment in creating accessible rental platforms for students in mid-sized European cities?

2.2. Methodology

To address the research questions, this study employed a multifaceted approach that included a literature review, prototype development, and practical evaluation. The literature review focused on existing research in open data visualization, real-estate data processing, and other rental applications, offering insights into current trends and challenges. While traditional recommender system approaches have been applied to this domain, acknowledging the complexity of housing decisions influenced by factors such as location, amenities, and price, “A User-Centric Housing Recommender System” [4] introduced a hybrid housing recommendation model that combined fuzzy logic with item-based collaborative filtering. This approach utilized fuzzy rules to interpret imprecise user preferences and filtered listings accordingly [28]. However, their system did not incorporate reinforcement learning or interactive map features, relying instead on static preference modeling [29].

Another notable advancement is the integration of augmented reality (AR) with basic recommender algorithms in rental platforms [5]; the “Smart Rent Portal using Recommendation System Visualized by Augmented Reality” was developed, which enabled users to search for rental properties using a preference-based collaborative filtering recommender and explore apartment interiors through AR visualizations. Their system featured a Node.js (runtime)/20.11.1 LTS and Express backend, utilized a MongoDB database for storing property listings, and demonstrated the practicality of a web-based rental search application. In contrast, our work introduces a reinforcement-learning framework to continually improve recommendations based on user interactions, which has shown promise in other recommendation domains for capturing dynamic preferences [9].

The article selection process was guided by the PRISMA framework, as illustrated in Figure 1. This diagram outlines the systematic approach used to identify, screen, and include relevant studies for review.

Keywords: “data visualization”, “real estate data processing” Timeframe: 2010–2025, Sort by relevancy.

During manuscript preparation, graphical icons were prototyped with the assistance of ChatGPT o4-mini (OpenAI, San Francisco, CA, USA) under direct human supervision. All AI-generated material was subsequently reviewed and edited by the authors, who take full responsibility for the final content.

The second part of the methodology involved developing a web application prototype with a React frontend and an ExpressJS + MongoDB backend, deployed serverless on Vercel. This platform is designed to address real-world challenges faced by students seeking rental housing near universities in Timișoara [6]. It leverages open data sources and zone-based geospatial enrichment (e.g., computing each listing’s proximity to campus zones) to provide real-time, context-rich rental information. Interactive mapping visualizations are combined with a reinforcement-learning model that ranks and recommends listings based on user interactions, improving search efficiency beyond traditional filter-based methods (RQ1). Furthermore, the integration of open data and a free, serverless deployment ensures the platform remains accessible and up-to-date for student users, illustrating the benefits of using open, real-time geospatial data and cloud deployment in a student-focused housing platform (RQ2).

The development itself spans multiple layers of the platform, from data acquisition to AI model design and user interface development. In this section, we describe how each major component of the system was designed and implemented, and how they work together to deliver an intelligent rental recommendation service [6].

To populate our housing recommender system with relevant and up-to-date listings, we primarily leveraged the free public API provided by the Storia real-estate platform as well as the API from the Imobiliare.ro (Imobiliare.ro SRL, Cluj-Napoca, Romania) real-estate platform, as no official city rental platforms are available for Timisoara city.

These APIs provide structured access to a large volume of rental listings, allowing for the efficient retrieval of key information, such as title, price, room count, location, description, and listing metadata. We designed automated routines to query the API on a scheduled basis, filtering for listings relevant to student renters in the target city (e.g., near university zones, within student budget ranges). The collected data was ingested into our MongoDB Atlas (MongoDB Inc., New York, NY, USA) database in real-time, where it could be indexed and processed for recommendation and visualization [30].

While the APIs provided a reliable data stream, some listings occasionally contained incomplete, outdated, or miscategorized attributes. To address these inconsistencies, we implemented a lightweight validation layer using Playwright 1.44.0 (Microsoft Corp., Redmond, WA, USA). Rather than scraping all data from the frontend website, Playwright was used selectively to open a sample of listings and verify their content against the API responses. This helped identify discrepancies such as missing images, outdated pricing, or mislabeling of room types. These checks allowed us to tune our ingestion filters and flag anomalous entries for exclusion or correction [31].

The multi-source integration also employs temporal and geospatial deduplication using DBSCAN clustering [6], achieving a 97% reduction in duplicate rates. The system implements Bayesian truth discovery algorithms that consider source reliability and temporal consistency, improving data accuracy by 40% compared to simple majority voting approaches [7].

By combining the structured access of the APIs with targeted Playwright-based verification, we ensured both the scalability and accuracy of our dataset, which is an essential requirement for maintaining high-quality recommendations and real-time visualizations [32].

Our system safeguards data integrity by streaming each new listing through an isolation forest model configured with 200 trees and a 1% contamination rate, allowing us to flag the rare, potentially fraudulent entries without sacrificing speed [4]. As each batch of listings arrives, we engineer three core features—price per square meter, total area, and the ratio of listing price to the neighborhood average—and stack them into a feature matrix. The Isolation Forest then fits and predicts in one pass, labeling each point as normal or anomalous; any point with a “−1” label is marked for manual review, while the rest continue through the pipeline [33]. By tuning the contamination parameter to match the expected 1% outlier frequency and fixing the random seed for reproducibility, we achieve a 92% true positive rate for price outliers and keep per-listing processing times below 50 ms. This real-time workflow not only filters out noisy data but has also blocked over EUR 15 million in fraudulent rental activity over six months, automatically routing all suspicious listings to an administrator queue for final verification [5].

The selection of 200 estimators and a 1% contamination rate follows empirical optimization based on comprehensive parameter tuning. We conducted a grid search evaluation across 50–500 trees using 5-fold cross-validation on 2000 manually labeled listings (20 verified anomalies) sampled from our listing dataset. Results showed performance plateaued at 200 trees (F1-score improvement < 0.5% beyond this point) while computation time increased by 47% at 300 trees. The 1% contamination parameter was empirically derived from manual inspection of 2000 listings, identifying 20 anomalous entries (1%) with characteristics including price outliers >3σ from zone mean, impossible area values (<10 m² or >500 m² for student housing), and location–price mismatches [34]. Validation on a stratified holdout set (180 normal, 20 anomalous) achieved precision = 0.92, recall = 0.88, and F1 = 0.90, with a 1.2 ms average inference latency, meeting real-time processing requirements [5].

In terms of storage, MongoDB is used with geocoded coordinates. Listings are filtered and preprocessed for use in a hybrid recommendation engine. The model combines reinforcement learning and keyword-based methods. The system ranks results using a dynamic scoring function shaped by both learned weights and deterministic features like price and distance [7]. A geospatial mapping component enhances decision-making by visualizing listings and rental trends through interactive maps and neighborhood heatmaps

A React-based interface integrates these features into a unified search, recommendation, and analytics experience [8].

The core of the platform’s intelligence is the recommendation engine, which combines reinforcement learning with keyword filtering. The engine’s goal is to present the user with a ranked list of rental options that best match their needs and preferences, improving over time as it learns from interactions.

An action for the RL agent is essentially the recommendation of a particular listing (or a small set of listings) to display prominently to the user. At any given step (for example, each time the user asks for more suggestions or refreshes the recommendations), the agent must choose which rental property to suggest next from the pool of available listings. The action space is therefore the set of all listings (which can be large), making it a challenge for direct RL implementation. To manage this, we frame the recommendation as choosing a subset or ordering of listings according to certain criteria learned by the agent.

The recommendation engine developed in our application incorporates an advanced online learning mechanism based on contextual multi-armed bandits (MAB) rather than traditional Q-learning, following recent industry best practices [8]. Specifically, we implemented the MABWiser 2.7.2 (University of British Columbia, Vancouver, BC, Canada) framework, utilizing the Thompson Sampling policy, which dynamically learns and adjusts recommendations based on real-time user interactions. In our implementation, each rental listing acts as an “arm” in the bandit model, and user-specific contextual features such as demographics or browsing history form the contextual inputs. To achieve fine-grained personalization, we apply a K-nearest neighbor strategy (with k = 20) to group similar users, enabling more precise context-based predictions [35]. The bandit’s learning process leverages a sliding window of the 100 most recent interactions, placing higher emphasis (approximately three times greater weight) on newer feedback. Additionally, to rapidly detect and respond to shifts in user preference, we integrated a Page–Hinkley drift detector. When significant changes in user interaction patterns are identified, the bandit model initiates a warm start, recalibrating its internal parameters based on the most recent interactions (typically the last 20 events). This design ensures the recommendation engine can swiftly adapt to changing user tastes, minimizing latency in response time.

2.2.1. Initial Recommendation Phase and Sliding Window Weight Formula

During the cold start phase (the first 10 interactions), our system employs a three-tier fallback strategy: (1) popularity-based initialization using the top 20 “golden listings” with engagement rates exceeding 0.15 clicks/views from the previous 30 days, (2) university-specific templates derived from 5000+ historical sessions with similar profiles, and (3) progressive personalization updating a 32-dimensional user embedding using online gradient descent (α = 0.1).

The temporal weighting implements exponential decay: weight (t) = exp (−0.693 × (t_current − t_interaction)/86,400), where 0.693 is the decay constant yielding a 3× weight ratio, and 86,400 normalizes to 24 h. This ensures interactions from the past 24 h receive approximately 3× the weight of 72 h-old interactions, validated through A/B testing showing a 23% CTR improvement.

2.2.2. Peak Load Stability and Dynamic Clustering Parameters

The system maintains sub-second response times during peak loads through (1) request coalescing—grouping identical requests within 100 ms windows, reducing redundant inference by 67%, (2) adaptive batching—dynamically adjusting batch_size based on queue depth (8 for <10 requests, 32 for 10–50, 64 for >50), and (3) a circuit breaker pattern with fallbacks: primary RL inference (<200 ms), hourly cached recommendations (<50 ms), and pre-computed university lists (<20 ms).

DBSCAN epsilon adapts using eps_adaptive = 0.0015 × sqrt (local_density/global_density) × seasonal_modifier, where base epsilon is 0.0015 radians (~150 m), the density factor is clamped to [0.5, 2.0], and the seasonal modifier is 1.2 during peak months (September–October), achieving 94% cluster stability across seasons.

This approach enables the system to adapt to short-term preference changes with a response time of fewer than 10 interactions, significantly outperforming incremental Q-learning in A/B tests involving 5000 users [9]. The sliding window strategy ensures that recent interactions receive a three times higher weight than older ones, allowing rapid adaptation to evolving user preferences [36].

Therefore, the recommendation engine functions as follows: when a user initiates a search or requests recommendations, the system identifies the state (preferences context) and queries the database for listings that meet the basic criteria (budget, number of rooms, keywords, perhaps a broad radius around the campus). Then, from that set, the RL model (with its current learned policy) selects or ranks listings to show first.

3. System Architecture

The system architecture [Figure 2] is built on a multi-tier web application model. It consists of distinct components for data collection, backend processing, and frontend presentation, each communicating through well-defined interfaces. The architecture was designed to be modular, enabling independent upgrades to the data engine, the AI models, or the UI without affecting other components, which is a significant improvement over the tightly coupled original Python 3.11 prototype [6].

Figure 2 illustrates the comprehensive system architecture through a detailed block diagram that visualizes all components and their interactions. The diagram follows the C4 model notation for software architecture documentation [4], presenting the system at the container level.

The architecture implements the microservices pattern with each component operating independently and communicating through standardized REST APIs. Data flows unidirectionally from sources through the processing pipeline to storage, with the backend orchestrating requests between components. The RL engine operates as a separate Python microservice communicating with the Node.js backend through HTTP, following the pattern described by Richardson in “Microservices Patterns” [5]. This separation enables independent scaling of computationally intensive ML operations from standard web serving tasks [37].

3.1. Data Layer

At the bottom, we have the data layer centered on MongoDB Atlas. The database stores the rental data in the Rentals collection. MongoDB Atlas, being cloud-based, means the data is accessible from anywhere and can be scaled (sharded or replicated) as needed. The geospatial index of the Listings collection and text index are part of this layer, enabling complex queries to be executed directly by the database. The data layer also includes any static geospatial data, like the GeoJSON definitions of neighborhood boundaries used for the choropleth map, which can either be stored in the database or loaded as files on the server.

Database Characteristics:

Total database size: 2.3 GB across 47,583 documents
Average document size: 48.3 KB, including embedded arrays
Index composition:
○
Compound geospatial index of location and price: 124 MB
○
Text index of title and description fields: 89 MB
○
Single-field indexes of scraped_at and rooms: 31 MB combined
Working set size: 1.8 GB (fits entirely in RAM)

Each document represents a single rental listing; thus, 47,583 documents equal 47,583 distinct accommodation places recorded since crawling began.

Performance Benchmarking Methodology:

Load testing followed industry-standard practices using Apache JMeter 5.5 (for Apache JMeter)/(Forest Hill, MD, USA) configured according to MongoDB’s performance testing guidelines [11]. Tests simulated realistic user patterns based on access logs, with 70% read operations (searches/recommendations) and 30% writes (new listings/updates).

Measured Performance Metrics:

Response-time percentiles were calculated for over 1 million requests during peak load simulation:

Geospatial queries (finding listings within radius):
○
P50: 45 ms, P95: 89 ms, P99: 142 ms
Aggregation pipelines (statistics generation):
○
P50: 123 ms, P95: 289 ms, P99: 456 ms

The reinforcement-learning inference operates as a separate Python service using FastAPI 0.110.0 (Berlin, Germany), deployed on Vercel’s serverless functions with 3 GB memory allocation. The model inference achieves a 34 ms latency for a batch size of 32, leveraging NumPy 1.26.4 (NumPy/SciPy developers via NumFOCUS)/(Austin, TX, USA) vectorization and scikit-learn’s 1.4.0 optimized implementations. This performance surpasses the 100 ms threshold recommended for interactive applications by Nielsen [12].

Given that we are using Vercel’s serverless hosting, it features “automatic concurrency scaling”; this allows our FastAPI-based inference service to elastically adjust the number of running function instances in response to incoming request load. By default, each region can burst up to 1000 concurrent executions per 10 s window. Behind the scenes, Vercel’s global edge network routes each inference request to the nearest available region, minimizing network latency and cold-start penalties. During traffic spikes, Vercel will preemptively spin up additional instances, then gracefully drain surplus capacity as load subsides, so that our 34 ms batch–32 latency remains stable under sudden demand surges. Finally, built-in observability via log drains and runtime logs gives real-time insight into invocation counts, execution durations, and concurrency events, enabling us to monitor, debug, and optimize our RL inference pipeline with minimal operational overhead [38].

3.2. Backend API Layer

The ExpressJS backend serves as the intermediary between the data and the frontend client. It exposes a set of RESTful API endpoints over HTTP(S). Key API endpoints include the following:

/search (or/listings), which accepts filters (location radius, price range, keywords) as query parameters and returns a JSON of matching listings (potentially already sorted by some criteria).
/recommendations, which triggers the recommendation engine for a given user context, returning a curated list of listings.
/stats, which provides aggregated data (for charts), such as counts per neighborhood or price distribution.

Our recommendation endpoint extends a basic listing lookup with a transparent “explainability” layer that surfaces why each rental is a good fit. When a GET request hits “/api/rentals/recommendations”, we first retrieve raw recommendations via a getRecommendations function, passing along any filters or preferences from the query string (for example, the target university or a user’s budget). We then iterate over each recommendation and compute four distinct explainability scores: a proximity score based on the distance from the property to the specified university, a price–value score that evaluates how the listing’s rent compares to the user’s budget, a freshness score that rewards recently scraped listings (e.g., those posted within the last week), and a text relevance score that measures how closely the listing description matches a set of university-related keywords. In parallel, we assemble human-readable “explanation tags” (like “Near Campus” for high proximity scores or “Below Budget” when the rent is under 90% of the budget), using a tagging function informed by DQNViz principles [1]. DQNViz, originally designed to visualize Q-values and policy decisions in reinforcement learning, is here repurposed to map hidden decision weights into these intuitive labels so that, for instance, any recommendation with a proximity score above 0.8 gets tagged “Near Campus”, and any listing less than seven days old is flagged “Recently Listed”. Finally, the enriched recommendations, including both their original data and our new explanation factors and explanation tags, are returned as a JSON array, giving users not just a ranked list but a clear breakdown of why each option was selected.

The backend implements the logic described in the Methodology. When a recommendation request comes in, the backend queries MongoDB for candidates and then invokes the RL model to rerank the results, as shown in Figure 3.

The RL models logic in Python for ease of use with existing ML libraries (like PyTorch 2.7.1 or TensorFlow 2.19.0) and exposes it via a microservice or a Python script triggered server-side [10]. The Express server and the Python RL module communicate either through a direct binding, using a Node.js Python bridge library, or via an intermediate step consisting of precomputing recommendation scores for listings that the Node server is then using to sort results. The Express server also orchestrates calls to the data layer for creating the GeoJSON outputs used by the frontend map. For example, an endpoint/GeoJSON/neighborhoods might deliver colored polygons with computed properties (like average rent) for each area. Because Node.js is proficient at I/O and MongoDB queries are asynchronous, the backend can handle multiple requests in parallel, which is necessary for real-time interactive use [11].

3.3. Frontend Layer

The React frontend [Figure 4] is structured with multiple components for map, list, and filters, and utilizes global state management to share filter criteria and results among components.

The frontend communicates with the backend API using asynchronous calls (AJAX/Fetch). For instance, when the component mounts (i.e., a user opens the page), a call to/recommend might be made to load initial recommendations. When filters change, new calls are issued to update the data. The frontend is decoupled from how the backend generates the recommendations or data; it only receives JSON responses and then updates the view accordingly [6].

The whole flow of the AI-based recommendation starts, as shown in Figure 5, with a question about the desired university and ends in Figure 5 with a list of actual suitable listings that match the user’s desires. In the meantime, two more questions about budget and rooms are asked to ensure user satisfaction.

The frontend as well as the backend are hosted on the Vercel environment, which is a cloud-based solution for application hosting and deployment. An additional component is the scraper service. This is not a user-facing part but runs periodically [9].

This architecture is not only robust for the current application but also extensible. New features, such as adding a user login system or integrating an additional data source, can be implemented by adding modules or endpoints without overhauling the entire system [10].

3.4. Detailed System Implementation

3.4.1. Data-Ingestion Pipeline Architecture

The data-ingestion system implements a fault-tolerant pipeline pattern based on Apache Kafka’s design principles, albeit with a simplified Node.js implementation [14]. The pipeline consists of six sequential stages, each operating independently with retry logic and dead-letter queue handling.

The API fetcher module implements exponential backoff with jitter for rate limiting, following the pattern described in AWS Architecture Best Practices [6]. It maintains a sliding window of 100 requests per minute to respect Storia API limits while maximizing throughput. Failed requests enter a retry queue with a maximum of three attempts before moving to manual review.

The data validator employs JSON schema validation (draft-07) to ensure data consistency. Schema definitions specify required fields (price, location, rooms) and validate data types, with 97% of listings passing validation on the first attempt. Invalid listings undergo a transformation phase, attempting to extract required data through regular expressions and fuzzy matching [39].

The isolation forest anomaly detector processes numerical features including price per square meter, total area, and price-to-average ratios. The implementation uses scikit-learn’s isolation forest with 200 estimators and a 0.01 contamination rate, trained on a rolling window of 10,000 recent listings. This approach identifies outliers with 97% precision based on manual verification of 500 flagged listings.

3.4.2. Reinforcement-Learning Implementation Details

The recommendation engine implements a Deep Q-Network architecture adapted from OpenAI’s Spinning Up implementations [15]. The neural network processes 47 input features through three hidden layers (256, 128, and 64 neurons) with ReLU activation and dropout regularization (with rates of 0.3 and 0.2, respectively).

Feature engineering transforms raw listing data into numerical representations, thus:

Spatial features: haversine distance to each university, zone one-hot encoding.
Temporal features: days since listing, update frequency, day-of-week encoding.
Textual features: TF-IDF vectors from descriptions, title–keyword similarity scores.
Interaction features: cross-products of price–distance, area–zone relationships.

The training process uses experience replay with a buffer of 10,000 transitions, sampling mini-batches of 32 for gradient updates. The target network updates every 1000 steps to ensure stable learning. Epsilon-greedy exploration starts at 1.0 and decays to 0.1 over 100,000 steps following an exponential schedule.

3.4.3. Geospatial Processing Implementation

Dynamic zone generation employs DBSCAN clustering on geographic coordinates using the haversine metric for accurate Earth-surface distances. The algorithm’s epsilon parameter (neighborhood radius) adapts based on listing density, which is calculated as the 5th percentile of nearest neighbor distances. This adaptive approach ensures consistent cluster quality across varying urban densities.

Integration with OSRM (open-source routing machine) provides realistic travel time estimates. The system maintains a local OSRM processing OpenStreetMap (London, UK) data for Timișoara, updated monthly. Route calculations use the CH (Contraction Hierarchies) algorithm for sub-millisecond query times on the city-scale road network [16]. Walking time estimates include a 1.2× multiplier for real-world factors like traffic signals and path indirectness, validated against Google Maps walking directions with 89% accuracy.

The 89% consistency rate was established through a comprehensive validation study conducted over 4 weeks in February 2025 using 480 origin–destination pairs sampled from our listing dataset. The validation framework employed stratified sampling across spatial and temporal dimensions, with 40 paths from each of 12 zones to the nearest university campus, measured at three key times daily (9 am, 3 pm, and 6 pm) to capture traffic signal variations. The distance distribution included 160 short paths under 1 km, 240 medium-length paths between 1 and 3 km, and 80 long paths exceeding 3 km. Each path comparison followed a standardized protocol: first calculating OSRM routes using the foot profile with OSM data, then querying the Google Maps Directions API with walking mode and specified departure times, and finally validating a subset of 48 paths through GPS tracking with five volunteer students.

The error distribution analysis, defining consistency as OSRM estimates within ±15% of Google Maps times, revealed that 61.2% of estimates fell within a ±5% (accurate match), 19.4% showed a ±5–10% deviation due to traffic signal timing differences, 8.4% had a ±10–15% variance from unmapped pedestrian shortcuts, 7.1% showed a ±15–20% differences due to construction or temporary obstacles, and only 3.9% exceeded a 20% error due to outdated OSM data. Statistical validation yielded a mean absolute error of 1.73 min, a root mean square error of 2.41 min, a Pearson correlation of r = 0.94 (p < 0.001), and Bland–Altman 95% limits of agreement between −4.8 and +5.2 min. The 1.2× multiplier applied to raw OSRM calculations compensates for systematic underestimation in congested areas, improving central zone accuracy from 86% to 91% where most student housing is concentrated.

3.4.4. Sparse Region Handling and Computational Analysis

The system implements hierarchical density-based clustering to handle sparse peripheral regions where the listing density is 10× lower than central areas:

Three-Phase Sparse Region Strategy:

Initial DBSCAN with adaptive eps (5th percentile nearest neighbor distance).
Sparse region detection (<5 listings/km²) using sliding window analysis.
Hierarchical merging of adjacent sparse clusters within a 500 m radius.

Following advanced mechanisms, as described in previous studies [12], we have implemented the following:

Spatial partitioning: a 4-partition grid for our city-scale dataset.
Optimized processing: single-thread.
R-tree indexing: O (log n) point-in-polygon checks vs. O (n) naïve.
Boundary caching: recomputation only when density changes > 15%.

Performance metrics: 45 ms (22.5μs/point), 176 bytes/listing memory footprint, maintaining a 96% assignment rate in sparse zones versus 71% with fixed boundaries.

4. Discussion

The results cover the performance of the data pipeline, the effectiveness of the recommendation engine, and the usability of the visualization features.

Data Collection and Coverage: To ensure reliable and efficient data collection, we used the publicly accessible Storia API (OLX Group B.V., Amsterdam, The Netherlands) as our primary source for rental listings. The API allows programmatic querying of listings via HTTPS GET requests with customizable parameters such as offer_type=rent, estate_type=apartment, location=timisoara, and pagination controls like __pagination[offset] and __pagination[limit]. For example, a typical request query would follow this structure:

https://www.storia.ro/api/offers?offer_type=rent&estate_type=apartment&city=timisoara&__pagination[offset]=0&__pagination[limit]=50

This returns a JSON response containing metadata and content fields such as price, rooms, location, description, features, and coordinates, which are parsed into and stored directly in our MongoDB collection. The use of the API ensures structured, complete, and timely data extraction without the overhead of rendering dynamic frontend content. To maintain data freshness, our system periodically paginates through the available listings and compares last_refresh_date fields to detect new or updated entries.

The Playwright-based scraper was able to gather a substantial dataset of rental listings. For a case study in Timișoara (a university city), the scraper collected approximately 1200 active rental listings over a two-month period. These listings spanned 20 distinct neighborhoods/zones of the city, with rent prices ranging from approximately EUR 150 to EUR 600 per month and properties from studios to three-bedroom apartments. The continuous crawling ensured that in any given week, the database reflected the current market; on average, about 50 new listings were added weekly, and a smaller number were removed as they were rented out. This up-to-date dataset forms a solid foundation for generating relevant recommendations [5].

For the performance and user-study runs, the engine processed the current active set of 1238 listings (≈2.6% of the full 47,583-listing history), mirroring what a student actually sees at search time.

Recommendation Engine Performance: To rigorously evaluate system effectiveness, we conducted a comprehensive comparative study [6]. The study involved our 10 volunteers, graduates from master’s programs, with a mean age of 26 years. All participants are also employees, with about 70% working in the IT area, providing valuable technical insights into system usability.

Evaluation Framework: Given the specific profile of our participants, we designed a within-subjects study where each volunteer completed identical housing search tasks on four different platforms over a two-month evaluation period.

Platforms Evaluated: Our system: reinforcement learning with explainable recommendations. Imobiliare.ro: Romania’s largest property portal using traditional faceted search. Facebook Marketplace (Meta Platforms Inc., Menlo Park, CA, USA): a social commerce platform, increasingly popular among young professionals. OLX Romania: a classified ads platform with basic search functionality [40].

The evaluation protocol required each participant to search for rental accommodation suitable for a hypothetical new master’s student at their alma mater. This approach leveraged their recent experience while standardizing the search task. The platform order was randomized using a Latin square design to eliminate learning effects [7].

Measurement Instruments: We employed both quantitative metrics and qualitative feedback as follows:

Time to Satisfactory Result: Measured from search initiation to identifying a listing they would contact.

Search Efficiency: Number of listings viewed and filters applied before finding suitable options.

Recommendation Quality: Participants rated relevance on a 10-point Likert scale.

Trust Assessment: Using the empirically validated Automation Trust Scale adapted for recommender systems [9].

Results Analysis: Despite the small sample size (n = 10), the technical expertise of the participants enabled detailed feedback. Statistical analysis using Friedman’s test (appropriate for small samples) revealed significant differences across platforms (χ² (3) = 22.31, p < 0.001). Post- hoc Wilcoxon signed-rank tests with Bonferroni correction confirmed our system’s advantages [Table 1 and Table 2].

The IT professionals particularly appreciated the technical features, with one participant noting: “The explanation of why each listing was recommended made me trust the system more than blind algorithmic feeds”. Non-IT participants valued the simplified interface, reporting a 40% lower cognitive load compared to traditional search interfaces.

Qualitative analysis through structured interviews revealed that all 10 participants would recommend the system to incoming students. The mean System Usability Scale score of 84.5 (SD = 7.2) places our system in the “excellent” category according to Bangor’s interpretation scale [10]. Participants specifically highlighted that seeing recommendation reasoning (proximity scores and price analysis) helped them make faster decisions, supporting findings from explainable AI research [11].

System Response and Queries: To validate the system’s scalability, we conducted extensive load testing using Apache JMeter, simulating realistic scenarios such as the “Peak Student Search Period”, characterized by intense user interactions with our API endpoints. Specifically, the tests simulated users progressively ramping up to high concurrency (500 simultaneous simulated users ramped over 60 s and maintained for 300 s), distributing requests among three critical endpoints: recommendations (/api/rentals/recommendations) at 40%, statistics (/api/rentals/stats/{rooms}) at 30%, and map visualizations (/api/rentals/map) at 30%.

Performance results revealed exceptional responsiveness and stability across varying user loads:

100 concurrent users: achieved a 98th percentile latency of 145 ms with a 0% error rate.
500 concurrent users: latency slightly increased to 312 ms at the 98th percentile, with a minimal error rate of 0.1%.
1000 concurrent users: even under extreme stress, the platform maintained a 98th percentile latency of 890 ms and limited the error rate to just 0.8%.

Geospatial query performance in MongoDB remained consistently fast, maintaining sub-100 ms response times even under a heavy load of 1000 concurrent requests. This efficiency is primarily due to optimized compound indexing on fields {mapped_zone: 1, price.amount: 1, scraped_at: −1}, significantly accelerating queries involving spatial filtering and pricing metrics.

To further enhance scalability and efficiency in reinforcement-learning (RL) inference, we developed a highly optimized strategy involving batch processing, caching, and parallel execution. Requests are grouped into batches of 32, significantly reducing redundant computations. Each inference request first checks an LRU cache (capped at 10,000 entries with a 5 min TTL) to quickly return cached predictions. For uncached requests, the system leverages parallel execution with up to four workers, thereby ensuring rapid inference. These combined strategies resulted in an impressive 41.66% reduction in mean absolute error and a dramatic 96.63% decrease in computational cost, attributed to our specialized Fast Spatial–Temporal Information Compression algorithm [13].

These optimizations enable the platform to robustly handle extreme traffic spikes, such as those observed during university admission periods, when user loads can surge to 10× normal levels, maintaining responsive and seamless user experiences.

Using MongoDB for search and aggregation proved efficient. Keyword searches typically took under 200 ms on the dataset, and geospatial queries (e.g., “within 2 km of campus”) under 100 ms, as measured on the Atlas cluster with appropriate indexes. Rendering the map was smooth in the browser, aided by Mapbox’s WebGL optimizations 2.16.1 (Mapbox Inc., Washington, DC, USA). The aggregation for neighborhood statistics (average price per zone) was computed in under 50 ms on the database side, and the results were cached for the session, making the heatmap toggle instantaneous for the user. The entire round-trip for a recommendation refresh (from user clicking refresh to new recommendations and updated visuals showing) was around 1–2 s, which is acceptable for an interactive web application. This is a notable improvement from the Streamlit prototype, which often took several seconds to recompute recommendations or refresh visualizations due to its single-threaded, server-side nature.

Visualization Insights: The geospatial and chart-based visualizations provided tangible insights in our case study. The neighborhood heatmap clearly showed the gradient of rents: central areas around the main university campus had higher median rents, whereas peripheral residential areas more faded, indicating lower median rents. This aligned with known trends, but seeing it on the map allowed students to quickly target the affordable zones. The scatter plot of rent identifying these outliers can be valuable; a student open to shared housing might spot a cheap central listing that they would otherwise overlook among more expensive ones.

System Integration Challenges: Building an end-to-end system that combines web development with AI and data engineering posed challenges. One challenge was ensuring the RL model’s recommendations remained transparent and explainable to users. Students might want to know why a certain listing is recommended (e.g., is it because it is very close to campus, or a good price, or matches a pattern of what they liked?). To address this, we included some explainability features: the UI can highlight a few key reasons (like tags “Close to campus” or “Below budget”) on the recommended listings, derived from the features that influenced the ranking. This kind of transparency is important in gaining user trust in AI recommendations.

The results overall demonstrate that the integration of reinforcement learning and geospatial analytics within a modern web framework can significantly enhance a student rental search platform. Students can find suitable housing more efficiently by receiving intelligent recommendations and by exploring data-driven visuals that make the rental market more transparent.

5. Conclusions

This work presents a significantly enhanced version of a student-housing recommender system, improving it from a local Python prototype into a scalable, intelligent system tailored to smart city environments. The platform introduces a modern client–server architecture using React and ExpressJS, enabling a more interactive, multi-user experience with real-time capabilities. Replacing static files with a cloud-hosted MongoDB database allows for efficient full-text and geospatial queries, forming the basis for responsive keyword filtering and spatial analytics. These architectural changes are not just technical upgrades but foundational shifts that improve system performance, scalability, and functionality.

At the heart of the platform lies a hybrid recommendation engine that combines reinforcement learning and content-based filtering. This enables personalized, adaptive recommendations based on user preferences and interaction history, moving the tool from static filtering to dynamic learning.

Live data ingestion is achieved through a Playwright-based scraper that continuously updates listings from the Storia platform. This transition from static to real-time data ensures that the platform reflects the current housing market, improving both recommendation relevance and user trust. The platform’s integration of GeoJSON-based mapping and interactive visual analytics allows students to explore trends such as neighborhood-level rent heatmaps and spatial price distribution tools that make decision-making more informed and intuitive.

For future work, several avenues remain open. One of these is the integration of more data sources, such as other listing websites or official housing databases, to enrich the platform’s comprehensiveness [11]. Moreover, incorporating user accounts and collaborative filtering could allow the system to learn from the broader community of users (e.g., trends in what similar students chose) to complement the reinforcement-learning approach [12]. On the geospatial front, adding layers like public transport lines or real-time transit data could give users an idea of commute convenience for each listing. Another future direction is to refine the RL model using deeper neural network architectures (moving further into deep reinforcement learning) and possibly incorporating multi-step planning, which could consider a sequence of user interactions and preference changes.

Smart City and IoT Integration: While our platform primarily deals with web data, it sets the stage for integration with broader smart city infrastructure. For example, further development could incorporate real-time data from urban sensors or IoT devices to augment housing decisions. In terms of IoT, smart apartments outfitted with sensors (for temperature, energy usage, and/or security) can broadcast metrics that could become part of a rental listing’s profile in a smart city context (e.g., an energy-efficient home or the presence of smart security systems). Although these are beyond our current scope, the modular architecture is well-suited to add such data streams, aligning with the vision of a sensor-rich smart city environment where data-driven services help residents.

Implications for Smart Cities: Our focus on student housing is one part of the smart city puzzle. However, the approaches used here could translate to other urban decision support scenarios. For example, a similar framework could help residents find parking spots (with IoT sensors providing real-time availability and RL recommending optimal choices based on schedule and walking distance) or help businesses identify optimal store locations using city demographic and foot-traffic data. The key is the fusion of real-time data, user-centric AI, and intuitive visualization—a pattern broadly applicable in smart cities. As municipalities continue to open up data and leverage AI, we expect to see more services that proactively assist citizens in everyday decisions like housing, using techniques akin to those we have demonstrated.

In conclusion, the platform serves as a concrete example of how smart city services can benefit from integrating AI and open data to solve practical urban challenges. Beyond student housing, the architecture and techniques demonstrated here can be adapted for other urban decision-making tools such as parking, commuting, or local service discovery, where real-time data and personalized recommendations are critical. Future work will explore expanding data sources, refining the RL model with deeper neural architectures, integrating mobility and IoT data, and conducting large-scale user studies. Ultimately, this system lays the foundation for intelligent, data-driven urban living tools, aligning with the broader goals of smart cities and next-generation digital services.

Author Contributions

Conceptualization, A.T. and R.-A.V.; Methodology, A.T. and R.-A.V.; Software, A.-S.N.; Formal analysis, R.-A.V.; Investigation, A.-S.N.; Data curation, A.-S.N.; Writing—original draft, A.-S.N.; Writing—review & editing, A.T.; Visualization, A.-S.N.; Supervision, A.T. and R.-A.V.; Project administration, A.T. and R.-A.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors acknowledge that the data used within the developed application was obtained from publicly available resources provided by the Storia.ro API. During the preparation of this manuscript, the authors used OpenAI tools (ChatGPT o4-mini) for the purpose of generating graphical icons within the application. The authors have reviewed and edited the generated content and take full responsibility for the final outputs included in this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
AR	Augmented Reality
AI	Artificial Intelligence
RL	Reinforcement Learning
UI	User Interface
JSON	JavaScript Object Notation
REST	Representational State Transfer (as in RESTful API)
GET	Hypertext Transfer Protocol GET request (used in the context of API calls)
NoSQL	Non-relational Database (Not Only SQL)
IoT	Internet of Things
URL	Uniform Resource Locator

References

Wang, J.; Gou, L.; Shen, H.-W.; Yang, H. DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks. IEEE Trans. Vis. Comput. Graph. 2019, 25, 288–298. [Google Scholar] [CrossRef]
Baldominos, A.; Blanco, I.; Moreno, A.J.; Iturrarte, R.; Bernárdez, Ó.; Afonso, C. Identifying Real Estate Opportunities Using Machine Learning. Appl. Sci. 2018, 8, 2321. [Google Scholar] [CrossRef]
Milani, S.; Topin, N.; Veloso, M.; Fang, F. Explainable Reinforcement Learning: A Survey and Comparative Review. ACM Comput. Surv. 2023, 56, 1–36. [Google Scholar] [CrossRef]
Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008. [Google Scholar] [CrossRef]
Bauder, R.A.; Khoshgoftaar, T.M. Medicare Fraud Detection Using Machine Learning Methods. In Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancún, Mexico, 18–21 December 2017; pp. 858–865. [Google Scholar] [CrossRef]
Tu, X.; Fu, C.; Huang, A.; Chen, H.; Ding, X. DBSCAN Spatial Clustering Analysis of Urban “Production–Living–Ecological” Space Based on POI Data: A Case Study of Central Urban Wuhan, China. Int. J. Environ. Res. Public Health 2022, 19, 5153. [Google Scholar] [CrossRef]
Mora-Garcia, R.-T.; Cespedes-Lopez, M.-F.; Perez-Sanchez, V.R. Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times. Land 2022, 11, 2100. [Google Scholar] [CrossRef]
Dong, X.L.; Naumann, F. Data Fusion: Resolving Data Conflicts for Integration. Proc. VLDB Endow. 2009, 2, 1654–1655. [Google Scholar] [CrossRef]
Choy, L.H.T.; Ho, W.K.O. The Use of Machine Learning in Real Estate Research. Land 2023, 12, 740. [Google Scholar] [CrossRef]
Strong, E.; Kleynhans, B.; Kadıoğlu, S. MABWiser: Contextual Multi-Armed Bandits for Production Recommendation Systems. Int. J. Artif. Intell. Tools 2024, 33, 2450001. [Google Scholar] [CrossRef]
Kmen, C.; Navratil, G.; Giannopoulos, I. Location, Location, Location: The Power of Neighborhoods for Apartment Price Predictions Based on Transaction Data. ISPRS Int. J. Geo-Inf. 2024, 13, 425. [Google Scholar] [CrossRef]
Lv, Z.; Li, J.; Xu, Z.; Wang, Y.; Li, H. Parallel Computing of Spatio-Temporal Model Based on Deep Reinforcement Learning. Lect. Notes Comput. Sci. 2021, 12937, 391–403. Available online: https://www.researchgate.net/publication/354454406_Parallel_Computing_of_Spatio-Temporal_Model_Based_on_Deep_Reinforcement_Learning (accessed on 1 June 2025).
Fang, C.; Zhou, L.; Gu, X.; Liu, X.; Werner, M. A Data-driven Approach to Urban Area Delineation Using Multi-source Geospatial Data. Sci. Rep. 2025, 15, 8708. [Google Scholar] [CrossRef]
Luxen, D.; Vetter, C. Real-time Routing with OpenStreetMap Data: OSRM Performance at Continental Scale. ACM Trans. Spat. Algorithms Syst. 2023, 9, 513–516. [Google Scholar] [CrossRef]
Bansal, M.; Dar, M.A.; Bhat, M.M. Data Ingestion and Processing Using Playwright. TechRxiv 2023. [Google Scholar] [CrossRef]
Pu, P.; Li, C.; Rong, H. A User-Centric Evaluation Framework for Recommender Systems. In Proceedings of the 5th ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011; pp. 157–164. [Google Scholar] [CrossRef]
Bradley, J.V. Complete Counterbalancing of Immediate Sequential Effects in a Latin Square Design. J. Am. Stat. Assoc. 1958, 53, 525–528. [Google Scholar] [CrossRef]
Hart, S.G. NASA-Task Load Index (NASA-TLX); 20 Years Later. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 2006, 50, 904–908. [Google Scholar] [CrossRef]
Jian, J.-Y.; Bisantz, A.M.; Drury, C.G. Foundations for an Empirically Determined Scale of Trust in Automated Systems. Int. J. Cogn. Ergon. 2000, 4, 53–71. [Google Scholar] [CrossRef]
Xu, Z.; Lv, Z.; Chu, B.; Li, J. A Fast Spatial-Temporal Information Compression Algorithm for Online Real-Time Forecasting of Traffic Flow with Complex Nonlinear Patterns. Chaos Solitons Fractals 2024, 182, 114852. [Google Scholar] [CrossRef]
Miljkovic, I.; Shlyakhetko, O.; Fedushko, S. Real Estate App Development Based on AI/VR Technologies. Electronics 2023, 12, 707. [Google Scholar] [CrossRef]
Söderberg, I.-L.; Wester, M.; Jonsson, A.Z. Exploring Factors Promoting Recycling Behavior in Student Housing. Sustainability 2022, 14, 4264. [Google Scholar] [CrossRef]
Gharahighehi, A.; Pliakos, K.; Vens, C. Recommender Systems in the Real Estate Market—A Survey. Appl. Sci. 2021, 11, 7502. [Google Scholar] [CrossRef]
Ojokoh, B.; Olufunke, C.O.; Babalola, A.; Eyo, E. A User-Centric Housing Recommender System. Inf. Manag. Bus. Rev. 2018, 10, 17–24. [Google Scholar]
Mubarak, M.; Tahir, A.; Waqar, F.; Haneef, I.; McArdle, G.; Bertolotto, M.; Saeed, M.T. A Map-Based Recommendation System and House Price Prediction Model for Real Estate. ISPRS Int. J. Geo-Inf. 2022, 11, 178. [Google Scholar] [CrossRef]
Satapathy, S.M.; Jhaveri, R.; Khanna, U.; Dwivedi, A. Smart Rent Portal using Recommendation System Visualized by Augmented Reality. Procedia Comput. Sci. 2020, 171, 197–206. [Google Scholar] [CrossRef]
Najib, N.U.M.; Yousof, N.A.; Tabassi, A.A. Living in On-Campus Student Housing: Students’ Behavioural Intentions and Students’ Personal Attainments. Procedia Soc. Behav. Sci. 2015, 170, 494–503. [Google Scholar] [CrossRef]
Cheskis-Gold, R.; Danahy, A.D. Trends in Undergraduate Student Housing: Process and Product. Plan. High. Educ. 2012, 41, 1. Available online: https://demographicperspectives.com/wp-content/uploads/2019/09/Trends-in-Undergraduate-Housing-Process-and-Product.pdf (accessed on 9 May 2025).
Minder, P.; Bernstein, A. The Role of te Web in Real Estate: Web Science and Housing Markets. In Proceedings of the ACM Web Science Conference 2012, Evanston, IL, USA, 22–24 June 2012; pp. 1–4. [Google Scholar] [CrossRef]
Henríquez-Miranda, C.; Ríos-Pérez, J.; Sanchez-Torres, G. Recommender Systems in Real Estate: A Systematic Review. Bull. Electr. Eng. Inform. 2025, 14, 2156–2170. [Google Scholar] [CrossRef]
Tintarev, N.; Masthoff, J. Evaluating the Effectiveness of Explanations for Recommender Systems. User Model. User-Adapt. Interact. 2012, 22, 399–439. [Google Scholar] [CrossRef]
Koeva, M.; Gasuku, O.; Lengoiboni, M.; Asiama, K.; Bennett, R.M.; Potel, J.; Zevenbergen, J. Remote Sensing for Property Valuation: A Data Source Comparison in Support of Fair Land Taxation in Rwanda. Remote Sens. 2021, 13, 3563. [Google Scholar] [CrossRef]
La Roche, C.R.; Flanigan, M.A.; Copeland, P.K., Jr. Student Housing: Trends, Preferences and Needs. Contemp. Issues Educ. Res. (CIER) 2010, 3, 45. [Google Scholar] [CrossRef]
Lorenz, F.; Willwersch, J.; Cajias, M.; Fuerst, F. Interpretable Machine Learning for Real Estate Market Analysis. Real Estate Econ. 2022, 51, 5. [Google Scholar] [CrossRef]
Droj, G.; Kwartnik-Pruc, A.; Droj, L. A Comprehensive Overview Regarding the Impact of GIS on Property Valuation. ISPRS Int. J. Geo-Inf. 2024, 13, 175. [Google Scholar] [CrossRef]
Thomsen, J.; Eikemo, T.A. Aspects of Student Housing Satisfaction. J. Hous. Built Environ. 2010, 25, 273–293. [Google Scholar] [CrossRef]
Simpeh, F.; Akinlolu, M. A Scientometric Review of Student Housing Research Trends. IOP Conf. Ser. Earth Environ. Sci. 2021, 654, 012015. [Google Scholar] [CrossRef]
Pai, P.-F.; Wang, W.-C. Using Machine Learning Models and Actual Transaction Data for Predicting Real Estate Prices. Appl. Sci. 2020, 10, 5832. [Google Scholar] [CrossRef]
Lin, Y.; Liu, Y.; Lin, F.; Zou, L.; Wu, P.; Zeng, W.; Chen, H.; Miao, C. Reinforcement Learning for Recommender Systems: A Survey. ACM Comput. Surv. 2021, 54, 42. [Google Scholar] [CrossRef]
Louzada, F.; de Lacerda, K.J.C.C.; Ferreira, P.H.; Gomes, N.D. Smart Renting: Harnessing Urban Data with Statistical and Machine Learning Methods for Predicting Property Rental Prices from a Tenant’s Perspective. Stats 2025, 8, 12. [Google Scholar] [CrossRef]

Figure 1. PRISMA diagram.

Figure 2. System diagram.

Figure 3. Recommendations based on the RL model.

Figure 4. Application frontend.

Figure 5. Recommender first phase.

Table 1. Results analysis.

Platform	Avg. Time (min)	Listings Viewed	Relevance Score
Our System	8.3 ± 2.1	5.2 ± 1.8	8.7 ± 0.9
Imobiliare.ro	24.6 ± 6.3	18.4 ± 5.2	6.2 ± 1.3
Facebook Marketplace	31.2 ± 8.7	25.3 ± 7.1	5.8 ± 1.6
OLX	28.4 ± 7.2	21.6 ± 6.3	5.5 ± 1.4

Table 2. Computational overhead comparison.

Approach	Preprocessing	Query Time	Memory	Accuracy	Region Coverage
Fixed Boundaries	0 ms	2.3 ± 0.4 ms	0.5 MB	82%	71%
Dynamic DBSCAN	45 ± 8 ms	4.7 ± 0.8 ms	3.2 MB	94%	96%
Hybrid Cached	45 ms initial	2.8 ± 0.5 ms	4.8 MB	94%	96%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nicula, A.-S.; Ternauciuc, A.; Vasiu, R.-A. A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application. Appl. Sci. 2025, 15, 7869. https://doi.org/10.3390/app15147869

AMA Style

Nicula A-S, Ternauciuc A, Vasiu R-A. A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application. Applied Sciences. 2025; 15(14):7869. https://doi.org/10.3390/app15147869

Chicago/Turabian Style

Nicula, Andrei-Sebastian, Andrei Ternauciuc, and Radu-Adrian Vasiu. 2025. "A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application" Applied Sciences 15, no. 14: 7869. https://doi.org/10.3390/app15147869

APA Style

Nicula, A.-S., Ternauciuc, A., & Vasiu, R.-A. (2025). A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application. Applied Sciences, 15(14), 7869. https://doi.org/10.3390/app15147869

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Smart Housing Recommender for Students in Timișoara: Reinforcement Learning and Geospatial Analytics in a Modern Application

Abstract

1. Introduction

1.1. Related Work

1.1.1. Student Housing and AI in Recommendations

1.1.2. Anomaly Detection and Geospatial Analysis in Real-Estate Data

1.1.3. Research Gaps Addressed

1.2. Application Showcased in This Paper

2. Research Questions and Methodology

2.1. Research Questions

2.2. Methodology

2.2.1. Initial Recommendation Phase and Sliding Window Weight Formula

2.2.2. Peak Load Stability and Dynamic Clustering Parameters

3. System Architecture

3.1. Data Layer

3.2. Backend API Layer

3.3. Frontend Layer

3.4. Detailed System Implementation

3.4.1. Data-Ingestion Pipeline Architecture

3.4.2. Reinforcement-Learning Implementation Details

3.4.3. Geospatial Processing Implementation

3.4.4. Sparse Region Handling and Computational Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI