You are currently viewing a new version of our website. To view the old version click .
Atmosphere
  • Article
  • Open Access

5 November 2023

Research and Application of Intelligent Weather Push Model Based on Travel Forecast and 5G Message

,
,
,
,
,
and
1
Sichuan Meteorological Service Centre, Chengdu 610072, China
2
Shaanxi Meteorological Information Centre, Xi’an 710014, China
3
Huyi District Meteorological Bureau, Xi’an 710399, China
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Deep Learning Algorithms for Weather Forecasting and Climate Prediction

Abstract

In the realm of daily activity planning, precise weather forecasting services hold paramount significance. However, the prevalent dissemination of weather forecasts through conventional channels like radio, television, and the internet often yields only generalized regional predictions. This limitation contributes to diminished forecast reach, inadequate accuracy, and a lack of individualization, thwarting the effective distribution of meteorological insights and inhibiting the fulfillment of personalized forecast demands. Addressing these concerns, our study proposes a personalized weather forecasting approach that harnesses machine learning techniques and leverages the 5G messaging platform. By amalgamating projected user travel data, we augment personalized weather reports and extend user coverage to achieve tailored, timely, and high-quality weather services. Concretely, our research commences with an extensive analysis of large-scale user travel behavior data to extract pertinent travel attributes. Subsequently, we construct a user’s future location prediction model—dubbed the Loc-PredModel—by employing the Extreme Gradient Boosting (XGBoost) algorithm to forecast users’ trip destinations and arrival times. Anchored in the anticipated outcomes of user travel behavior, personalized weather data reports are formulated. Experimental results underscore the Loc-PredModel’s remarkable predictive prowess, demonstrating a root mean squared error (RMSE) value of 0.208 and a coefficient of determination (R2) value of 0.935, affirming its efficacy in prognosticating users’ trip destinations and arrival times. Furthermore, our 5G message-driven platform, rooted in intelligent personalized meteorological services, underwent testing within Chengdu city and garnered positive user feedback. Our research effectively surmounts the limitations of conventional weather forecasting platforms by furnishing users with more precise and customized weather information predicated on behavioral analysis and the 5G information ecosystem. This study not only advances the theoretical groundwork of intelligent meteorology but also offers invaluable insights and guidance for future advancement. By providing users with a more personalized and timely intelligent meteorological service experience, our approach exhibits transferability, with the research methodology and model potentially extendable nationwide or even on a larger scale beyond the study’s Chengdu-based scope.

1. Introduction

Weather has a significant impact on people’s daily lives and work [1]. Accurate meteorological information can assist individuals in better organizing their work and life, thereby enhancing productivity [2] and improving the quality of life. As the pace of life accelerates, there is an increasing demand for the accuracy of weather forecasts [3], with users’ personalized requirements becoming more prominent. Despite the breakthroughs achieved in intelligent and automated meteorological monitoring technology [4], the quality and quantity of meteorological data have improved significantly, and the accuracy and resolution of forecasts continue to improve [5]. However, they also face the challenge of information overload brought about by the widespread proliferation of the internet [6]. Particularly in terms of the dissemination channels for meteorological data, traditional media such as broadcast television and the internet still dominate [7]. However, these channels often only provide rough forecasts over a large regional scope, failing to meet modern society’s demand for personalized, precise, and timely meteorological information. For instance, different locations within the same administrative region of a city can experience vastly different weather conditions. Higher geographical resolution services can offer more accurate meteorological insights. This situation results in low forecast reach, inadequate precision, and a lack of personalized services, thereby restricting the practical application of meteorological data in everyday life.
Therefore, personalized weather recommendations have emerged as a novel research direction. Leveraging techniques like machine learning, these studies utilize historical data to offer more accurate and personalized weather recommendations. While such research contributes to enhancing prediction accuracy and user experience, the limitations of personalized weather recommendations persist, unable to address the issue of timely information delivery in meteorology.
To address the aforementioned issues, researchers have been striving to enhance the accuracy and personalization of these services [8,9,10]. In this pursuit, there has been accelerated technological innovation to ensure precise monitoring, forecasting, and detailed services [11]. Government reports also emphasize the strengthening of disaster prevention and emergency capabilities through improved meteorological services. One key aspect involves advancing core technologies such as refining weather mechanisms and enhancing numerical forecasting models [12]. Through the combination of these efforts, researchers are committed to improving the quality, timeliness, and personalization of meteorological services, thereby contributing to the overall well-being and safety of society.
However, despite the significant progress made in the field of meteorological services, current research primarily focuses on enhancing meteorological services but lacks an in-depth exploration from the perspective of user behavior analysis to improve the accuracy of meteorological services. Personalized weather forecasting research that is based on user behavior analysis and takes a user-centric approach is currently relatively scarce. This research perspective holds great significance. Leveraging technologies such as machine learning, personalized weather forecasting based on user behavior analysis can tailor weather forecasts to individual users by analyzing their historical behaviors, preferences, and backgrounds, better meeting their unique needs and patterns. This forward-looking research direction holds the potential to open up new avenues for enhancing the accuracy and personalization level of meteorological services. Delving into the correlation between user behavior and meteorology will provide a fresh perspective for personalized weather forecasting, thereby further elevating the precision and personalization level of meteorological services.
Therefore, this study aims to comprehensively consider the strengths and weaknesses of these models and integrate them with the 5G messaging platform to establish a refined and efficient personalized meteorological service system. By deeply analyzing user travel behavior, integrating multi-model predictions, and leveraging the rich media and interactivity features of the 5G messaging platform, personalized, timely, and high-quality weather forecasts can be achieved to meet diverse user demands. The selection of the 5G messaging platform is driven by its superior bandwidth, lower latency, and enhanced capacity compared to 4G networks, enabling the delivery of rich media content and facilitating more interactive user experiences. While the 4G platform can also support similar services, the 5G platform offers distinct advantages in terms of delivering immersive and engaging meteorological information, thereby significantly enhancing the quality and efficiency of meteorological information services.
This research holds significant theoretical and practical value. Theoretically, it explores novel personalized weather recommendation methods by leveraging the rich media and interactivity features of the 5G messaging platform, addressing the limitations of traditional approaches. Practically, it offers new communication channels and interaction methods to improve the reach and user satisfaction of meteorological information, promoting intelligent, personalized, and accurate meteorological information services.

3. Dataset and Data Processing

This research conducts travel pattern analysis based on a shared bicycle dataset. Considering our focus on individual travel patterns, we opt to use transportation data as a proxy to indirectly extract individuals’ spatiotemporal location patterns. In real-world scenarios, various modes of transportation, such as shared bicycles, taxis, buses, and subways, record features like departure time, departure location, arrival time, and arrival location. Notably, shared bicycle datasets represent the travel patterns of a substantial number of urban users. Thus, we regard shared bicycle datasets as significant sources for studying individual travel patterns. Furthermore, 5G-enabled devices can also collect these data attributes. With this in mind, we adopt a transfer learning approach. We utilize publicly available shared bicycle datasets to construct travel prediction models and smoothly transfer these models to other modes of transportation and data collected from 5G-enabled devices for application.
Through the travel pattern prediction model, we can obtain information regarding users’ arrival times and locations, which can be combined with user characteristics to offer personalized weather reports. To achieve this goal, this research necessitates the use of shared bicycle datasets in conjunction with corresponding weather datasets.

3.1. Bike-Sharing Data

The dataset employed in this study is derived from the Mobike bicycle dataset for August 2016 in Shanghai, which constitutes a shared bicycle open dataset. This dataset encompasses six distinct categories of features, namely bicycle ID (bike_id), user ID (user_id), departure time (start_time), departure longitude (start_lon) and latitude (start_lat), arrival time (end_time), and arrival longitude (end_lon) and latitude (end_lat). The dataset covers the timeframe from 1 August 2016, 00:00, to 1 September 2016, 00:00, and comprises a total of 1,023,603 records. Both Bicycle ID and user ID have been encoded using Label Encoder, while time data have been transformed into timestamps. Refer to Table 1 for the shared bicycle dataset fields and their corresponding descriptions. An illustrative example of the dataset is provided in Table 2.
Table 1. Shared bicycle dataset fields and their descriptions.
Table 2. Example of the bike-sharing dataset.
The meanings of the fields in the data file are as follows, as indicated in the table below. Please note that some user information has been de-identified in the dataset.

3.2. Meteorological Observation Data

These weather data not only provide detailed information on weather conditions but also reflect the impact of different weather conditions on travel. Therefore, they become key input factors in the intelligent weather push model. In this study, we also used meteorological observation data from the same time period as the Shanghai shared bicycle data, spanning from 1 August 2016, 0:00 to 1 September 2016, 0:00. These meteorological observation data meticulously records various weather elements during this period, such as temperature, humidity, wind speed, wind direction, precipitation, and more. By combining these meteorological observation data with the shared bicycle data, a more comprehensive description of weather conditions can be achieved for the study. Through the thorough utilization of these meteorological observation data, we can more accurately analyze travel patterns, thus achieving more precise personalized intelligent weather push services.
Figure 2 is the temperature chart for Shanghai in August 2016. Table 3. is example of Shanghai meteorological observation data.
Figure 2. Temperature chart for Shanghai in August 2016.
Table 3. Table example of Shanghai meteorological observation data.

3.3. Weather Forecast Data

Table 4 is example of Shanghai meteorological forecast data.
Table 4. Table example of Shanghai meteorological forecast data.

3.4. Integration of Dataset Construction

In order to comprehensively analyze travel patterns and provide personalized intelligent weather recommendation services, this study adopts a method of associating shared bike data, meteorological observation data, and weather forecast data to construct an integrated dataset. This integrated dataset uses latitude, longitude, and time as reference points to organically combine data from different sources, providing richer and more accurate information.
Specifically, in the construction process of the integrated dataset, we first use latitude, longitude, and time as the basis for association, matching the departure and arrival locations and times from the shared bike data with corresponding timestamps in the meteorological observation data. This approach allows us to link each trip with the weather conditions at specific times, laying the foundation for subsequent analysis.
Simultaneously, we include weather forecast data in the integrated dataset to further enhance its richness and accuracy. Incorporating weather forecast data with actual observation data contributes to a more comprehensive weather prediction. By comparing forecasted data with actual observed data, we can better understand the changing trends in weather and provide users with more reliable weather information.
The construction of this integrated dataset allows us to comprehensively leverage multiple data sources, thus, more comprehensively capturing travel patterns and weather impact factors. By associating shared bike data, meteorological observation data, and weather forecast data, we can conduct more refined analyses in the spatial and temporal dimensions, providing stronger support for intelligent travel recommendations and personalized weather notifications.

3.5. Data Preprocessing

To effectively preprocess the shared bike and meteorological data, this study follows the following steps to ensure data quality and suitability:
  • Data cleaning: In the preliminary stage of data preprocessing, rigorous data cleaning is performed for both shared bike and meteorological data. Firstly, for shared bike data, records containing missing values or anomalies are removed. This includes data with unreasonable timestamps, latitude and longitude values exceeding actual ranges, and negative speed values. Similarly, for meteorological data, records with missing or abnormal values, such as invalid temperature, humidity, and wind speed data, are also eliminated.
  • Data transformation: After data cleaning, data transformation is carried out to suit subsequent analysis and modeling. Timestamps are converted into dates and hours, aiding in associating data with time to explore travel patterns and weather conditions across different time periods. Latitude and longitude values are transformed into grid IDs, mapping spatial information to discrete grids for subsequent spatial analysis. Additionally, wind directions are converted into angle values to better comprehend and compare directional differences.
  • Data normalization: To eliminate scale differences between numeric data, normalization or standardization is applied. This ensures that data distributions are on a consistent scale, balancing the weights between different features. Normalizing data to a standard normal distribution or within the [0, 1] range prevents the model from being influenced by data scales during training and testing.
  • Data matching: In the process of merging shared bike and meteorological data, the key is to match the two types of data based on time and space. Each travel record is associated with weather observation data under a specific time interval and grid ID. This matching ensures that each trip is linked to weather conditions during its specific time period, providing strong support for subsequent analysis.
Through the aforementioned preprocessing steps, we obtain a normalized integrated dataset where each record includes shared bike data features such as user ID, time, latitude, and longitude, as well as meteorological data features like temperature, humidity, wind speed, wind direction, precipitation, and cloud cover.
During the experimental modeling process, data preprocessing plays a crucial role. Rigorous data cleaning, transformation, and matching guarantee the quality and accuracy of data used in model training and testing. Particularly with ample data available, effective data preprocessing is vital for establishing reliable analysis and prediction models. By removing problematic data records, we ensure the accuracy of model inputs, thereby enhancing the efficiency of predicting and analyzing travel patterns.

4. Methodology and Model Introduction

The methodology employed in this study comprises several key steps. Firstly, we collect and analyze users’ travel behavior data to extract travel features and preferences. Subsequently, utilizing the XGBoost algorithm [27] from machine learning, we construct a predictive model for users’ future locations, capable of accurately forecasting their upcoming destinations. Following this, personalized meteorological data reports are generated based on users’ personal characteristics and the meteorological indicators of their higher interest. Finally, leveraging the 5G messaging platform technology, these personalized meteorological reports are precisely delivered to users, achieving an intelligent and personalized weather service.

4.1. Model Structure

Personalized weather push services can be customized based on predictable user travel information, such as where the user is likely to go and when they will arrive. In this paper, based on the Extreme Gradient Boosting (XGBoost) algorithm in machine learning technology, we build a user arrival destination prediction model, named Loc-PredModel, to integrates all kinds of data. The model takes the user’s travel data (departure location, departure time, etc.) and additional data as input to the model to directly infer the destination location and arrival time of the user. The overall structure of the Loc-PredModel is shown in Figure 3.
Figure 3. The structure of Loc-PredModel.
The XGBoost algorithm is an improved algorithm based on the gradient-enhanced decision tree, adding additional features (such as column sampling and shrinking) to avoid overfitting and enhance the predictability of the model. By introducing a regularization term to measure the complexity of the tree model into the objective function, the risk of overfitting can be reduced. XGBoost can use a decision tree or a linear base model as the base learner, updating the weight of the learner based on the error obtained from each iteration. Finally, learners with different weights are combined to form an integrated model to implement the prediction. In general, the XGBoost algorithm tends to have better accuracy and less time to build the model due to its additional training process, while the algorithm also has the advantages of supporting parallel computation, built-in cross-validation, and accepting missing values. Therefore, in this paper, the XGBoost algorithm is chosen to build the Loc-PredModel model for prediction. As an integrated tree model, the predicted value is calculated as shown in Equation (1):
y ^ i t = y ^ i t 1 + f t ( x i )
where y ^ i t is the predicted value of the i th sample after the t th iteration, y ^ i t 1 represents the predicted value of the previous t − 1 foundation model, and f t ( x i ) represents the t th foundation model. Considering that the prediction of user arrival destination and arrival time is a regression problem, the loss function equation of the model is set as shown in Equation (2):
L = i = 1 m l ( y i , y ^ i )
where m is the number of samples, l is the training loss, y i is the real value of the user’s arrival destination and arrival time, and y ^ i is the predicted value of the destination and arrival time. After the regularization term is introduced, the objective function equation of the model is set as shown in Equation (3):
O b j = L + i = 1 n Ω ( f i )
This objective function consists of two parts, the former is a loss function and the latter is a regularization term, which is used to suppress the model complexity to prevent overfitting, where n is the number of trees. The expression of the regularization term is shown in Equation (4):
Ω ( f ) = γ T + 1 2 λ ω 2
where γ and λ are the penalty coefficients, T is the number of nodes in a given tree, and ω 2 is the L2 regular term.

4.2. Model Hyperparameters

In this paper, a gridded parameter search method called GridSearchCV is used to determine the model hyperparameters. In the process of parameter search, the set of hyperparameters with the smallest error is selected as the final hyperparameters of the model. We conducted a grid search for six major hyperparameters of Loc-PredModel, and the search range of each hyperparameter is shown in Table 5.
Table 5. Hyperparameter search range in the GridSearchCV.

5. Experiment and Evaluation

5.1. Experimental Environment and the Evaluation Index

The experiments were primarily conducted on a server with Graphic Processing Unit (GPU), with detailed software and hardware specifications outlined in Table 6.
Table 6. Experimental environment setting.
In this study, the performance of the models was evaluated using three metrics: mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R2), and Pearson correlation coefficient (COR) [28]. MAE and RMSE were used to measure the deviation between predicted values and actual values, with smaller values indicating lower deviation. R2 reflects the goodness of fit of the model to the data, ranging from 0 to 1, where values closer to 1 indicate a better fit. COR measures the correlation between predicted values and actual values, with higher values indicating stronger correlation. The formulas for these evaluation metrics are defined as follows, where n is the number of samples, y i is the actual value, y ^ i is the predicted value, and y ¯ i is the sample mean:
M A E = i = 1 n y i y ^ i n
R M S E = i = 1 n ( y i y ^ i ) 2 n
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y - ) 2
C O R = C O V ( y ^ i , y i ) V a r [ y ^ i ] V a r [ y i ]

5.2. Data Analysis Results

We begin by analyzing the features of user riding data in the dataset. Figure 4a presents the distribution of user riding distances. It is evident from the graph that the majority of users have riding distances not exceeding 2 km. Among these, approximately 42.4% of the data consist of distances under 1 km, while distances between 1 km and 2 km make up around 34.8% of the total data. Notably, data points with riding distances exceeding 5 km account for 10.8%, which could introduce higher prediction challenges due to the likelihood that longer distances reflect instances where users might change their riding destinations.
Figure 4. (a) Distribution characteristics of user cycling distance; (b) distribution characteristics of user cycling time.
Figure 4b portrays the distribution of user riding durations. The data show that the highest proportion falls within the range of 5 to 10 min, accounting for approximately 31.2%. Data points with riding durations exceeding 30 min constitute around 11.2%. Similar to riding distances, longer riding durations could also contribute to increased prediction complexities.
Next, we delve into the Pearson correlation analysis of various variables in the dataset. As shown in Figure 5a, the arrival time of users is highly correlated with their departure time, with a correlation coefficient of 0.94. Simultaneously, concerning users’ arrival destinations, there exists a noticeable correlation with their departure locations. The correlation between arrival longitude and departure longitude is 0.68, while the correlation between arrival latitude and departure latitude is 0.88. Departure longitude also exhibits some influence on arrival latitude (Cor = 0.15).
Figure 5. (a) Pearson correlation between variables; (b) feature importance of different variables from XGBoost algorithm.
Subsequently, using the XGBoost algorithm, this study conducts feature importance analysis on all input features. Figure 5b illustrates the results. It is evident that the departure time feature holds the greatest importance in the prediction process of the model, followed by the departure location’s significance. Notably, the user ID and bike ID features have the least importance in the model’s prediction, aligning with the results of Pearson correlation analysis among the variables.

5.3. Model Performance Comparison

The main features for predicting users’ arrival locations and times include the user’s departure location and time, with auxiliary features being the bike ID and user ID. The dataset was divided randomly into a training set (70%) and a test set (30%), which were preprocessed using standardization functions.
Several different models were developed and tested for predicting users’ destinations and arrival times:
K-Nearest Neighbor (KNN): The KNN algorithm finds the K-Nearest Neighbors of a given sample in the feature space and assigns the average attributes of these neighbors to the sample for prediction. It was developed using the “sklearn” package in Python, with the parameter n_neighbors set to 7.
Deep Neural Network (DNN): The DNN receives predictive variables as inputs at the input layer, and generates predictions in the output layer by training neurons in hidden layers. It was developed using the “Keras” package in Python, comprising one input layer (128 neurons), two hidden layers (with 64 and 32 neurons, respectively), and one output layer, utilizing the “relu” activation function.
Random Forest (RF): This ensemble model is composed of multiple classification or regression trees. Input predictive factors are randomly partitioned into each tree using bootstrapping, training the predictive model using data within each tree to mitigate model overfitting risks. It was developed using the “sklearn” package in Python, with the parameter n_estimators set to 100.
Loc-PredModel: This model, designed for predicting users’ destination using the XGBoost algorithm, was developed using the “XGBoost “package in Python. Parameters were set as n_estimators = 200 and max_depth = 3.
The performance of each model is presented in Table 7, where “Train_c” denotes the time consumption on the training set, and “Test_c” represents the time consumption on the test set. For location prediction, the units for mean absolute error (MAE) and root mean squared error (RMSE) are degrees, whereas for arrival time prediction, the units for MAE and RMSE are minutes.
Table 7. Model performance comparison.
Overall, the Loc-PredModel outperforms other models across all three evaluation metrics. The performance of the Random Forest model is comparable to that of the Loc-PredModel. Notably, the Loc-PredModel exhibits the fastest efficiency, with a training time of 2.315 s on the training set and an inference time of 0.055 s on the test set. In contrast, the Random Forest model takes 16.216 s for training and 1.029 s for inference on the test set. The DNN model has the longest training time, at 195.005 s on the training set and 5.603 s on the test set. The parallel tree structure of the Loc-PredModel based on the XGBoost algorithm contributes to its lower time consumption, making it highly advantageous for practical deployment and application. The experimental results demonstrate the effectiveness of the Loc-PredModel in predicting users’ arrival locations and times. It not only exhibits good prediction accuracy but also boasts lower time consumption.
After confirming the superior performance of the Loc-PredModel, further analysis was conducted, through feature ablation experiments, to explore the impact of different features on prediction performance. As shown in Table 8, the results show that the Loc-PredModel achieves good prediction performance when given the departure time and departure location information as inputs. Adding the bike_id feature has a larger impact on improving prediction performance compared to adding the user_id feature. This aligns with the results from the Pearson correlation analysis and feature importance analysis in Section 5.2. Importantly, when both bike_id and user_id features are included as inputs to the model, prediction performance does not improve further. This suggests that adding more features to the model does not always enhance prediction performance, as it might introduce redundant information.
Table 8. The impact of different predictors on model performance.
Furthermore, we also examined how different travel distances and travel times affect the predictive performance of the model. This analysis helps us understand the model’s ability to predict user behavior under various travel conditions. Figure 6a illustrates the predictive performance of the Loc-PredModel across different travel distances. Generally, the model exhibits better predictive performance for shorter travel distances and poorer performance for longer distances. The predictive ability of the model noticeably decreases when the travel distance exceeds 5 km (R2 = 0.758). Figure 4b displays the model’s predictive performance across different travel times. Similar to Figure 6a, longer travel distances correspond to poorer predictive performance. However, the decline in performance is more gradual, suggesting that the model is more sensitive to travel distance than travel time. These results confirm the hypotheses made in Section 5.2, which stated that the user’s travel distance and travel time are significantly correlated with prediction difficulty.
Figure 6. (a) The prediction performance for different cycling distances; (b) the prediction performance for different cycling time.

6. Personalized Weather Report Generation

The primary objective of this research is to deliver personalized weather reports that specifically address individual users’ requirements and preferences. This section aims to provide a comprehensive understanding of how the outcomes from the predictive model are integrated with user preference features to formulate customized weather reports.

6.1. Consideration of User Preference Features

In order to generate personalized weather reports, it is crucial to take into account various user preference features, including their sensitivities towards elements such as temperature, precipitation, and wind speed. These features can be derived from historical preference data, user feedback, and personal information. By integrating these features, a better comprehension of user needs can be attained, thereby facilitating the creation of weather reports that align closely with their preferences.

6.2. Data Integration Process

The integration process involves the combination of three sets of data. Firstly, meteorological data comprise two categories: basic meteorological elements such as precipitation, temperature, wind speed, wind direction, and humidity, and lifestyle weather elements like air quality, UV index, dress index, and car wash index. Secondly, user travel data primarily include the user’s mode of travel, departure time, departure location, arrival time, and destination. Lastly, user preference feature data involve user characteristics such as age group, occupation, and primary commuting method.

6.3. Integration of User Preference Features and Weather Data

After acquiring user preference features, they are integrated with the weather data produced by the prediction model to generate personalized weather reports. The process, illustrated in Figure 7, entails the recommendation system gathering weather data, user travel data, and user-specific features. Leveraging the Loc-PredModel previously discussed, the system predicts the user’s location and time. Subsequently, the weather data, travel data, and user-specific feature data are transmitted to the application server, where they are amalgamated to generate data reports incorporating weather conditions, destination location, and time. These personalized reports are subsequently disseminated to user terminals through the 5G messaging platform.
Figure 7. Architecture diagram of the intelligent weather push platform.

6.4. Practical Application Scenarios

As shown in Figure 8, 5G Messaging Cloud Service Platform, including functions such as notification management, reply configuration, 5G menu configuration, transmission management, rich media message editing, and data management, we have successfully implemented a personalized 5G Messaging Cloud Service Platform based on the Loc-PredModel. Our personalized weather service recommendation system aims to provide users with an intelligent, real-time, and tailored meteorological information experience. To achieve this goal, we have designed a variety of 5G message templates for different scenarios, catering to users’ diverse weather information needs. These templates dynamically and automatically combine based on the user’s current context and service variations, allowing flexible deployment of 5G message products for different scenarios. Whether users are on a journey, preparing for outdoor activities, or in need of real-time weather information, we can provide weather services relevant to their specific scenarios, ensuring users receive the latest and most accurate weather information.
Figure 8. Deployed 5G Messaging Cloud Service Platform.
In addition, our system implements user text recognition and multimedia weather interaction question–answer services across various scenarios. Users can pose weather-related questions to the system through text input or voice recognition. The system responds in rich multimedia formats, including voice synthesis, images, and videos, based on the user’s question and current context. This enables users to intuitively grasp weather conditions, whether they are inquiring about today’s weather, the temperature trend for the upcoming week, or seeking weather advice for specific activities. Our system caters to personalized needs, enhancing the user’s meteorological service experience.
In summary, our personalized weather service recommendation system, through flexible 5G message templates and multimedia weather interaction question–answer services, aims to provide users with a more intelligent, real-time, and tailored meteorological service. We address users’ diverse meteorological information needs, offering a more convenient and valuable weather forecasting experience.

7. Conclusions and Future Directions

In this research, a comprehensive exploration was conducted in the field of personalized weather recommendations to achieve accurate weather predictions based on user behavior. By analyzing various features such as user departure location, departure time, bike ID, and user ID, along with the correlations between weather information, predictions were made for user destination and arrival time, providing users with more personalized and practical weather recommendations.
Different model approaches, including K-Nearest Neighbor (KNN), Deep Neural Network (DNN), Random Forest (RF), and Loc-PredModel developed using the XGBoost algorithm, were thoroughly compared and analyzed. The experimental results demonstrated that Loc-PredModel performed exceptionally well across multiple evaluation metrics, particularly in terms of MAE, RMSE, R2, and COR, showcasing its effectiveness in predicting user destination and arrival time. Furthermore, the Loc-PredModel exhibited significant advantages in terms of time efficiency, making it suitable for practical deployment and application.
It is worth noting that the feature ablation experiment further verified the impact of different features on prediction performance. It was observed that even with only departure time and departure location information as input, Loc-PredModel achieved satisfactory prediction performance. Moreover, incorporating the bike ID feature significantly improved prediction performance, while the improvement from the user ID feature was limited. Interestingly, the performance stopped improving when both bike ID and user ID features were simultaneously input, emphasizing the importance of feature selection and model design in personalized weather prediction.
In summary, this research delved into the relationship between user behavior and weather, resulting in a personalized weather prediction model based on user travel patterns. Experimental results illustrated the model’s advantages in accuracy, practicality, and time efficiency. This not only offers a new approach to enhancing the accuracy and personalization of meteorological services but also provides robust support for the further development and application of weather prediction. With the continuous advancement of mobile internet and big data technology, we believe personalized weather prediction will play an increasingly crucial role in areas such as smart cities and intelligent transportation, providing users with more convenient and accurate weather information services.
By integrating 5G messaging platform technology and machine learning algorithms, this research proposed an intelligent weather notification model based on travel prediction and the 5G messaging platform. The model predicted users’ future locations by analyzing their travel behavior patterns, combined user features to extract meteorological information they are most likely to be interested in, and generated personalized weather reports. These reports were then sent to users through the 5G messaging platform, supporting intelligent interaction between users and the server. This model realizes personalized smart weather services, enhancing user satisfaction and reliance on weather information.

8. Discussion and Limitations

The predictive capabilities of our location and time prediction model were initially limited by the constraints of the shared bicycle dataset, particularly in relation to temporal and spatial scales. However, the integration of spatial–temporal data obtained from 5G terminal feedback holds promise in overcoming these limitations. This incorporation enables our personalized weather report generation model to dynamically adapt to corresponding temporal and spatial constraints, facilitating the generation of tailored weather reports for effective dissemination to users.

8.1. Enhanced Predictive Capabilities

The implementation of 5G terminal feedback significantly enhances the acquisition of real-time spatial and temporal location data, thereby augmenting the predictive capabilities of our personalized weather report generation model. This real-time feedback mechanism not only enriches the dataset but also contributes to the refinement of the predictive model, leading to improved accuracy and relevance in the dissemination of weather reports to users.

8.2. Privacy and Generalizability Considerations

While the integration of 5G terminal feedback data provides promising opportunities, it necessitates robust data governance and privacy protection protocols. The ethical and secure utilization of user location data demands careful consideration and implementation of stringent privacy measures, including anonymization, data encryption, and transparent data governance. These measures are crucial in upholding user trust and data security, ensuring responsible and ethical deployment of the proposed model.
Moreover, the localized application of our approach within a specific geographical context may pose challenges related to the generalizability and adaptability of the model for diverse urban and rural settings. Variations in user travel behavior, environmental factors, and infrastructural disparities across regions call for comprehensive validation procedures to assess the model’s efficacy in different geographic and climatic conditions. Comprehensive studies across varied geographical regions and user demographics are necessary to ensure the model’s reliability and applicability across diverse settings.

Author Contributions

Conceptualization and validation, Y.Y.; formal analysis, writing—original draft and methodology, F.F.; data processing and revision, Y.L.; writing-review and editing, Y.X.; data curation, L.W.; formal analysis, H.Z.; software design, W.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Heavy Rain and Drought-Flood Disasters in Plateau and Basin Key Laboratory of Sichuan Province support, grant number ”SCQXKJYJXZD202204”, “Research on solar radiation identification and prediction model based on artificial intelligence and image”, grant number ”2022XXJ-5”, and the Shaanxi Meteorological Information Center’s list of scientific and technological projects.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the meteorological data being real station data and containing location information.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, N.; Fan, J. On the Importance of Meteorology in People’s Lives. Agric. Technol. Serv. 2016, 33, 147–187. [Google Scholar]
  2. Ma, L.; Fu, X. Theory and Application of Agricultural-Energy-Meteorology Coupling in Agricultural Energy Internet. China Electr. Power 2021, 54, 115–124. [Google Scholar]
  3. Cao, R. Design and Implementation of Meteorological Service Products for Travel to and from School Based on WeChat Mini Program. China New Commun. 2021, 23, 30–31. [Google Scholar]
  4. Shao, Y.; Qin, Z.; Li, X. Quality Control of Automatic Station Temperature Observations with High Temporal and Spatial Resolution Based on EOF. J. Atmos. Sci. 2022, 45, 603–615. [Google Scholar] [CrossRef]
  5. Du, T. Discussion on Meteorological Data Mining Based on Cloud Computing and Association Rules Mining Technology. Smart City 2016, 2, 34. [Google Scholar] [CrossRef]
  6. Dai, B.; Xu, Y.; Luo, R. Factors Influencing Information Overload of Social Media Users and Its Consequences. Mod. Inf. 2020, 40, 152–158. [Google Scholar]
  7. Chen, L.; Han, B.; Li, B.; Li, S.; Wang, Y.; Ding, X. Analysis of Agricultural Meteorological Disaster Service Demand in Heilongjiang Province. Disaster Sci. 2019, 34, 78–82. [Google Scholar]
  8. Zhang, X.; Han, H. Exploration of Personalized Service in Meteorological Research Institutions’ Information Room. Mod. Inf. 2010, 30, 109–110. [Google Scholar]
  9. Chen, X. Research on Improving Public Meteorological Service Capability. Shandong Soc. Sci. 2015, S2, 159–160. [Google Scholar] [CrossRef]
  10. Cui, X.; Guo, X.; Tang, J.; Xu, J.; Shen, M. Exploration and Practice of Personalized Agricultural Meteorological Service Based on SMS. Hubei Agric. Sci. 2012, 51, 4506–4509. [Google Scholar] [CrossRef]
  11. Zhuang, G. Better Service Guarantee for the Modernization of a Strong Country through the High-Quality Development of the Meteorological Industry. Study Times August 2021, 4, 1. [Google Scholar]
  12. Circular of the State Council of the People’s Republic of China on Printing and Distributing the Outline of High-Quality Development of Meteorology (2022–2035); Gazette of the State Council of the People’s Republic of China: Beijing, China, 2022; pp. 11–16.
  13. Keller, J.M.; Gray, M.R.; Givens, J.A. A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 1985, SMC-15, 580–585. [Google Scholar] [CrossRef]
  14. Wen, W.; Wu, C.; Wang, Y.; Chen, Y.; Li, H. Learning structured sparsity in deep neural networks. In Proceedings of the Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
  15. Cutler, A.; Cutler, D.R.; Stevens, J.R. Random forests. In Ensemble Machine Learning: Methods and Applications; Springer: Berlin/Heidelberg, Germany, 2012; pp. 157–175. [Google Scholar]
  16. Liu, C. Analysis and Forecast of Traffic State in Chengdu Based on Taxi GPS Data. Master’s Thesis, China University of Mining and Technology, Jiangsu, China, 2020. [Google Scholar]
  17. Chen, H. Research on Temporal and Spatial Laws of Residents’ Travel Based on Taxi GPS Data. Master’s Thesis, Yunnan University, Yunnan, China, 2016. [Google Scholar]
  18. Du, M. User Behavior Analysis and Demand Forecasting Based on Shared Bicycle Travel Data. Master’s Thesis, Chang’an University, Chang’an, China, 2021. [Google Scholar]
  19. Shi, W.; Gao, T.; Zeng, Y.; Liang, J.; Liu, C.; Cai, Y. Research on Travel Reminder Rules Based on Comprehensive Meteorological Analysis of the Whole Journey. Guangdong Meteorol. 2020, 42, 59–62. [Google Scholar]
  20. Li, H.; Cao, M.; Wen, F. Construction of Intelligent Meteorological Service Model Based on User Behavior Analysis of Shenzhen Weather Mobile Internet Channel. Prog. Meteorol. Sci. Technol. 2019, 9, 222–224. [Google Scholar]
  21. Li, B. Research on Destination Prediction Methods Based on Taxi Mobile Trajectories. Master’s Thesis, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 2021. [Google Scholar]
  22. Abdollahi, M.; Khaleghi, T.; Yang, K. An integrated feature learning approach using deep learning for travel time prediction. Expert Syst. Appl. 2020, 139, 112864. [Google Scholar] [CrossRef]
  23. Zhang, Q.; Fu, F.; Tian, R. A deep learning and image-based model for air quality estimation. Sci. Total Environ. 2020, 724, 138178. [Google Scholar] [CrossRef] [PubMed]
  24. Reddy, D.R. Speech recognition by machine: A review. Proc. IEEE 1976, 64, 501–531. [Google Scholar] [CrossRef]
  25. Chowdhary, K.R.; Chowdhary, K.R. Natural language processing. In Fundamentals of Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2020; pp. 603–649. [Google Scholar]
  26. Zhao, W.; Liu, Y.; Yu, D. Intelligent Recommendation of Meteorological Services Based on Association Rules. Big Data 2018, 4, 72–85. [Google Scholar]
  27. Chen, T.; Guestrin, C. In Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  28. Cohen, I.; Huang, Y.; Chen, J.; Benesty, J. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.