The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems

Sabella, Dario; Micheli, Davide; Nardini, Giovanni

doi:10.3390/jsan12040049

Open AccessEditor’s ChoiceArticle

The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems

by

Dario Sabella

^1,*

,

Davide Micheli

²

and

Giovanni Nardini

³

¹

Intel Corporation Italia SpA, 20094 Milan, Italy

²

Telecom Italia S.p.A., 00198 Roma, Italy

³

Department of Information Engineering, University of Pisa, 56122 Pisa, Italy

^*

Author to whom correspondence should be addressed.

J. Sens. Actuator Netw. 2023, 12(4), 49; https://doi.org/10.3390/jsan12040049

Submission received: 28 April 2023 / Revised: 14 June 2023 / Accepted: 20 June 2023 / Published: 27 June 2023

(This article belongs to the Special Issue Advancing towards 6G Networks)

Download

Browse Figures

Versions Notes

Abstract

:

The evolution of communication systems always follows data traffic evolution and further influences innovations that are unlocking new markets and services. While 5G deployment is still ongoing in various countries, data-driven considerations (extracted from forecasts at the macroscopic level, detailed analysis of live network traffic patterns, and specific measures from terminals) can conveniently feed insights suitable for many purposes (B2B e.g., operator planning and network management; plus also B2C e.g., smarter applications and AI-aided services) in the view of future 6G systems. Moreover, technology trends from standards and research projects (such as Hexa-X) are moving with industry efforts on this evolution. This paper shows the importance of data-driven insights, by first exploring network evolution across the years from a data point of view, and then by using global traffic forecasts complemented by data traffic extractions from a live 5G operator network (statistical network counters and measures from terminals) to draw some considerations on the possible evolution toward 6G. It finally presents a concrete case study showing how data collected from the live network can be exploited to help the design of AI operations and feed QoS predictions.

Keywords:

5G; 6G; data traffic; multi-RAT; data analytics; AI

1. Introduction

Continuously growing traffic demand [1] confirms that society is moving towards a data-driven world, and it is also a natural driver for communication network evolution. This market demand translates into many technical requirements to be addressed by communication networks, a situation which obliges the entire ecosystem—e.g., operators, technology, and service providers—to continuously introduce elements of innovation in network infrastructure and terminals. At the same time, it is worth noticing that the usage of new networks will be mainly influenced by the market introduction of new terminals. This phenomenon is true since the era of the first smartphones, when the actual usage of more performing devices acted as a catalyser for the creation of new services, thus acting as a further enabler for data consumption (which is again driving a further cycle of network evolution).

In summary, as we have seen in past generations, when it comes to the evolution beyond 5G systems we may rely again on the typical innovation cycle (Figure 1), where the evolution of both networks and terminals is driven by traffic demand but also acts as a driver in turn for the same market evolution for increased data consumption. Quoting Brian Krzanich (former CEO of Intel), “Data is the new oil” [2]. With this scenario in mind, and according to the innovation cycle, the creation of new market opportunities (given by new services) will start from the creation of new network technologies and terminals.

Even if, at the time of writing this paper, still nobody can claim what 6G will be exactly, based on the transitions from previous generations some evolutionary considerations can be drawn on the current data traffic demand and on data forecasts present in the literature, complemented by data traffic extractions from a live 5G operator network. This paper will thus explore network evolution across the years from a data demand point of view and will draw some considerations to discuss the possible evolution toward 6G by using data sources at various levels from global traffic forecasts to data traffic extractions from a live 5G operator network, not only to design strategies to improve current operations but also to derive insights into network evolution. The rest of this paper is outlined as follows: Section 2 discusses the data traffic demand and forecasts across the years, including data-driven insights from global forecasts, as well as considerations from mobile data usage and from live networks. Section 3 presents a few selected technology trends and standards supporting network evolution toward 6G. Section 4 shows how data collected from a live network can be exploited to help the design of AI operations, taking a Hexa-X activity as a case study. Finally, Section 5 concludes this paper.

2. Data Traffic and Impact on Networks

Since the advent of smartphones, we have witnessed progressive and exponential growth in global data traffic demand [1]. Across the years, these forecasts were often adjusted excessively, since the reality most of the time surpassed reasonable or conservative predictions. Additionally, for current networks it is worth having a deep look at global forecasts, as they can also give an idea of market maturity and help in deriving useful insights toward 6G systems.

2.1. Data-Driven Insights from Global Forecasts

According to [2], during the next two years, most of the global mobile traffic (see Figure 2, showing the forecasts in millions of smartphones) will still be supported by the current 4G networks, but from the next year 5G networks will start to take over and in 2024–2025 the number of 4G and 5G subscriptions will be comparable, leading to a situation in 2027 where 5G will serve the majority of mobile subscriptions. Similar trends can also be seen when looking at traffic volumes. Moreover, the phase out of older networks (e.g., 2G, 3G) will progressively lead to a move to newer radio access technologies (especially in growing markets), with consequent enablement of an enhanced consumption of data traffic (indeed, because the portfolio of terminals is also renewed with smartphones having better capabilities). However, the global market is not homogeneous, and we can appreciate different levels of maturity in the various regions: in this perspective, we can notice that the Asian market will continue playing a key role in the global landscape; for example, the number of smartphones in 2027 for North-East Asia (mainly China) is expected to be comparable with the sum of Europe and the Americas. Moreover, the switching point between 4G and 5G smartphone subscriptions will occur earlier than in Europe and the Americas, as shown in Figure 2. This difference could give an idea of the maturity of 5G deployment in North-East Asia with respect to other macro-regions and should be taken into account for data-driven considerations (while the reader should not neglect the growing importance of the Indian market which for both 4G and 5G networks is expected to reach considerable volumes, if compared to Europe or the USA). In this regard, some data-driven considerations can be made. In fact, when it comes to future 6G systems, the role of Asian markets as “early adopters” of new technology can be exploited to anticipate some predictions in other regions (e.g., introduction of new devices, impact of new radio features on coverage, QoS, etc.).

Finally, the traffic consumed globally is not expected to be served only by cellular networks and the evolution of 5G. Especially for indoor environments, Wi-Fi connections are continuously improving their reliability and performance, whereas also the majority of network user traffic consumption is still related to indoor use cases and scenarios with limited mobility (e.g., home, office, shopping mall, etc.). Moreover, in mixed scenarios, where both Cellular and Wi-Fi networks are available, a 5G/Wi-Fi convergence is an important aspect to be considered since 5G networks are not expected to cover alone all users’ needs, especially in indoor spaces where good cellular coverage is typically more difficult to achieve [4]. To cope with this need, standards are already supporting 5G/Wi-Fi convergence at various levels, even if the actual maturity of technical solutions and related terminal support are not yet at a level of massive adoption. Similarly, there is a growing interest from industry (e.g., automotive use cases [5]) to consider also complementary coverage from satellite links in order to achieve seamless service continuity. In summary, the expectation here is that network evolution (see Section 3) will likely move toward multi-access convergence, as a natural consequence of the continuous growth of traffic demand.

2.2. Considerations from Mobile Data Usage

From a data consumption point of view, the average monthly usage per smartphone globally is expected to surpass 15 GB in 2022 [2], with the forecast to reach 40 GB per mobile per month by the end of 2027. According to [6], mobile data will more than triple in most regions over the next six years, driven by increasing smartphone adoption and video usage (in particular, their forecast on the data traffic per smartphone is to move from 11.4 GB/month in 2021 to 41.0 GB/month expected in 2027). This should not surprise the reader, as in past years video traffic has driven the global traffic increase, and also today a big percentage of data volume consumption (69% globally) is given by video traffic (a share that is forecast to increase to 79% in 2027). Additional services are likely to appear in future systems, and this trend is confirmed by a clear shift in the inclinations of consumers who with the advent of 5G are increasingly interested in adding non-connectivity offerings to their mobile subscriptions [6] In particular, users’ interest in mobile gaming is increasing from 23% to 36%, while interest or usage of cloud services in 5G is reaching 55%, and there is increasing enthusiasm for wearable devices. According to Ericsson, service providers expect additional traffic growth with the advent of new video services, such as high-definition video and XR services. In this perspective, new services will be also likely coupled with a progressive introduction of new devices (e.g., AR/VR glasses, gloves, displays) for gaming, entertainment, remote collaboration, healthcare, etc. New terminals enabling use cases related to the metaverse are also likely to be expected, creating an unfolding (r)evolution of the internet experience (see also Section 3). Another important aspect to further elaborate these forecasts is the speed of introduction of 5G terminals (period 2018–2027), compared to what happened with the introduction of 4G (2009–2018). According to [2], 5G subscription uptake is faster than 4G, and the expectation is for it to reach one billion 5G subscriptions after 4 years from the first deployment (instead of the 6 years needed in the past for 4G). Analysis on measurements from OpenSignal [7,8] reveal that across all smartphone makers, users with 5G models enjoy faster overall download speeds, typically seeing speeds 1.5–3 times faster than owners of 4G smartphone models from the same smartphone brand. In all six markets, all smartphone brands saw a significant jump in the overall download speeds experienced by users with 5G models, compared to owners of 4G models. Current 5G networks mostly use 3GPP Rel.15, but there are new standards coming onto the scene. More wireless spectrums will arrive which should boost speeds considerably, even in markets that already offer 5G. Responsiveness will improve with updated 5G technology, e.g., Rel.17. Moreover, future networks are expected to support more devices simultaneously [9].

2.3. Data Analysis from Operator Networks

If on the one hand global forecasts can provide macroscopic insights for network evolution, on the other hand diving into network counters and measures from terminals could offer precious insights on the specific behaviours of the RAN (Radio Access Network), plus more insights in general for planning and network management purposes. In particular, statistical network counters from base stations (BS) can offer a view on how the traffic demand is matching with offered network capacity, and all the performance information that is essential for network planning purposes. The gathered data is averaged within the cell, for which counters are not offering georeferenced inputs per UE (User Equipment). However, time series can show the daily evolution of many kinds of KPIs, e.g., average cell load (DL/UL), number of users, average user throughput, traffic volume, radio resource usage, etc.

As an example of network counters gathered from a live TIM network, Figure 3 shows the daily profile (averaged across multiple cells in a cluster related to the city of Parma) of the throughput served for active users. The DL and UL curves show that in current networks the DL/UL ratio is around 1.5, and a monitoring over time (even years) can give a sense of the evolution of traffic demand in this perspective. As a sidenote, the reader can notice that measures here are offered from the morning (8:00 a.m.), where indeed the counters extraction was planned. Of course, the entire day could be captured, but here the intention was to show that dataset completeness is subjected to many factors. In some other cases, it may happen instead that a reset of BSs (or any temporary electricity outage) can cause partial loss of data, with the consequent need for any post-processing to take into account of this phenomenon when analysing the data. This is indeed the typical difference between theory and practical cases, where databases need to be carefully managed to avoid biased/inaccurate results.

Another example of elaboration from network counters is provided by Figure 4, showing the radio frame usage, and in particular the statistical distribution functions of used PRBs (Physical Resource Blocks) in the cell (again, by considering the average for a cluster of cells). Here, the level of saturation can be monitored by separating DL and UL, and by identifying some target cells (e.g., in highly demanding areas, or in certain periods of the year, e.g., during concerts or crowded events).

A further element that can be captured (by properly filtering network counters, based on the percentage of urbanized environment) is the difference between urban and suburban areas. Taking proper actions based on this information is also critical for the optimization of a seamless user experience. Finally, a differentiation among the various days of the monitored period (e.g., working week versus weekend) can offer more insights related to average customer habits (always by preserving user privacy, as indeed data here is averaged on the whole cell). Figure 5 shows in fact how the daily profile of the cell DL traffic volume can be significantly different on Saturday (with respect to Thursday and Friday of the same week). This example also shows that data processing is a complex task because, depending on the desired outputs, an averaging of bigger time periods (e.g., weeks or months) should also consider the differences among various time series.

In addition to statistical network counters gathered at the BS level, it is worth mentioning in this context also the opportunity to collect even more refined data, at the UE level, thanks to MDT (Minimization of Drive Tests), a feature introduced in Rel.10 to allow the collection and reporting of georeferenced measures from UEs. There are two MDT collecting methods: Immediate MDT and Logged MDT. The first refers to the measures performed by the UE in connected mode, while the latter occurs in idle mode. The Immediate MDT provides both state and reporting of the measures at the time of reporting, while in the Logged MDT the measures are reported at a later point in time, typically at the first UE connection to the network [9]. MDT is developed over three main features: (1) periodic reporting of GPS location of the UE, if the GPS receiver is enabled and the UE supports GPS reporting over L3 (RRC Measurement Report); (2) periodic reporting of legacy/ordinary L3 and L2 measurements at UE and eNB, already used for signalling and radio resource management; (3) MDT Data collector and Big Data platform for processing and analysis [10]. In the context of the present paper, we analysed MDT samples from the live TIM network, and we observed that currently about 10% to 20% of UEs in the network report MDT radio measures with GPS position coordinates. This is because some UE are in indoor condition and so are not always visible to the GPS satellites. In other cases, it may happen also that some UE brands report radio measures without the associated GPS position. The reader can notice how these aspects clearly identify the difference between theory (where datasets are full of samples, without any holes) and real-life cases (where indeed datasets can miss some parts, or even contain corrupted or not comparable samples). These cases often require performing post-processing of MDT samples, to manage a sufficient number of samples that may lead to reliable statistics and related data-driven considerations. More details on MDT technology and standards can be found in [11]. However, in the context of this paper we simply show some exemplary extractions of MDT data related from the TIM networks of two medium-sized Italian cities: Parma and Piacenza (the data retrieved from the TIM network is related to 20 September 2022). In Figure 6, the MDT sample densities are shown for Parma for both 4G and 5G traffic (where the legend shows different colours for various levels of traffic densities). The figure shows how these MDT samples (taken with a resolution of a square meter) are distributed in the territory over a certain observation period. This distribution of the MDT sample densities thus gives an immediate perception of the 5G market penetration and related terminal density, compared to 4G traffic density (note that both figures for 4G and 5G use the same scale of percentage values in the legend, in terms of the various samples per square meter and related colours).

Similar charts can be drawn, e.g., showing the levels of RSRP (Reference Signals Received Power) of a critical cellular area quality of Parma, or the UL SINR (Signal to Interference plus Noise Ratio) of the same area, or again the best serving cell in the various pixels of the considered area. MDT is often use in radio planning and radio network optimization. In fact, as clearly displayed in charts (c) and (d) of Figure 6, there is a net correlation between a low radio coverage level of a specific area of the city and the Uplink SINR of the same area. This shows how MDT can be used in automatic tools to analyse critical areas of coverage and/or quality of service. Figure 6 is only a simple example of analysis but more and more can be achieved by taking into account other available MDT reports.

Here, it is worth noting that MDT measures coming from terminals can be used for many purposes. First of all, they are quite important for operators, e.g., in most RAN optimization scenarios and generally in planning operations. In fact, theoretical radio coverage prediction and numerical simulations can be compared and validated with MDT measures, in order to tune electromagnetic propagation theoretical models. Recent research reports the use of MDT in the analysis of interference scenarios [9], plus in the influence of humidity and maritime propagation on signal and network coverage [10,11]. Other uses of MDT reports also include analysis of solar radiations and their impact on RAN interference [12], hotspot detection and network capacity upgrades identification [13]. This can be obtained via the detection of local peaks in traffic (density of MDT reports) in space and time over a given search area, or differentiation of traffic type (indoor, outdoor, mobility), or again through the analysis of network performance during peak hours (serving cells, KPIs within the hotspot). This analysis can target possible actions for the MNO (Mobile Network Operators), e.g., accurate small cells deployment plans (best candidates’ list to ensure high ROI), or identification of high traffic areas requiring capacity upgrades. Moreover, MDT reports can be used for other purposes, not necessarily related to network planning, e.g., to infer customer behaviour from measures, e.g., users’ feelings during concerts [14]. MDT are also applied on the analysis of road and pedestrian mobility (these studies are related to smart city and road/vehicle traffic modelling [15]).

In summary, measures from MDT-capable devices (jointly with global/country-level traffic trends and network counters) could offer opportunities for data-driven optimizations and network traffic insights. However, even if in most of the cases MDT accuracy is satisfactory for the use cases described above, it is important to notice that the variability of RSRP measures leads sometimes to difficulty in building accurate models able to predict future signal values [16]. In such big-data contests, we expect MDT performance advances in the future, where AI is expected to play a role in extracting insights from measures to learn and make QoS predictions [17,18]. The use of MDT data is now improving several radio interference analyses that are even coming from outer space. For example, a recent paper reports how MDT data allows a direct evaluation of radio interference phenomena on cellular phone networks caused by solar flares [19].

Moving forward toward 6G systems, we may expect to see the highest of those MDT-enabled Data Analytics in the domain of SON (Self Organizing Networks). In future systems, georeferenced data from terminals will permit us to better follow the various traffic sources over time, and this will permit us to design more advanced network optimizations and also better network planning, i.e., not only based on the service type, but also taking into account the usage of terminals and the specific position of the users. In summary, we envision the evolution toward systems leveraging more advanced datasets (e.g., georeferenced and including mobility patterns) that will permit us to recognize specific events (e.g., mobility patterns) and adapt network performances accordingly (e.g., in a geo-localized way).

3. Status and Technology Evolution of Mobile Communication Systems

As already said, network evolution is naturally driven by traffic demand. Yet the opposite is also true, since according to the typical innovation cycle, new network technologies are also enabling the usage of new terminals and devices, which are introducing new services and so on, to complete the cycle. So, at the end, to obtain useful insights from data analysis (as per Section 2), we need also to have a look at the current status of standards, as this will give us a sense of the maturity of the various technologies that are or will be introduced in the market.

3.1. Standardization Trends toward 6G: Devices Evolution

Standardization efforts related to mobile networks include many 3GPP groups; moreover, also ETSI groups are working on some network-related aspects (e.g., NFV, MEC, ZSM). So, it is not practical to mention all standards potentially impacting on 6G. Nonetheless, the aim of this paper is to analyse how the introduction of new terminals can unlock new services and thus stimulate the evolution of traffic demand (as per the typical innovation cycle shown in Section 1). In this context, it is thus worth mentioning here a few technology components as meaningful examples of features (although not exhaustive) that are likely to have an impact on new devices (and thus on data usage/patterns). Some examples:

Low-end devices: in 2024, the first reduced capability (RedCap) devices should be available, introducing relaxed requirements on the receiver in the device, allowing lower costs compared to standard NR. RedCap devices can facilitate the expansion of the NR device ecosystem to cater to the use cases that are not currently best served by NR specifications. This includes wearables, industrial wireless sensors, and video surveillance. The introduction of RedCap devices can further stimulate the market in all sectors of Internet-of-Things, influencing data usage/patterns.
Multi-access integration: it is evident that 5G access networks cannot cope alone with global traffic demand growth in some indoor (with better Wi-Fi coverage) or outdoor scenarios (where satellites can complement mobile networks). It is then expected to see a convergence among different accesses (supported by traffic combination at higher levels e.g., via MTS APIs [20]), where multi-access devices will experience better and ubiquitous performance. This usage of multiple accesses can be also beneficial at the network level in terms of energy efficiency.
Location and positioning technologies: many advanced use cases (e.g., from 5GAA [21], on coordinated manoeuvres for connected and automated vehicles) will require more stringent precision in order to provide location-based services. Furthermore, the evolution of MDT-enabled devices is moving is this direction, where an improved set of measures from UEs (by keeping user privacy) can offer more opportunities to improve perceived the user experience.
High-end devices (e.g., for AR/VR, multiverse): 3GPP [22] is working on 5G-Adv standard for metaverse applications by selecting UCs and capabilities based on the specification for “Tactile and Multi-modality Services” (see also the LF Edge Akraino Technical Summit Fall 2022). Examples of these UCs: (1) Localized Mobile Metaverse Service UCs, (2) Mobile Metaverse for 5G-enabled Traffic Flow Simulation and Situational Awareness, (3) Collaborative and Concurrent Engineering in Product Design using Metaverse Services. Now, while it is still not clear what the metaverse will be, it is however expected that high-end devices will critically push even higher requirements in future systems.
Edge computing: thanks to better latency for the proximity to the end user, edge computing is commonly considered as a key technology for the evolution of communication systems. Gartner predicts that by 2025, three-quarters of enterprise-generated data will be created and processed at the edge, outside centralized datacentres or clouds. New services at the edge will also provide new revenue opportunities for operators and service providers. This paradigm shift will change radically the way data is processed and consumed, with increased presence of applications at the edge; this process will further transform data traffic patterns, by influencing in turn the evolution of networks and devices. In particular, from a device perspective, the availability of edge servers will create huge opportunities to design edge native applications which can better exploit network capabilities in a low-latency environment, thus enabling new and innovative services at the terminal side. It is also worth noting that one key use case in ETSI MEC [23] is also about application computation offloading, where the MEC host executes compute-intensive functionalities with high performance instead of mobile devices. This use case can help especially for computation-hungry applications such as graphical rendering (high-speed browser, AR/VR, 3D games, etc.), intermediate data-processing (sensor data cleansing, video analysing, etc.), and value-added services (translation, log analytics, etc.).
In-RAN computing: it is expected that 6G will be the first generation to shift from a communication-centric to a communication computing-data centric system [24], with tight coupling between communication and computing. According to estimations, the growth of data would far outpace the growth of network capacity. Current systems would not be able to transport all the data to datacentres for processing: in fact, even if there is sufficient communication capacity, the cost of transporting data is high. For example, according to [24], with an estimated 10 nJ/bit energy consumption for transporting data over 500 km, 22 trillion kWh of energy will be needed to transport 1 mln zettabytes of data. So, computing close to data sources is a way to cater to the exponential growth of data and reduce the energy costs of its transport.

3.2. Standardization Trends toward 6G: Network Infrastructure Evolution

Referring again to Figure 1 (showing the typical innovation cycle and the key driving role of data), we remember that the improvement of network infrastructure and related operations can be properly influenced by data. In particular, AI/ML algorithms (properly fed by data and measurements) can suitably enable RAN intelligence, i.e., a set of RAN features that can help with optimizing how the RAN operates, to maximize certain KPIs and performance parameters. From a standardization point of view, we may emphasize the presence of the following activities in 3GPP, related to network infrastructure evolution, which are clearly driven and influenced by the presence of data:

RAN intelligence: 3GPP is studying in TR 37.817 how these RAN intelligence features in Release 17 can be enabled by AI/ML, including a functional framework (where the model of the training function may reside in OAM or RAN nodes, and the model for the inference functions resides in RAN nodes) and a set of input/output parameters for AI/ML-enabled RAN optimization functions (Network Energy Saving, Load Balancing, Mobility Optimization).
O-RAN: in the context of Radio Access Network, it is worth mentioning also the relevant work performed by O-RAN Alliance in proposing a new architecture called Open-RAN (O-RAN) that consists in splitting the RAN into various parts based on functionality. This functional split of the RAN not only permits us to have cost efficient and more flexible networks, but also creates a chance for the small vendors and operators to start their own services and to increase their market revenue, compared to the current situation, where RAN vendors are typically offering a complete (and often still monolithic) solution to mobile operators. In the O-RAN architecture [25], the various RAN components are disaggregated, where Distributed Unit (DU) and multi-RAT Control Unit (CU) are separated and running on a NFVI platform. This distribution of RAN functionalities (that in the traditional RAN architecture were aggregated into a single node) will increase the network reliability by avoiding any single point of failure. Finally, the O-RAN architecture defined by O-RAN Alliance permits us to enhance the traditional RAN functions with AI via the introduction of the RAN Intelligent Controller (RIC) platform, implementing RAN monitoring and control techniques in the form of rApps and xApps, respectively for “Non-RT (Real-Time) RIC” and “Near-RT RIC”. Suitable uses for the RIC are mainly focused on AI-enabled RAN optimization, but can also include advanced functionalities, such as the integration of RIC with MEC enabling cross-layer application design (e.g., network and QoS aware adaptive MEC applications), or again xApps for Flexible ML-based Spectrum Sensing, e.g., to enable cognitive radio concepts [26].
Distributed/Federated Learning over 5G System (5GS): while Rel-17 5GS plans to support AI/ML training and inference within the 5GC via NWDAF for network automation purposes, for their Release 18 it is notable how 3GPP is also working to provide intelligent transmission support for application AI/ML-based services, e.g., AI/ML model distribution, transfer, training for various applications, video/speech recognition, robot control, automotive, etc. A relevant example of AI/ML operations is the Distributed/Federated Learning over 5G system (see also 3GPP TR 22.874). Federated Learning (FL) is an increasingly widely used approach for training computer vision and image recognition models. In Federated Learning mode, the cloud server (called also an “aggregator”) trains a global model by aggregating local models partially trained by each of the end devices (“collaborators”) based on an iterative model averaging method.

In summary, as we have seen, when it comes to the evolution beyond 5G systems we may rely on the typical innovation cycle, where the evolution of both networks and terminals is driven by traffic demand but also acts as a driver in turn for the same market evolution for increased data consumption. This is true not only in the domain of standardization, but also from the point of view of research communities, for which data-driven innovations are described in the following section.

3.3. Research Trends: Data-Driven Innovations

Collecting data from operational mobile networks has been employed by several works in the literature towards the design of more effective and more efficient mobile networks.

Data-driven approaches have been recognized as an effective way to improve network performance as shown in [27], which presents a comprehensive survey of research papers that use Machine Learning techniques to optimize 5G networks. The paper highlights multiple use cases where the collection of data produced by both the users of the network and the network itself may be used to feed proactive optimization techniques within certain contexts, e.g., traffic prediction, load balancing, interference coordination, and edge caching. In [28], the authors identified the needs of real-world data traces for enabling better capacity planning, and presented a large-scale dataset composed of data from users’ traffic demand, infrastructure deployment, and population. In [29], hourly traffic data collected by a large number of base stations in different areas of Milan (Italy) have been used to design energy-saving techniques based on sleep modes. Likewise, [30] uses data traffic demands obtained from probes in the mobile network infrastructure serving a large urban area to predict the capacity of network slices using a deep neural network. MDT data have been proposed in [31] in order to train a neural network aimed at enabling the automatic detection of issues in radio interface quality, in the context of self-healing functionalities of a mobile network.

Moreover, there exist national and international projects worldwide whose aim is to design the future generation of mobile networks following a data-driven approach. For example, the EU-funded DAWN4IoE project [32] aimed at exploiting new big data analytics techniques to optimize mobile network planning and functions, such as heterogeneous cell planning, radio resources management, and data caching. The CANCAN project [33] funded by the French government follows a data-driven methodology to design orchestration policies within the mobile networks. The project lays its basis on data collection from an operational French network, in order to tailor existing analytics to the specific needs of network resource management.

3.4. The Power of Data: Overview of Key Enablers for Data-Driven Insights

As we have seen, a number of key enablers can be conveniently exploited by leveraging data extractions from various sources and combined in different ways to derive AI-powered or human-derived data-driven insights. Table 1 below summarizes the main enablers available in current standards, as they can be also leveraged in future systems, where certainly data analytics will continue to play a key role. The table is thus showing what data can be collected from terminals and mobile networks, how they can be used to improve performances, and evidence of such improvement in existing works.

4. Exploiting Data toward 6G: The Fed-XAI Case Study

Apart from the standardization efforts discussed in Section 3, many communities (e.g., 6G projects) are influencing the innovation cycle with their findings. When it comes to 6G, there are many active discussions and technology development efforts around the globe, such as ITU-R WP5D 2030 future technology trends and vision, North American initiated Next G Alliance (NGA), European Union funded 6G Flagship project Hexa-X [36], and China initiated IMT-2030 PG, to name a few. In particular, among the many areas investigated by the Hexa-X project for future 6G systems, it is worth mentioning in the context of this paper the activity called Fed-XAI, coming from the collaboration between University of Pisa, Intel, and TIM, and proposing the integration of Federated Learning (FL) with eXplainable Artificial Intelligence (AI) models (XAI), with the objective to improve the user experience by helping end-users trust the decisions performed by in-network AI. In the following, we refer to the integration of these two approaches as Fed-XAI.

In this section, we first introduce the concepts of FL and XAI separately, then we discuss how their integration has been envisioned in the Hexa-X project, strongly supported by data collected from TIM’s mobile network. Finally, we present in detail a practical implementation of the Fed-XAI innovation exploiting a real-time, emulated network testbed, in which the availability of live-RAN data has been key to the development and testing of Fed-XAI in a realistic application scenario. This provides tangible proof of how data drives innovation toward 6G network development.

4.1. Federated Learning of eXplainable Artificial Intelligence Models: The Hexa-X Experience

Fed-XAI blends FL and XAI together by allowing the collaborative learning of transparent models, making it a promising approach to build a data privacy-preserving and trusted AI ecosystem, towards the tight integration between the digital and the human worlds.

On one hand, FL consists in a learning paradigm that allows multiple parties (i.e., data owners) to collaboratively learn an AI model without any disclosure of private raw data. This is accomplished by training AI models locally using the private data of the user and sharing the obtained (local) models—rather than users’ data—with a central entity. The latter aggregates the received local AI models in order to produce a global one. In this way, the privacy of the users’ data is preserved, while the accuracy of the global AI model can still take advantage of multiple experiences. XAI focuses on producing AI models that have the capabilities to provide useful and easy-to-understand details about their functioning, as opposed to the so-called opaque AI models. Explainability can be obtained either through inherently transparent models, such as rule-based systems or decision trees, or by applying post-hoc explainability techniques to “black-box” models, such as deep neural networks. The scope of XAI is to improve the trust in the results produced by AI techniques. Integrating FL and XAI, i.e., enabling the construction of inherently explainable AI models learnt in a federated fashion, makes it possible to leverage the benefits of the two approaches simultaneously, such as privacy and trustworthiness.

The Fed-XAI innovation proposed in the Hexa-X project has been clearly driven by data. In fact, the proposed Fed-XAI approach has been demonstrated by implementing a prototype [17] composed of real devices/applications and a mobile network emulated with the Simu5G simulator [37]. One of the cornerstones of the above prototype is the use of realistic sources of live data from an MNO network. In particular, the network scenario implemented within Simu5G is designed considering data taken from TIM’s live network as input, such as base station position and user data volume, extrapolated to produce predictions using AI-based algorithms. The usage of real data and live measurements from the MNO network is critical for the reliability of the produced output. In that perspective, the MDT functionality is also applied on TIM’s RAN to acquire geolocated data from live RAN. The advantage of this data-driven approach proposed by Fed-XAI is twofold: first, privacy preservation by leveraging FL for during collaborative training of AI models, especially suitable in heterogeneous B5G/6G scenarios; second, to ensure an adequate degree of explainability of the models themselves (including the obtained aggregated model as a result of FL), with better benefits for industrial customers in terms of high dependability, and for end-users in terms of trustworthiness.

4.2. Combining Live Network Data with Network Simulations to Support Fed-XAI Operations

In this section, we show how data collected from a live network has been key to the realization of the Fed-XAI prototype. In the latter, tele-operated driving (ToD) is considered as a use case: a vehicle streams a real-time video to a remote host, which in turn plays the video and allows a (human or machine) operator to drive the vehicle remotely. Clearly, this is only possible if the quality of the video stream is good enough to allow a smooth driving experience. Thus, AI models can be used to predict the future quality of the video-stream, based on a few QoS metrics detected during the streaming of the video itself. The Fed-XAI prototype is composed of two main phases: an offline training phase, the aim of which is to produce an XAI model using FL; and an online inference phase, the aim of which is to exploit the above model to make real-time predictions of the video quality.

In the offline training phase, a dataset including QoS metrics obtained from a mobile network is required. Thus, we exploited the Simu5G system-level simulator [37] to produce a meaningful dataset that includes a large set of QoS data produced by several video-streaming sessions. We relied on Simu5G because it allows us to generate a wide range of statistics from large-scale, custom network scenarios, and to obtain the required amount of data by running large simulation campaigns that explore all the possible radio conditions. This makes the dataset meaningful to be used as a basis for learning AI models. However, such a dataset could represent a real network scenario only if the network simulated within Simu5G is configured according to a real network configuration, i.e., a real network topology and traffic conditions. Data extracted from the TIM’s live network is exploited to make the scenario more realistic and then to produce more meaningful datasets. In particular, the position of BSs in the simulation are set according to their actual positions in the city of Turin. Moreover, the actual data volume handled by those BSs was used to configure the background traffic in the simulation, i.e., to produce realistic cell workloads. In more detail, we used data-volume values provided by cell-wise network counters from the TIM’s network, which provide averaged metrics over a time span of 15 min. Three days of such values were extracted, resulting in 288 values for each BS. This guided the configuration of our simulation campaign to generate the dataset: we configured 288 instances, each 15-min long, during which the data volume served by the BSs (i.e., its workload) corresponds to that provided by network counters.

The simulated network topology is configured as shown in Figure 7, where UEs (i.e., connected cars) move along one main road and three intersecting roads. Intersections are regulated by traffic lights. Such a portion of the urban scenario is served by multiple BSs that provide 5G connectivity to the UEs. The latter locally run the sender side of the video-streaming application, which streams the video to a remote-driving application hosted on a MEC host. To do so, we needed to implement a realistic model of a video streaming application within Simu5G, where UEs send streams to a remote host following a trace-based approach, i.e., rate and size of UL video frames are read from a log file generated from dash-cam videos. This is useful to model video traffic from real-life ToD scenarios.

Since an AI model is more effective when it is trained with a large amount of data, each 15-min simulation instance was repeated five times with different seeds of the pseudo-random number generators. This also has the effect of simulating multiple UEs’ mobility patterns, and, in turn, it increases the variability of the scenarios learned by the training algorithms.

Figure 8 shows an example of a QoS metric that we extracted from the above simulation campaign, i.e., the evolution over time of the end-to-end delay of video segments, where we observe that the metric changes over time due to UE mobility and variable interference produced by the background traffic. In [17], an example of how a dataset including QoS metrics such as the one in Figure 8 can be used to predict the future quality of the video streaming is described in detail. Note that the above approach can be enhanced by considering improved network capacity and increased volume of background traffic following the foreseen evolution toward 6G networks, allowing the design of AI algorithms and models based on datasets representing future network scenarios.

Once the dataset had been generated as described above, we trained an XAI model according to the Takagi–Sugeno-Kang Fuzzy Rule-Based (TSK-FRB) system in an FL setting [38]. The Fed-XAI application that performs the training has been designed and implemented following a fully virtualized architecture by deploying each module inside a Docker container, so as to enable portability regardless of the underlying hardware and software infrastructure, and for easier migration in real-world, edge computing-based environments. The Fed-XAI application exploits the Intel OpenFL library, purposely extended to support FL of inherently interpretable models, such as TSK-FRB.

For the online inference phase, we implemented the real-time testbed shown in Figure 9. It includes a general-purpose PC running Simu5G, exploiting its network emulation capabilities to emulate the mobile network in real time [39]. This means that we make the mobile network within Simu5G evolve in real time (i.e., synchronized with the wall-clock time), while it processes real video frames generated by external, real applications. The video source and the video player are realized using the VLC software and are hosted respectively on a laptop and a tablet. Both are connected via Ethernet interfaces to the PC running Simu5G, so that packets generated by the video source traverse the emulated network before reaching the video player. In this setting, the quality of the video at the receiver depends on the network conditions. In particular, the emulated network had been configured using the same scenario and parametrization described above, hence using TIM’s live network data as input in order to make the emulation as realistic as possible.

The streamed video shows a scene shot using a dash-cam on a highway. During the execution of the testbed, probes within Simu5G allow us to generate the metrics that are sent in real time to the inference module of the Fed-XAI application, as shown in Figure 9. The inference module uses such data and the pre-trained model to make the quality predictions, which are shown in real time on a dashboard.

The testbed implementation is shown in Figure 10. It is composed of a laptop acting as the video server and a tablet acting as the video receiver, which plays the received video in real-time. The video frames traverse a miniPC equipped with the Simu5G simulator, which affects the latency of such video frames. This makes the quality of the video played out at the receiver depend on the status of the user in the emulated network. While the video flows across the network, Simu5G extracts real-time statistics and sends them to the Fed-XAI app implemented on a fourth PC. The quality forecast is shown at both the receiving tablet and the screen of the PC hosting the Fed-XAI app (in the top-right corner of Figure 10). The dashboard is also shown in Figure 11. It shows two “semaphores”: the leftmost one shows the expected (predicted) quality in the next three seconds, whereas the rightmost one shows the prediction made three seconds before and is used for verification purposes. In this example, the Fed-XAI application had predicted poor video quality (red light). In fact, in Figure 10 we observe that the video at the receiving side presents some impairments. The dashboard also shows the explanations for the prediction of poor video quality in the charts at the bottom-left corner of Figure 11. In this case, the prediction is due to high cell load utilization and low signal-to-interference-plus-noise ratio experienced by the car. This is reasonable as we observe in the top left screen in Figure 10 that the considered car (the blue dot) is located at the border between two base stations (the red triangles).

5. Conclusions

This paper showed the power and importance of data-driven elaborations, where macroscopic forecasts, network counters, and MDT measures can feed suitable insights for discussing the possible evolution toward 6G systems, and the related challenges and opportunities for operators and service providers, both in terms of technical and economic aspects. It also presented many examples of the usage of network counters and MDT data, including a practical case study showing how data collected from live network can be exploited to help the design of AI operation (as part of the Fed-XAI activity in the Hexa-X project). Future work includes the showcasing of a Fed-XAI prototype and its related results.

Author Contributions

Conceptualization, D.S. and D.M.; writing—original draft preparation, D.S.; writing—review and editing, D.S., D.M. and G.N.; resources, D.M.; software, G.N. All authors have read and agreed to the published version of the manuscript.

Funding

Part of this work has been funded from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101015956 Hexa-X, as well as by the Italian Ministry of Education and Research (MIUR) in the framework of the FoReLab and CrossLab projects (Departments of Excellence).

Data Availability Statement

Restrictions apply to the availability of the data used during the Fed-XAI activity in the framework of Hexa-X project, as it includes data obtained from TIM.

Acknowledgments

We would like to thank the FED-XAI team for their hard work with Simu5G simulator and with Intel OpenFL framework, and the TIM colleagues that helped in gathering the data from TIM live network.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Cisco Annual Internet Report (2018–2023). Available online: www.cisco.com/c/en/us/solutions/executive-perspectives/annual-internet-report/index.html (accessed on 19 June 2023).
Ericsson Mobility Report. 2022. Available online: www.ericsson.com/4a4be7/assets/local/reports-papers/mobility-report/documents/2022/ericsson-mobility-report-q2-2022.pdf (accessed on 19 June 2023).
Sabella, D.; Reznik, A.; Frazao, R. Multi-Access Edge Computing in Action; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar] [CrossRef]
Ericsson Report. Planning In-Building Coverage for 5G: From Rules of Thumb to Statistics and AI. 2021. Available online: https://www.ericsson.com/49ced5/assets/local/reports-papers/mobility-report/documents/2021/planning-in-building-coverage.pdf (accessed on 19 June 2023).
5GAA Position Paper: Secure Space-Based Connectivity Programme and Focus on the European Communication Satellite Constellation. Available online: 5gaa.org/content/uploads/2022/10/5GAA_NTN_Position_Paper.pdf (accessed on 19 June 2023).
GSMA Intelligence. The Mobile Economy 2022. Available online: https://www.gsma.com/mobileeconomy/wp-content/uploads/2022/02/280222-The-Mobile-Economy-2022.pdf (accessed on 19 June 2023).
OpenSignal. The Smartphone Experience Shift from 4G to 5G. March 2022. Available online: www.opensignal.com (accessed on 19 June 2023).
OpenSignal. 5G Impact on the Global Mobile Network Experience. 2022. Available online: www.opensignal.com (accessed on 19 June 2023).
Mizzi, C.; Fabbri, A.; Rambaldi, S.; Bertini, F.; Curti, N.; Sinigardi, S.; Luzi, R.; Venturi, G.; Davide, M.; Muratore, G.; et al. Unraveling pedestrian mobility on a road network using ICTs data during great tourist events. EPJ Data Sci. 2018, 7, 44. [Google Scholar] [CrossRef] [Green Version]
Scaloni, A.; Cirella, P.; Sgheiz, M.; Diamanti, R.; Micheli, D. Multipath and Doppler Characterization of an Electromagnetic Environment by Massive MDT Measurements from 3G and 4G Mobile Terminals. IEEE Access 2019, 7, 13024–13034. [Google Scholar] [CrossRef]
3GPP TS 37.320 V17.3.0 (2023-03): Radio Measurement Collection for Minimization of Drive Tests (MDT); Overall Description; Stage 2. Available online: https://www.3gpp.org/ftp/Specs/archive/37_series/37.320 (accessed on 19 June 2023).
Micheli, D.; Diamanti, R. Statistical analysis of interference in a real LTE access network by massive collection of MDT radio measurement data from smartphones. In Proceedings of the 2019 PhotonIcs & Electromagnetics Research Symposium-Spring (PIERS-Spring), Rome, Italy, 17–20 June 2019. [Google Scholar]
Vannelli, A.; Micheli, D.; Muratore, G. Statistical analysis of smartphone MDT signaling power measurements for Radio Maritime LTE propagation study. In Proceedings of the 2020 International Symposium on Electromagnetic Compatibility-EMC EUROPE, Rome, Italy, 23–25 September 2020. [Google Scholar] [CrossRef]
Micheli, D.; Muratore, G.; Vannelli, A.; Scaloni, A.; Sgheiz, M.; Cirella, P. Rain effect on 4G LTE in-car electromagnetic propagation analyzed through MDT radio data measurement reported by mobile phones. IEEE Trans. Antennas Propag. 2021, 69, 8641–8651. [Google Scholar] [CrossRef]
Scaloni, A. Minimization of Drive Test (MDT): An Innovative Methodology for Measuring Customer Performance on Mobile Network. In Proceedings of the ITU Workshop on “Benchmarking of Emerging Technologies and Applications. Internet Related Performance Measurements”, Geneva, Switzerland, 11 March 2019. [Google Scholar]
Micheli, D.; Muratore, G. Smartphones reference signal received power MDT radio measurement statistical analysis reveals people feelings during music events. In Proceedings of the 2019 PhotonIcs & Electromagnetics Research Symposium-Spring (PIERS-Spring), Rome, Italy, 17–20 June 2019. [Google Scholar]
Renda, A.; Ducange, P.; Marcelloni, F.; Sabella, D.; Filippou, M.C.; Nardini, G.; Stea, G.; Virdis, A.; Micheli, D.; Rapone, D.; et al. Federated Learning of Explainable AI Models in 6G Systems: Towards Secure and Automated Vehicle Networking. Information 2022, 13, 395. [Google Scholar] [CrossRef]
Skocaj, M.; Amorosa, L.M.; Ghinamo, G.; Muratore, G.; Micheli, D.; Zabini, F.; Verdone, R. Cellular network capacity and coverage enhancement with MDT data and Deep Reinforcement Learning. Comput. Commun. 2022, 195, 403–415. [Google Scholar] [CrossRef]
Muratore, G.; Giannini, T.; Micheli, D. Solar radio emission as a disturbance of radiomobile networks. Sci. Rep. 2022, 12, 9324. [Google Scholar] [CrossRef] [PubMed]
ETSI GS MEC 015 V2.2.1 (2022-12). Multi-Access Edge Computing (MEC); Traffic Management APIs. Available online: https://www.etsi.org/deliver/etsi_gs/MEC/001_099/015/02.02.01_60/gs_MEC015v020201p.pdf (accessed on 19 June 2023).
5GAA. System Architecture and Solution Development; High-Accuracy Positioning for C-V2X. Available online: https://5gaa.org/wp-content/uploads/2021/02/5GAA_A-200118_TR_V2XHAP.pdf (accessed on 19 June 2023).
3GPP TR 22.856 V1.0.0 (2023-03). Feasibility Study on Localized Mobile Metaverse Services (Release 19). Available online: https://www.3gpp.org/ftp/Specs/archive/22_series/22.856/ (accessed on 19 June 2023).
ETSI GS MEC 002 V3.1.1 (2023-04). Multi-Access Edge Computing (MEC); Use Cases and Requirements. Available online: https://www.etsi.org/deliver/etsi_gs/MEC/001_099/002/03.01.01_60/gs_MEC002v030101p.pdf (accessed on 19 June 2023).
Li, Q.; Ding, Z.; Tong, X.; Wu, G.; Stojanovski, S.; Luetzenkirchen, T.; Kolekar, A.; Bangolae, S.; Palat, S. 6G Cloud-Native System: Vision, Challenges, Architecture Framework and Enabling Technologies. IEEE Access 2022, 10, 96602–96625. [Google Scholar] [CrossRef]
Singh, S.K.; Singh, R.; Kumbhani, B. The Evolution of Radio Access Network Towards Open-RAN: Challenges and Opportunities. In Proceedings of the 2020 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Seoul, Republic of Korea, 6–9 April 2020; pp. 1–6. [Google Scholar] [CrossRef]
Rego, I.; Medeiros, L.; Alves, P.; Goldbarg, M.; Lopes, V.; Flor, D.; Barros, W.; Sousa, V.; Aranha, E.; Martins, A.; et al. Prototyping near-real time RIC O-RAN xApps for Flexible ML-based Spectrum Sensing. In Proceedings of the 2022 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Phoenix, AZ, USA, 14–16 November 2022; pp. 137–142. [Google Scholar] [CrossRef]
Ma, B.; Guo, W.; Zhang, J. A Survey of Online Data-Driven Proactive 5G Network Optimisation Using Machine Learning. IEEE Access 2020, 8, 35606–35637. [Google Scholar] [CrossRef]
Francesco, P.D.; Malandrino, F.; DaSilva, L.A. Assembling and Using a Cellular Dataset for Mobile Network Analysis and Planning. IEEE Trans. Big Data 2018, 4, 614–620. [Google Scholar] [CrossRef]
Vallero, G.; Renga, D.; Meo, M.; Marsan, M.A. Greener RAN Operation through Machine Learning. IEEE Trans. Netw. Serv. Manag. 2019, 16, 896–908. [Google Scholar] [CrossRef]
Bega, D.; Gramaglia, M.; Fiore, M.; Banchs, A.; Costa-Pérez, X. DeepCog: Optimizing Resource Provisioning in Network Slicing With AI-Based Capacity Forecasting. IEEE J. Sel. Areas Commun. 2020, 38, 361–376. [Google Scholar] [CrossRef] [Green Version]
Gómez-Andrades, A.; Barco, R.; Muñoz, P.; Serrano, I. Data Analytics for Diagnosing the RF Condition in Self-Organizing Networks. IEEE Trans. Mob. Comput. 2017, 16, 1587–1600. [Google Scholar] [CrossRef]
DAWN4IoE Project. Available online: https://www.dawnforioe.com/ (accessed on 19 June 2023).
CANCAN Project. Available online: https://cancan.roc.cnam.fr/ (accessed on 19 June 2023).
Lohmüller, S.; Schmelz, L.C.; Hahn, S. Adaptive SON management using KPI measurements. In Proceedings of the NOMS 2016–2016 IEEE/IFIP Network Operations and Management Symposium, Istanbul, Turkey, 25–29 April 2016; pp. 625–631. [Google Scholar] [CrossRef]
Hahn, S.; Schweins, M.; Kürner, T. Impact of SON function combinations on the KPI behaviour in realistic mobile network scenarios. In Proceedings of the 2018 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Barcelona, Spain, 15–18 April 2018; pp. 1–6. [Google Scholar] [CrossRef]
Uusitalo, M.A.; Ericson, M.; Richerzhagen, B.; Soykan, E.U.; Rugeland, P.; Fettweis, G.; Sabella, D.; Wikström, G.; Boldi, M.; Hamon, M.H.; et al. Hexa-X The European 6G Flagship Project. In Proceedings of the 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Porto, Portugal, 8–11 June 2021. [Google Scholar]
Nardini, G.; Sabella, D.; Stea, G.; Thakkar, P.; Virdis, A. Simu5G—An OMNeT++ Library for End-to-End Performance Evaluation of 5G Networks. IEEE Access 2020, 8, 181176–181191. [Google Scholar] [CrossRef]
Bárcena, J.L.C.; Ducange, P.; Ercolani, A.; Marcelloni, F.; Renda, A. An Approach to Federated Learning of Explainable Fuzzy Regression Models. In Proceedings of the 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Padua, Italy, 18–23 July 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar]
Nardini, G.; Stea, G.; Virdis, A. Scalable Real-time Emulation of 5G Networks with Simu5G. IEEE Access 2021, 9, 148504–148520. [Google Scholar] [CrossRef]

Figure 1. The typical innovation cycle in ICT [3].

Figure 2. Forecast Mobile Subscriptions [mln] (elaboration from [2]).

Figure 3. Average daily profile of UE throughput (elaboration from live network data).

Figure 4. CDF and PDF of radio resource usage (elaboration from live network data).

Figure 5. Daily traffic profile of cell DL traffic volume (elab. from live network data) (Turin, Italy).

Figure 6. (a) MDT sample density (Parma, IT) for 4G (left) and (b) 5G traffic (right). (c) MDT RSRP dB for a critical area coverage of Parma, (d) UpLink SINR for a critical area coverage of Parma.

Figure 7. Simulation Scenario for Fed-XAI, leveraging data collected from the live network.

Figure 8. Trace of end-to-end delay over time of video segments.

Figure 9. High-level representation of the Fed-XAI real-time testbed.

Figure 10. Prediction of real-time video quality.

Figure 11. Status of the real-time dashboard.

Table 1. Overview of data that can be collected from terminals and networks.

Type of Data	Possible Usages (and Relevant Work)	References (Standards)	Comments
MDT samples	RAN optimization (ref. [12]); Analysis of solar radiations and their impact on RAN interference (ref. [19]); Inferences of pedestrian mobility flows (ref. [9]);	3GPP TS 37.320 “Radio measurement collection for Minimization of Drive Tests (MDT); Overall description”	Details of data samples available: Longitude ddd.dddddd [deg]; Altitude [m] Major/Minor semiaxis of uncertainty ellipse/ellipsoid [m] Orientation of major semiaxis from geographical North [deg] Timestamp (date + time) of stored measurements User’s MCC-MNC; TA distance [m] eutraCelid of serving cell/of neighbor cells RSRP of serving cell; RSRQ of serving cell RSRP of neighbor cells; RSRQ of neighbor cells Estimated UE speed [Km/h] UL SINR [dB] Average wideband CQI
Radio Access Network counters	Network dimensioning and planning (ref. [18]) Adaptive SON management (ref. [34,35])	Technology specific	Examples of metrics: User and Cell Data and Voice Traffic 4G, Data Traffic 5G, Radio Link Quality, Radio Link Level, Performances (throughput, volume), Antenna beam and MIMO usage
Country-level/ region-level data statistics	Network dimensioning and planning	GSMA Intelligence (ref. [6]) Open Signal (ref. [7,8]);	Examples of metrics: Subscribers (Market/country level, Mobile internet subscribers) Market penetration; Network coverage (3G, 4G, and 5G) Base stations Data traffic Cellular IoT connections

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sabella, D.; Micheli, D.; Nardini, G. The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems. J. Sens. Actuator Netw. 2023, 12, 49. https://doi.org/10.3390/jsan12040049

AMA Style

Sabella D, Micheli D, Nardini G. The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems. Journal of Sensor and Actuator Networks. 2023; 12(4):49. https://doi.org/10.3390/jsan12040049

Chicago/Turabian Style

Sabella, Dario, Davide Micheli, and Giovanni Nardini. 2023. "The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems" Journal of Sensor and Actuator Networks 12, no. 4: 49. https://doi.org/10.3390/jsan12040049

APA Style

Sabella, D., Micheli, D., & Nardini, G. (2023). The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems. Journal of Sensor and Actuator Networks, 12(4), 49. https://doi.org/10.3390/jsan12040049

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Power of Data: How Traffic Demand and Data Analytics Are Driving Network Evolution toward 6G Systems

Abstract

1. Introduction

2. Data Traffic and Impact on Networks

2.1. Data-Driven Insights from Global Forecasts

2.2. Considerations from Mobile Data Usage

2.3. Data Analysis from Operator Networks

3. Status and Technology Evolution of Mobile Communication Systems

3.1. Standardization Trends toward 6G: Devices Evolution

3.2. Standardization Trends toward 6G: Network Infrastructure Evolution

3.3. Research Trends: Data-Driven Innovations

3.4. The Power of Data: Overview of Key Enablers for Data-Driven Insights

4. Exploiting Data toward 6G: The Fed-XAI Case Study

4.1. Federated Learning of eXplainable Artificial Intelligence Models: The Hexa-X Experience

4.2. Combining Live Network Data with Network Simulations to Support Fed-XAI Operations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI