Clustered and Distributed Caching Methods for F-RAN-Based mmWave Communications

: Fog-radio access networks (F-RANs) alleviate fronthaul delays for cellular networks as compared to their cloud counterparts. This allows them to be suitable solutions for networks that demand low propagation delays. Namely, they are suitable for millimeter wave (mmWave) operations that suffer from short propagation distances and possess a poor scattering environment (low channel ranks). The F-RAN here is comprised of fog nodes that are collocated with radio remote heads (RRHs) to provide local processing capabilities for mobile station (MS) terminals. These terminals demand various network functions (NFs) that correspond to different service requests. Now, provisioning these NFs on the fog nodes also yields service delays due to the requirement for service migration from the cloud, i.e., ofﬂoading to the fog nodes. One solution to reduce this service delay is to provide cached copies of popular NFs in advance. Hence, it is critical to study function popularity and allow for content caching at the F-RAN. This is further a necessity given the limited resources at the fog nodes, thus requiring efﬁcient resource management to enhance network capacity at reduced power and cost penalty. This paper proposes novel solutions that allocate popular NFs on the fog nodes to accelerate services for the terminals, namely, the clustered and distributed caching methods. The two methods are analyzed and compared against the baseline uncached provisioning schemes in terms of service delay, energy consumption, and cost.


Introduction
Millimeter wave (mmWave) bands support wide bandwidth transmission without the need for sophisticated channelization techniques such as multi-carrier and career aggregation. This is attributed to the wide contiguous bandwidth chunks at these bands. This has allowed for mmWave to be considered in the New Radio (NR) standard of 5G standard, in the Frequency Range (FR) 2. A key limitation in mmWave bands here is the high path and penetration losses, along with atmospheric attenuation, oxygen absorption, and sensitivity to blockage. This makes mmWave links susceptible to significant degradation in signal quality.
The long separation distances between mobile station (MS) terminals and access points (radio remote heads) add complexity to the design of beamforming and access networks. For beamforming, new designs are required that deem low power consumption and multianalog beamformers at the MS terminals and digital or hybrid beamformers at the BS for spatial multiplexing and multi-user connectivity. For access networks, it is essential to reduce fronthaul traffic to minimize end-to-end delays for the geographically distributed terminals. Solutions here include cloud-radio access networks (C-RANs) that allow for a centralized baseband unit (BBU) pool that processes various network functions (NFs). These NFs are migrated from the radio remote heads (RRHs) that feature limited resources to the

Related Work
Few studies look at the caching problem in F-RAN for mmWave operations. The work in [2] proposes a low complexity subchannel assignment and power control mechanism for mmWave F-RAN that includes caching, user experience constraints, interference suppression, and energy efficiency. The power problem is formulated as an optimization model, and the alternative direction of the multipliers method is leveraged. However, the work here investigates the energy efficiency only and lacks analysis of network delay and cost. Further, the work in [3] accounts for user mobility in transmission scheduling for caching at the fog nodes. The optimal scheduling problem is formulated as a stochastic nonlinear mixed-integer programming, and multi-hop relaying caching is proposed. The work leverages device-to-device (D2D) communication in the hopping process to enable simultaneous transmission in the scheduling of caching at the edge nodes. The analysis is limited to the amount of cached data under different hop numbers, locations, time slots, and transmission power.
Another set of caching schemes is proposed for F-RAN, albeit not in the context of mmWave NR operations. For example, the authors in [4] propose a cooperative coded caching method by deploying deep reinforcement learning to search for the optimal content coded caching approach. The search here is applied to every request by the controller in a high-power node in terms of the deep Q network model, i.e., to enhance the probability of successful transmission. Furthermore, in [5], an online caching approach is introduced that considers the time-varying nature of the popular content with a focus on long-term normalized delivery delay, i.e., based on the temporal dependency of the coding times aggregated during multiple time slots in the high signal-to-noise (SNR) ratio regime. The authors in [6] deploy a cooperative multi-point (CoMP) transmission approach at fog nodes with cache modules that serve clustered users. They aim to optimize the minimum weighted signal-to-interference noise ratio (SINR) among all clusters while considering cluster fairness and load balancing in the backhaul. Namely, the work aims to jointly optimize the clustering formulation for increased fairness and multicast beamforming, with consideration for power consumption. The authors in [7] assume that fog access points for a distributed cache cluster that cooperatively serve MS requests in efforts to reduce fronthaul traffic. Here, cache placement is optimized by concatenating an MDS code with a repetition code, i.e., to save energy and bandwidth. Namely, repeating the same packet of some F-APs allows for multicasting over the fronthaul link, which saves energy and bandwidth.
In [8], fog computing is used as an intermediate communication interface between the underlying and global tiers of Information-centric Networks (ICN), where content is processed and stored at the fog nodes. The goal here is to reduce the total cached content using content labeling and sharing. The goal here is to reduce the content request delay and enhance the cache hit rate. Another cooperative caching approach in [9] applies graph theory to maximize the offloaded traffic. Specifically, the work formulates the clustering problem while considering cooperative caching and local content popularity. Thereafter, a graph-based approach is proposed to solve the problem.
Reference [10] also aims to enhance the cache hit rate in F-RAN by again formulating the edge caching problem as an optimization model to determine the optimal policy for a content popularity prediction algorithm. The algorithm considers the content features and user preferences, as well as an offline user preference learning algorithm based on the online gradient descent and follows the regularized leader method. The goal is to estimate upcoming content popularity at reduced complexity and track the content with spatial and temporal dynamics in time. In [11], online caching based on time-varying content is analyzed to characterize the long-term normalized delivery time, which captures the temporal dependence of the coding latencies accrued across multiple time slots in the SNR regime. Further, the work studies online caching and delivery schemes for serial and pipelined transmission modes across fronthaul and edge network entities.
In [12], the cooperative caching problem is formulated as a probability-triggered combinatorial multi-armed bandit problem. Then, an enhanced multi-agent reinforcement learning algorithm is developed to solve the problem. The solution combines user preference and content popularity prediction based on a real dataset to enhance the cache hit rate. Federated learning is also applied in [13] to develop a content popularity prediction framework for D2D communication in F-RAN in an effort to maximize the cache hit ratio. Note that the work considers individual privacy in the content popularity prediction model. Further note that it only utilizes the user's local model in the training process.
Furthermore, the work in [14] presents an F-RAN model based on a femto-base station cluster (FBSC) structure to coordinately serve users on the mmWave bands. However, it still relies on microwave bands between fog stations and central units. Overall, the Appl. Sci. 2022, 12, 7111 4 of 14 work focuses on resource allocation problem for caching placement and lacks analysis of the caching mechanism including caching delay, network cost, and energy consumption. It rather optimizes the hybrid precoder to reduce the transmission delay as a function of the BS transmit power for different metrics such as the volume of cached content. A joint caching and recommendation policy is proposed in [15] for F-RAN by considering a dynamic request model of various times. The caching problem is formulated as an optimization model and reinforcement learning is leveraged to maximize the net profit of each F-AP. Furthermore, a double deep Q-network is used to allocate optimal caching policy with a content recommendation at reduced complexity. Again, the work here lacks the context of mmWave systems, i.e., incorporating beamforming and channel models. Finally, a content caching strategy for F-RAN is developed in [16] using a federated deep reinforcement learning algorithm to enhance the caching performance in terms of content request delay and cache hit rate. Here, the model learns content popularity at multiple cooperative F-Aps, and then a deep Q-network is applied to learn the request content data in each F-AP. However, this work again lacks the context of mmWave operations and the analysis on network cost and energy consumption.
Various outcomes have been proposed from the 5G MiEdge project in [1]. First, stochastic optimization and matching theory are leveraged in [17] to propose an online computation offloading method in MEC, while considering offloading requests, channel conditions, user mobility, and computation queues. However, the work lacks the integration of mmWave system and its requirements, and it lacks network function virtualization and content popularity analysis. Service migration is investigated due to user mobility in [18] that considers latency penalty. It proposes a method to resume the established service at a new mobile edge host to maintain service continuity. It aims to overcome service disruption and resource consumption in the backhaul links. However, it suffers from increased signaling when notifying the MS terminal about a new optimal edge host during path configuration. This adds a complexity at the MS to transmit service parameters. Proactive computation caching methods are presented in Reference [19], which considers task popularity, size, and cached resources that can vary from the incoming resource demands. However, the methods lack service delay analysis and implementation of network function virtualization. Reference [20] integrates MEC and mmWave communications to enable a high-level network architecture with an application-and user-centric orchestration that collects various parameters using a liquid RAN C-plane such as user position, network load, and data popularity. The work in [21] provides an overview and strengths, weaknesses, opportunities, and threats (SWOT) analysis of the integration of mmWave and MEC from a business and economic aspects of 5G systems. However, the work is only limited to SWOT analysis, without provisioning caching schemes or a technical framework. In [22], a prefetching algorithm is proposed for mmWave communications in MEC. It develops mobility and traffic models and reduces system latency and enhances user data rates Finally, the work in [23] addresses computation offloading mechanisms that consider blockage effects in mmWave links in cloud-RAN. However, it lacks service request models, network function virtualization, and operating constraints of fog nodes.
Overall, existing schemes on caching in F-RAN focus on content prediction to enhance the hit ratio, while considering limited user request specifications such as computation and delay bound. Further, the studies lack analysis of the resource constraints at the edge of the network without accounting for network cost in the caching process as compared to online prefetching.

Multi-Layer Network Architecture
A multi-layer architecture is utilized here that is comprised of MS terminals, fog nodes in cluster distribution, and the cloud core, as depicted in Figure 1. MS Terminals (Layer I): User terminals that demand various services of different delay and capacity specifications. Terminals can be mobile stations, sensors, vehicles, desktops, laptops, etc., which are distributed across the fog nodes in each cluster, i.e., each cluster is comprised of multiple fog nodes.
Primary Nodes in Fog-RAN (Layer II): This layer consists of distributed, homogenous fog nodes that are collocated with RRHs at the proximity of MS terminals in Layer I and cloud-RAN in Layer III. They are equipped with beamforming architectures to provide high bandwidth links with the MSs. It is the gateway at the edge of the network that receives traffic requests and thus provides services at stringent delay bounds and limited resources.
Secondary Nodes in Fog-RAN (Layer II): Another set of fog nodes that possess higher resources as compared to the primary nodes, albeit less resources than the cloud core. Every secondary node manages a cluster of primary nodes through direct links. This intermediate structure combines the benefits of Layers II and IV, i.e., higher resources at the expense of a slight increase fronthaul delays.
Cloud-RAN (Layer IV): This layer is comprised of widely dispersed cloud nodes that possess abundant resources. It acts as the network BBU and contains the NFs that are offloaded to the fog nodes via wireless fronthaul links that operate on microwave sub 6 GHz.

Network Model
Consider a set of RRHs distributed over a geographical area, and each RRH is collocated with a fog node n i , n i ∈ N, where N is the total number of fog nodes in the network. A fog node acts as the processing unit that delivers various NF without relaying to the cloud core that acts as the BBU. Functions include control, communication, storage, and management, etc. The processing of various NFs at the edge of the network alleviates latency in the backhaul of the network, as opposed to traversing traffic to the core, which incurs aggregated delays. The local processing also saves in resources that can be used instead to enhance the network capacity, i.e., radio resources and processing units at the cloud core. Further, each Fog node n i in the F-RAN possesses processing capacity and memory expressed by q prc (n i ) and q ee (n i ), respectively. Note that q prc (n i ≤ Q prc (n i and q me (n i ) ≤ Q me (n i ), where Q prc (n i and Q me (n i ) are the maximum capacity and memory, respectively. Additionally, incurred processing delay at node n i is denoted by D prc (n i ). Further, consider that the fog nodes are interconnected via wireless links, thus forming a set E = {e} of links. The instantaneous available link bandwidth is denoted by b(e), b(e) ≤ B(e), where B(e) represents the maximum available bandwidth. Here, D prp (e) accounts for the propagation delay at the link.

Service Request Model from MS
A request r that generates from the MS m features various NF types t ∈ T as per the specific service, where T is the total number of NF types. Each NF f t requires computation and memory resources that are denoted by q prc ( f t and q me ( f t ), respectively. This accumulates to Q r in total processing and memory requirements for the request as per Equation (1). Moreover, the variable b r accounts for the link resources.
Along with this, an incoming request to the fog nodes in the F-RAN has different requirements in terms of delay bound, lifetime, resources, and service types. The request is received by a fog node that is termed as the source node, src. Thereafter, it is traversed to the destination node dst, which can be the same MS, another MS, or a fog node. The intermediate nodes that interconnect the src and dst nodes must have sufficient resources to route the data and the NFs. Along with this, each request r can be modeled by a 6-tuple r = <src, dst, F r , Q r , b r , δ r >. The variable F r defines the set of NFs F r = { f t }. The variable δ r defines the delay bound for the request.

Beamformer Architecture at the MS
Different beamforming models exist for the MS terminal in the context of mmWave bands such as the analog beamforming models in [24][25][26] comprised of various array geometries. A request r generated from MS m ∈ M propagates through a wireless link. Each MS is equipped with RF chain ψ MS that is connected to a uniform linear array (ULA) for directional transmission with the F-RAN, where it radiates a single beam. The chain is composed of A MS antennas that are equally spaced at d ant = λ/2, where λ is the wavelength, λ = ς/χ, where ς is the speed of light and χ is the carrier frequency. Note that antennas are parallelly fed by phase shifters to provide continuous beam scanning. Overall, this formulates an analog beamformer that radiates a primary beam vector v Θ MS 0 that points towards Θ MS 0 direction. The vector for this direction is part of the vector matrix, V MS at the analog stage of the chain, i.e., v = v an , where v an is the analog precoder. It is determined by the far-field array factor for the ULA, written as [27,28], where the variables a, k, and fi denote the amplitude of the antenna at the MS, wave number, i.e., k = 2ß/˘, and the progressive phase shift between the elements at the MS, respectively.

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that C MS = C bb C an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, an at the MS is modeled as [29],

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that C MS = C bb C an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w~N(0, σ 2 w , with variance σ 2 w . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal EER REVIEW 7 of 14

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], ƴ where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, ƴ where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable,

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], ƴ where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, ƴ where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that C MS = C bb C an . Furthermore, the instantaneously received signal

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer nication with multiple MSs and provide spatial multiplexing for an ity. Hence, the RRH are equipped with a UCA composed of ψ R equally spaced along circular ring. In contrast to the MS design, each nected to one RF chain. Note that the total number of antennas is eq RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern a beamforming vector, p RH , in the beamforming matrix, P RH = p b p an represent the baseband and analog beamforming stages, respect

Downlink Model
Consider that the set of MSs that operate in time-division duplexi the F-RANs, with reciprocal channel state information (CSI) knowledg ceived signal model in the RF domain ƴ an at the MS is modeled as [ ƴ where the variables H, z, and w represent the complex channel, con tive white Gaussian noise (AWGN), respectively, where w ~N(0, σ Here, the variable V AP denotes the beamforming matrix at the F-RA received signal ƴ bb at the MS subsequent to the combiner section, C ƴ where the variables P tr and C MS denote the transmitted signal pow biner, respectively, and where C MS includes baseband and analog C MS = C bb C an . Furthermore, the instantaneously received signal ƴ in beamforming and c combining vectors, where v ∈ V RH and c ∈ C In general, the poor scattering propagation nature at mmWav use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path l gain of the l-th path for L number of paths captured in K clusters. here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where between power in the first path and other paths that possess reduced over, the variables V RH and C MS represent the response vectors tha at the F-RAN and MS, respectively. Therefore, the overall response former at the MS is given by the RF precoding matrix and is comp array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointin section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of th RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memo and q me (s), respectively. Here, the cache status of a NF is represented

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], ƴ where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, ƴ where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l~

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that C MS = C bb C an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, (0 , ζ) , where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at`M S s azimuth and φ MS s elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable,

Beamformer Architecture at the BS
Each mmWave F-RAN is equipped with a digital beamformer to allow for communication with multiple MSs and provide spatial multiplexing for an increased link capacity. Hence, the RRH are equipped with a UCA composed of ψ RH antennas that are equally spaced along circular ring. In contrast to the MS design, each antenna here is connected to one RF chain. Note that the total number of antennas is equal to the number of RF chains, i.e., A RH = ψ RH . Additionally, the overall radiated pattern from r BS represents a beamforming vector, p RH , in the beamforming matrix, P RH = p bb p an , where P bb and p an represent the baseband and analog beamforming stages, respectively.

Downlink Model
Consider that the set of MSs that operate in time-division duplexing (TDD) mode with the F-RANs, with reciprocal channel state information (CSI) knowledge. The downlink received signal model in the RF domain ƴ an at the MS is modeled as [29], where the variables H, z, and w represent the complex channel, control signal, and additive white Gaussian noise (AWGN), respectively, where w ~N(0, σ w 2 ), with variance σ w 2 . Here, the variable V AP denotes the beamforming matrix at the F-RAN. Furthermore, the received signal ƴ bb at the MS subsequent to the combiner section, C MS , is expressed as, where the variables P tr and C MS denote the transmitted signal power and the MS combiner, respectively, and where C MS includes baseband and analog combiners, such that C MS = C bb C an . Furthermore, the instantaneously received signal ƴ inst at the MS due to p beamforming and c combining vectors, where v ∈ V RH and c ∈ C MS , is written as, In general, the poor scattering propagation nature at mmWave bands imposes the use of geometric channel models, written as [30,31], where the variable Γ bl and h l in order denote the blockage path loss and the complex gain of the l-th path for L number of paths captured in K clusters. Note that path gains here are also assumed to be Rician-fading, i.e., h l ~ Ɍ (0, ζ), where ζ represents the ratio between power in the first path and other paths that possess reduced power levels. Moreover, the variables V RH and C MS represent the response vectors that capture the channel at the F-RAN and MS, respectively. Therefore, the overall response vector for the beamformer at the MS is given by the RF precoding matrix and is computed by the periodic array factor for the UCA at θ s MS azimuth and ϕ s MS elevation pointing directions for each section (likewise for the F-RAN).

Cache Model
Assume that the set of locally installed NFs at the cache of the fog nodes in the F-RAN is denoted by S = {s}, i.e., S ⊂ F r with a processing and memory resource of q prc (s) and q me (s), respectively. Here, the cache status of a NF is represented by a binary variable, Here, the set of cached NFs is predetermined by content popularity and most requested NFs as per the study in [32] as depicted in Figure 2 (retrieved from the study in [32]). It shows the most popular NFs in the network, where f 1 and f 2 account for call-in and call-out, f 3 and f 4 account for sms-in and sms-out, and f 5 represents internet-access traffic, respectively. Note that the set of popular traffic varies according to the geographic areas and nature of the traffic. Here, the same algorithm can be applied to other types of NFs and content popularity as well. Along with this, the cached NFs demanded by the MS terminal m can be directly retrieved from the local fog node, whereas uncached content needs to be traversed to the cloud core for processing, i.e., conveyed to the cloud BBU via the fronthaul links.

Caching Placement in mmWave F-RAN
The received traffic from various MS terminals at the fog nodes differs in terms of the processing capacity, lifetime, and delay bound. Along with this, the goal is to place popular NFs at the edge of the network at the F-RAN without offloading to the cloud BBU. However, given the limited available resources of F-RAN, efficient cache placement methods are proposed on the fog nodes.
Two schemes are studied to distribute cached content across the fog nodes. Namely, clustered caching and distributed caching methods. In the earlier groups, all the popular NFs across a single fog node in Layer III, given the abundant resources across the secondary fog nodes. Meanwhile, the distributed caching method allocates a single popular NF across every node in each cluster. The interconnection between the cluster nodes yields in the total cached NFs, where each host fog is directly connected to an adjacent node that caches another popular NFs. Consider the details of the caching methods.
Clustered Caching Method: Consider an incoming request r that generates from an MS terminal m, requesting F r set of NFs. This request is received as the src node, which is the closest fog node to the terminal. The traffic traverses to Layer III, which contains the cached NFs. If any of the NFs is not supported at the node, then mapping is conducted across the direct neighbor in Layer II, which is in the direction of the path toward the src node. Consider the set of popular NFs as S = {s} for each batch requests R = {r}. This set is group mapped on the secondary fog node as, ∑ s∈S n j = 1, ∀n j ∈ N, ∀r ∈ R.
Now the path for request r to the host secondary node is established to minimize the function Appl. Sci. 2022, 12, x FOR PEER REVIEW 9 of 14 Now the path for request r to the host secondary node is established to minimize the function Ƒ = min (D(src, n j )), Path(r)= n i , e|min (D(n i , n j )) , ∀r ∈ R where D(., .) is the path delay between any two nodes across Layer II and III.
Here, a set of shortest paths result based on the end-to-end delay, which are sorted in ascending order, after which the least-delay path is selected. If any of the requested NFs in F r does not exist in S, then mapping is performed across the closest primary fog node in Layer II, which is termed as the host primary node n i , i.e., i i j i i j n D n n n Path r d n n ← ∀ ∈ = min (10) = min (D(src, n j )), Path(r) = n i , e|min (D(n i , n j ) , ∀r∈ R where D(., .) is the path delay between any two nodes across Layer II and III.
Here, a set of shortest paths result based on the end-to-end delay, which are sorted in ascending order, after which the least-delay path is selected. If any of the requested NFs in F r does not exist in S, then mapping is performed across the closest primary fog node in Layer II, which is termed as the host primary node n i , i.e., n i ← n i |minD n i , n j , ∀n i ∈ Path(r)|d n i , n j = 1. (10) where mapping is formulated as, Distributed Caching Method: The popular NFs are cached separately across the fog nodes of each cluster. Then, incoming requests are routed to the sequence of interconnected nodes that host the entire cached NFs. The rationale behind this distribution mechanism is to avoid increased traffic directed toward a single node and link, i.e., in effort to avoid node/link congestion and failure. Accordingly, a single popular NF s ∈ S is cached on each host node n i in Layer II, The result is a set of interconnected nodes and links that form the cached path, Path(S), where all incoming requests are routed across it, Path(S) = {n i , e}. Along with this, an incoming request demands F r NFs. If any f t ∃{ n i |n i ∈ Path(S)}, ∀ f t ∈ F r then the method maps f t on the first node in the path closest to the MS m.

Performance Evaluation
The proposed methods are evaluated versus the conventional uncached approaches in terms of service delay, network cost, and energy consumption. The uncached approaches map network functions online based on the least path delays between terminal users and fog nodes in Layers II and III, i.e., without prior knowledge about content popularity as opposed to the proposed caching methods.
The complexity of the proposed methods depends on the scale of network size in terms of nodes and links modeled as O(|N|log|E|). Further, the use of LSTM results in an additional complexity scaled by the number of network functions that need to be forecasted. A saliency for the LSTM algorithm here is that it features a reduced computational complexity versus other deep learning methods. It accepts random weight initializations and a variety of state information without the requirement of a prior setting of the input states [33]. The complexity per weight and time is modeled as backpropagation through time (BPTT), O(1) [34]. This is scaled by the number of network functions F r . Along with this, the overall run-time complexity is formulated as O(|N|log|E| + |F r |).

Network Service Delay
The service delay is determined by the total provisioning time required to allocate cached NFs on the nodes, time required to provide a copy of a cached NF, and the time required to map any uncached NFs, i.e., instantiation delay. This delay also includes the hopping time over all intermediate nodes and link propagation delay in the path from the src to the nodes that host the NFs. Given the stringent delay, it is vital here to provide services at the least delays. Figure 3 shows the service provisioning delay for the proposed caching methods for various numbers of incoming MS requests versus the conventional uncached methods. At low incoming requests, the network is still underutilized and most of the nodes are unoccupied. Therefore, the delay here is relatively low for the two methods, e.g., ranges of 2.5-3 ms and 3-3.5 ms for the clustered and distributed methods for the first 80 requests, respectively. In the clustered approach, incoming requests are routed to the secondary nodes that host all the cached NFs. This routing leads to hopping over primary nodes, which causes a slight delay (e.g., 3 ms at 80 requests) without providing NFs to them. In the distributed method, the cached NFs are hosted at the primary nodes in Layer II, which requires each node to provide service to the request, thus yielding in increased delay, e.g., 4.4 ms at 80 requests. At high traffic volumes, some NFs are uncached, which results in mapping requirements of new NFs. Along with this, the distributed method maps these additional new NFs on the established caching path, which aggregates the delay further. Meanwhile, the clustered approach maps the new NFs on the direct neighbors of the secondary node, which yields in shorter paths, thus yielding shorter delays that approach 5.5 ms at 200 requests.
Meanwhile, the uncached methods suffer from noticeable delays attributed to the provisioning delays of nodes after the terminals initiate service requests. Here, both the distributed and clustered uncached methods search for the best node that yields in the least aggregate path delays to host the network functions. This is opposed to the proposing caching methods that allocate the NFs in advance, thus reducing the instantiation time of the functions running on the nodes. Here the distributed uncached method suffers the highest delay approaching 9.5 ms at 200 requests, i.e., attributed to the high number of nodes in the request path hosting the NFs. This delay is reduced in the clustered uncached method by aggregating the NFs on a single node, i.e., conditioned by available resources, thus reducing the total nodes in the path, which minimizes the link delays between the nodes.

Network Cost
The service cost across all the network is proportional to the aggregated delays across the paths, i.e., nodes and links in the request path occupy resources that add to the total cost. Namely, the cost for a single request r of Path r is gauged as, where the variable Λ(n i ) represents cost of the node per usage unit for the cached and uncached NFs requested by the request. Further, the variable Λ(e) denotes the cost of the usage unit for the link that is proportional to the link bandwidth demands b r . For further analysis on cost, the work in [35] presents a comprehensive cost model to "minimize application deployment cost and maximize the cloud owner's revenue in terms of the requested traffic, the backhaul capacity, and the number of MEC servers". The model incorporates the leasing cost of the MEC and cloud resources by end-users considering latency requirements. It also includes the deployment cost of servers (e.g., software running cost), the number of users using the server, and the size of their traffic. Figure 4 shows that the cost for the two proposed methods is approximately similar with a slight increase for the clustered approach. This is attributed to the high utilization of the secondary node in each cluster during the entire service time, along with the cost of the primary nodes and links that route the requests to the secondary nodes in Layer III. For example, the cost approaches 78 units at 80 requests. The distributed method requires multiple primary nodes to cache the various NFs, which also increases the cost. However, the reduction in cost here is due to the unoccupancy on the secondary nodes, particularly at low traffic volumes. However, at higher traffic, the distributed method is compelled to map uncached NFs on the secondary nodes to accommodate the high demand, in particular after 80 requests. The uncached provisioning methods demand a higher number of nodes in the request path, given the redundant mapping of all incoming NFs, regardless of the traffic popularity; this is mostly shown in the clustered approach that favors nodes in the upper fog layer that features more available resources, at the detriment of increased cost. For example, the cost is increased by 10% for the clustered method as compared to the cached counterpart at 200 requests, and 35% increase for the distributed uncached method.

Energy Consumption in the Network
Energy consumption is a vital efficiency performance factor in network operations. Here, the total power consumption in the network is gauged during the caching and provisioning process for the NFs across the network nodes and links. It also includes the beamforming power consumption at the RRHs to communicate with the MS requests. The total power consumption during the caching and provisioning times across the fog BBU nodes, links and the beamforming architectures account for the energy consumption in the network. First, for the network infrastructure, the work here adopts the power consumption model in [36], which measures the power usage in the cloud for a certain processing duration. Along with this, the power consumption at a single BBU node is z, z(n i ) = β(n i ). z(n i )| max + (1 − β(n i ))σ(n i ). z(n i )| max , where fi(n i ) is the power consumption for the node in idle status, and z(n i )| max accounts for the maximum power consumption at the node. Furthermore, the variable oe(n i ) denotes the node saturation rate, which depends on the utilization rate. Along these lines, the total power consumption for all the nodes in the path for request r is determined by, This is added to the power consumption due to the number of switches in the network and their power consumption as, where z(x) is the power consumption for switch x for X total switches. Figure 5 shows the total energy consumption levels. Similar to the analysis for the delay and cost, the clustered method groups cached NFs on the secondary node, which requires higher power consumption as compared to the primary nodes. The power consumption here also includes the utilization rate of the beamforming architectures at the F-RAN. Overall, the clustered method suffers from higher energy consumption, i.e., 11% higher at 200 requests versus the distributed caching counterpart. The behavior of the uncached methods follows the trend of the cost analysis attributed to the increased number of nodes and links in the request path. This increased operational cost suffers from an increase in power and energy consumption as well.

Conclusions
This paper proposes cache distribution methods for popular network functions at fog-access radio to reduce service provisioning and fronthaul delays. This setting makes it suitable for millimeter wave communications that demand reduced fronthaul delays to alleviate the effect of path and penetration losses. Results show that the clustered caching method minimizes the total time required for service provisioning as compared to the distributed method. Meanwhile, the latter achieves a slight reduction in network cost and energy. Overall, this supports a tradeoff based on network preference, i.e., the priority metric selected for operation. Future efforts will investigate the two methods in terms of online caching for real-time traffic.