Caching Method for Information-Centric Ad Hoc Networks Based on Content Popularity and Node Centrality

: In recent years, most internet communications have focused on accessing content such as video, web services, and audio. Conversely, traditional Internet communications are inefficient because they are primarily designed for data transfer between hosts. In response, Information-Centric Networking (ICN) has emerged as a content-oriented networking model. The impact of ICN in reducing the location dependency of data and its high compatibility with ad hoc networks has led to research on realizing Information-Centric ad hoc Networks (ICANET). There has also been extensive research into caching content in the network, which is one of the features of ICN. In static networks, methods have been proposed to cache highly popular content in nodes that are more likely to be used for shortest paths. However, in dynamic networks, content with high popularity should be cached on nodes that are more likely to reach all nodes, as missing nodes need to be taken into account. In this study, we propose a cache control scheme for content caching in ICANET that utilizes both content popularity and the closeness centrality of nodes within the ad hoc network as indicators. To realise the proposed method, a new packet flow based on the Pending Interest Table (PIT) and Content Store (CS) was implemented in the forwarding strategy of ICN. The experiments used ndnSIM, a protocol implementation of NDN based on Network Simulator3, which is widely used in wireless network research. The experimental results showed that the cache hit rate could be increased by up to 4.5% in situations with low content bias. In the same situation, the response delay was also reduced by up to 28.3%.


Introduction
In recent years, the widespread adoption of smartphones and other mobile devices has led to a surge in communication demands.In addition, the need for high-volume, low-latency data, such as the simultaneous distribution of live video, movies, and music, is increasing and will continue to consume huge amounts of data [1].
The traditional TCP/IP network architecture, known as host-oriented networking, relies on IP addresses to identify the location of hosts.However, this approach necessitates research for the path from the source to the destination each time the delivery endpoints change, leading to increased communication overhead.Moreover, information is exchanged on a one-to-one basis.Consequently, during periods of high user access, servers providing data experience increased processing due to one-to-many communication, leading to degraded response times.However, recent internet usage has predominantly focused on accessing content such as video, web services, and audio.As a result, there is a gap between the original internet design concept and current usage, resulting in inefficient communication.
In response, Information-Centric Networking (ICN) has emerged as a proposed solution [2].ICN is a new network architecture designed to avoid the network congestion and response degradation that are expected to occur when mobile and IoT devices communicate at scale.In addition to ICN, research is being conducted to solve these problems using edge computing and Reconfigurable Intelligent Surfaces (RIS) [3,4].
There is also a growing interest in the area of research aimed at realizing Information-Centric ad hoc Networks (ICANET), focusing on the effect of ICN in reducing content location dependency and their high affinity with ad hoc networks such as MANET [5] and WSN [6].In ad hoc networks, research has also been conducted to improve routing by dynamically learning through Q-learning [7].
One of the features of ICN is that content is cached on routers in the network.By allowing users to access cached content, server load and the amount of network traffic can be reduced.In order to access content efficiently, research has also been conducted into the appropriate placement of content in the network [8][9][10][11][12].This is a particularly important challenge in ad hoc networks where communication takes place between IoT devices, as content caching needs to be performed under the assumption that cache and battery capacity are very limited [13,14].In addition, the disappearance of nodes caching content from the network must be considered, as nodes may be missing in dynamic networks.
In this study, we propose a cache control scheme for content caching in ICANET consisting of mobile nodes and agent systems.This scheme utilizes the content popularity and closeness centrality of nodes in the ad hoc network as indicators, aiming to reduce the increased communication load on the network.Based on Named Data Networking (NDN) [15], a type of ICN, we introduce a new packet flow mechanism leveraging the Pending Interest Table (PIT) and Content Store (CS) within the ICN forwarding strategy and evaluate its performance.

ICN Content Caching
In recent years, considerable research has focused on content caching in the field of ICN [16].
Honda et al. [17] conducted an analysis of ICNs, taking into account social network characteristics.Specifically, they focused on influential users, one of the main characteristics of social networks, and investigate how to select them using the centrality index and the impact of their proportion on ICN content caching from multiple perspectives.Experimental results showed that using betweenness centrality [18] and PageRank [19] improves the average cache hit rate.Furthermore, the study revealed that when the degree distribution of the social network follows a power law, the average cache hit rate can be significantly improved by appropriately determining the proportion of influential users.
However, a limitation of the Honda et al. study is its reliance on user influence as an indicator for content caching.This approach operates separately from the ICN packet forwarding strategy and may not be universally applicable across different ICN environments.Therefore, the content caching process should be integrated into the Interest and Data packet forwarding strategy.

ICN Caching Methods Based on Content Popularity and Node Centrality
Chang et al. [20] proposed and implemented an efficient caching policy on ICN.They introduced an on-path caching policy algorithm called PT-Cache (popularity-topology cache) that uses packet popularity ratings to compete directly on router nodes based on the potential to save forwarding hops for future Interest packets.This strategy not only improves cache hit rates but also attempts to reduce packet forwarding hops by analyzing the network topology at nodes closer to the user for caching.The top nodes cache content that is not popular, while content of significant interest to users is cached closer to the edge routers.The remaining content is then stored in the upstream router, resulting in an improved overall hit rate and cache space efficiency and reducing the number of hops required to forward Interest packets of popular content.
In the Chang et al. study, the assumed network topology is a static topology that does not consider node movement.Therefore, they proposed a forwarding strategy in which the caching tendency of nodes is such that nodes close to users cache content with high popularity, while nodes close to servers cache content with low popularity.Moreover, they observed that more popular content tends to be cached at nodes that have higher betweenness centrality and are more likely to be used for the shortest routing paths.This caching algorithm is not suitable for use in ad hoc networks for several reasons.Firstly, in their study, the values of betweenness centrality are calculated offline in advance.However, in MANET and WSN, the network topology changes over time due to node moves and outages.This requires updating the node centrality values when the routing table is rebuilt.This could be improved by implementing a method of regular updates, similar to the access frequency of Interest packets.Additionally, since the shortest paths between nodes change as the topology changes, betweenness centrality becomes inappropriate as an indicator.In ad hoc networks, it would be more sensible to use nodes with a high probability of reaching all other nodes as nodes with high centrality.Therefore, the proposed method in this study applies closeness centrality [21] as an indicator.Regarding the caching tendencies of nodes, in previous research methods, there is a possibility of missing end nodes in ad hoc networks.Consequently, caches of highly popular content are likely to be absent.Therefore, in this study, which considers operation in ad hoc networks, caches of highly popular content are placed in nodes with high closeness centrality.

Summary
The proposed method is similar to PT-Cache proposed by Chang et al. (hereinafter referred to as PT-default).However, when applying this method to ad hoc networks, the evaluation of centrality is changed to closeness centrality.
When considering its application to ad hoc networks, it is not appropriate to use the betweenness centrality used in the Chang et al. study as a measure of caching control.Using betweenness centrality tends to cache content with high popularity at the edge nodes, which are more likely to be used for the shortest routing path.However, in ad hoc networks, the cache of highly popular content is likely to disappear from the network as the edge nodes disappear due to topology changes.
Therefore, in ad hoc networks, it is better to use reachability to all nodes as a measure of node centrality for caching control.
Closeness centrality is a measure of how close a node is to all other nodes.Therefore, the shorter the distance from all other nodes, the higher the centrality.Using this measure, nodes that are more likely to reach all nodes will have higher centrality.As a result, highly popular content is more likely to be cached on nodes that are easier to reach throughout the network.Consequently, highly popular content can remain in the network even if topology changes cause the loss of nodes at the network's edges.

Content Popularity Rating
First, we define an Interest packet access frequency collector as an indicator to ensure that content popularity can reflect the current network content demand status.Each router (Consumer) node periodically updates the access frequency of its Interest packets (Equation (1)).
on the left-hand side is the access frequency of Interest packets at time period j for content C i .This is calculated by considering MF T j C i , the access frequency of Interest packets collected by each router node in the jth cycle T, and RF j−1 C i , the access frequency of Interest packets in the previous time period j − 1. β is a weighting factor that adjusts the response speed to changes in the access frequency of Interest packets.
Next, we define how to classify popularity based on access frequency.Assuming that the content popularity follows a Zip-f distribution, the popularity of the content corresponding to the Interest packet received at each router (Consumer) node is dynamically set.Content popularity is categorized into four levels based on the frequency of Interest packet access (Table 1).High cache value available

The Difference between Centrality and Closeness Centrality
Betweenness centrality describes the extent to which a node in a network is used for the shortest path between other nodes.The betweenness centrality, BC(n), of node n is represented by Equation (2).
Here, δ ij is the total number of shortest paths from node i to node j and δ ij (n) is the number of those paths that pass through node n.In dynamic networks, it is important to consider that the edge nodes may be missing due to topology changes.By using betweenness centrality, highly popular content is cached at the edge nodes, which are the nodes closest to the users.Therefore, in dynamic networks such as MANET, there is a higher probability that highly popular content will be missing from the network.
In contrast, closeness centrality is a measure of how close a node is to other nodes.To obtain this measure, the average distance to other nodes is used.Considering the closeness centrality, CC(n), of any node n (Equation (3)), the priority of the node in the cache is proportional to this value.
By using closeness centrality, nodes that have an impact on the entire network are evaluated as having high centrality, meaning that upstream nodes will possess content with a high degree of popularity.This allows highly popular content to remain in the network, even in dynamic networks where edge nodes are expected to be missing and is expected to improve the cache hit rate.
Therefore, this study uses the closeness centrality defined in Equation ( 3) as an evaluation of centrality.
The closeness centrality CC(n) is updated sequentially as the direction and speed of motion of node n changes.If any router (Consumer) node n does not cache the data corresponding to the received Interest packet, it appends its own closeness centrality CC(n) value to the Interest packet and forwards it to the upstream node (Figure 1).The upstream node m that receives the Interest packet calculates ∆CC(m) from the value of the downstream node's closeness centrality CC(n) extracted from the Interest packet and its own CC(m) (Equation ( 4)).
If ∆CC ≥ 0, the cache priority of the content at that node should be increased by incrementing the number of times the Interest packet is accessed.

Packet Format
To implement a caching policy that utilizes node centrality and Interest packet access frequency as indicators, a Signature is introduced.This Signature stores both the closeness centrality of the Interest packet forwarding source and the access frequency from the network for the same Interest packet.This Signature is added as part of the structure of the Interest packet.The structure of the Interest packet in the proposed method is depicted in Figure 2.

Interest Packet Forwarding Strategy
The forwarding strategy for Interest packets in the proposed method is illustrated in Figure 3.
It is a forwarding strategy wherein each router node that receives Interest(C i ) requesting content C i as input outputs the popularity of content C i and the processing state of Interest(C i ).In any branching process, the popularity of the content is updated based on the access frequency collector at the end of the process (Equation (1)).

Data Packet Forwarding Strategy
The forwarding strategy of data packets in the proposed method is illustrated in Figure 4.It is a forwarding strategy in which each router node that receives Data(C i ) corresponding to the content C i as input outputs the processing state of Data(C i ).
First, it checks whether Interest(C i ) corresponding to Data(C i ) is registered in the PIT.If it is not registered in the PIT, Data(C i ) is usually discarded.However, in this study, Data(C i ) is forwarded to downstream nodes because NDN is constructed on an ad hoc network.If it is registered in the PIT, it refers to rank(C i ) from the popularity distribution of the PIT and decides whether to cache C i in CS based on its value.
In the case illustrated as ※1 in Figure 4, we explain the scenario where the popularity of the content at the node executing the transfer is ranked 2. When the sequence number of the content is large and the request frequency is low, while the node centrality is high, or when the sequence number is small and the request frequency is high, while the node centrality is low, it is determined that the content in question should be ranked 1 and no cache is retained.Otherwise, the content is cached in the CS as content with high cache value at the respective node.
In the case denoted as ※2 in Figure 4, we explain the scenario where the popularity of the content at the node executing the transfer is ranked 1.When the sequence number of the content is small and the request frequency is high, while the node centrality is high, or when the sequence number is large and the request frequency is low, while the node centrality is low, it is determined that the relevant content should be rank 2 and the content is cached in the CS.Otherwise, the cache is not retained as content with low cache value at the respective node.In any branching process, Data(C i ) is transferred to the downstream node along the PIT at the end.

Experiment Environment
The experiment will use ndnSIM2.7 [22], a simulator designed for NDN.ndnSIM is built on Network Simulator 3 (NS-3) [23], a widely adopted platform for wireless network research and commonly employed in NDN-related studies.The standard parameters for each experiment are outlined in Table 2.The choice of the RandomWayPointMobilityModel as the node mobility model is explained.In this experiment, it is assumed that ICN will be deployed in ad hoc networks established in densely populated areas, such as urban and suburban areas.To simulate a topology where there is predictable pattern in the direction of node movement, Consumer nodes are configured to move randomly.The role of the Producer node is to distribute content and is placed in the center of the topology as a single fixed node.Additionally, considering that nodes in ad hoc networks often have limited memory capacity, the content cache size of the CS for a particular content type is approximately 1 5 .

Topology Configuration
The simulation topology comprises one Producer node and 45 Consumer nodes (Figure 5).Area 1 is a 100 × 100 region centered on the Producer node, where 5 out of 45 Consumer nodes move randomly.Area 2 is a 200 × 200 region centered on the Producer node, where 15 of 45 Consumer nodes move randomly.Area 3 is a 300 × 300 region centered on the Producer node, where 25 of the 45 Consumer nodes move randomly.In addition, one Producer node is statically positioned in the center of the area.This setup simplifies the topology to include both densely populated areas and their surrounding suburbs.

Comparison of Cache Hit Rates for Each Caching Policy
We compared the cache hit ratio of PT-Cache, which was adapted to operation on ad hoc networks and is the proposed method in this study, to that of other methods.The first method was PT-default, used in a previous study, and the second method was LCD (Leave Cache Down), which exhibited the second highest cache hit rate after PT-default in the previous study.The parameter s of the ZipfMandelbrot distribution was set to 0.7.
The experimental results are depicted in Figure 6.The cache hit rates for PT-Cache and PT-default, serving as indicators of content popularity and centrality, reached up to 31.8% and 33.8%, respectively, around 10 s after the start.In contrast, when LCDs were employed, the maximum cache hit rate observed was 22.1%.It can be observed that the cache hit rate was higher than LCD when using PT-Cache and PT-default as indicators of content popularity and centrality.The cache hit ratios of PT-Cache and PT-default were almost equal.

Comparison of Network Traffic Volumes
The network traffic for each caching policy is compared in Figure 7.The graphs show the cumulative sum over time of the number of signals transmitted by all Consumer nodes.There are four types of packets: InInterests, OutInterests, InData and OutData.InInterests is the number of Interest packets forwarded to the router node, OutInterests is the number of Interest packets forwarded from the router node, InData are the number of data packets forwarded to the router node, and OutData are the number of data packets forwarded from the router node.The highest volume of Interest packet inflows and outflows was observed when LCD was used.PT-default resulted in the lowest amount of traffic, and PT-Cache resulted in slightly more traffic than PT-default.

Comparison of Cache Hit Rates When Requested Content Is Biased
Experiments were conducted to compare the cache hit ratio when the content requested by Consumer nodes is biased.By varying the value of the parameter s in the ZipfMandelbrot distribution, a bias is introduced in the content.Generally, a value of s of [0.2, 1.5] is considered appropriate.As the value of s is increased, the sequence number bias of the content to be delivered increases, and consequently, the bias of the requested content also increases.
The results of the experiment are depicted in Figure 8.For s = 0.7, 1.2, the cache hit ratio is higher for PT-Cache, PT-default, and LCD, in that order.Conversely, for s = 0.2, 1.5, the cache hit ratio is higher for PT-default, PT-Cache, and LCD, in that order.In particular, at s = 0.2, where the difference was larger, the content cache hit rates of PT-Cache, PT-default, and LCD were up to 30.3%, 25.8%, and 23.2%, respectively, representing an increase of 4.5% compared to existing studies.An increase in the cache hit rate can be observed for both caching policies by increasing the content popularity bias.

Comparison of Average Response Delay
The average response delay from sending an Interest packet to receiving the corresponding data packet is compared for each caching policy adopted.
The average response delay for each caching policy when the parameter s is varied is depicted in Figure 9.The experimental results showed that when the parameter s was 0.2, the average response delay of PT-Cache, PT-default, and LCD were 957 ms, 1334 ms, and 1249 ms, respectively, an improvement of 28.3% compared to existing studies.PT-default exhibited the largest average response delay, surpassing even LCD, which is the most common method.In contrast, PT-Cache, the proposed method in this study, demonstrated the smallest average response time.
It can also be observed that the average response delay for both caching policies tended to decrease as the value of the s parameter increased, indicating an increase in the bias of the content request.From Figure 9, when the parameter s was within the range of [0.2, 1.2], the response speed of PT-Cache was the lowest, and when s = 1.5, it became equal to that of LCD.

Comparison of Cache Hit Rates for Each Caching Policy
The experimental results indicate that the cache hit ratio of the proposed method PT-Cache and the caching policy of the previous study PT-default are nearly identical.Moreover, the performance of the proposed method surpasses that of the existing method LCD.This suggests that employing content popularity and node centrality as evaluation indicators is advantageous as a caching policy for ICANET.However, the advantage of transitioning from betweenness centrality to closeness centrality as the centrality index is not entirely evident.
For all caching policies employed in the experiment, the cache hit rate exhibited a sharp increase immediately after the simulation started.During the initial phase of the simulation, nothing is cached in the CS.When PT-Cache and PT-default are used, the access frequency RF, which is a setting index of content popularity at each node, is 0 for all content, and all content transferred for the first time is cached in the CS with a popularity rank of 3. Consequently, the accessibility of the cache may have been temporarily higher.Similarly, for LCD, it can be assumed that this phenomenon occurred due to the low redundancy of the cache in the network, as each transferred content is evenly cached across the network.Additionally, after 10 s from the start, the cache hit ratio decreases over time.In this simulation, the content popularity is reflected in the type of content cached over time progresses, resulting in a greater bias in the type of content cached at each node.Consequently, the cache hit rate decreases.Similarly for LCDs, the bias of cached content types increases over time.Therefore, it is assumed that the cache hit rate is decreasing due to higher cache redundancy in the network.

Comparison of Network Traffic Volumes
The experimental results reveal that the largest amount of network traffic is generated when LCD is used as the caching policy, while PT-default exhibits the least amount of traffic, and PT-Cache has slightly more traffic than PT-default.This indicator is positively correlated with redundant traffic in ICN.Analyzing network occupancy helps ensure network quality of service.PT-Cache and PT-default are more effective in eliminating redundancy in the in-network cache, thus reducing the load on network traffic.Moreover, the high prevalence of content that can be cached in the node results in a high hit rate and reduced retransmissions of requests.
Consider the reasons why PT-Cache traffic volume was larger than PT-default traffic volume.PT-Cache caches highly popular content at nodes that are more likely to be reached, whereas PT-default leaves the cache at nodes that are more likely to be used as the shortest path.It can be inferred that this occurs because when PT-default is adopted, the cache can be reached in a relatively small number of hops, resulting in fewer interest and data packets being transferred.

Comparison of Cache Hit Rates When Requested Content Is Biased
The parameter s of the ZipfMandelbrot distribution enables variation in the popularity bias of the content.Experimental results indicate that for PT-Cache and PT-default, the performance is nearly equal in both cases.
Moreover, an increase in the cache hit rate can be observed for any caching policy by biasing the content popularity.In this scenario, most of the requested content is cached in the CS of many router (Consumer) nodes, enabling the node caching the content data to be reached with a small number of hops.Consequently, it is assumed that the redundancy of the cache in the network is not a concern, and the importance of selecting the nodes to cache is diminished.
The reason for the cache hit ratio reaching its maximum at 10 s, and then, decreasing with time is thought to be the same as described in Experiment 1.

Comparison of Average Response Delay
The experimental results reveal that PT-default exhibits the largest average response delay, surpassing even LCD, which is the most employed method.In contrast, PT-Cache, the method proposed in this study, yields the lowest average response delay.For both caching policies, it is observed that the smaller the bias of the content request, the larger the average response delay, whereas the larger the bias of the content request, the smaller the average response delay.This finding underscores the influence of content bias on the response delay.
The average response delay serves a quantification of the time between requesting content and its delivery, offering an intuitive measure of the user's network experience.In real-world scenarios, the significance of reducing the response delay increases as application use increases.
PT-default is believed to elevate the average response delay owing to the loss of terminal nodes resulting from node migration, along with the loss of caches containing crucial content within the network.On the other hand, it can be inferred that PT-Cache mitigates the increase in average response delay since caches of important content persist within the network even in the absence of terminal nodes due to node movement.
The reduction in average response delay for any caching policy as the bias of content requests increases mirrors the increase in cache hit ratio described above.We attribute this trend to the presence of caches containing important content across many nodes.Consequently, even if certain nodes are missing, the cache storing the requested content data remains accessible.Furthermore, when s = 1.5, the average response delay of PT-Cache and LCD are comparable.This suggests that the proposed method has no advantage when the content is concentrated in popularity, but it demonstrates a significant advantage when the content bias is minimal.

Future Works
In this experiment, the topology was configured to simulate scenarios with high population density, such as suburban areas near an urban centers.An analogous situation to present MANET usage is during emergencies, like disaster scenarios, where users equipped with mobile terminals adhere to specific movement protocols [24,25].In such constrained and time-sensitive data acquisition situations, low latency communication emerges as a crucial factor, making method proposed in this study highly effective.
Additionally, while we used Equation (1) from the prior research by Chang et al. to determine content popularity in this study, we will consider testing other calculation formulas that reflect content popularity in future experiments.
Furthermore, for practical implementation, this study used physical distance between nodes to calculate centrality values.However, in real-world scenarios, the physical distance between nodes should be the number of hops between nodes.Thus, future research could explore adjustments to the methodology to align more closely with real-world network dynamics.

Conclusions
Much of today's internet communication is consumed by content retrieval, significantly burdening communication networks, posing a considerable challenge.The proposition of ICN, a content-oriented network model has been proposed, fueling active research in content caching control schemes.The effect of ICN in mitigating data location and its high compatibility with ad hoc networks has also led to research on the realization of ICANET.
In this study, we proposed a cache control scheme for content caching in ICANET consisting of mobile nodes and agent systems.We utilized content popularity and the closeness centrality of the nodes in the ad hoc network as indicators.The implementation was based on Named Data Networking (NDN), an ICN framework, where we introduced a new packet flow using Pending Interest Table (PIT) and Content Store (CS) in the forwarding strategy.This scheme ensures access to critical content even in ad hoc networks where end nodes may be missing due to topology changes.Experimental results showed that the proposed cache control method increased the cache hit rate by up to 4.5% and reduced delivery response delay by up to 28.3% compared to existing methods.As a future consideration, we consider implementing the physical distance between nodes as the number of hops mirroring real-world conditions and other calculation formulas that reflect content popularity.

Figure 1 .
Figure 1.Centrality update flow during Interest packet transfer.

Figure 2 .
Figure 2. Message format of the Interest packet.
First, it checks whether the content C i requested by the Interest(C i ) is cached in the CS of the router node.If a cache exists, it calculates ∆CC(C i ) based on Equation (4), adjusts the MF C i access count of Interest(C i ) according to that value, and transfers the corresponding Data(C i ) along the PIT.If no cache exists, it initializes with rank(C i ) = 1 and checks if Interest(C i ) is registered in the PIT.If registered in PIT, it extracts the access frequency RF from Interest(C i ), adds it to RF(C i ) in PIT, and then discards Interest(C i ).If not registered in the PIT, it checks if Interest(C i ) is registered in the Forwarding Information Base (FIB).If it is registered in the FIB, RF(C i ) is extracted from the PIT, added to RF of Interest(C i ), and then Interest(C i ) is forwarded to the upstream node along the FIB.

Figure 6 .
Figure 6.Comparison of cache hit rates for each caching policy.

Figure 9 .
Figure 9. Average response time for varying parameter s.

Table 1 .
Popularity of cache content.

Table 2 .
Standard parameters for each experiment.