1. Introduction
A Content Delivery Network (CDN) is an important technology for data delivery in the Internet of Things (IoT). It stores content in multiple proxy servers geographically to enable users to obtain content from the nearest servers geographically close to them, thereby reducing the transmission delay and loading time [
1,
2]. As proxy servers share the load of the originating servers and reduce their pressure of directly handling user requests, CDNs improve the overall system performance and stability. By replicating content on multiple servers, a CDN can quickly fetch the content from other servers in the event of the requested server’s failure, ensuring continuous service availability. In addition, CDN effectively reduces network bandwidth consumption and data center operating costs by caching content and optimizing data transmissions. It works by caching static content (such as images, videos, CSS, JavaScript, etc.) on CDN nodes distributed globally, and when a user requests a website, DNS routes the request to the nearest CDN node, which, based on the user’s request, caches the content and then sends it to the nearest CDN node [
3]. According to the user’s request, the node will return the cached content directly to the user or get the latest content from the original server and return it. Common CDN service providers include Cloudflare, Akamai, Amazon CloudFront, Google Cloud CDN [
4], etc. CDNs are widely used in a variety of websites and applications, especially those that require fast loading and high availability, such as video streaming, e-commerce, and social media.
On the other hand, recommender systems play a crucial role in today’s Internet [
5]. In contrast to content caching, which focuses on the popularity of user requests, content recommendation emphasizes personalization of user interest and recommends content to users according to their individual interests, which may include new content that the users have not requested before. With the maturity of recommender systems, recommendations have shown great importance in CDN for improving its service satisfaction. For example, Netflix reports that 80% of its total product views are influenced by recommendation systems [
6]. The joint optimization of content caching and recommendation at the same time for both popularity and personalization is of increasing significance for improving system effectiveness in efficiency and service satisfaction. However, achieving this faces great technical challenges due to the contradicting goals of caching and recommendation, while cache hit-rate maximization focuses on finding commonalities among user requests to improve network efficiency, recommendation emphasizes users’ individual needs for the purpose of enhancing user stickiness. As different users have different needs and access behavioral patterns, it is a challenging problem to find a balance between diverse user needs to ensure the accuracy of recommended content and the efficiency of caching frequently used content [
7].
Most previous studies focus on either cache hit-rate maximization or recommender system optimization, and little is known about jointly optimizing the two, particularly with a provable performance guarantee. Our research is dedicated to solving the problem of jointly optimizing content caching and recommendation to maximize the cache hit rate without overly distorting the recommended content. We formulate this problem as a constrained optimization problem and show its NP-hardness even without considering recommendations. For the more general case when the recommendation is done among the cached contents, we show that this problem has the property of a monotonically increasing and submodular objective funcion and propose an algorithm with an approximation ratio of .
Our contributions are summarized as follows:
- •
Define the joint caching and recommendation optimization problem and show its NP-hardness even without considering the recommendation.
- •
Prove that when a recommendation is made from cached contents, this problem has a monotonically increasing and submodular objective function.
- •
Propose a greedy algorithm to solve the above problem with a worst-case performance guarantee of from the optimal solution.
- •
Experimentally validate the performance of the proposed algorithm and demonstrate its variations on different parameter settings.
2. Related Work
As a classical topic in data networks, the main challenge for caching is how to accurately predict user demand and content popularity. Because content demand is temporally localized (i.e., content relevance is time-dependent, and similar content is frequently accessed in a short period of time), demand prediction for user content should take into account the temporal localization between contents. If the user content demand is accurately predicted, then the cache hit rate can be improved, which is the main problem to be solved by CDNs. CDNs are deployed by network operators, and their customers are mainly content providers such as YouTube, TikTok, etc., in recent years. One of the core competencies of these content providers is the recommendation of content, which can enhance user stickiness and thus improve the revenue of content providers. Therefore, the joint optimization of caching and recommender systems in CDNs has become an important research area, which is also the core topic of our paper. To the best of our knowledge, the joint role of caching and recommendation is considered simultaneously in [
8,
9,
10,
11,
12,
13,
14]. The first four of them address caching recommendations for video content, while the rest address caching recommendations for generic content.
The work [
8] seems to be the first that makes caching decisions based on recommender system decisions. Their recommendation-driven, caching-based scheme outperforms traditional content prediction schemes, which, however, disappear in real-world scenarios where caching serves a larger number of users. The work [
10] proposed to cache the most popular content locally rather than globally. The work [
12] proposed a method to identify cached content for CDNs based on a recommender system, but it was not quantitatively evaluated. What refs. [
8,
10,
12] have in common is that caching is based on the recommender system. In other words, they consider the post-recommendation user preferences as the actual user preferences. Our work differs from theirs in that our recommender system is not used purely for the prediction of user content demand but as a user preference moderation tool for balancing user preferences with performance metrics. Our proposed algorithm tunes user preferences during the course of their prediction to improve network performance.
In fact, the studies in [
9,
11,
13,
14] are closer to ours. The work [
9] ranked cached videos higher by rearranging the list of related videos, thus influencing users’ preference and then boosting the cache hit rate. The work [
13] studied peer-to-peer (P2P) systems, and its algorithm mainly solves the recommendation problem considering the cost of content distribution. The work [
14] proposed a framework for proactive resource and demand allocation and theoretically analyzed that the delivery cost of proactive caching will be lower. Their recommender system enhances user demand certainty by actively modifying user ratings of content. The work [
11] proposed the concept of “soft cache hit”, i.e., if a user clicks on a relevant alternative to unhit content (which is already cached), it will be considered a “soft cache hit”, and also studied the caching-recommendation joint optimization problem.
In addition, the work [
15] proposed a method to jointly optimize the content caching and recommendation system, aiming to improve the cache hit rate and user experience in regional networks. By analyzing the user’s request patterns and behaviors and dynamically adjusting the cached and recommended contents, it achieves more efficient content distribution and recommendation. However, its deployment is difficult due to high computational complexity and the need for frequent policy changes. The work [
16] proposed a Markov model to handle recommendation-driven user requests and developed an iterative ADMM-based algorithm to minimize the access cost and maintain the quality of the recommendations to address the problem of large fluctuations in user preferences and the complex sequential consumption patterns that make traditional caching methods ineffective in edge environments. The work [
17] proposed “soft cache hits” to improve the cache hit rate by caching not only the content that is directly requested by the user, but also content that is related to the requested content to improve the cache hit rate. It developed a model to evaluate the impact of recommended relevant content on cache performance and then proposed an optimization algorithm that analyzes the user’s content access patterns and relevance, enabling dynamically adjusting the caching policy to maximize “soft cache hits” and reduce content access latency. The work [
18] defined a streaming experience metric that captures the fundamental tradeoff between caching and recommendations and proposed an approximation algorithm under this metric that balances cache capacity constraints and recommendation quality to maximize user satisfaction and cache hit rate. The algorithm combines cache management and recommendation systems to reduce content access latency through an optimization policy. It also provided the approximation performance of the algorithm and experimental verification of its effectiveness in real applications, showing better cache utilization and recommendation results under the given tradeoff metric between caching and recommendations.
Closely related to the above work, Liang et al. [
19] studied the problem of cooperative caching of vehicles to reduce energy consumption and delay in vehicular networking. The authors solve the problem based on a deep reinforcement learning approach by considering the constraints of tolerable latency of the task and dwell time of the vehicle. Su et al. [
20] studied the joint optimization cache recommendation problem by giving a reinforcement learning-based algorithm that contains prediction and allocation phases. The prediction phase estimates the values of cache choices by considering value policy dependencies, while the allocation phase determines the cache choices for each request while satisfying global budget constraints. Chen et al. [
21] studied optimization in large-scale caching and recommender systems and proposed a cache-aware reinforcement learning approach to optimize recommendations through real-time computation and caching. Song et al. [
22] studied the wireless edge offload scenario of device-to-device offloading, which actually recommends a user’s cached content to encountered users. The strategy takes into account both personalized preferences and relative location without burdening the cellular link. Fu et al. [
23] studied the problem of maximizing the cache hit rate in wireless edge offload scenarios. The authors give a time-efficient iterative algorithm for solving the problem, taking into account the quality of personalized recommendations, the number of recommendations, and the cache capacity requirements for each user.
Zhang et al. [
24] investigated the use of proactive content pushing during low peak hours to improve spectrum efficiency, revenue prediction based on reinforcement learning, and solving for joint pushing, pricing, and recommending in cache-enabled wireless access networks based on linear programming. Fu et al. [
25] optimized content encoding by joint caching and recommendation, studied the problem of maximizing the hit rate of wireless edge caching networks, and achieved the optimal solution to this problem from a game-theoretic perspective. Tsigkar et al. [
26] studied the relationship between the cost and revenue of CDNs that are reduced by recommendation, and the authors proposed a model to find the tradeoff between the two and gave an economic mechanism based on Nash bargaining to enhance the final financial returns. Tharakan et al. [
27] investigated the joint optimization of cached recommendations for wireless edge computing, modeling the impact of recommendations on popularity through probabilistic transformation matrices and solving the optimization problem based on point estimation and Bayesian estimation methods. Zheng et al. [
28] modeled the influence of a user’s preference distribution and recommendation and then proposed a heuristic algorithm with low computational complexity to solve the joint recommendation and cache optimization problem.
Deployment of joint caching and recommendation is becoming increasingly popular and has found significant performance improvements in various applications including video streaming [
8,
9,
10] and content delivery [
11,
12,
13,
14,
15,
16,
17,
18,
29].
A classification of major approaches for caching-recommendation joint optimization is shown in
Table 1.
Compared with the previous work, we do not assume that a recommender system is always “honest”, i.e., our recommender system may make recommendations driven by factors such as product promotion rather than purely by user interests, which may make its recommended content deviate from cached hits. We study the joint optimization of caching and recommendation in CDNs in the general case without assuming the honesty of the recommender system and the tradeoff metric between caching and recommendations. In comparison with most existing studies that either focus only on cache optimization, or consider caching-recommendation joint optimization under special assumptions of their relationships, or provide only heuristics and deep reinforcement learning methods without proof of performance guarantee, our study synthesizes the general-case joint optimization problem of caching and recommendation, shows its NP-hardness in the simple case of a single user, and provides a solution with a performance guarantee of from the optimal solution for the case when the recommended contents are selected only from the cached contents.
6. Simulation Experiments
We use simulation experiments to validate our algorithm, in which the frequency of each user for each content is randomly generated to satisfy the sum of the frequency of each user for all contents is 1000, and the size of each content is similarly randomly generated to ensure that the sum of the size of each content is 1000 with a minimum of one.
The benchmark algorithm that serves as a reference for our algorithm is the Least Frequently Used (LFU) caching algorithm. This algorithm caches the items that attract the maximum total demand among all users, but it does not recommend any content. LFU is able to maximize the cache hit rate in the absence of recommendations.
In
Figure 2,
,
, all
, all
, and all
. The figure contains the LFU algorithm and the variation of the hit rate with capacity for
, and 40. First, the cache hit rate increases with capacity for both the LFU algorithm and our algorithm for different values of
k. The hit rate increases with capacity. The upper limit is reached when the capacity is close to the sum of the contents, when it can approach 100% hits. However, this means that virtually all content is cached, but we know that it is impractical and pointless to go to the server and cache all content. Secondly, no matter what the value of
k is, our algorithm will always hit above the LFU algorithm, and the higher the value of
k, the higher the hit rate. This can be interpreted as an expansion of the optional space of our recommended content, which gives our algorithm more room to play. Finally, we observe that the larger the server cache capacity, the smaller the gap between the gains of different algorithms. On the contrary, when the cache capacity is small, the difference between LFU and our algorithm is huge, which reflects the advantages of our algorithm even more.
Figure 3 and
Figure 4 with
,
,
,
and all
.
Figure 3 illustrates that the incremental hit rate of our algorithm compared with the LFU algorithm varies with
and
and that the incremental hit rate mainly reflects the role of the recommendation algorithm. When
is constant and
increases, the hit rate increment increases at this time, which indicates that the larger the gamma is, the more space the recommendation algorithm has to choose from, which brings more hit rate increment. On the other hand, as
increases, the hit rate increment decreases. This indicates that the effect of
on the hit rate increase is gradually decaying and also indicates that the problem is submodular. In fact, as
increases, the main factor limiting the increase in hit rate becomes
. When
is constant, the change in
is observed to be similar to that is
. This indicates that
and
together determine the incremental hit rate. Theoretically,
and
should be set as large as possible to ensure larger hit increments. However,
and
actually limit the magnitude of distortion of the recommender system with respect to the user’s original preferences. When both
and
are large, the limits are weakened, and the recommender system’s distortion of user preferences can be large. We use Jensen–Shannon Divergence (JSD) to measure the degree of distortion of user preferences, where JSD is actually the difference in the distribution of the probability of a user’s content request after the action of the recommender system, i.e.,
, where
and
.
Figure 4 illustrates the variation of JSD with
and
.
In addition, when and tend to infinity, the hit rate will approach the full hit rate, but the difference in probability distributions will become huge, no matter how small the cache size is, as long as one or more contents can be stored. In fact, with , , , , all , and , the LFU only achieves a hit rate of 17.929. After executing the algorithm in this paper, the hit rate can reach 45.317372, which is 90.63% of the limit full hit rate compared with the ideal full hit rate of 50 but at this time, the JSD is as high as 0.212237. That is to say, the hit rate is extremely high but the distortion of the user’s preference is extremely large. The reason for this situation is that when and tend to infinity, the constraints on the recommendation score have actually failed. At this point, the algorithm in this paper will score the recommendation of some cached content as infinity, and then the recommendation score will completely sway the user’s request, allowing the user to access only the cached content. This will result in a very poor user experience, because what the user can request is entirely up to the system.
The experimentally relevant parameters for
Figure 5 and
Figure 6 are
,
,
,
,
, and
.
Figure 5 and
Figure 6 show how the hit rate and JSD vary with the upper limit of distortion
, respectively. It can be seen that the hit rate does not always increase as
increases. But JSD increases all the time with
. In fact, even if
tends to be close to the upper limit of 1, its hit rate will not be close to the full hit rate of 50 but up to 24.251540, while the JSD is as high as 0.001888. This shows that the upper limit of the degree of distortion can only determine the degree of “honesty” of the recommender system and cannot determine the degree of distortion of the system for the user’s preferences. This shows that the upper limit of distortion can only determine the degree of “honesty” of the recommender system but cannot directly determine the degree of distortion of the system to the user’s preferences.
Figure 7 shows the relationship between cache hit rate and the number of caches
n under different server capacities and different user preference distributions, with the following parameters:
,
,
,
, and
. Here, the “zipf” suffix indicates that the data are experimental data whose user preferences conform to the zipf distribution; otherwise, the frequency of user preferences is completely random, i.e., conforms to the uniform distribution. We use the zipf distribution, i.e., content ranked
x, with user preference probability
. The results show that in either case, the hit rate always decreases as the number of contents
n increases, which can be explained by the fact that as the contents increase, there is a greater variation in preferences among users, making it more difficult to find contents that satisfy the majority of users who match. At the same time, the hit rate under the zipf distribution is always much higher than that under the uniform distribution. For example, with
and
, the hit rate under the zipf distribution is 45.572, while the hit rate under the uniform distribution is 37.471, which is 21.62% higher for zipf. And this difference further expands as the number of contents
n increases; when
and
, the two hit rates are 42.65 and 20.97, respectively, the former being 103.39% higher than the latter. This is due to the fact that under the uniform distribution, the hit rate decreases rapidly with the increase in the number of contents
n, whereas it does not under the zipf distribution. In fact, this is due to the fact that under the uniform distribution, all user preferences are completely randomized, and the more content there is, the harder it is to find common preferences. Under the zipf distribution, on the other hand, the higher the ranking, the higher the probability that the content is preferred by the user, and the user’s preferences tend to be similar, so that an increase in the number of contents has less impact on finding the public preference content.
The relationship between cache hit rate and JSD and the number of recommended content
k is shown in
Figure 8 and
Figure 9 for different
and
, respectively. Its experiment-related parameters are
,
,
, and
.
Figure 8 suggests that the hit rate increases with k but no longer increases when it reaches a certain magnitude, which is shown in the figure around
. The effect of the number of recommended contents on the increase in hit rate is not significant. It can be observed that for
and
, there is almost no improvement. However, when
and
, the number of recommended content
k is able to bring a large improvement. The fundamental reason for this is that smaller
and
limit the distortion of user preferences by the recommender system, and the recommender system is less useful. Therefore, in order to ensure the effectiveness of improving the number of recommendations
k, both
and
should be improved. On the other hand, when
and
are small, the growth trend of JSD is the same as the growth trend of hit rate, and both increase with
k until convergence. Notice that when
and
, the JSD shows a trend of increasing and then decreasing. This indicates that in the first half of the curve, where the hit rate increases rapidly with increasing k, the recommender system is limited by the number of recommendations
k, and the increase of
k liberates the recommender system’s recommending ability and brings about a rapid increase in the hit rate and JSD. In the second half of the curve, where the hit rate stabilizes as
k increases, an increase in the number of recommendations
k does not lead to an increase in the hit rate but reduces the distortion of user preferences. In effect, because more content can be recommended, the recommender system does not have to overly distort certain users’ preferences for certain content, which brings about a decrease in JSD.
In summary, we experimentally verified that our algorithm is greatly improved over the LFU algorithm and that the hit rate will increase as the optional recommendation set of the algorithm increases with an increase in the cache size c or k value. We observe that an increase in the number of contents, n, makes it harder to find users’ public preferences and thus reduces the hit rate, which is more serious in the case of a uniform distribution with completely randomized user preferences. In addition, the parameters and can directly determine the degree of distortion of the recommender system with respect to the user’s preferences, which together determine the constraints on the recommendation score; the larger the value, the larger the hit rate of the algorithm, but the larger the distortion value of the user’s preferences. On the other hand, if the parameters and are large enough, increasing the number of recommendations k can simultaneously improve the hit rate and reduce the distortion of user preferences. To ensure the user experience, setting too large and should be restricted. Finally, the experiment proves that the upper limit of distortion can only determine the degree of “honesty” of the recommender system but cannot directly determine the degree of distortion of the system to the user’s preference. However, in order to ensure the “honesty” of the recommendation system, the upper limit of distortion should not be set too large.
7. Conclusion and Future Work
In this paper, we address the problem of joint optimization of caching and recommendation for content delivery in IoT, which is NP-hard even even for only one user and without considering recommendation. We consider the case where a recommendation is made among the cached contents, which is more general than assuming a recommendation is made only in accordance with users’ preferences, as in the existing work. We solve this problem by splitting it into two subproblems: first, caching to maximize the total cache hit rate determined by user requests to all contents without considering recommendation, and then performing recommendation from the cached contents. We show that Subproblem 1 can be solved optimally by dynamic programming and Subproblem 2 possesses the submodular property, implying the existence of a solution with a performance guarantee of a worst-case approximation ratio to the optimal solution. Taking advantage of the properties of these two subproblems, we present a greedy algorithm to solve the problem of joint optimizationof cached recommendations that has a provable performance guarantee of . We experimentally validate the performance of our algorithm and demonstrate its hit rate improvement by simulation experiments and comparison with the most popular LFU algorithm. At the same time, we also show the variation patterns of hit rate measured by Jensen–Shannon Divergence on different parameter settings of cache capacity, , , and distortion limit . The results show that , , and should be set appropriately to tradeoff the cache hit rate and user-preferred distortion.
When recommendations are only made from cached content, it can reduce the frequency of obtaining content from the central server, thereby alleviating network bandwidth pressure. In environments where IoT device resources are limited, this assumption reduces network bandwidth pressure and has practical significance. In addition, we can cache the content that users may like in advance based on their browsing history. At this time, the cached content is closer to the user’s actual preferences, and recommendations from the cached content will not deviate too far from the user’s actual preferences.
In addition, we use staged optimization with caching and recommendation, and we set caching based on the user’s initial request preferences. However, recommendations change the probability distribution of user requests, and our cache optimization depends on the probability distribution of user requests, which may result in a cache that is not currently optimal after recommendations. If the cache optimization is redone, it will again lead to changes in the recommendation optimization. In practice, recommendation and caching need to be optimized alternately to obtain the best hit rate, and the caching and recommendation phases interact with each other, leading to the problem becoming an online, combinatorial optimization problem that needs to evolve over time, where each optimizationmust adapt to changing inputs caused by previous actions. The research of Robinson Duque et al. [
30] provides a class of dynamic combinatorial problems that accurately solve this kind of framework. In future work, we will combine their framework to optimize a two-stage greedy caching and recommendation optimization scheme into an online recurrent caching-recommendation alternation optimization scheme to enhance the robustness and applicability of our approach in dynamic environments of real-world applications.