Next Article in Journal
Automatic Control of a Mobile Manipulator Robot Based on Type-2 Fuzzy Sliding Mode Technique
Next Article in Special Issue
Network Alignment across Social Networks Using Multiple Embedding Techniques
Previous Article in Journal
Machine Learning-Based Estimation of the Compressive Strength of Self-Compacting Concrete: A Multi-Dataset Study
Previous Article in Special Issue
Hybrid Fake Information Containing Strategy Exploiting Multi-Dimensions Data in Online Community
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Efficient Streaming Algorithms for Maximizing Monotone DR-Submodular Function on the Integer Lattice

1
Faculty of Information Technology, Ho Chi Minh City University of Food Industry, 140 Le Trong Tan Street, Ho Chi Minh City 700000, Vietnam
2
Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, 17.listopadu 15/2172, 708 33 Ostrava, Czech Republic
3
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City 700000, Vietnam
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(20), 3772; https://doi.org/10.3390/math10203772
Submission received: 1 September 2022 / Revised: 6 October 2022 / Accepted: 9 October 2022 / Published: 13 October 2022
(This article belongs to the Special Issue Complex Network Modeling: Theory and Applications)

Abstract

:
In recent years, the issue of maximizing submodular functions has attracted much interest from research communities. However, most submodular functions are specified in a set function. Meanwhile, recent advancements have been studied for maximizing a diminishing return submodular (DR-submodular) function on the integer lattice. Because plenty of publications show that the DR-submodular function has wide applications in optimization problems such as sensor placement impose problems, optimal budget allocation, social network, and especially machine learning. In this research, we propose two main streaming algorithms for the problem of maximizing a monotone DR-submodular function under cardinality constraints. Our two algorithms, which are called StrDRS 1 and StrDRS 2 , have ( 1 / 2 ϵ ) , ( 1 1 / e ϵ ) of approximation ratios and O ( n ϵ log ( log B ϵ ) log k ) , O ( n ϵ log B ) , respectively. We conducted several experiments to investigate the performance of our algorithms based on the budget allocation problem over the bipartite influence model, an instance of the monotone submodular function maximization problem over the integer lattice. The experimental results indicate that our proposed algorithms not only provide solutions with a high value of the objective function, but also outperform the state-of-the-art algorithms in terms of both the number of queries and the running time.
MSC:
03G10; 06C05; 06D99; 30E10; 65K10

1. Introduction

Submodular function maximization problems have recently received great interest in the research community. A satisfactory explanation for this attraction is the prevalence of optimization problems related to submodular functions in many real-world applications [1]. Prominent examples include sensor placement problems [2,3] and facility location [4] in operational improvement, the influence maximization problem in viral marketing [5,6], document summarization [7], experiment design [8], dictionary learning [9] in machine learning, etc. These problems can be presented with the concept of submodularity, and effective algorithms have been developed taking advantage of the submodular function [10]. Given a ground set E, a function f: 2 E R 0 is called submodular if for all A , B E ,
f ( A ) + f ( B ) f ( A B ) + f ( A B )
The submodularity of a submodular function f is equivalent to the property diminishing return, i.e., for any A B E and e E \ B , it holds.
f ( A e ) f ( A ) f ( B e ) f ( B )
and the set function f is called monotone if
f ( A ) f ( B ) for any A B E
The submodular function maximization problem aims to select a subset A of the ground set E to maximize f ( A ) .
Most existing studies of the submodular function maximization problem consider submodular functions identified over a set function. It means that the problem has the input as a subset of the ground set and returns an actual value. However, there are many real-world situations in which it is crucial to know whether an element e E is selected and how many copies of that element should be chosen. In other words, the problem considers submodular functions over a multiset, under the name submodular function on the integer lattice [11]. The submodularity defined on the integer lattice differs from set functions because it does not equate to the diminishing return property. Some notable examples include the optimal budget allocation problem [12], document summarization and sensor placement [13], the submodular welfare problem [14], and maximization of influence spread with partial incentives [15]. The definitions of a submodular function and diminishing return submodular function on the integer lattice are as follows:
A function f : Z + E R is a submodular function on the integer lattice if for all x , y Z + E
f ( x ) + f ( y ) f ( x y ) + f ( x y )
where x y and x y denote the coordinate-wise min and max operations, respectively.
A function f : Z + E R is called diminishing return submodular (DR-submodular), if for all x , y Z + E with x y
f ( x + χ e ) f ( x ) f ( y + χ e ) f ( y )
where e E and χ e denote the unit vector with coordinate e being 1 and the other elements are 0.
The submodularity defined on the integer lattice differs from the set functions because it does not equate to the diminishing return property. In other words, lattice submodularity is weaker than DR-submodularity, i.e., a lattice submodular function may not be a DR-submodular function, but any DR-submodular function is a lattice submodular one [11]. Due to this cause, developing approximation algorithms is challenging; even for a single cardinality constraint, we need a more complicated method, such as partial enumeration [12,13]. Nevertheless, the diminishing return property of the DR-submodular function maximization problem often plays a fundamental role in some practical problems, such as optimizing budget allocation among channels and influencers [12], optimal budget allocation [13], and online submodular welfare maximization [14].
There have been many approaches to solving the problem of maximizing the submodular function under different constraints and contexts in the last decade. Two notable approaches to this problem are greedy algorithms [16,17,18,19] and streaming algorithms [20,21,22]. Plenty of studies show that the greedy method is often used for this optimization problem because it outputs a better result than other methods due to its “greedy” operation [16,23,24]. Understandably, the greedy method always scans data many times to find the best. However, this causes its algorithms to have a long runtime; it cannot even be applied to big data. Contrary to the greedy method, the streaming method scans the data once. As each element in the dataset arrives in order, the streaming algorithm must decide whether that element is selected before the next element arrives. Thus, the result of this method may not be as good as the result of greedy, because the elements it selects are not the best, but meet the selection condition. However, the outstanding advantage of the streaming method is that it runs much faster than the greedy method [25]. There are many studies that have used the streaming method to resolve the issue of submodular function maximization. Those studies have shown the advantages of the streaming method compared to the greedy method. Some prominent studies include using the streaming algorithm for maximizing k-submodular functions under budget constraints [26], optimizing a submodular function under noise by streaming algorithms [27], maximizing a monotone submodular function by multi-pass streaming algorithms [20], and using fast streaming for the problem of submodular maximization [22].
Attracted by the usefulness of the maximizing DR-submodular function on the integer lattice issue in many practical problems, numerous studies on this problem have recently been published. These publications consider the problem under many different constraints and use greedy or streaming methods as the standard approach. Some prominent examples include the use of a fast double greedy algorithm to maximize the non-monotone DR submodular function [24], using a threshold greedy algorithm to maximize the monotone DR submodular constraint knapsack over an integer lattice [28], combining the threshold greedy algorithm with a partial element enumeration technique to maximize the monotone DR submodular knapsack constraint over an integer lattice [11], using a streaming method to maximize DR-submodular functions with d-knapsack constraints [29], using a one-pass streaming algorithm for DR-submodular maximization with a knapsack constraint over the integer lattice [30], and using streaming algorithms for maximizing a monotone DR-submodular function with a cardinality constraint on the integer lattice [31].
Our contribution. In this paper, we focus on the maximization of monotone DR-submodular function under cardinality constraint on the integer lattice (the MDRSCa problem in Definition 2). In surveying the literature, there are two novel methods for this problem. First, Soma et al. [11] proposed the Cardinality constraint/DR-submodular algorithm (called CaDRS ), which interpolates between the classical greedy algorithm and a truly continuous algorithm. This algorithm achieves an approximation ratio of ( 1 1 / e ϵ ) in O ( n ϵ log B log k ϵ ) complexity. Second, Zhang et al. [31] first devised a streaming algorithm based on Sieve streaming [32]. Zhang’s method achieves an approximation ratio of ( 1 / 2 ϵ ) in O ( k ϵ ) memory and O ( k ϵ log 2 k ) query complexity. Inspired by Zhang’s method [31], our study based on the streaming method devises two improved streaming algorithms for the problem and obtains some positive results compared to state-of-the-art algorithms. Specifically, our main contributions are as follows.
  • To resolve the MDRSCa problem, we first devise an algorithm (called StrOpt ) to handle each element by scanning the data with the assumption of a known optimal value ( OPT ). We prove that StrOpt guarantees the theoretical result with an approximation ratio of ( 1 / 2 ) . Next, we devise a ( 1 / 4 ) -approximation streaming algorithm (called Stepping-Stone algorithm), which has the procedure role to calculate the threshold for the main algorithm. Later, based on StrOpt and Stepping-Stone algorithms, we provide two main streaming algorithms to solve this problem. They are named StrDRS 1 and StrDRS 2 . Because OPT cannot be determined in actual situations, we estimate OPT based on a conventional method by observing OPT [ m , 2 k m ] where m = max e E { f ( χ e ) } . Based on estimated OPT , StrDRS 1 , a one-pass streaming algorithm, has an approximation ratio of ( 1 / 2 ϵ ) and takes O ( n ϵ log ( log B ϵ ) log k ) queries. For StrDRS 2 , we first find a temporary result that satisfies the cardinality constraint by the Stepping-Stone algorithm. Subsequently, we increase the approximation solution ratio in StrDRS 2 by finding elements that hold the threshold restriction of the above temporary result. The StrDRS 2 is a multi-pass streaming algorithm that scans O ( 1 ϵ ) passes, takes O ( n ϵ log B ) queries, and returns an approximation ratio of ( 1 1 / e ϵ ) .
  • We further investigate the performance of our algorithms by performing some experiments on some datasets of practical applications. We run four algorithms, StrDRS 1 , StrDRS 2 , CaDRS  [11], and SieveStr + +  [31], to compare their performance. The results indicate that our algorithms provide solutions with a theoretically guaranteed value of the objective function and outperform the state-of-the-art algorithm in both the number of queries and the runtime.
Table 1 shows how our algorithms compare theoretical properties with current state-of-the-art algorithms for the problem of maximizing monotone DR submodular functions with a cardinality constraint in the integer lattice.
Organization. The structure of our paper is as follows: Section 1 introduces the development situation of the submodular function maximization on a-set and multi-set. Primarily, we focus on maximizing the monotone DR-submodular function on the integer lattice under cardinality constraint and present the main contributions of our study. Section 2 reviews the related work. The definition of the problem and some notation are introduced in Section 3. Section 4 contains our proposed algorithms and theoretical analysis. Section 5 shows the experimental results and evaluation. Finally, Section 6 concludes the paper and future work.

2. Related Work

A considerable amount of literature has been published on the maximization of monotone submodular functions under many different constraints over many decades. Nemhauser et al. [33] are pioneers in studying the approximations for maximizing submodular set functions in combinatorial optimization and machine learning. They proved that the standard greedy algorithm gives a ( 1 / 2 )-approximation under a matroid constraint and a ( 1 1 / e )-approximation under a cardinality constraint. Their method served as a model for further development. Later, Sviridenko [34] developed an improved greedy algorithm for maximizing a submodular set function subject to a knapsack constraint. This algorithm achieves a ( 1 1 / e )-approximation with O ( n 5 ) time complexity for a knapsack constraint. Subsequently, Calinescu et al. [35] first devise a ( 1 1 / e )-approximation algorithm for maximizing a monotone submodular function subject to a matroid constraint. This method combines a continuous greedy algorithm and pipage rounding. The pipage rounding rounds the approximate fractional solution of the continuous greedy approach to obtain an integral feasible solution. Recently, Badanidiyuru et al. [36] design a ( 1 1 / e ϵ )-approximation algorithm with any fixed constraint ϵ > 0 for maximizing submodular functions. This algorithm takes O ( n ϵ log n ϵ ) time complexity for the cardinality constraint.
Several studies have recently begun investigating the maximization of DR-submodular functions on the integer lattice under various constraints. Soma et al. (2014) [13] studied the monotone DR-submodular function maximization over integer lattices under a knapsack constraint. They proposed a simple greedy algorithm, which has an approximation ratio of ( 1 1 / e ) and a pseudo-polynomial time complexity. Next, Soma et al. (2018) [11] continued to develop polynomial-time approximation algorithms for the problem of DR-submodular function maximization under a cardinality constraint, a knapsack constraint, and a polymatroid constraint on the integer lattice, respectively. For the cardinality constraint, they devised an algorithm based on the decreasing threshold greedy framework. For the polymatroid constraint, they developed an algorithm based on an extension of continuous greedy algorithms. For the knapsack constraint, they used the decreasing threshold greedy framework as the algorithm of the cardinality constraint. However this algorithm takes its initial solution as an input, whereas the algorithm for cardinality constraints always uses the zero vector as the initial solution. All three algorithms have polynomial time and achieve a ( 1 1 / e ϵ )-approximation ratio. Besides, Some et al. (2017) [37] also studied the problem of non-monotone DR-submodular function maximization. They proposed a double greedy algorithm, which has 1 2 + ϵ -approximation and O ( n ϵ log 2 B ) . Subsequently, Gu et al. (2020) [24] study the problem of maximizing the non-monotone DR-submodular function on the bounded integer lattice. They propose a fast double greedy algorithm that improves runtime. Their result achieves a 1 / 2 -approximation algorithm with a O ( n log B ) time complexity. Liu et al. (2021) [29] develop two streaming algorithms for maximizing DR-submodular functions under the d-knapsack constraints. The first is a one-pass streaming algorithm that achieves a ( 1 θ 1 + d ) -approximation with O ( log ( d β 1 ) β ϵ ) memory complexity and O ( log ( d β 1 ) ϵ log B ) update time per element, where θ = m i n ( α + ϵ , 0.5 + ϵ ) and α , β are the upper and lower bounds for the cost of each item in the stream. The second is an improved streaming algorithm to reduce the memory complexity to O ( d β ϵ ) with an unchanged approximation ratio and query complexity. Zhang et al. (2021) [31] based on the Sieve streaming method to develop a streaming algorithm for the problem of monotone DR-submodular function under cardinality constraint on the integer lattice. This algorithm achieves an approximation ratio of ( 1 / 2 ϵ ) and takes O ( k ϵ log 2 k ) complexity. This is the problem that we study in this paper. Most recently, Tan et al. (2022) [30] design an one-pass streaming algorithm for the problem of DR-submodular maximization with a knapsack constraint over the integer lattice, called DynamicMRT, which achieves a ( 1 / 3 ϵ ) -approximation ratio, a memory complexity O ( K log K / ϵ ) , and query complexity O ( log 2 K / ϵ ) per element for the knapsack constraint K. Meanwhile, Gong et al. (2022) [28] consider the problem of non-negative monotone DR-submodular function maximization over a bounded integer lattice. They present a deterministic algorithm and theoretically reduce its runtime to a new record, O ( ( 1 ϵ O ( 1 / ϵ 5 ) . n log 1 c m i n log B ) ) , (where c m i n = m i n e E c ( e ) and c ( . ) is a cost function defined in E) with the approximate ratio of ( 1 1 / e O ( ϵ ) ).
All the studies mentioned above consider the problem of maximizing the submodular function on a set function or maximizing the DR-submodular function on the integer lattice under different constraints. Only the studies of Soma et al. in [11] and Zhang et al. in [31] consider the problem of MDRSCa , as mentioned in the contribution section. Motivated by these studies, we proposed two improved streaming algorithms for the MDRSCa problem. Our algorithms achieve better than state-of-the-art methods through theoretical analysis and experimental results.

3. Preliminaries

This section introduces the definitions of the monotone DR-submodular, MDRSCa problem and its associated notations. Table 2 summarizes the usually used notations in this paper.

3.1. Notation

For a positive integer k N , [ k ] denotes the set { 1 , , k } . Given a ground set E = { e 1 , , e n } , we denote the i-th entry of a vector x Z + E by x ( i ) , and for each e E , we define the e-th unit vector with χ e ( t ) = 1 if t = e and χ e ( t ) = 0 if t e .
For x Z + E , { x } denotes the multiset where the element e appears x ( e ) times and with a subset A E , x ( A ) = e A x ( e ) and s u p p + ( x ) = { e E | x ( e ) > 0 } . According to the definition of the vector norm, we have x : = m a x e E x ( e ) and x 1 : = e E x ( e ) .
For two vectors x , y Z + E , x y signifies e E then x ( e ) y ( e ) . Furthermore, given x , y Z + E , x y and x y denote the coordinate-wise maximum and minimum, respectively. This means that ( x y ) ( e ) : = m a x { x ( e ) , y ( e ) } and ( x y ) ( e ) : = m i n { x ( e ) , y ( e ) } . In addition, x + y denotes the multiset { x + y } where the element e appears ( x ( e ) + y ( e ) ) times. Thus, we can infer x y = x + ( y ) .

3.2. Definition

For function f : Z + E R + , we define f ( x | y ) = f ( x + y ) f ( y ) .
Definition 1
(Monotone DR-submodular function). A function f : Z + E R + is monotone if f ( x ) f ( y ) for all x , y Z + E with x y and f is said to be diminishing return submodular (DR-submodular), if
f ( x + χ e ) f ( x ) f ( y + χ e ) f ( y )
Definition 2
(Maximization of monotone DR-submodular function under cardinality constraint on the integer lattice— MDRSCa problem). Let B Z + E , B = B and an integer k > 0 , we consider the DR-submodular function under cardinality constraint as follows
m a x i m i z e : f ( x ) s u b j e c t t o : 0 x B , x ( E ) k

4. Proposed Algorithm

This section presents descriptions and theoretical analysis of the algorithms we have proposed for the MDRSCa problem, including a streaming algorithm with the assumption that the optimal value is known ( StrOpt ), the Stepping-Stone algorithm and two main streaming algorithms ( StrDRS 1 , StrDRS 2 ).

4.1. Streaming Algorithm with Approximation Ratio of ( 1 / 2 ϵ )

First, we propose StrOpt , a single-pass streaming algorithm for the MDRSCa problem under the assumption that the optimal value of the objective function is known. Afterwards, we use the traditional method to estimate the optimal value and devise the main one-pass streaming algorithm called StrDRS 1 .

4.1.1. Algorithm with Knowing Optimal Value— StrOpt

Algorithm description. The detail of StrOpt is fully presented in Algorithm 1.
Algorithm 1: StrOpt ( f , B , k , ϵ , v )
Input: f : Z + E R + , B , k, ϵ , a guess of optimal value v
Output: A vector x
Mathematics 10 03772 i001
We assume that the optimal value OPT of the objective function of MDRSCa is already known. StrOpt is created to find vector x using this OPT . Given a known optimal value v that satisfies ( 1 ϵ ) OPT v OPT for any ϵ ( 0 , 1 2 ) . When each element e arrives, we find a set I, which is the set of positive integers predicted to be the number of copies of e. Then, we use the binary search with threshold v 2 k to find the minimum k e that holds f ( k e | x ) / k e < v 2 k . We denote by k the number of copies of e that adds the result vector x . The value k is the minimum of two values k e and the rest of elements x ’ in the cardinality k. If k is equal to 0, then e is not selected in x . Otherwise, e is selected in x with k copies.
Theoretical analysis. Lemma 1, Theorem 1, and their proofs demonstrate the theoretical solution guarantee of StrOpt . On the basis of that, we devise the first main streaming algorithm for the MDRSCa problem.
Lemma 1.
We have f ( k e χ e | x ) ( 1 ϵ ) k e v 2 k .
Proof. 
Assume that k e = i j = x where x = B ( e ) ( 1 ϵ ) i with some i I . We have i j i j 1 + 1 and
k e i j 1 = i j ( i j 1 + 1 ) = x ( ( 1 ϵ ) x + 1 ) x ( x + ( ϵ x ) ) = ( ϵ x ) = ϵ x ϵ x ϵ k e
Therefore, i j 1 ( 1 ϵ ) k e . From the combination of the selection k e and the monotonicity of f, we have the following.
f ( k e χ e | x ) f ( i j 1 χ e | x ) i j 1 v 2 k ( 1 ϵ ) k e v 2 k
The proof is completed.    □
Theorem 1.
For any ϵ ( 0 , 1 / 2 ) and ( 1 ϵ ) OPT v OPT , the Algorithm 1 takes O ( n log ( 1 ϵ log B ) ) queries and returns a solution x satisfying f ( x ) ( 1 ϵ ) v / 2 .
Proof. 
The Algorithm 1 scans only one time over E and each incoming element e, it takes log | I | = O ( log ( 1 ϵ log B ) ) queries to find k e . The total number of required queries of the algorithm is O ( n log ( 1 ϵ log B ) ) .
Denote x i and k i χ e i as the solution at the beginning of iteration i and the additional vector in the current solution at iteration i, respectively. We consider two following cases:    
Case 1. If x 1 = k , we have k 1 + k 2 + + k n = k thus:
f ( x ) = i = 1 n f ( k i χ e i | x i ) i = 1 n ( 1 ϵ ) k i v 2 k = ( 1 ϵ ) v 2
Case 2. If x 1 < k , after ending the main loop, we have f ( e | x ) v 2 k for all e { B x } . Therefore:
f ( o ) f ( x ) = f ( o x ) f ( x ) = e { o x x } f ( χ e | x ) = e { o o x } f ( χ e | x ) < e { o o x } v 2 k v 2
where the second equality follows from the lattice identity x y y = x x y for x , y Z + E . We have f ( x ) OPT v / 2 v / 2 . The proof is completed.    □

4.1.2. ( 1 / 2 ϵ ) -Approximation Streaming Algorithm— StrDRS 1 Algorithm

Algorithm description. The detail of this algorithm is fully presented in Algorithm 2.
Algorithm 2: Streaming-I algorithm ( StrDRS 1 )
Input: f : Z + E R + , B , k, ϵ
Output: A ( 1 / 2 ϵ ) -approximation solution x
Mathematics 10 03772 i002
Based on the analysis of StrOpt , and the working frame of the Sieve streaming algorithm [38], we design the StrDRS 1 algorithm for the MDRSCa problem with the following main idea. We find a set of solutions x v of OPT , where v O and O is the set of values that changes according to the maximum value of the unit standard vector on the arriving elements. Besides, we find a set I, which contains positive integers predicted to be the number of copies of each element e if e is selected in x v . For each solution x v , v O , the algorithm finds k e , is the smallest value in I so that the current element e satisfies the condition in line 8 by binary search. Then, we choose k , which is the minimum value between k e and k x v 1 . If k is not equal to 0, this means that e is selected in x v with k copies. Otherwise, e is not selected in x v . In the end, the result x is x v , which makes f ( x )  maximal.
Theoretical analysis. We analyze the complexity of StrDRS 1 , stated in Theorem 2.
Theorem 2.
StrDRS 1 is a single-pass streaming algorithm, has an approximation ratio of ( 1 2 ϵ ) and takes O ( n ϵ log ( log B ϵ ) log k ) queries.
Proof. 
By the definition of O, there exists an integer i such that
( 1 ϵ ) OPT OPT 1 + ϵ v = ( 1 + ϵ ) i OPT
By applying the proof of Theorem 1, and the working frame of the Sieve streaming algorithm in [38], we obtain:
f ( x v ) ( 1 ϵ ) 2 v ( 1 ϵ ) 2 2 OPT ( 1 2 ϵ ) OPT
The proof is completed.    □

4.2. Streaming Algorithm with Approximation Ratio of ( 1 1 / e ϵ )

In this section, we introduce two more algorithms for the MDRSCa problem, including one with the role of a stepping stone (called Stepping-Stone algorithm) and the second main algorithm (called StrDRS 2 algorithm) in our study.

4.2.1. ( 1 / 4 ) -Approximation Streaming Algorithm—Stepping-Stone Algorithm

Algorithm description. The detail of this algorithm is fully presented in Algorithm 3.
Algorithm 3: ( 1 / 4 ) -approximation algorithm (Stepping-Stone algorithm)
Input: f : Z + E R + , B , k, ϵ
Output: A vector x
Mathematics 10 03772 i003
We design the Stepping-Stone algorithm, which is a ( 1 / 4 ) -approximation streaming algorithm. The Stepping-Stone algorithm differs from StrDRS 1 and StrDRS 2 in that it only selects elements for exactly one solution and has an approximately constant value. In contrast, the other two algorithms find multiple solution candidates and choose the best candidate.
In more detail, the main idea of this algorithm differs from StrDRS 1 , that is, the Stepping-Stone algorithm is a single-pass streaming algorithm and finds k e without relying on a given v. In this way, after finding the set I as StrDRS 1 , for each element e , e E , k e is the largest i t 1 , i t I so that it meets the conditions in line 3. Finally, the output contains the last elements of x with x 1 = k .
Theoretical analysis. Lemmas 2–4, and Theorem 3 clarify the theoretical analysis of the Stepping-Stone algorithm.
Lemma 2.
After each iteration of the Stepping-Stone algorithm, we have f ( k e χ e | x ) ( 1 ϵ ) k e f ( x ) / k .
Proof. 
Due to the definition of I, after each iteration of the main loop, we have i t 1 i t 1 . Similarly to the proof of Lemma 2, we have k e i t 1 = i t 1 i t 1 ϵ k e . By selection of the algorithm, for 1 j < t we have
f ( i j | x + i j 1 χ e ) = l = i j 1 + 1 i j f ( χ e | x + i j 1 χ e ) l = i j 1 + 1 i j f ( x + i j 1 χ e ) / k
( i j i j 1 ) f ( x + i j 1 χ e ) / k
Therefore:
f ( k e χ e | x ) f ( i t 1 · χ e | x )
j = 1 t 1 ( i j i j 1 ) f ( χ e | x + i j 1 · χ e )
= i t 1 f ( x ) / k ( 1 ϵ ) k e f ( x ) / k
The proof is completed.    □
Lemma 3.
After the main loop of the Stepping-Stone algorithm, we have 2 f ( x ) OPT .
Proof. 
Denote x ( e ) is x right before the element e starts to proceed. We have the following.
f ( o ) f ( x ) = f ( o x ) f ( x )
= e { o x x } f ( χ e | x )
= e { o o x } f ( χ e | x )
e { o o x } f ( χ e | x ( e ) )
< e { o o x } f ( x ( e ) ) k f ( x )
which implies the proof.    □
Lemma 4.
At the end of the Stepping-Stone algorithm, we have f ( x ) 1 3 ϵ 2 3 ϵ f ( x ) .
Proof. 
If x 1 < k , then x = x and Lemma 4 holds. We consider the case x 1 = k . Assume that s u p p ( x ) = { e 1 , e 2 , , e l } , x i = j = 1 i x ( e j ) · χ e j , s u p p ( x ) = { e p , e p + 1 , , e l } and x 1 = x x where e j is added to x immediately after e j 1 and 1 p < l .
We further consider two cases.    
Case 1. If { x 1 } { x } = , we have k = i = p l k e p and
f ( x ) f ( x 1 ) = i = p l f ( k e i χ e i | x i ) i = p l ( 1 ϵ ) k e i f ( x i ) k ( Lemma   2 )
( 1 ϵ ) i = p l k e i f ( x 1 ) k = ( 1 ϵ ) f ( x 1 )
Case 2. If { x 1 } { x } = { e p } . Denote q = k e p x 1 ( e p ) and c = min { i j I : i j x 1 ( e p ) } . We have k = q + i = p + 1 l k e p and i j 1 < x 1 ( e p ) c = i j . Similarly to the proof of Lemma 2, we have c x 1 ( e p ) ϵ i j ϵ k e p and thus c ϵ i j + x 1 ( e p ) ϵ k e p + x 1 ( e p ) . Let x l 1 = x 1 + l χ e p , then
f ( q χ e | x 1 ) l = c + 1 i t 1 f ( χ e p | x 1 + ( l 1 ) k e p ) k
( k e p ( 1 ϵ ) c ) f ( x c 1 ) k = ( k e p ( 1 ϵ ) ( x 1 ( e p ) + ϵ k e p ) ) f ( x 1 ) k
( q 2 ϵ k e p ) f ( x 1 ) k
implying that f ( q χ e | x 1 ) ( q 2 ϵ k e p ) f ( x 1 ) k . Therefore:
f ( x ) f ( x 1 ) = f ( q χ e | x 1 ) + i = p + 1 l f ( k e i χ e i | x i )
( q 2 ϵ k e p ) f ( x 1 ) k + i = p + 1 l ( 1 ϵ ) k e i f ( x i ) k ( Lemma   2 )
( q + i = p + 1 l k e i 2 ϵ k e p ϵ i = p + 1 l k e i ) f ( x 1 ) k
( k 3 ϵ k ) f ( x 1 ) k = ( 1 3 ϵ ) f ( x 1 )
Hence, f ( x ) ( 2 3 ϵ ) f ( x 1 ) . Combined with the fact that f ( x ) f ( x ) + f ( x 1 ) , we have f ( x ) 1 3 ϵ 2 3 ϵ f ( x ) , which completes the proof.    □
Theorem 3.
The Stepping-Stone algorithm is a single-pass streaming algorithm that takes O ( n ϵ log B ) and provides an approximation ratio of 1 / 4 3 ϵ / 4 .
Proof. 
The algorithm scans only one time over the ground set E and each element e, it calculates f ( χ e | x + ( i j 1 ) χ e ) for all i j I to find k e . This task takes at most 1 ϵ log ( B ( e ) ) = O ( 1 ϵ log B ) queries. Thus, the total number of required queries is O ( 1 ϵ log B ) . For the proof of the approximation ratio, by using Lemmas 3 and 4, we have:
f ( x ) 1 3 ϵ 2 3 ϵ f ( x ) 1 3 ϵ 2 ( 2 3 ϵ ) OPT ( 1 4 3 4 ϵ ) OPT
The proof is completed.    □

4.2.2. ( 1 1 / e ϵ ) -Approximation Streaming Algorithm— StrDRS 2 Algorithm

Algorithm description. The detail of StrDRS 2 is fully presented in Algorithm 4.
Algorithm 4: ( 1 1 / e ϵ ) -approximation algorithm ( StrDRS 2 )
Input: f : Z + E R + , B , k, ϵ
Output: A vector x
Mathematics 10 03772 i004
We propose a ( 1 1 / e ϵ ) -approximation algorithm, called StrDRS 2 . It is a multi-pass streaming algorithm and is based on the output of the Stepping-Stone algorithm (Algorithm 3) to compute the threshold θ of f ( k e χ e | x ) of each element e. The k e of each e is the minimal value i , i { 1 , 2 , , B ( e ) } , so that f ( i χ e | x ) / i < θ . The threshold θ decreases ( 1 ϵ ) times after each iteration.
Theoretical analysis. Lemma 5 and Theorem 4 clearly demonstrate the theoretical solution-ability guarantee of the StrDRS 2 algorithm.
Lemma 5.
In the StrDRS 2 algorithm, at any iteration of the outer loop, we have:
f ( k e χ e | x ) ( 1 ϵ ) k e k ( OPT f ( x ) )
Proof. 
For the first iteration of the outer loop, we have x = 0 , and thus
f ( o ) f ( x ) = k OPT k k ( 4 3 ϵ ) Γ ( 1 3 ϵ ) k < k θ 1 ϵ f ( k e χ e | x i ) ( 1 ϵ ) k e
Thus, f ( k e χ e | x ) ( 1 ϵ ) k e k ( OPT f ( x ) ) , Lemma 5 is valid. For the latter iterations, the marginal gain of any element e with current vector x is less than the threshold of previous iterations of the outer loop, i.e, f ( χ e | x ) θ 1 ϵ for e { B x } . Then,
f ( o ) f ( x i ) f ( o x ) f ( x )
= e { o x x } f ( χ e | x )
= e { o o x } f ( χ e | x )
k θ 1 ϵ f ( k e χ e | x i ) ( 1 ϵ ) k e
The proof is completed. □
Theorem 4.
The StrDRS 2 algorithm is a multi-pass streaming algorithm that scans O ( 1 ϵ ) passes over the ground set, takes O ( n ϵ log B ) queries, and returns an approximation ratio of ( 1 1 / e ϵ ) .
Proof. 
We consider following cases:
Case 1. If x 1 < k , after the last iteration of the outer loop we have:
f ( o ) f ( x ) f ( o x ) f ( x )
= e { o x x } f ( χ e | x )
= e { o o x } f ( χ e | x )
k θ m i n k ( 1 ϵ ) Γ 4 k ( 1 ϵ ) OPT 4
Hence, f ( x ) 3 + ϵ 4 OPT .
Case 2. If x 1 = k . Denote x i as x after i-th update, k e i χ e i is the vector added to x at the i-th update, and the final solution x = x l , Lemma 5 gives
f ( x i + 1 ) f ( x i ) = f ( k e i + 1 χ e i + 1 | x i ) ( 1 ϵ ) k e i + 1 k ( OPT f ( x i ) )
Rearrange the above inequality for i + 1 = l , and we have:
OPT f ( x l ) ( 1 ( 1 ϵ ) k e l k ) ( OPT f ( x l 1 ) )
e ( 1 ϵ ) k e l k ( OPT f ( x l 1 ) )
e j = 1 l ( 1 ϵ ) k e j k OPT
e ( 1 ϵ ) OPT ( 1 e + ϵ ) OPT
Therefore, f ( x ) ( 1 1 / e ϵ ) OPT , the proof is completed. □

5. Experiment

We conducted experiments based on the budget allocation problem over the bipartite influence model [39]. This problem is an instance of the monotone submodular function maximization problem over the integer lattice under a constraint [3]. As mentioned above, we consider the problem under a cardinality constraint.
Suppose that we consider the context of the algorithmic marketing approach. The budget allocation problem can be explained as follows. In a marketing strategy, one of the crucial choices is deciding how much of a given budget to spend on different media, including television, websites, newspapers, and social media, to reach as many potential customers as possible. In other words, given a bipartite graph G ( V ; E ) , where V is a bipartition ( V 1 ; V 2 ) of the vertex set, V 1 denotes the set of source nodes (such as ad sources), V 2 denotes the set of target nodes (such as people/customers), and E V 1 × V 2 is the edge set. Each source node v 1 has a capacity B v 1 Z + , which represents the number of available budgets of the ad source corresponding to v 1 . Each edge v 1 v 2 E is associated with a probability p ( v 1 v 2 ) [ 0 ; 1 ] , which means that putting an advertisement to a slot of v 1 activated customer v 2 with probability p ( v 1 v 2 ) . Each source node v 1 will be allocated a budget x ( v 1 ) { 0 , 1 , . . . , B v 1 } such that v 1 V 1 x ( v 1 ) k where k Z + denotes a total budget capacity. The object value function f, which means the expected number of target vertices activated by x , is defined as follows [3].
f : Z + V R + as f ( x ) = v 2 V 2 1 v 1 v 2 E ( 1 p ( v 1 v 2 ) ) x ( v 1 )
All experiments are carried out to compare the performance of StrDRS 1 , StrDRS 2 , CaDRS and SieveStr + + . We evaluated the performance of each algorithm based on the number of oracle queries, runtime, and influence f ( x ) .

5.1. Experimental Setting

Datasets. For the exhaustive experiment, we choose two datasets of different sizes regarding the number of nodes and edges. They are two real networks that are bipartite, undirected type and weighted of the KONECT (http://konect.cc) (accessed on 1 September 2022) project [40]: the network of the FilmTrust ratings project, and the NIPS is a doc-word dataset of NIPS full papers. The weighted of the rating datasets is the rating value, and one of the doc-word datasets is the number of occurrences of the word in the document. The description of the datasets is presented in Table 3.
Environment. We conducted our experiments on a Linux machine with Intel Xeon Gold 6154 (720) @ 3.700 GHz CPUs and 3TB RAM. Our implementation is written in Python.
Parameter Setting. We set the parameters as follows: ϵ = 0.1 , B = 5 for all experiments. Because FilmTrust has a small set of nodes, k { 60 , 70 , 80 , 90 , 100 } . Meanwhile, NIPS has a large set of nodes and edges, so k { 120 , 140 , 160 , 180 , 200 } . Besides, we do a simple preprocessing for the edge-weighted of the datasets, which refers to the probability p ( v 1 v 2 ) . For FilmTrust, edge weighted is the ratio of the rated value and the maximum rated value ( rated value maximum rated value ). While, it is the ratio of the number of the word’s occurrences in the document and the number of words in the document ( number of the word s occurrences in the document number of words in the document ) for NIPS.

5.2. Experimental Results

This section discusses the experimental results to clarify the benefits and drawbacks of the algorithms through three metrics: number of oracle queries, runtime, and influence. Two outstanding advantages of our algorithms over CaDRS and SieveStr + + are (1) our algorithms’ runtime and the number of oracle queries are faster many times than those of CaDRS and SieveStr + + ;(2) the influence of our algorithms is often smaller than that of SieveStr + + and CaDRS . However, for some datasets, the influence of StrDRS 1 and StrDRS 2 can be equal to or greater than that of SieveStr + + and CaDRS if we suitably set parameters B and k for the dataset. Figure 1 clearly shows the results achieved.
Oracle queries and Runtime. Because most of the execution time of the algorithms is consumed by the number of queries to compute the function f, the runtime is directly proportional to the number of oracle queries. In detail, for comparing StrDRS 1 to SieveStr + + , the number of oracle queries of StrDRS 1 is 1.2 to 24.4 times smaller than SieveStr + + ; and the runtime of StrDRS 1 is 1.1 to 102.5 times faster than SieveStr + + . For comparing StrDRS 2 to CaDRS , the number of oracle queries of StrDRS 2 is 5.1 to 6.5 times smaller than CaDRS ; and the runtime of StrDRS 2 is 2.0 to 4.8 times faster than CaDRS . Especially, even if k increases many times, the number of queries and runtime of StrDRS 2 only increase very small compared to the other algorithms. This cause makes it possible for us to mistake them for constants when looking at the charts. Table 4 clearly shows the variation in the number of queries.
Influence. Through the analysis of experimental results, the difference in the influence value of the algorithms is as follows. For the comparison of SieveStr + + and StrDRS 1 , the influence of StrDRS 1 is 1.1 to 1.2 times smaller than SieveStr + + . For the comparison of CaDRS and StrDRS 2 , the influence of StrDRS 2 is 1.0 to 1.3 times smaller than CaDRS for FilmTrust dataset. However, for NIPS dataset, the influence of StrDRS 2 is 1.4 to 1.7 times greater than CaDRS in this parameters set. Generally, because CaDRS uses a greedy technique, the influence of this algorithm is always at its best. As k increases, this value of CaDRS can reach the best values. On the contrary, the remaining three algorithms use streaming techniques, so it is difficult to achieve the same influence as CaDRS ’s. However, the difference in the influence of streaming and greedy algorithms is not too large. Especially, this gap will decrease as k increases. Thus, the time benefit of our algorithms is a significant strength against this disparity in influence.
For the convenience of the readers, we summarize the experimental results in Table 5.

6. Conclusions and Future Work

This paper studies the maximization of monotone DR-submodular functions with a cardinality constraint on the integer lattice. We propose two streaming algorithms that have determined approximation ratios and significantly reduce query and time complexity compared to state-of-the-art algorithms. We conducted some experiments to evaluate the efficiency of our algorithms and novel algorithms for this problem. The results indicate that our algorithms are highly scalable and outperform the compared algorithms in terms of both runtime and number of queries, and the influence is slightly smaller.
For our future work, one direction is to study the monotone DR-submodular function maximization problem under a polymatroid constraint and knapsack constraint. In another direction, we consider the maximization of the non-monotone DR-submodular function under a cardinality constraint.

Author Contributions

Conceptualization, B.-N.T.N. and V.S.; formal analysis, B.-N.T.N.; investigation, B.-N.T.N. and P.N.H.P.; methodology, B.-N.T.N., P.N.H.P. and V.-V.L.; project administration, B.-N.T.N.; resources, B.-N.T.N.; software, B.-N.T.N. and P.N.H.P.; supervision, V.S.; validation, V.S.; writing—original draft, B.-N.T.N.; Writing—review and editing, B.-N.T.N., P.N.H.P., V.-V.L. and V.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Ho Chi Minh City University of Food Industry (HUFI), Ton Duc Thang University (TDTU), and VŠB-Technical University of Ostrava (VŠB-TUO).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All real-world network datasets used in the experiment can be downloaded at http://konect.cc/ (accessed on 1 September 2022).

Acknowledgments

The authors would like to give thanks for the support of Ho Chi Minh City University of Food Industry (HUFI), Ton Duc Thang University (TDTU), and VŠB-Technical University of Ostrava (VŠB-TUO).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Tohidi, E.; Amiri, R.; Coutino, M.; Gesbert, D.; Leus, G.; Karbasi, A. Submodularity in action: From machine learning to signal processing applications. IEEE Signal Process. Mag. 2020, 37, 120–133. [Google Scholar] [CrossRef]
  2. Krause, A.; Guestrin, C.; Gupta, A.; Kleinberg, J. Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the 5th International Conference on Information Processing in Sensor Networks, Nashville, TN, USA, 19–21 April 2006; pp. 2–10. [Google Scholar]
  3. Soma, T.; Yoshida, Y. A generalization of submodular cover via the diminishing return property on the integer lattice. In Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 847–855. [Google Scholar]
  4. Cornuejols, G.; Fisher, M.; Nemhauser, G.L. On the uncapacitated location problem. Ann. Discret. Math. 1977, 1, 163–177. [Google Scholar]
  5. Nguyen, B.-N.T.; Pham, P.N.; Tran, L.H.; Pham, C.V.; Snášel, V. Fairness budget distribution for influence maximization in online social networks. In Proceedings of the International Conference on Artificial Intelligence and Big Data in Digital Era, Ho Chi Minh City, Vietnam, 18–19 December 2021; pp. 225–237. [Google Scholar]
  6. Pham, C.V.; Thai, M.T.; Ha, D.; Ngo, D.Q.; Hoang, H.X. Time-critical viral marketing strategy with the competition on online social networks. In Proceedings of the International Conference on Computational Social Networks, Ho Chi Minh City, Vietnam, 2–4 August 2016; pp. 111–122. [Google Scholar]
  7. Lin, H.; Bilmes, J. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; pp. 510–520. [Google Scholar]
  8. Agrawal, R.; Squires, C.; Yang, K.; Shanmugam, K.; Uhler, C. Abcd-strategy: Budgeted experimental design for targeted causal structure discovery. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, PMLR, Naha, Japan, 16–18 April 2019; pp. 3400–3409. [Google Scholar]
  9. Das, A.; Kempe, D. Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection. In Proceedings of the 28th International Conference on Machine Learning, ICML, Bellevue, WA, USA, 28 June–2 July 2011; pp. 1057–1064. [Google Scholar]
  10. Liu, S. A review for submodular optimization on machine scheduling problems. Complex. Approx. 2020, 12000, 252–267. [Google Scholar]
  11. Soma, T.; Yoshida, Y. Maximizing monotone submodular functions over the integer lattice. Math. Program. 2018, 172, 539–563. [Google Scholar] [CrossRef] [Green Version]
  12. Alon, N.; Gamzu, I.; Tennenholtz, M. Optimizing budget allocation among channels and influencers. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; pp. 381–388. [Google Scholar]
  13. Soma, T.; Kakimura, N.; Inaba, K.; Kawarabayashi, K.-I. Optimal budget allocation: Theoretical guarantee and efficient algorithm. In Proceedings of the International Conference on Machine Learning, PMLR, Beijing, China, 21–26 June 2014; pp. 351–359. [Google Scholar]
  14. Kapralov, M.; Post, I.; Vondrák, J. Online submodular welfare maximization: Greedy is optimal. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, New Orleans, LA, USA, 6–8 January 2013; pp. 1216–1225. [Google Scholar]
  15. Demaine, E.D.; Hajiaghayi, M.; Mahini, H.; Malec, D.L.; Raghavan, S.; Sawant, A.; Zadimoghadam, M. How to influence people with partial incentives. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 937–948. [Google Scholar]
  16. Bian, A.; Buhmann, J.; Krause, A.; Tschiatschek, S. Guarantees for greedy maximization of non-submodular functions with applications. In Proceedings of the 34th International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 498–507. [Google Scholar]
  17. Feldman, M.; Naor, J.; Schwartz, R. A unified continuous greedy algorithm for submodular maximization. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Palm Springs, CA, USA, 22–25 October 2011; pp. 570–579. [Google Scholar]
  18. Korula, N.; Mirrokni, V.; Zadimoghaddam, M. Online submodular welfare maximization: Greedy beats 1/2 in random order. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, Portland, OR, USA, 14–17 June 2015; pp. 889–898. [Google Scholar]
  19. Ha, D.T.; Pham, C.V.; Hoang, H.X. Submodular Maximization Subject to a Knapsack Constraint Under Noise Models. Asia-Pac. J. Oper. Res. 2022, 2250013. [Google Scholar] [CrossRef]
  20. Huang, C.; Kakimura, N. Multi-pass streaming algorithms for monotone submodular function maximization. Theory Comput. Syst. 2022, 66, 354–394. [Google Scholar] [CrossRef]
  21. Chekuri, C.; Gupta, S.; Quanrud, K. Streaming algorithms for submodular function maximization. Int. Colloq. Autom. Lang. Program. 2015, 9134, 318–330. [Google Scholar]
  22. Buschjäger, S.; Honysz, P.; Pfahler, L.; Morik, K. Very fast streaming submodular function maximization. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Bilbao, Spain, 13–17 September 2021; pp. 151–166. [Google Scholar]
  23. Pham, C.; Pham, D.; Bui, B.; Nguyen, A. Minimum budget for misinformation detection in online social networks with provable guarantees. Optim. Lett. 2022, 16, 515–544. [Google Scholar] [CrossRef]
  24. Gu, S.; Shi, G.; Wu, W.; Lu, C. A fast double greedy algorithm for non-monotone dr-submodular function maximization. Discret. Math. Algorithms Appl. 2020, 12, 2050007. [Google Scholar] [CrossRef]
  25. Mitrovic, S.; Bogunovic, I.; Norouzi-Fard, A.; Tarnawski, J.; Cevher, V. Streaming robust submodular maximization: A partitioned thresholding approach. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4557–4566. [Google Scholar]
  26. Pham, C.; Vu, Q.; Ha, D.; Nguyen, T.; Le, N. Maximizing k-submodular functions under budget constraint: Applications and streaming algorithms. J. Comb. Optim. 2022, 44, 723–751. [Google Scholar] [CrossRef]
  27. Nguyen, B.; Pham, P.; Pham, C.; Su, A.; Snášel, V. Streaming Algorithm for Submodular Cover Problem Under Noise. In Proceedings of the 2021 RIVF International Conference on Computing and Communication Technologies (RIVF), Hanoi, Vietnam, 19–21 August 2021; pp. 1–6. [Google Scholar]
  28. Gong, S.; Nong, Q.; Bao, S.; Fang, Q.; Du, D.-Z. A fast and deterministic algorithm for knapsack-constrained monotone dr-submodular maximization over an integer lattice. J. Glob. Optim. 2022, 1–24. [Google Scholar] [CrossRef]
  29. Liu, B.; Chen, Z.; Du, H.W. Streaming algorithms for maximizing dr-submodular functions with d-knapsack constraints. In Proceedings of the Algorithmic Aspects in Information and Management—15th International Conference, AAIM, Virtual Event, 20–22 December 2021; Volume 13153, pp. 159–169. [Google Scholar]
  30. Tan, J.; Zhang, D.; Zhang, H.; Zhang, Z. One-pass streaming algorithm for dr-submodular maximization with a knapsack constraint over the integer lattice. Comput. Electr. Eng. 2022, 99, 107766. [Google Scholar] [CrossRef]
  31. Zhang, Z.; Guo, L.; Wang, Y.; Xu, D.; Zhang, D. Streaming algorithms for maximizing monotone dr-submodular functions with a cardinality constraint on the integer lattice. Asia Pac. J. Oper. Res. 2021, 38, 2140004:1–2140004:14. [Google Scholar] [CrossRef]
  32. Kazemi, E.; Mitrovic, M.; Zadimoghaddam, M.; Lattanzi, S.; Karbasi, A. Submodular streaming in all its glory: Tight approximation, minimum memory and low adaptive complexity. In Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 3311–3320. [Google Scholar]
  33. Nemhauser, G.L.; Wolsey, L.A.; Fisher, M.L. An analysis of approximations for maximizing submodular set functions—I. Math. Program. 1978, 14, 265–294. [Google Scholar] [CrossRef]
  34. Sviridenko, M. A note on maximizing a submodular set function subject to a knapsack constraint. Oper. Res. Lett. 2004, 32, 41–43. [Google Scholar] [CrossRef]
  35. Cálinescu, G.; Chekuri, C.; Pxaxl, M.; Vondrxaxk, J. Maximizing a monotone submodular function subject to a matroid constraint. SIAM J.Comput. 2011, 40, 1740–1766. [Google Scholar] [CrossRef]
  36. Badanidiyuru, A.; Vondrák, J. Fast algorithms for maximizing submodular functions. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, SIAM, Portland, OR, USA, 5–7 January 2014; pp. 1497–1514. [Google Scholar]
  37. Soma, T.; Yoshida, Y. Non-Monotone DR-Submodular Function Maximization. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 898–904. [Google Scholar]
  38. Badanidiyuru, A.; Mirzasoleiman, B.; Karbasi, A.; Krause, A. Streaming submodular maximization: Massive data summarization on the fly. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’14, Association for Computing Machinery, New York, NY, USA, 24–27 August 2014; pp. 671–680. [Google Scholar]
  39. Hatano, D.; Fukunaga, T.; Maehara, T.; Kawarabayashi, K. Lagrangian decomposition algorithm for allocating marketing channels. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 1144–1150. [Google Scholar]
  40. Kunegis, J. KONECT: The koblenz network collection. In Proceedings of the 22nd International World Wide Web Conference, WWW’13, ACM, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 1343–1350. [Google Scholar]
Figure 1. The results of the experimental comparison of algorithms on the datasets.
Figure 1. The results of the experimental comparison of algorithms on the datasets.
Mathematics 10 03772 g001
Table 1. State-of-the-art algorithms for the problem of monotone DR-submodular function maximization with a cardinality constraint on the integer lattice in terms of time complexity.
Table 1. State-of-the-art algorithms for the problem of monotone DR-submodular function maximization with a cardinality constraint on the integer lattice in terms of time complexity.
ReferencePassRatioQuery Complexity
CaDRS O ( 1 ϵ log 1 ϵ ) 1 1 / e ϵ O ( n ϵ log B log k ϵ )
SieveStr + + 1 1 / 2 ϵ O ( k ϵ log 2 k )
StrDRS 1 1 1 / 2 ϵ O ( n ϵ log ( log B ϵ ) log k )
StrDRS 2 O ( 1 ϵ ) 1 1 / e ϵ O ( n ϵ log B )
Table 2. Table of the usually used notations in this paper.
Table 2. Table of the usually used notations in this paper.
NotationDescription
Ea ground set, E = { e 1 , , e n }
nthe number of elements in the ground set E.
2 E the subset family of E.
A , B the arbitrary subsets of E
x , y the arbitrary vectors of Z + E
χ e the unit vector with coordinate e, e E
{ x } the multiset contains elements in vector x , where each element e E can appear many times.
x ( e ) , y ( e ) the coordinate value of entry e in vector x , y , where e E
x the infinity norm of vector x , x : = m a x e E x ( e )
x 1 the taxicab norm of vector x , x 1 : = e E x ( e ) .
0 the vector zero whose value 0 ( e ) = 0 , e E
B the upper bound vector of x , 0 x B
B B : = B
kthe upper bound of total elements in vector x on the integer lattice Z + E , x ( E ) k
k e the number of copies of e to be considered for addition to x
k the number of copies of e add to x
[ k ] the set of { 1 , , k }
van optimal value of the object function, ( 1 ϵ ) OPT v OPT with ( ϵ ( 0 , 1 / 2 ) )
x y the coordinate-wise maximum of x and y
( x y ) ( e ) x y : = m a x { x ( e ) , y ( e ) }
x y the coordinate-wise minimum of x and y
( x y ) ( e ) x y : = m i n { x ( e ) , y ( e ) }
x + y sum of 2 vectors x and y , with the multiset { x + y } whose e appears ( x ( e ) + y ( e ) ) times.
x y x y = x + ( y )
f ( x ) the object function value of x
f ( x | y ) f ( x | y ) = f ( x + y ) f ( y )
Table 3. Statistics of datasets. All datasets have the type of bipartite and undirected.
Table 3. Statistics of datasets. All datasets have the type of bipartite and undirected.
Dataset#Nodes#EdgesNode Meaning ( n 1 ; n 2 )Edge Meaning
FilmTrust357935,494(user, film) (1508;2071)rating
NIPS13,8751,932,365(doc, word) (1500;12,375)occurrence
Table 4. Statistics of the number of queries.
Table 4. Statistics of the number of queries.
k StrDRS 1 StrDRS 2 CaDRS SieveStr + +
FilmTrust rating
60760140,895240,202183,453
70812640,917247,335198,063
80858240,939252,978202,716
90893340,957261,258217,947
100961340,979264,730225,569
NIPS full papers
12010,93440,807206,59913,595
14011,04040,847209,60615,106
16011,10840,887215,62417,690
18013,31340,927218,63919,362
20014,44640,963221,67222,958
Table 5. Statistical comparison of experimental results.
Table 5. Statistical comparison of experimental results.
StrDRS 1 vs. SieveStr + + Oracle queries StrDRS 1 is 1.2 to 24.4 times smaller than SieveStr + +
Time StrDRS 1 is 1.1 to 102.5 times faster than SieveStr + +
Influence StrDRS 1 is 1.1 to 1.2 times smaller than SieveStr + +
StrDRS 2 vs. CaDRS Oracle queries StrDRS 2 is 5.1 to 6.5 times smaller than CaDRS
Time StrDRS 2 is 2.0 to 4.8 times faster than CaDRS
InfluenceFor FilmTrust, StrDRS 2 is 1.0 to 1.3 times smaller than CaDRS but StrDRS 2 is 1.4 to 1.7 times greater than CaDRS for NIPS.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nguyen, B.-N.T.; Pham, P.N.H.; Le, V.-V.; Snášel, V. Efficient Streaming Algorithms for Maximizing Monotone DR-Submodular Function on the Integer Lattice. Mathematics 2022, 10, 3772. https://doi.org/10.3390/math10203772

AMA Style

Nguyen B-NT, Pham PNH, Le V-V, Snášel V. Efficient Streaming Algorithms for Maximizing Monotone DR-Submodular Function on the Integer Lattice. Mathematics. 2022; 10(20):3772. https://doi.org/10.3390/math10203772

Chicago/Turabian Style

Nguyen, Bich-Ngan T., Phuong N. H. Pham, Van-Vang Le, and Václav Snášel. 2022. "Efficient Streaming Algorithms for Maximizing Monotone DR-Submodular Function on the Integer Lattice" Mathematics 10, no. 20: 3772. https://doi.org/10.3390/math10203772

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop