A Joint Constraint Incentive Mechanism Algorithm Utilizing Coverage and Reputation for Mobile Crowdsensing

Selection of the optimal users to maximize the quality of the collected sensing data within a certain budget range is a crucial issue that affects the effectiveness of mobile crowdsensing (MCS). The coverage of mobile users (MUs) in a target area is relevant to the accuracy of sensing data. Furthermore, the historical reputation of MUs can reflect their previous behavior. Therefore, this study proposes a coverage and reputation joint constraint incentive mechanism algorithm (CRJC-IMA) based on Stackelberg game theory for MCS. First, the location information and the historical reputation of mobile users are used to select the optimal users, and the information quality requirement will be satisfied consequently. Second, a two-stage Stackelberg game is applied to analyze the sensing level of the mobile users and obtain the optimal incentive mechanism of the server center (SC). The existence of the Nash equilibrium is analyzed and verified on the basis of the optimal response strategy of mobile users. In addition, mobile users will adjust the priority of the tasks in time series to enable the total utility of all their tasks to reach a maximum. Finally, the EM algorithm is used to evaluate the data quality of the task, and the historical reputation of each user will be updated accordingly. Simulation experiments show that the coverage of the CRJC-IMA is higher than that of the CTSIA. The utility of mobile users and SC is higher than that in STD algorithms. Furthermore, the utility of mobile users with the adjusted task priority is greater than that without a priority order.


Introduction
Mobile crowdsensing (MCS), as a new and rapidly emerging information collection paradigm, has aroused extensive concerns for solving complex sensing problems [1,2]. Complex sensors (e.g., cameras, GPS, and microphones) in mobile smart devices provide superior tools for acquiring sensing data in MCS network systems. At present, MCS has been widely used in water pollution [3], environmental monitoring [4], health services [5], intelligent transportation [6], and other fields. In MCS, mobile users (MUs) consume limited resources to complete sensing tasks in their spare time, and they may face potential threats of privacy disclosure (e.g., geographic location). Therefore, designing a reasonable incentive mechanism to encourage more MUs to participate in the sensing task and maximize the quality of the sensing data in MCS is a hot issue.
To collect high-quality sensing data, scholars have proposed numerous incentive mechanisms for MCS network systems [7][8][9][10]. The incentive mechanism is the core of the MCS network [11]. Existing

•
An optimal MU selecting algorithm (OMUS) is proposed to select the optimal MUs according to the location information and historical reputation of MUs. Thus, the collected sensing data will be more accurate and credible; • A two-stage Stackelberg game model is proposed to solve the balance problem between the lowest rewards of the SC and the optimal strategy of the MUs in the MCS system, and the existence of the Nash equilibrium is proven in the Stackelberg game; • A task priority time series method is proposed to maximize the total utility of the MUs' tasks; • A reputation update and reward allocation method for the MUs is proposed. After the MUs upload the sensing data, the EM algorithm is used to evaluate the quality of sensing data, and SC evaluates the reputation of MUs according to the quality of sensing data and updates the historical reputation of each MU. Then, the reward is allocated to MUs who have completed the tasks according to the selected optimal strategy.

System Model and Game Formulation
The MCS network system includes the TP, the SC, and the MUs in this paper. MUs are equipped with various mobile smart devices (e.g., smartphones, tablets). This work describes a procedure in which MUs receive and accomplish the sensing task when the TP publishes a task to the SC. In Figure 1, Sensors 2020, 20, 4478 3 of 17 the SC will broadcast the task to MUs located in the target area when the TP uploads the sensing task to the SC. The set of MUs that have signed up for the sensing task is U = {u 1 , u 2 , . . . , u n }, and the SC selects the optimal MUs to participate in the task. MUs perform the sensing task and upload the sensing data to the SC. Finally, the EM algorithm is used to evaluate the quality of the sensing data, and the SC updates the reputation of MUs. Meanwhile, the SC allocates reward to MUs for the task, and the SC sends the sensing data to the TP.
Sensors 2020, 20, x FOR PEER REVIEW 3 of 18 which MUs receive and accomplish the sensing task when the TP publishes a task to the SC. In Figure  1, the SC will broadcast the task to MUs located in the target area when the TP uploads the sensing task to the SC. The set of MUs that have signed up for the sensing task is U = {u1, u2, …, un}, and the SC selects the optimal MUs to participate in the task. MUs perform the sensing task and upload the sensing data to the SC. Finally, the EM algorithm is used to evaluate the quality of the sensing data, and the SC updates the reputation of MUs. Meanwhile, the SC allocates reward to MUs for the task, and the SC sends the sensing data to the TP.
MUs with smart devices in the target area The detailed process is presented as follows.
(1) TP publishes a sensing task and the target area to the SC; (2) If the MUs with a mobile smart device sensor are interested in the sensing task, then they will sign up to participate in the sensing task. The MU set is U = {u1, u2, …, un}; (3) The SC uses the OMUS algorithm to select the optimal users W = {w1, w2, …, wm} (m ≤ n); (4) The SC and the selected MUs choose their optimal strategies by using the coverage and reputation joint constraint incentive mechanism algorithm (CRJC-IMA). When the SC determines the total reward R, MUs choose the optimal bandwidth strategy to maximize the utility of the SC and MUs; (5) Each MU sorts the tasks in time series according to the allocated reward to maximize the total utility of the MUs; (6) MUs upload sensing data to the SC, and MUs receive the reward allocated by the SC; (7) The SC evaluates the quality of the sensing data, and the SC updates the reputation of MUs.
The main parameter definitions in this work are shown in Table 1. The detailed process is presented as follows.
(1) TP publishes a sensing task and the target area to the SC; (2) If the MUs with a mobile smart device sensor are interested in the sensing task, then they will sign up to participate in the sensing task. The MU set is U = {u 1 , u 2 , . . . , u n }; (3) The SC uses the OMUS algorithm to select the optimal users W = {w 1 , w 2 , . . . , w m } (m ≤ n); (4) The SC and the selected MUs choose their optimal strategies by using the coverage and reputation joint constraint incentive mechanism algorithm (CRJC-IMA). When the SC determines the total reward R, MUs choose the optimal bandwidth strategy to maximize the utility of the SC and MUs; (5) Each MU sorts the tasks in time series according to the allocated reward to maximize the total utility of the MUs; (6) MUs upload sensing data to the SC, and MUs receive the reward allocated by the SC; (7) The SC evaluates the quality of the sensing data, and the SC updates the reputation of MUs.
The main parameter definitions in this work are shown in Table 1. The TP publishes a sensing task and allocates the total reward R to the MUs who perform the sensing task, where R > 0. When the MUs participate in the task, the SC will acquire the payoff, which can be expressed as a function ρ(·). The payoff function ρ(·) increases with the MUs' sensing level and decreases with the increase in the sensing rate.
where λ 1 and λ 2 are system parameters; and B i and E i are the bandwidth strategy selected by MU i and the energy consumed by MU i for transmitting data, respectively. The utility of the SC can be expressed as To maximize the utility of MUs, they will not participate in sensing tasks unless incentives are sufficient. The payoff and cost functions of MUs in this work are determined by the energy consumption and bandwidth used by MUs to perform the task. The payoff function f i and cost function g i of each MU can be defined as The utility of MU i can be defined as The utility of each MU is composed of payoff and cost functions. The payoff function f i is formed by the bandwidth and energy consumed by MU i . Bandwidth B i is the strategy selected by MU i . Energy E i is determined by the distance from the SC to MU i when the sensing data are transmitted between them. The function g i is determined by the energy and bandwidth cost. α i > 0 and β i > 0 are the unit cost of the energy and bandwidth of MU i , respectively.
The energy consumption of MU i mainly includes the energy consumption of sending and receiving data during the sensing task, and other energy consumption used by MUs can be ignored [29]. Equation (6) represents the energy consumption of transmitting and receiving data.
where E elect represents the energy consumption of sending and receiving k bit data, d is the transmission distance between the SC and each MU, d 0 is the distance threshold equal to 87 m, and fs and amp are the amplifier power consumption of the free space model and multipath attenuation model, respectively. When the distance between MU i and SC is less than d 0 , the free space model is adopted, and the transmission power is attenuated to d 2 . Otherwise, the multipath attenuation model is adopted, and the transmission power is set as d 4 .
In this work, the number of sensing tasks performed by each MU is greater than or equal to 1. When the published tasks and determined rewards by the SC vary, the MU acquires payoff, and utility is distinct after performing the tasks. Suppose h tasks are needed to be completed by MU i , and a device can only perform one task during the execution of the task. The required time for each task is {t i1 , t i2 , . . . , t ih }. The corresponding utility of h i tasks are {u i1 , u i2 , . . . , u ih }. Therefore, the priority of h tasks performed by MU i is {p i1 , p i2 , . . . , p ih }. Then, the total required time that MU i executes task l is t il = h=l−1 h=1 t ih + t l . To maximize the total utility of MU i , the priority of task l performed by MU i is Sensors 2020, 20, 4478

of 17
The utility of MU i with task priority considered can be defined as where γ i is the unit time MU i costs to accomplish the sensing tasks. Each MU i spends different times on distinct tasks. Thus, various utilities for each MU i in different tasks can be defined as Equation (8). The tasks performed by MU i will be sorted in descending order according to the utility of the tasks. Therefore, the task with the maximum utility will be executed first.

CRJC-IMA
In this work, the whole CRJC-IMA consists of four phases. The first phase selects the optimal MUs. In the second phase, the follower and leader games are applied to analyze the sensing level of the MUs and the optimal incentive mechanism of the SC, respectively. The third phase includes the quality evaluation process in which the MUs upload sensing data and the historical reputation value update process. The fourth stage performs the incentive allocation process. The game process will be described in the next section.

OMUS
The selection of optimal MUs for completing the task is a challenge for the MCS network system. The MUs with a higher reputation and wider coverage are superior to collect high-quality sensing data for the TP than other MUs. The number of selected m MUs is related to the total reward R of the TP and the cost of registered MUs. The registered MUs are MUs signed up to participate in the sensing task. The number of users m can be determined as follows.
where α sum is the energy unit cost sum of the registered Mus, β sum is the bandwidth unit cost sum of the registered Mus, E is the mean value of the maximum and minimum consumption energy of registered MUs in the process of data transmission, B is the average of the maximum and minimum bandwidth selected by registered Mus, and µ 1 and µ 2 are system parameters.
After the SC publishes a sensing task, the MUs in set U = {u 1 , u 2 , ..., u n } sign up to complete the published task in the MCS network. First, the SC selects the optimal virtual points in the target area. Then, the SC determines the optimal MUs based on geographic location information and the historical reputation of the MUs according to virtual points. The selected virtual points and MUs are shown in Figure 2.
Sensors 2020, 20, x FOR PEER REVIEW 6 of 18 After the SC publishes a sensing task, the MUs in set U = {u1, u2, ..., un} sign up to complete the published task in the MCS network. First, the SC selects the optimal virtual points in the target area. Then, the SC determines the optimal MUs based on geographic location information and the historical reputation of the MUs according to virtual points. The selected virtual points and MUs are shown in Figure 2. The PSO [30] has the characteristics of fast search speed and high efficiency. Therefore, the PSO is applied to select m virtual points with maximum coverage in the target area randomly.
Definition 1: Area coverage. It is the ratio of the sensing area to the target area perceived by MUs.  The PSO [30] has the characteristics of fast search speed and high efficiency. Therefore, the PSO is applied to select m virtual points with maximum coverage in the target area randomly. Definition 1. Area coverage. It is the ratio of the sensing area s to the target area S perceived by MUs.
The fitness of PSO is the coverage rate f = s 1 /S. s 1 is the area covered by the virtual points, and S is the target area.
The specific process of PSO is presented as follows.
Step Step 4: Calculating the coverage of the m virtual points; Step 5: Updating the individual extremum and group extremum of the m virtual points; Step 6: If the maximum number of iterations has been reached, then the global optimal position will be determined, otherwise skip to the second step.
According to the above steps, m virtual points are randomly selected.

MU Selection Process
Through the above proposed VPS algorithm, m virtual points are selected in the target area of the MCS. Then, m MUs are selected to perform the task from the n MUs signed up for the task according to m virtual points. The objective function between the virtual point j and each MU i can be defined as where a and b are the weighting factors and a + b = 1, Cre i0 is the historical reputation of MU i , and d ij is the Euclidean distance between MU i and virtual point j. The SC selects an optimal MU with the largest objective function value according to a virtual point. Thus, the SC will select m optimal MUs according to m virtual points to perform the sensing task. Therefore, the quality of the collected sensing data can be guaranteed. If a new MU participates in the sensing task, then their reputation will set the highest reputation, which is set as 5 in this work. Thus, the new MU with the highest reputation can be selected preferentially to perform the sensing task, and its reputation value will be accumulated when it participates in other sensing tasks in the future.

Update Reputation of MUs
Reputation is an important index to select optimal MUs in the CRJC-IMA. MUs with high reputation will be selected preferentially by the SC to collect sensing data. In this section, the EM algorithm [31] is applied to evaluate the quality of sensing data in a task, and then the SC updates the historical reputation of each MU according to the completion quality of the task.

Quality Evaluation
The quality of the MUs' sensing data reflects the quality of tasks completed by MUs. For example, the MUs collect urban noise sensing data [32]. Each MU w i estimates a quality evaluation matrix e w i , which is a m × m matrix with elements e  the quality evaluation matrix e w i will be given in detail in the following specific steps. The quality evaluation matrix is mapped to the quality of the sensing data by the function q i = g(e w i ). Therefore, the spans of the sensing data are divided into {d 1 , d 2 , . . . , d m } intervals, which represent the sensing levels of the collected data. The EM algorithm is applied to estimate the quality evaluation matrix e w i of each MU. The converged estimation of an MU's evaluation matrix indicates the quality of the sensing data, whereas the noise interval distribution suggests the urban noise pollution level. The specific steps are as follows: Step 1. For each task, t ∈ T, the index function I d k t = d j = 1 when the MU's sensing data d k t fall into the real interval d j , and the probability distribution of the real noise interval are initialized as Step 2. Estimate the likelihood function of the perception probability matrix, andê w i rg represents the value after t iterations.ê The true noise interval distribution is estimated aŝ Step 3. Estimate the real noise interval. Given the sensing data G, the quality evaluation matrix E, and noise interval distribution Π, Bayesian inference is used to estimate the true noise interval P. Calculate the true noise interval distribution according to the following formula.
Step 4. Convergence. Steps 2-3 are iterated until the two estimates converge (i.e., |Ê t+1 −Ê t | < ε, |P t+1 −P t | < η, ε > 0, η > 0). According to the estimation of the quality evaluation matrix e w i , a mapping function g(·) can be used to obtain the quality of w i 's sensing data. Thus, the quality of w i 's sensing data is q i , that is

Reputation Updated
Through the above quality evaluation of the sensing data process, the sensing data quality of MU i is q i . Then, the reputation value of MU i will perform the normalization process and be converted to [0,5], that is, where q max is the highest data quality value of an MU who participates in the task. The SC updates the historical reputation of each MU after the task reputation is estimated, that is, where o is the number of historical tasks in which MU i participated, and Cre i0 is the historical reputation value of MU i .

Incentive Allocation
When the SC determines the total reward R, MUs select the optimal bandwidth strategy to perform the task according to the incentive mechanism algorithm based on the Stackelberg game. After MUs upload the sensing data, the SC will allocate the reward to MUs based on the optimal strategy selected by the MUs. Therefore, the final reward for each MU i is If the MUs participate in the sensing task, then the SC allocates the reward to each MU according to the payoff function. When MU i selects the optimal bandwidth B * i = 0, MU i does not participate in completing the task. Thus, reward i = f i = 0.

Designing the Incentive Mechanism Based on Stackelberg Game
This study aims to design an incentive mechanism for maximizing the utility of MUs and SCs, where the MUs with a high reputation will be encouraged to complete the task. Thus, the relationship between the SC and MUs is modeled as a Stackelberg game [33]. The SC is the leader, and the SC will provide the total reward R for the task. MUs are the followers, and all MUs select their optimal bandwidth to maximize their utility according to the SC strategy. Then, the SC will adjust its strategy accordingly. The selected optimal bandwidth by MUs in the follower game can be considered a non-cooperative game [15], which is named as the optimal bandwidth determination (OBD) game. However, the sensing task received by MUs is greater than or equal to one. Therefore, according to the utility of the sensing task, MUs should determine the priority of tasks to maximize the total utility.

Follower Game
The SC allocates rewards to each MU based on its energy consumption and bandwidth strategy, and MUs' energy and bandwidth payoffs account for half of the total reward given by the SC. Once MUs participate in the sensing task, half of the reward R will be allocated to MUs according to the energy consumed by the MUs, and the more energy consumed by the MUs, the more reward MUs will obtain. The remaining reward can get more payoff by adjusting the bandwidth. All of the MUs participating in the sensing task are W={w 1 ,w 2 , . . . ,w m }, and the bandwidth strategy set of MUs is B=(B 1 ,B 2 , . . . ,B m ), and B -i =(B 1 ,B 2 , . . . ,B i-1 ,B i+1 , . . . , B m ) represents the strategy excluding MU i . Therefore, B can be expressed as B=(B i , B -i ).
MU i will choose the optimal bandwidth strategy to maximize the utility of the task and then choose the priority of the task to maximize the total utility of all tasks. The number of each MU i 's tasks is greater than or equal to one, and the unit time cost of MU i is a determined value. Therefore, the utility function of MU i is Assume that MU i receives l-1 tasks, the published task by the TP is l, which forms the set T i = {T 1 , T 2 , . . . , T l } with other l-1 tasks. The corresponding priority set of each task is P= {p i1 , p i2 , . . . , p il }. MU i selects the priority of the task to maximize the total utility of all received tasks.  (20).

Definition 3.
Optimal response strategy. Given B −i , a strategy is MU i 's optimal response strategy, denoted by The Nash equilibrium ensures the stability of the OBD game algorithm, and each MU chooses the optimal bandwidth strategy according to other MUs.

Proofs for Properties
Property 1: Follower game of the OBD has a unique Nash equilibrium given a total reward R published by the SC.

Proof.
To certify the optimal bandwidth strategy of MU i , the utility u i of MU i is calculated for the first and second derivatives with respect to its bandwidth strategy B i , which can be obtained by Equation (20).
The utility function in Equation (23) is concave with respect to the bandwidth strategy of MUs. Therefore, the optimal strategy is unique when the total reward is fixed.
By setting the first derivative to zero Solved by Equation (24) ρ When B * i is positive in Equation (25), this bandwidth is the optimal strategy for MU i . If B * i is negative, then MU i will not participate in the sensing task and B * i = 0.

Leader Game
All participating MUs have a unique Stackelberg equilibrium bandwidth strategy when the SC gives the total reward R in the above section. In the leader game, the SC can maximize its utility by adjusting R.
Property 2: A unique Stackelberg equilibrium (R * , B * ) exists in the leader game. (24), we can obtain

Proof. From Equation
We sum all participating MUs in W for Equation (26) and obtain the following formula: Substituting Equation (28) into Equation (26) yields Substituting Equation (29) into the SC utility function yields Solving the second derivative of the SC utility function obtains where Therefore, the proposed algorithm obtained from Equation (32) is strictly concave, and a unique Stackelberg equilibrium (R * , B * ) exists in the leader game. A unique maximizer R * can be computed efficiently using Newton's method [34].

Simulation Results and Analysis
Simulation experiments are conducted using MATLAB R2016a to evaluate the effectiveness of the CRJC-IMA. A total of 1 TP, 1 SC, and 1000 MUs are distributed randomly in the target area with the range of 1 × 1 km, and each MU's sensing range is 60 m. The TP can successfully publish tasks to the SC. The parameters and experimental values of this paper are listed in Table 2.

Performance Evaluation of Selecting the Optimal Users
This section evaluates the optimal number of MUs, coverage rate, and reputation in the OMUS. The CRJC-IMA is proposed in the MCS scenario, and the reputation and the coverage of the MUs are important indicators for selecting the optimal MUs. The CTSIA [27] is an incentive mechanism based on reverse auctions under total reward constraints, which have a good coverage ratio in the target area. Therefore, the CRJC-IMA has compared the coverage ratio with the CTSIA under a certain reward. 1.
The number of optimal users: Figure 3 shows the relationship between the optimal number of MUs m and the total rewards R given by the SC in the CRJC-IMA. As the rewards increase to 2000, an increasing number of MUs are selected as the collectors to sense the data and the number of selected MUs rises linearly. This phenomenon indicates that the CRJC-IMA performs well relative to the reward.
the target area 1000 m × 1000 m 1000

Performance Evaluation of Selecting the Optimal Users
This section evaluates the optimal number of MUs, coverage rate, and reputation in the OMUS. The CRJC-IMA is proposed in the MCS scenario, and the reputation and the coverage of the MUs are important indicators for selecting the optimal MUs. The CTSIA [27] is an incentive mechanism based on reverse auctions under total reward constraints, which have a good coverage ratio in the target area. Therefore, the CRJC-IMA has compared the coverage ratio with the CTSIA under a certain reward.
1. The number of optimal users: Figure 3 shows the relationship between the optimal number of MUs m and the total rewards R given by the SC in the CRJC-IMA. As the rewards increase to 2000, an increasing number of MUs are selected as the collectors to sense the data and the number of selected MUs rises linearly. This phenomenon indicates that the CRJC-IMA performs well relative to the reward. 2. Coverage: Figure 4 shows the relationship between the total reward R given by the SC and the coverage rate of the CRJC-IMA and CTSIA algorithms. The coverage rate in the target area is increasing in both algorithms as the reward R rises. When the total reward R is greater than 800, the coverage of the CRJC-IMA reaches 90%. When R is greater than 1600, the coverage rate of both algorithms will not increase significantly, and both reach 90%. However, more reward will increase the cost of the SC and cause excessive data redundancy. The experimental results of the two algorithms show that the coverage rate of the CRJC-IMA is superior to that of the CTSIA.

2.
Coverage: Figure 4 shows the relationship between the total reward R given by the SC and the coverage rate of the CRJC-IMA and CTSIA algorithms. The coverage rate in the target area is increasing in both algorithms as the reward R rises. When the total reward R is greater than 800, the coverage of the CRJC-IMA reaches 90%. When R is greater than 1600, the coverage rate of both algorithms will not increase significantly, and both reach 90%. However, more reward will increase the cost of the SC and cause excessive data redundancy. The experimental results of the two algorithms show that the coverage rate of the CRJC-IMA is superior to that of the CTSIA.  3. Reputation: Figure 5 shows the influence of different weights on reputation and coverage when selecting the optimal users using Equation (11). The horizontal axis represents a and b, whose domains are between 0 and 1. When the horizontal axis represents the reputation weight a, the vertical axis that corresponds to the left represents the reputation value. The vertical axis that

3.
Reputation: Figure 5 shows the influence of different weights on reputation and coverage when selecting the optimal users using Equation (11). The horizontal axis represents a and b, whose domains are between 0 and 1. When the horizontal axis represents the reputation weight a, the vertical axis that corresponds to the left represents the reputation value. The vertical axis that corresponds to the right represents the coverage value when the horizontal axis represents the coverage weight b. The sum of the reputation weight a and coverage weight b is 1. For example, the coverage weight b is 0.9 when the reputation weight a is 0.1. The experimental result shows that the reputation value of the selected MUs is a non-decreasing trend with the increase in reputation weight. When the weight of reputation is greater than 0.1, the average reputation value of the users will reach 4. The coverage of the selected MUs is related to the number of users. Thus, the coverage ratio and weight have a weak coupling relationship. 3. Reputation: Figure 5 shows the influence of different weights on reputation and coverage when selecting the optimal users using Equation (11). The horizontal axis represents a and b, whose domains are between 0 and 1. When the horizontal axis represents the reputation weight a, the vertical axis that corresponds to the left represents the reputation value. The vertical axis that corresponds to the right represents the coverage value when the horizontal axis represents the coverage weight b. The sum of the reputation weight a and coverage weight b is 1. For example, the coverage weight b is 0.9 when the reputation weight a is 0.1. The experimental result shows that the reputation value of the selected MUs is a non-decreasing trend with the increase in reputation weight. When the weight of reputation is greater than 0.1, the average reputation value of the users will reach 4. The coverage of the selected MUs is related to the number of users. Thus, the coverage ratio and weight have a weak coupling relationship.

Performance Evaluation of Incentive Mechanism
The SC will select optimal MUs in the target area when the SC publishes the total reward R to the MUs for completing the task. Figure 5 shows that the weights of the optimal MUs' reputation and coverage are set as 0.2 and 0.8, respectively. Figures 6-11 analyze the MUs' energy payoff and bandwidth payoff, the utility of MUs selection priority, the bandwidth strategy selected by MUs, the utility and payoff of MUs and the SC, and evaluation of the reputation of MUs, respectively. 4. Energy and bandwidth payoff: Figure 6 analyzes the mean square deviation of the energy payoff and bandwidth payoff obtained by the MUs. The total reward R given by the SC to MUs will be divided into two parts: each MU is rewarded on the basis of the energy consumed and bandwidth used. As shown in Figure 6, the mean square deviation of the energy payoff and

Performance Evaluation of Incentive Mechanism
The SC will select optimal MUs in the target area when the SC publishes the total reward R to the MUs for completing the task. Figure 5 shows that the weights of the optimal MUs' reputation and coverage are set as 0.2 and 0.8, respectively. Figures 6-11 analyze the MUs' energy payoff and bandwidth payoff, the utility of MUs selection priority, the bandwidth strategy selected by MUs, the utility and payoff of MUs and the SC, and evaluation of the reputation of MUs, respectively.
Sensors 2020, 20, x FOR PEER REVIEW 13 of 18 bandwidth payoff increases with the total reward R. This phenomenon shows that the gap among MUs' energy payoff, bandwidth payoff, and the average payoff increases with R. The mean square deviation of the energy payoff is greater than that of the bandwidth payoff when R is fixed because energy is determined by the distance of MUs, and the bandwidth is selected between zero and five. Thus, the stability of each MU's energy payoff is worse than that of the bandwidth payoff.    5. The utility of priority: Figure 7 shows the relationship between the utility of MUs whether choosing priority and the total reward R of the task. The task has the utility when MUs choose the optimal bandwidth strategy. However, if MUs need to perform other tasks, then the utility of each task will vary because the total rewards of the MU's every task are diverse. Figure 7 illustrates that as the total reward R of the task increases, the total utility obtained by MUs also rises. After the priority ranking is performed, the total utility obtained by MUs will be greater than that without prioritization. This finding shows that the total utility will increase after each MU selects the priority of the task, and the MU can perform tasks better to avoid time conflicts. 6. Bandwidth strategy: Figure 8 analyzes the relationship between the bandwidth selected by MUs and the total reward R of the task. In Figure 8, the average bandwidth selected by MUs increases with the total reward R. When the total reward is less than 1000, the average bandwidth selected by the MU is less than 2.5. However, MUs choose the average bandwidth value to exceed 4 when the total reward is greater than 3000. 7. Utility and payoff: Figure 9 shows the utility and payoff of MUs in the follower game and the utility and payoff of the SC in the leader game when the MUs have been selected to perform the task. The horizontal axis represents the total reward. In Figure 9a, the utility of the SC is declined when the total reward R increases from 1000 to 2000, and the payoff of the SC remained unchanged. Considering the selected MUs are certain, the utility of the SC will decrease when the total reward paid by the SC increases. In Figure 9b, the average utility and payoff of MUs are growing with the increase in R. Moreover, the total rewards paid by the SC are linearly related to the average utility and payoff of MUs. This indicates that under the condition of a certain number of users, the more rewards the SC has, the less utility the SC has, however, the users will have more utility. Sensors 2020, 20, x FOR PEER REVIEW 14 of 18 6. Bandwidth strategy: Figure 8 analyzes the relationship between the bandwidth selected by MUs and the total reward R of the task. In Figure 8, the average bandwidth selected by MUs increases with the total reward R. When the total reward is less than 1000, the average bandwidth selected by the MU is less than 2.5. However, MUs choose the average bandwidth value to exceed 4 when the total reward is greater than 3000. 7. Utility and payoff: Figure 9 shows the utility and payoff of MUs in the follower game and the utility and payoff of the SC in the leader game when the MUs have been selected to perform the task. The horizontal axis represents the total reward. In Figure 9a, the utility of the SC is declined when the total reward R increases from 1000 to 2000, and the payoff of the SC remained unchanged. Considering the selected MUs are certain, the utility of the SC will decrease when the total reward paid by the SC increases. In Figure 9b, the average utility and payoff of MUs are growing with the increase in R. Moreover, the total rewards paid by the SC are linearly related to the average utility and payoff of MUs. This indicates that under the condition of a certain number of users, the more rewards the SC has, the less utility the SC has, however, the users will have more utility. 8. Figure 10 shows the utility of the SC and MUs of the CRJC-IMA and STD algorithm [15]. The STD algorithm is a non-cooperative game based on the Stackelberg game, and the total reward both algorithms declines ( Figure 10a). As R grows, the optimal number of MUs increases, and the average utility of the two algorithms will no longer increase significantly (Figure 10b). However, MUs have more average utility when the total reward R and the optimal number of users are constant in the CRJC-IMA. The reason is that the cost is determined by the bandwidth of MUs when performing the task in the CRJC-IMA, and the cost of MUs will be lower than that of the STD algorithm. 9. Reputation evaluation: Figure 11 shows the relationship between the reputation of MUs and the quality evaluation matrix (effort level eii) of the sensing data collected by MU wi. For the quality evaluation matrix eii of the sensing data, the majority of MUs uploaded follow the same normal distribution in [32], where μ = 0.75 and σ = 0.125. Figure 11 shows a linear relationship between the effort level (eii) and the reputation. The MU's reputation is lower when the MU uploads the sensing data with a small evaluation matrix. However, the MUs' reputation will be higher if the evaluation matrix of the MUs is larger.

Conclusions
In this work, an incentive mechanism is proposed on the basis of the Stackelberg game that considers coverage and reputation for MCS. The overall mechanism includes the OMUS algorithm,  9. Reputation evaluation: Figure 11 shows the relationship between the reputation of MUs and the quality evaluation matrix (effort level eii) of the sensing data collected by MU wi. For the quality evaluation matrix eii of the sensing data, the majority of MUs uploaded follow the same normal distribution in [32], where μ = 0.75 and σ = 0.125. Figure 11 shows a linear relationship between the effort level (eii) and the reputation. The MU's reputation is lower when the MU uploads the sensing data with a small evaluation matrix. However, the MUs' reputation will be higher if the evaluation matrix of the MUs is larger.

Conclusions
In this work, an incentive mechanism is proposed on the basis of the Stackelberg game that considers coverage and reputation for MCS. The overall mechanism includes the OMUS algorithm,

4.
Energy and bandwidth payoff: Figure 6 analyzes the mean square deviation of the energy payoff and bandwidth payoff obtained by the MUs. The total reward R given by the SC to MUs will be divided into two parts: each MU is rewarded on the basis of the energy consumed and bandwidth used. As shown in Figure 6, the mean square deviation of the energy payoff and bandwidth payoff increases with the total reward R. This phenomenon shows that the gap among MUs' energy payoff, bandwidth payoff, and the average payoff increases with R. The mean square deviation of the energy payoff is greater than that of the bandwidth payoff when R is fixed because energy is determined by the distance of MUs, and the bandwidth is selected between zero and five. Thus, the stability of each MU's energy payoff is worse than that of the bandwidth payoff.

5.
The utility of priority: Figure 7 shows the relationship between the utility of MUs whether choosing priority and the total reward R of the task. The task has the utility when MUs choose the optimal bandwidth strategy. However, if MUs need to perform other tasks, then the utility of each task will vary because the total rewards of the MU's every task are diverse. Figure 7 illustrates that as the total reward R of the task increases, the total utility obtained by MUs also rises. After the priority ranking is performed, the total utility obtained by MUs will be greater than that without prioritization. This finding shows that the total utility will increase after each MU selects the priority of the task, and the MU can perform tasks better to avoid time conflicts. 6.
Bandwidth strategy: Figure 8 analyzes the relationship between the bandwidth selected by MUs and the total reward R of the task. In Figure 8, the average bandwidth selected by MUs increases with the total reward R. When the total reward is less than 1000, the average bandwidth selected by the MU is less than 2.5. However, MUs choose the average bandwidth value to exceed 4 when the total reward is greater than 3000. 7.
Utility and payoff: Figure 9 shows the utility and payoff of MUs in the follower game and the utility and payoff of the SC in the leader game when the MUs have been selected to perform the task. The horizontal axis represents the total reward. In Figure 9a, the utility of the SC is declined when the total reward R increases from 1000 to 2000, and the payoff of the SC remained unchanged. Considering the selected MUs are certain, the utility of the SC will decrease when the total reward paid by the SC increases. In Figure 9b, the average utility and payoff of MUs are growing with the increase in R. Moreover, the total rewards paid by the SC are linearly related to the average utility and payoff of MUs. This indicates that under the condition of a certain number of users, the more rewards the SC has, the less utility the SC has, however, the users will have more utility. 8. Figure 10 shows the utility of the SC and MUs of the CRJC-IMA and STD algorithm [15]. The STD algorithm is a non-cooperative game based on the Stackelberg game, and the total reward and sensing time of a task are used as the parameters of the utility function in the proposed game formula. With the increase in the total reward R paid by the SC to MUs, the utility of the SC in both algorithms declines ( Figure 10a). As R grows, the optimal number of MUs increases, and the average utility of the two algorithms will no longer increase significantly ( Figure 10b). However, MUs have more average utility when the total reward R and the optimal number of users are constant in the CRJC-IMA. The reason is that the cost is determined by the bandwidth of MUs when performing the task in the CRJC-IMA, and the cost of MUs will be lower than that of the STD algorithm. 9.
Reputation evaluation: Figure 11 shows the relationship between the reputation of MUs and the quality evaluation matrix (effort level eii) of the sensing data collected by MU wi. For the quality evaluation matrix eii of the sensing data, the majority of MUs uploaded follow the same normal distribution in [32], where µ = 0.75 and σ = 0.125. Figure 11 shows a linear relationship between the effort level (eii) and the reputation. The MU's reputation is lower when the MU uploads the sensing data with a small evaluation matrix. However, the MUs' reputation will be higher if the evaluation matrix of the MUs is larger.

Conclusions
In this work, an incentive mechanism is proposed on the basis of the Stackelberg game that considers coverage and reputation for MCS. The overall mechanism includes the OMUS algorithm, two-stage Stackelberg game model, task priority time series method, and reputation update and reward allocation method. The accuracy of the collected sensing data has an obvious improvement, because the optimal MUs are selected by the constraint of coverage and reputation. In addition, each user chooses the sensing task priority to ensure that the utility of tasks reaches the maximum. Compared with the CTSIA, the proposed model in this paper has a higher coverage rate. The utility of the SC and MUs in the proposed method is greater than that in the STD algorithm. Meanwhile, MUs have more utility than those without a priority order. However, this article does not consider the user selection task problem when multiple tasks are released. Therefore, an incentive mechanism considering multiple TPs will be investigated in future work. Moreover, we will design an incentive mechanism that combines non-monetary incentives to reduce the total rewards of the SC in the MCS.