An Enhanced Dynamic Spectrum Allocation Algorithm Based on Cournot Game in Maritime Cognitive Radio Communication System

The recent development of maritime transport has resulted in the demand for a wider communication bandwidth being more intense. Cognitive radios can dynamically manage resources in a spectrum. Thus, building a new type of maritime cognitive radio communication system (MCRCS) is an effective solution. In this paper, the enhanced dynamic spectrum allocation algorithm (EDSAA) is proposed, which is based on the Cournot game model. In EDSAA, the decision-making center (DC) sets the weights according to the detection capability of the secondary user (SU), before adding these weighting coefficients in the price function. Furthermore, the willingness of the SU will reduce after meeting their basic communication needs when it continues to increase the leasable spectrum by adding the elastic model in the SU’s revenue function. On this basis, the profit function is established. The simulation results show that the EDSAA has Nash equilibrium and conforms to the actual situation. It shows that the results of spectrum allocation are fair, efficient and reasonable.


Introduction
With the development of the shipping industry and the improved informatization of shipping business, the maritime wireless communication field urgently requires an increase in the data transmission bandwidth.At the same, the spectrum has become an indispensable and precious resource in the field of public communications [1][2][3].However, the fixed spectrum allocation system of wireless networks is no longer suitable for the needs of spectrum growth.Therefore, the problem of the spectrum shortage has prompted the use of cognitive radio (CR), which can significantly improve the utilization rate of spectra by dynamically managing the spectra resources.
Relevant scholars and institutions are actively studying whether the CR technology is suitable for maritime communication systems.Reference [4] proposes the concept of cognitive marine wireless networks (CMWN) and studies the cognitive protocol of the media access control (MAC) layer under a mash network architecture.Reference [5] proposes the concept of marine cognitive radio network (MCRN) and studies cooperative spectrum-sensing methods based on the best entropy.Reference [6] proposes the concept of a marine cognitive radio communications system (MCRCS) and studies the architectural model of the system.
In general, cognitive radio networks, primary users (PUs) and secondary users (SUs) seek to maximize their profits in the competition and cooperation, which means that the game theory has become an effective method to study the relationship between PUs and SUs [7].The spectrum sharing model based on the Cournot game is originally proposed by previous studies [8,9].Reference [10] examines the effect of prioritizing SUs on spectrum allocation, but the model is likely to present the result that the PUs cannot find the SUs intending to transmit the data for them.Reference [11] provides a game method that uses the terminal power of SUs to establish a trust mechanism, but this method is mainly applied on handheld devices, which require low power consumption.
Marine wireless channels change quickly and there are frequent disturbances, which are due to the typical complex electromagnetic environment [12].Therefore, a single SU's detection of PU involves a certain degree of misjudgment probability.Therefore, we propose the enhanced dynamic spectrum allocation algorithm (EDSAA) based on the Cournot game.
Considering that the maritime wireless communication system is a short-range regional communication system, we first construct a regional multi-user cooperative spectrum allocation system model based on the decision-making center (DC).The DC can evaluate the accuracy of each SU by a new mechanism in order to adaptively adjust the weight of the SU in the game process.In addition, the SU's income is usually non-linear in the game process.Thus, we use the elastic model to construct the corresponding profit function to make it more suitable for the actual situation.Finally, the simulation results show that the EDSAA exists in a Nash equilibrium.Furthermore, the algorithm can filter out the SUs with relatively low detection capabilities in the whole region.This is conducted according to the priority coefficients obtained by the adaptive adjustment of the algorithm, which provides the basis for system optimization.
The rest of this paper is organized as follows.The system model is introduced in Section 2. In Section 3, the Game theory model is introduced, the new profit function is established with the modified revenue function and price function in addition to the Nash equilibrium being solved.Section 4 demonstrates the simulation results and conclusions are given in Section 5.

System Components
The International Telecommunication Union (ITU) has designated 156.025-162.025MHz of Very High Frequency (VHF) bands as the maritime special frequency band.The vast majority of spectrum resources can be used for analog voice intercom.At present, the digital communication bands approved by the International Maritime Organization (IMO) are only two channels used by the Automatic Identification System (AIS) with the bandwidth of each channel being only 25 kHz.The next-generation maritime digital communication systems only extend the bandwidth to 400 kHz, but are still evolving.These systems are still unable to meet the higher data rate requirements in the future.Since the VHF band has a long history and a large number of legacy systems, it is difficult to re-divide the spectrum resources.Therefore, MCRCS is the best choice for solving the limitations of the maritime broadband wireless communications.
In the MCRCS, the PUs are the analog voice intercoms in the VHF maritime frequency bands and the legacy system equipment outside the maritime band.The SUs are the communication devices with cognitive radio technology installed on the ships.
The VHF wireless signals are transmitted in a straight line.Considering the curvature of the Earth in the maritime environment, the transmission range of VHF signals is limited.In the region, the SU monitors the working status of the PU and uses the spectrum resources of the PU to exchange data with other SUs.In order to improve the accuracy of detection and solve the competition between SUs, we built the MCRCS using a centralized spectrum allocation scheme.In this scheme, the SU first sends the spectral detection results, before the spectrum lease requests to the DC through an additional signaling channel.Following this, the DC allocates the spectrum using the EDSAA.Finally, the allocation results are transmitted to the SU through the signaling channel.
According to the distance between the network region and the DC which is built on land, the system is divided into two typical regions: close to shore and deep sea.In the region close to the shore, the SUs can communicate directly with the DC through the signaling channels.In the deep-sea region, the SUs exchange data with the DC through satellite links.The physical composition of the system is shown in Figure 1.An important consideration of MCRCS is that the PU must be well protected.These bands are usually used for the transmission of important information, such as rescue in the case of disasters, vessel navigation, weather broadcasting and so on.Therefore, the overall communication efficiency in the region of MCRCS is the most important.

Algorithm Model
In this paper, the EDSAA based on MCRCS is proposed in Figure 2.There are N SUs and one PU system, which includes N PUs and one DC.The PUs have authorized spectra, which are used to form multiple complexes with SUs.Each SU passes the local energy detected and submits its own detection results to the DC through a unique signaling channel.The DC examines and compares these submitted results to determine the final available free spectrum and allocates the free spectrum to SUs.When the free spectrum is allocated, SUs can use adaptive modulation techniques to transmit data in the allocated spectrum.As shown in Figure 2, the gray portion represents the spectrum used by the PUs that was detected by DC; the black portion represents a small, fixed frequency band between the SU; and the white portion represents the free spectrum determined by the DC.We used ( = 1,2, … , ) to represent the allocated spectrum size of SUs, which is the strategy of game participants.The DC does not need to obtain the terminal information of the SUs at any moment.When the PUs need to use the spectrum, it will detect the terminal information of SUs, which does not return the spectrum in time.As a result, the channel resources can be maximally saved.An important consideration of MCRCS is that the PU must be well protected.These bands are usually used for the transmission of important information, such as rescue in the case of disasters, vessel navigation, weather broadcasting and so on.Therefore, the overall communication efficiency in the region of MCRCS is the most important.

Algorithm Model
In this paper, the EDSAA based on MCRCS is proposed in Figure 2.There are N SUs and one PU system, which includes N PUs and one DC.The PUs have authorized spectra, which are used to form multiple complexes with SUs.Each SU passes the local energy detected and submits its own detection results to the DC through a unique signaling channel.The DC examines and compares these submitted results to determine the final available free spectrum and allocates the free spectrum to SUs.When the free spectrum is allocated, SUs can use adaptive modulation techniques to transmit data in the allocated spectrum.As shown in Figure 2, the gray portion represents the spectrum used by the PUs that was detected by DC; the black portion represents a small, fixed frequency band between the SU; and the white portion represents the free spectrum determined by the DC.We used b i (i = 1, 2, . . ., N) to represent the allocated spectrum size of SUs, which is the strategy of game participants.An important consideration of MCRCS is that the PU must be well protected.These bands are usually used for the transmission of important information, such as rescue in the case of disasters, vessel navigation, weather broadcasting and so on.Therefore, the overall communication efficiency in the region of MCRCS is the most important.

Algorithm Model
In this paper, the EDSAA based on MCRCS is proposed in Figure 2.There are N SUs and one PU system, which includes N PUs and one DC.The PUs have authorized spectra, which are used to form multiple complexes with SUs.Each SU passes the local energy detected and submits its own detection results to the DC through a unique signaling channel.The DC examines and compares these submitted results to determine the final available free spectrum and allocates the free spectrum to SUs.When the free spectrum is allocated, SUs can use adaptive modulation techniques to transmit data in the allocated spectrum.As shown in Figure 2, the gray portion represents the spectrum used by the PUs that was detected by DC; the black portion represents a small, fixed frequency band between the SU; and the white portion represents the free spectrum determined by the DC.We used ( = 1,2, … , ) to represent the allocated spectrum size of SUs, which is the strategy of game participants.The DC does not need to obtain the terminal information of the SUs at any moment.When the PUs need to use the spectrum, it will detect the terminal information of SUs, which does not return the spectrum in time.As a result, the channel resources can be maximally saved.The DC does not need to obtain the terminal information of the SUs at any moment.When the PUs need to use the spectrum, it will detect the terminal information of SUs, which does not return the spectrum in time.As a result, the channel resources can be maximally saved.

Game Theory Model
Since the MCRCS is in the regional communication mode, we pay more attention to the overall communication capability in the region rather than the communication efficiency of a single SU.Therefore, we used the game theory for spectrum allocation.
The game theory model is often used to analyze the situation in which the users compete with each other in the cognitive radio system in order to maximize the benefits.The users participate in the competition to make decisions on their own known information.The spectrum allocation model based on the game theory is expressed as below: where N represents the collection of game participants; S i represents the policy set for the game participant i; and U i represents the profit of the game participant i, which involves the strategy of the game participant i and other user strategy functions (i.e., U = (S i , S −i )).In this paper, the SUs are the game participants, whose strategies are applying for leasing in the free spectrum.SUs gain benefit in spectrum competition through their own strategies (U i , i ∈ N).Furthermore, SUs' earnings are not only related to their own strategies, but also are affected by other SUs.
In the game G [13], we use s i = (s 1 , s 2 , . . . ,s N ) to represent a combination of strategies from one game to another.If the benefit for each participant is established for all strategies (s i ∈ S i ), the strategy combination s i = (s 1 , s 2 , . . . ,s N ) is a Nash equilibrium of the game, where the s * i represents that participant i is different from it in strategy s i .The s −i represents a policy mix among participants in addition to participant i.
The Nash equilibrium is understood as the standard behavior of rational participants in fragmented situations.In this paper, when all SUs cannot change their strategies in order to enhance their rational benefit, a Nash equilibrium will be achieved and the strategy combination of each user is the final spectrum allocation scheme.As a result, the overall communication efficiency in the region of MCRCS is improved to the greatest extent.

Establishment of Profit Function
SUs will have some profits by using the free spectrum to transmit the business process.The profit function is represented as below: where P i represents the revenue function off the SU i ; b i represents the allocated bandwidth of the SU i ; and c i represents the price function of the SU i per unit spectrum bandwidth.After determining the unit bandwidth pricing, the total bandwidth provided by the DC to the SUs will remain essentially unchanged.In this case, the size of the SU's allocated bandwidth will be subject to interference from other users.We used v ∈ [0, 1] to represent the degree of influence between SUs.For example, v = 0 represents that the degree of influence between SUs is zero.Essentially, the size of the SU's own application will not be affected by various other factors.In contrast, v = 1 represents the greatest degree of influence between SUs [9].

Revenue Function Improvement
The traditional SU's revenue function P i is expressed as: where r i represents the revenue per unit transmission rate of SU i ; and k i represents the spectral efficiency of SU i transmissions.According to the Shannon's theorem, the expression of k i is: In Equation ( 4), γ i represents the signal-to-noise ratio (SNR) of SU i ; and K represents a constant determined by the receiver's bit error rate (BER) threshold: Furthermore, B tar i represents the target BER off the SU i .The transmission rate can be dynamically adjusted according to the channel quality with the adaptive modulation technique [14].
In Equation ( 3), the traditional revenue function is determined by the allocated bandwidth of the SU and it increases with the addition of the bandwidth, which is a monotone linear function.In fact, the revenues obtained by the SU are non-linear with the allocated spectrum.According to the elastic business model [15], SUs receive higher unit revenue when they are allocated less spectrum communication to meet basic communication requirements.However, with an increase in the allocated spectrum, the SU revenues will reduce and the willingness to continue to lease the spectrum will reduce.Essentially, the SU's revenue and the allocated spectrum should be logarithmic.Therefore, a new revenue function is proposed: In Equation ( 6), use σ i to represent the spectrum demand factor of the SU, and its value reflects the degree of demand for the spectrum of the SU.In addition, the total amount of SU allocated spectrum should not exceed the number of spectrum allocated by the PU at each stage.

Price Function Improvement
In the process of allocating spectrum for SUs, the allocated spectrum is equivalent to the goods in the competitive market.Therefore, the SUs will certainly need to spend the appropriate price.The traditional price function is expressed as follows [8]: In Equation ( 7), the price function entirely depends on the size of the total allocated spectrum of the SUs and it does not reflect the willingness of DC to share the spectrum with SUs of different priorities.In cognitive radio systems, SUs determines whether the data can be transmitted or not through detecting the relevant spectrum.Once the PU needs to use the spectrum, the SU must provide the spectrum immediately to avoid interference of PU, which may affect the PU communication.However, the traditional price function does not take this into consideration.SUs with different detection capabilities do not have priorities, which will affect the enthusiasm of SUs and weaken their detection capabilities.Subsequently, the detection performance of the whole system will be greatly reduced.
In order to differentiate SUs, a new price function is proposed: The price function is based on the sigmoid function [16].In Equation (8), we used w i (w i ∈ [0, 1]) to represent the detection capability of the SU in submitting the test results correctly at the DC.For example, w i = 1 represents that the SU has the best detection capability and the test results are absolutely correct, which has the highest priority.At this point, the price function is reduced to the traditional price function.When the detection capability reduces, the priority level of SU will decrease, which will lead to an increase in the price of the allocated spectrum.When the detection capability of SU i reduces to a certain extent, it will be marked as a malicious user by the DC and the price function reaches the maximum.Finally, α represents the evaluation criteria for the SU's price and credibility.
Equations ( 6) and ( 8) are substituted into Equation ( 2) to obtain a new profit function: The first item represents the revenue generated by SU i by transmitting data traffic through the allocated bandwidth.The second item represents the cost of using the allocated spectrum for the SU i .The third item reflects the degree of interaction between the SUs.

Nash Equilibrium Solution
The main task of the static game is to obtain the best response spectrum for each user.When all the users fail to take other strategies to increase their own revenues, the Nash equilibrium is reached.Equation ( 9) is derived to obtain the optimal profit function: Let the value of Equation ( 10) be zero and the following is obtained: The spectrum allocation curve of each SU can be obtained and the intersection points are the Nash equilibrium of the static game.
At the start of the dynamic game, each SU tentatively changes the size of the allocated bandwidth at the moment t.At the same time, the DC constantly adjusts the spectral unit price.We used b i [t] to represent the allocated bandwidth for the SU i at moment t; and U i [t] to represent the SUs' revenue at moment t.The SU adjusts its strategy through the changes in revenue and bandwidth.The allocated bandwidth for the SU at the moment t + 1 is: In Equation (12), β i represents the learning factor of SU i , which reflects the capability of SU in observing bandwidth and leasing bandwidth changes [8].By constantly iterating, b i is reached and the game reaches the Nash equilibrium.Furthermore, the SU profit is the largest in this game.
The stability of the game theory is discussed and the Jacobian matrix is established: The absolute value of all eigenvalues of Jacobian matrix is less than 1 and the relation of the learning factor is obtained.This mutual relation can guarantee the stability of the game process [9].

Algorithm
We now give a detailed Algorithm 1 based on the analysis from the Section 3.1 to Section 3.3.

Algorithm 1:
Given that there is one PU system (including one DC and some PUs) and some SUs (SU i , i = 1 to N) in a region grid of the MCRCS.If PUs have free spectra.For i = 1 to N, do steps (a)-(f).
(a) SU i want to lease spectra; (b) DC obtains the detection capability of SU i , which is denoted by w i ; (c) If detection capability of SU i w i = 1, then use (8) to set the lowest price for SU i ; Else use (8) to set the price based on the detection capability; (d) Use ( 6) and ( 8) to substitute the profit function of SU i ; (e) Use ( 9) to calculate the profit of SU i ; (f) Use ( 12) to allocate spectrum for SU i .
The algorithm is executed at the DC in the MCRCS.In this algorithm, we have developed the new revenue function (6) and price function (8).Furthermore, we use these two functions to develop the profit function (9) and allocate the free spectra within the region effectively.

Simulation Results and Analysis
In the analysis of the performance of the EDSAA, it is assumed that there is only one PU system (including one DC and one PU) and two SUs in a region grid of the MCRCS.First, the simulation of the static game is carried out with the specific parameters set as follows: (1) Set the maximum available spectrum bandwidth of all SUs to 15 MHz; (2) Set the revenue rate of the SU's unit transfer rate r i = 10; (3) Set the SNR of each SUs γ = 10 dB; (4) Set the price evaluation criteria α = 2; (5) Set the target of BER B tar i = 10 −4 .
The best spectrum allocation curve of each SU is shown in Figure 3.In order to facilitate the comparative analysis, we set the parameter v to 0 to minimize the influence between SUs.
As shown in Figure 3, the intersection points of the curve are the Nash equilibrium points in different cases.If there are three SUs in the region, the best spectrum allocation curve for each user is three surfaces and the Nash equilibrium point is the intersection point of the three surfaces [8].It can be seen that the position of the Nash equilibrium is affected by the SU's detection capability (w i ) and the demand factor (σ i ).A stronger detection capability and greater spectrum demand of the SU result in a higher best spectrum allocation curve.On the other hand, when the SU detection capability or the spectrum demand weakens, the best spectrum allocation curve decreases.Second, in order to determine the relationship between learning factors and the convergence of the game results, we propose the corresponding dynamic game simulation verification with the simulation results shown in Figure 4.Under the condition that the detection capabilities of both sides of the game are determined, we propose each learning factor has an upper boundary in order to ensure that the game process finally converges to the stable state.As shown in Figure 4, when the two sides of the game learning factor adjustment have values in the lower left region, the game will eventually converge.However, if the value of the learning factor adjusts over the corresponding region, the game process will not converge.When the detection capability of the game users is weaker, the stability adjustment area of learning factors shrinks accordingly.Second, in order to determine the relationship between learning factors and the convergence of the game results, we propose the corresponding dynamic game simulation verification with the simulation results shown in Figure 4.Under the condition that the detection capabilities of both sides of the game are determined, we propose each learning factor has an upper boundary in order to ensure that the game process finally converges to the stable state.As shown in Figure 4, when the two sides of the game learning factor adjustment have values in the lower left region, the game will eventually converge.However, if the value of the learning factor adjusts over the corresponding region, the game process will not converge.When the detection capability of the game users is weaker, the stability adjustment area of learning factors shrinks accordingly.Second, in order to determine the relationship between learning factors and the convergence of the game results, we propose the corresponding dynamic game simulation verification with the simulation results shown in Figure 4.Under the condition that the detection capabilities of both sides of the game are determined, we propose each learning factor has an upper boundary in order to ensure that the game process finally converges to the stable state.As shown in Figure 4, when the two sides of the game learning factor adjustment have values in the lower left region, the game will eventually converge.However, if the value of the learning factor adjusts over the corresponding region, the game process will not converge.When the detection capability of the game users is weaker, the stability adjustment area of learning factors shrinks accordingly.to the SU is also different.The bandwidth of the SU with a strong detection capability is wider, while the bandwidth of the SU with a weak detection capability is narrower (represented by the green curve).
It can be seen from the above analysis that the EDSAA based on the Cournot model is positively correlated to the detection capability and the demand factor given by the SUs.This algorithm results in the spectrum distribution being more consistent with the user's actual situation and reduces the spectrum of waste, which significantly improves the system spectrum utilization.
Finally, we propose the EDSAA based on the Cournot model.When the detection capabilities of SUs are different, the convergence results are also different.This is because the EDSAA can dynamically adjust their price function according to the detection capability of the system feedback.As shown in Figure 6, the detection capability of the is constant ( = 1.0), which results in the detection capability ( ) and the evaluation criteria ( ) influencing the and price function in the spectrum allocation process.(Similar results are shown when the detection capability ( ) is constant and the simulation results of the SUs price are similar, which is not provided in this present paper.)It can be seen from the above analysis that the EDSAA based on the Cournot model is positively correlated to the detection capability and the demand factor given by the SUs.This algorithm results in the spectrum distribution being more consistent with the user's actual situation and reduces the spectrum of waste, which significantly improves the system spectrum utilization.
Finally, we propose the EDSAA based on the Cournot model.When the detection capabilities of SUs are different, the convergence results are also different.This is because the EDSAA can dynamically adjust their price function according to the detection capability of the system feedback.As shown in Figure 6, the detection capability of the SU 1 is constant (w 1 = 1.0), which results in the detection capability (w 2 ) and the evaluation criteria (α) influencing the SU 1 and SU 2 price function in the spectrum allocation process.(Similar results are shown when the SU 2 detection capability (w 2 ) is constant and the simulation results of the SUs price are similar, which is not provided in this present paper.)It can be seen from Figure 6 that the adjustment trend of the proposed method is negatively correlated to the detection capability of SUs.Essentially, SUs with a higher detection capability has a smaller price function.At the same time, the EDSAA introduces the system evaluation criteria ( ) to adjust the adjustment rate of the user price function, which is used to balance the sensitivity and filtering ability of the EDSAA.

Conclusions
In this paper, we analyze the EDSAA from an economic point of view.On the basis of the Cournot game, a series of algorithms for revenue function and price function are improved, while the existence and effectiveness of the Nash equilibrium is proven.In addition, the elastic business model is added to the revenue function, which is more consistent with the actual situation as the unit revenue of the SUs is higher when they are allocated with a smaller proportion of the spectrum.After meeting the quality of basic communications, the SUs will not communicate for a greater proportion of the spectrum and the revenue will be reduced.In the price function, parameters are added that reflect the SU's capability to detect the PU, which is used to determine the priority of the SUs.Users with lower priorities will be assigned less spectrum resources, which is the basis of system optimization.The simulation results show that the EDSAA settings in this paper are more reasonable and more consistent with practical applications, which can effectively protect the rights and profits of the PUs in addition to making full use of spectrum resources.It can be seen from Figure 6 that the adjustment trend of the proposed method is negatively correlated to the detection capability of SUs.Essentially, SUs with a higher detection capability has a smaller price function.At the same time, the EDSAA introduces the system evaluation criteria (α) to adjust the adjustment rate of the user price function, which is used to balance the sensitivity and filtering ability of the EDSAA.

Conclusions
In this paper, we analyze the EDSAA from an economic point of view.On the basis of the Cournot game, a series of algorithms for revenue function and price function are improved, while the existence and effectiveness of the Nash equilibrium is proven.In addition, the elastic business model is added to the revenue function, which is more consistent with the actual situation as the unit revenue of the SUs is higher when they are allocated with a smaller proportion of the spectrum.After meeting the quality of basic communications, the SUs will not communicate for a greater proportion of the spectrum and the revenue will be reduced.In the price function, parameters are added that reflect the SU's capability to detect the PU, which is used to determine the priority of the SUs.Users with lower priorities will be assigned less spectrum resources, which is the basis of system optimization.The simulation results show that the EDSAA settings in this paper are more reasonable and more consistent with practical applications, which can effectively protect the rights and profits of the PUs in addition to making full use of spectrum resources.

Figure 3 .
Figure 3. Best spectrum allocation curve and Nash equilibrium point under the static game.

Figure 4 .
Figure 4.The influence of learning factors on game stability.

Figure 3 .
Figure 3. Best spectrum allocation curve and Nash equilibrium point under the static game.

Algorithms 2017, 10 , 103 8 of 11 Figure 3 .
Figure 3. Best spectrum allocation curve and Nash equilibrium point under the static game.

Figure 4 .
Figure 4.The influence of learning factors on game stability.

Figure 4 .
Figure 4.The influence of learning factors on game stability.

Figure 5 .
Figure 5. Iteration curve of the dynamic game of secondary user (SU).

Figure 5 .
Figure 5. Iteration curve of the dynamic game of secondary user (SU).
Figure 6. and price function with different detection capabilities and different evaluation criteria.

Figure 6 .
Figure 6.SU 1 and SU 2 price function with different detection capabilities and different evaluation criteria.