A Joint Communication and Computation Design for Probabilistic Semantic Communications

In this paper, the problem of joint transmission and computation resource allocation for a multi-user probabilistic semantic communication (PSC) network is investigated. In the considered model, users employ semantic information extraction techniques to compress their large-sized data before transmitting them to a multi-antenna base station (BS). Our model represents large-sized data through substantial knowledge graphs, utilizing shared probability graphs between the users and the BS for efficient semantic compression. The resource allocation problem is formulated as an optimization problem with the objective of maximizing the sum of the equivalent rate of all users, considering the total power budget and semantic resource limit constraints. The computation load considered in the PSC network is formulated as a non-smooth piecewise function with respect to the semantic compression ratio. To tackle this non-convex non-smooth optimization challenge, a three-stage algorithm is proposed, where the solutions for the received beamforming matrix of the BS, the transmit power of each user, and the semantic compression ratio of each user are obtained stage by stage. The numerical results validate the effectiveness of our proposed scheme.


I. INTRODUCTION
T HE rapid development of wireless communication tech- nology has initiated an era of unprecedented connectivity [1] that brings with it a growing complexity of data transmission.Moreover, the principles of information theory have undeniably shaped modern communication systems.While this model has been invaluable, it inherently falls short in capturing the richer semantic dimension of the information being exchanged [2].In response to the limitations of traditional information theory, the concept of semantic communication has emerged as a compelling technology [3] to handle the growing complexity of data transmission.Semantic communication transcends the mere exchange of abstract symbols, instead placing an emphasis on the meaning and purpose of a message [4].Different from conventional communications that focuses on data rate maximization, semantic communications prioritizes data meaning transmission.
The advent of semantic communication has gained significant attention in the realm of communication research, representing a departure from established paradigms [5].However, despite its growing importance, the concept of semantic communication remains in a state of ongoing evolution [6] characterized by the lack of a universally accepted definition, a comprehensive theoretical framework, and a unified understanding [7].Research in this field is exploratory, reflecting the challenges and opportunities of semantic communication in modern communication systems.
To achieve the advantages of semantic communication, one of the intriguing challenges is how to effectively obtain key performance indicators (KPIs) for performance evaluation.These KPIs include various aspects such as semantic computation consumption, quality of semantic information extraction, and semantic capacity.Current research mainly employs two methodologies to derive KPIs in semantic communication.The first approach relies on simulation, where semantic-related metrics, such as semantic rate, are obtained utilizing functions derived from simulation results [8]- [11].The second approach involves analysis, where expressions related to semantic communication, such as semantic computation consumption, are derived through theoretical analysis [12]- [15].In simulationbased studies, Yan et al. achieved maximum spectral efficiency by optimizing channel assignment and the number of semantic symbols [8], [16].Addressing energy efficiency, the authors in [17] conducted optimization for total energy consumption under latency constraints.Cang et al. integrated semantic communication with mobile edge computing (MEC), minimizing energy consumption by optimizing semantic-aware division factors and managing communication and computation resources [18].In analysis-based studies, the authors in [12] optimized the total energy of the entire system through strategic semantic level selections.
In addition to characterizing the KPIs of semantic communication, the representation of semantic information is also a challenging aspect of semantic communication [19].Although many approaches use auto-encoders for semantic compression [20]- [22], resulting in data of small size that is considered to be semantic information, this output often lacks interpretability and cannot be directly validated by interaction with human understanding.To address this limitation, some works [23], [24] proposed the use of knowledge graphs as a representation method aligned with human logic.A knowledge graph generally consists of a set of nodes connected by edges [25].Each node represents an entity, which can be a realworld object, a concept, a temporal reference, etc.The edges represent the semantic relationship between these entities.An illustrative example of a knowledge graph is shown in Fig. 1.Notably, knowledge graphs efficiently encapsulate substantial information within a compact data size, making them an ideal candidate for semantic information representation.
Recently, there has been significant research investigating semantic communication over wireless networks.The authors in [26] introduced deep learning techniques to joint sourcechannel coding of text, which laid the foundation of a semantic communication system for text transmission.This research offered novel perspectives and methods for effectively encoding and transmitting textual information.Building upon this, Yao et al. further explored the design of text transmission by proposing an iterative semantic coding approach [27].The objective of this approach was to accurately capture and transmit the semantic content of text, thereby enhancing the efficiency and accuracy of transmission.Further, semantic triples and knowledge graphs have been employed to enable semantic communication.Liu et al. investigated a task-oriented semantic communication approach based on semantic triples [28].This approach focused on effectively encoding and transmitting key semantic information based on specific task requirements.Additionally, the work in [29] proposed a cognitive semantic communication framework with knowledge graphs.This work presented a simple, general, and interpretable solution for detecting semantic information by utilizing triples as semantic symbols.Considering the unique property of semantic communication, resource allocation and performance optimization are crucial factors to consider in the development of semantic communication systems.Wang et al. employed deep reinforcement learning to address the resource allocation problem in semantic communication [30].This study introduced new strategies to effectively allocate communication resources to ensure efficient transmission of semantic information.However, the aforementioned works [26]- [30] did not take into account the computational power requirements of semantic communication systems, which is important for energy-constrained wireless networks [31].
In this paper, we develop a multi-user probabilistic semantic communication (PSC) framework that jointly considers transmission and computation consumption.The key contributions of this work are summarized as follows: • We consider a PSC network in which multiple users employ semantic information extraction techniques to compress their original large-sized data and transmit the extracted information to a multi-antenna base station (BS).In our model, users' large-sized data is represented by extensive knowledge graphs and is compressed based on the shared probability graph between the users and the BS.
• We formulate an optimization problem that aims to maximize the sum equivalent rate of all users while considering total power and semantic resource limit constraints.This joint optimization problem takes into account the trade-off between transmission efficiency and computation complexity.• To solve this non-convex non-smooth problem, a lowcomplexity three-stage algorithm is proposed.In stage 1, the receive beamforming matrix is optimized using the minimum mean square error (MMSE) strategy.In stage 2, we substitute the transmit power with the semantic compression ratio and develop an alternating optimization (AO) method to perform a rough search for the semantic compression ratio.In stage 3, gradient ascent is used to refine the semantic compression ratio.Numerical results show the effectiveness of the proposed algorithm.The remainder of this paper is organized as follows.The system model and problem formulation are described in Section II.The algorithm design is presented in Section III.Simulation results are analyzed in Section IV. Conclusions are drawn in Section V.

II. SYSTEM MODEL AND PROBLEM FORMULATION
Consider an uplink wireless PSC network with one multiantenna BS and N single-antenna users, as shown in Fig. 2. The BS is equipped with M antennas, and the set of users is

A. Semantic Communication Model
We employ probability graphs as the knowledge base between the semantic transmitter (each user) and the semantic receiver (BS).A probability graph integrates information from multiple knowledge graphs, extending the conventional knowledge graph by introducing the dimension of relational probability.An illustrative example of a probability graph is depicted in Fig. 3.A traditional knowledge graph comprises numerous triples, and each triple can be represented by where h is the head entity, t denotes the tail entity, and r represents the relation between h and t.In a traditional knowledge graph, the relations are typically fixed.In contrast, in a probability graph, each relation is associated with a specific probability, representing the likelihood of that particular relation occurring under the given conditions of fixed head entity and tail entity.We assume that each user needs to transmit several knowledge graphs.These knowledge graphs are generated from extensive textual data (picture/audio/video data can also be applied) after undergoing named entity recognition (NER) [32] and relation extraction (RE) [33], resulting in abstracted information.Using the shared probability graph between a user and the BS, one can further compress the transmitted knowledge graphs.
The probability graph extends the dimensionality of relations by statistically enumerating the occurrences of various relations associated with the same head and tail entities across diverse knowledge graph samples.Leveraging the statistical information from the probability graph, a multidimensional conditional probability matrix can be constructed.This matrix reflects the likelihood of a specific triple being valid under the condition that certain other triples are valid.This enables the omission of relations in the knowledge graph before transmission, resulting in data compression.However, it is crucial to note that achieving a smaller data size necessitates a lower semantic compression ratio, which demands higherdimensional conditional probabilities.This decrease in semantic compression ratio comes at the cost of increased computational load, thus presenting a trade-off between communication and computation for the considered PSC network.The specific implementation details of the probability graph can be found in [12].
Within the framework of the considered PSC network, each user possesses a personalized local probability graph that stores statistical information about their historical data.Each user n individually performs semantic information extraction, compressing original large-sized data D n based on its stored probability graph with the semantic compression ratio denoted by ρ n .Subsequently, the obtained compressed data, C n , is transmitted to the BS with transmit power p t n .Meanwhile, the BS maintains identical probability graphs corresponding to all N users.Once the BS receives the semantic data from user n, it conducts semantic inference to recover the compressed semantic information using the shared probability graph of user n.The overall framework of the considered PSC network is depicted in Fig. 4.

B. Transmission Model
As mentioned above, the BS is equipped with M antennas to serve N single-antenna users.We assume that the number of users is not greater than the number of antennas in the BS, that is, N ≤ M .Therefore, space-division multiple access (SDMA) can be employed.
We consider the uplink transmission from all users to the BS, and the received signal at the BS can be mathematically represented by where represents the receive beamforming matrix at the BS, with w n ∈ C M ×1 being the receive beamforming vector for user n.The matrix denotes the multiple access channel matrix from all N users to the antenna array of the BS.Each vector h n ∈ C M ×1 represents the channel vector between the BS and user n, and is determined by the specific propagation environment.Here, we assume [H] i,j ∼ CN (0, β) where [•] i,j denotes an element in a matrix and β signifies the long-term channel power gain.The vector T , where the transmit power of user n is denoted by p t n .The vector T represents additive white Gaussian noise (AWGN) at the BS.We assume that [n] i ∼ CN (0, σ 2 ), where [•] i denotes an element in a vector, and σ 2 denotes the average noise power.
For the uplink transmission that utilizes linear combining at the BS, the received signal-to-interference-plus-noise ratio (SINR) for the signal from user n can be given by Transmitter Receiver and the achievable rate of user n can be expressed as

Noisy channel
In the considered PSC network, the original large-sized data D n is compressed into a small-sized data C n with a semantic compression ratio prior to transmission.The semantic compression ratio for user n is defined as where the function size(•) quantifies the data size in terms of bits.
Hence, we can calculate an equivalent rate for user n, denoted by which represents the transmission rate perceived by the receiver following the process of decoding.Due to the fact that one bit in the compressed data C n can represent 1/ρ n bits in the original data D n , we multiply the factor 1/ρ n in equivalent expression (6).

C. Computation Model
Each user n needs to perform semantic information extraction based on their local probability graph to compress the original data D n into a smaller-sized data C n .This operation relies on computational resources, and it is important to note that the lower the semantic compression ratio ρ n , the higher the computation load becomes.
According to equation ( 19) in [12], the computation load for the considered probability graph-based PSC network can be expressed as where A s < 0 represents the slope, B s > 0 stands for the constant term, and L s is the boundary for each segment s = 1, 2, • • • , S. These parameters are system-specific and are determined by the characteristics of the probability graphs.From ( 7), the computation load expression is a piecewise function, which is due to the fact that the semantic inference The magnitude of the slope increases by segments.
involves multiple levels of conditional probability functions and each level of conditional probability function results in one linear computation load expression.
Based on (7), the computation load, denoted by g(ρ), exhibits a segmented structure with S levels, and the slope magnitude decreases in discrete segments, as depicted in Fig. 5.This is because when the compression ratio is high, only low-dimensional conditional probabilities are employed, resulting in lower computational demands.However, as the compression ratio decreases, the need for higher-dimensional information arises.With higher information dimensions, the computation load becomes more intensive.Each transition in the segmented function g(ρ) represents the utilization of probabilistic information with more information for semantic information extraction.
Given the piecewise property of the computation load function, the computation power of user n can be written as where p 0 represents a positive constant denoting the computation power coefficient, In this paper, our primary focus is on the computation load at the user side, as we are specifically addressing the uplink transmission scenario.In this context, each user needs to perform an information transmission task, and as such, the computational overhead associated with semantic decoding at the BS is ignored since the BS always has high power budget.

D. Problem Formulation
Given the considered system model, our objective is to maximize the sum of equivalent rate for all users through jointly optimizing semantic compression ratio of each user, transmit power of each user, and receive beamforming matrix of the BS while considering the maximum total power of each user.The sum rate maximization problem can be formulated as where and ρ min n is the semantic compression limit for user n.Constraint (9a) reflects a limit on the sum of transmit power and computation power for user n, ensuring it remains within the overall power limit p max n .Constraint (9b) enforces the non-negativity of user's transmit power.Lastly, constraint (9c) bounds the semantic compression ratio for each user.
It is essential to recognize that semantic compression ratio and transmit power are tightly coupled in problem (9).Smaller compression ratios lead to larger values of the objective function, but the presence of constraint (9a) limits the transmit power, consequently reducing the objective function.Therefore, achieving the right balance between the effects of semantic compression ratio and transmit power is the key to the solution of problem (9).Another important aspect of problem ( 9) is the inclusion of the segmented function g n (ρ n ) in constraint (9a), which introduces a distinct challenge to the optimization process.Since the objective function is highly non-convex and constraint (9a) is non-smooth, it is generally hard to obtain the optimal solution of problem (9) with existing optimization tools in polynomial time.Thus, we develop a suboptimal solution in the next section.

III. ALGORITHM DESIGN
In this section, a three-step algorithm is proposed to solve problem (9), i.e., MMSE for receive beamforming matrix, rough search for semantic compression ratio, and refined search for semantic compression ratio.These three stages will be explained in detail below.

A. Stage 1: MMSE for Receive Beamforming Matrix
With the advancement of multiple-input multiple-output (MIMO) technology, various beamforming methods, including maximum ratio combining (MRC), zero-forcing (ZF), and MMSE, have been developed to deal with multi-user interference.In this section, we employ MMSE strategy to identify the receive beamforming matrix W, which is effective in dealing with the high noise power situations.Based on the MMSE technique, the closed-form solution of receive beamforming matrix W is given in the following lemma.
Lemma 1 For any given transmit power of each user, i.e., p, the optimal linear receive beamforming matrix W of the BS under MMSE strategy can be written as where P = diag{p} represents a diagonal matrix with [P] i,i = [p] i , and I M is an identical matrix of size M × M .
Proof See Appendix A. □ According to Lemma 1, the optimal MMSE receive beamforming is obtain as a closed-form solution, which is a function of the transmit power of all users.Based on the obtained W(P), we have For notation convenience, we define and .
(13) Thus, by substituting ( 11) into (3), the received SINR for the signal from user n can be rewritten as With the above variable substitution, problem (9) can be reformulated as In this stage, the receive beamforming matrix W is optimized using MMSE strategy with a closed-form solution.Hence, the variables that require optimization in problem (9) are reduced, and the problem we need to solve becomes problem (15).

B. Stage 2: Rough Search for Semantic Compression Ratio
In stage 2, we will roughly determine the semantic compression ratio ρ n for each user by identifying the segment in the piecewise function g n (ρ n ) where ρ n falls.
Without loss of generality, it is assumed that when the semantic compression ratio is equal to ρ min n , the computation power p c n exceeds the total power limit p max n , i.e., This is because as the semantic compression ratio tends to ρ min n , the computation load rises dramatically as the probability dimension of the computation becomes very high.
With the above assumption, the following theorem can be derived.
Theorem 1 The optimal semantic compression ratio ρ * n and transmit power (p t n ) * of problem (15) must satisfy Proof See Appendix B. □ Theorem 1 implies that constraint (15a) will always hold with equality for optimality of problem (15).Based on Theorem 1, we can substitute p t n = p max n − g n (ρ n )p 0 into problem (15).Thus, problem (15) can be rewritten as Note that U nk and v n are variables associated with the transmit power p according to equations ( 12) and ( 13).Since transmit power p t n is also a function of the semantic compression ratio ρ n , U nk and v n become variables only associated with the semantic compression ratio ρ.Therefore, problem (18) is related solely to the semantic compression ratio.
However, the difficulty in solving problem (18) still exists due to the non-convexity of the objective function and the non-smoothness of the computation load function, g n (ρ n ).To handle the non-smoothness of g n (ρ n ), it can be reformulated as where S is the number of segments of the piecewise function g n (ρ n ), and θ ns identifies the specific segment within which ρ n falls.
Therefore, problem (18) can be rewritten as where In problem (20), both binary integer matrix Θ and continuous variable ρ are involved.Thus, problem (20) becomes a challenging mixed-integer programming problem.
It is important to note that Θ and ρ are highly coupled in objective function (20) and constraint (20a).If ρ is determined, then so is Θ.However, a determined Θ cannot result in a determined ρ, but it can narrow down the possible range of ρ by specifying the particular segment in which ρ exists.
Therefore, we obtain an approximate estimation of the semantic compression ratio ρ by determining Θ as follows.
For convenience, we define which represents the middle value of the semantic compression ratio in segment s for user n.
We can see that ρ ns is a fixed value denoting the midpoint of segment s in g n (ρ n ).Therefore, we use ρ ns for approximating the value of ρ n in every segment s.By making this approximation, problem (20) can be simplified as Problem ( 22) is an integer programming problem with respect to the Boolean matrix Θ.
Since the objective function of problem (22) remains intractable and challenging to convert into a convex function, we present an AO method to iteratively determine the integer matrix Θ.

With given semantic compression ratio level indicating vectors of other
Since θ n is a one-hot vector of size S × 1, we can simply iterate through all the possible locations where '1' could occur, which has S possibilities.The θ n corresponding to the maximum objective function value is saved for subsequent iterations.
The iteration terminates when the objective function value of problem (23) converges or the iteration count reaches the maximum limit of I max .Algorithm 1 summarizes the AO method for solving the integer programming problem (22).

14:
Set i = i + 1. 15: until the objective value of problem (9) converges or i > I max .16: Output: The optimized Boolean matrix Θ.
In this stage, the transmit power p is substituted with the semantic compression ratio ρ according to Theorem 1. Furthermore, the matrix Θ, which determines the range of ρ n for each user, is optimized employing the AO method.Next, we need to perform a refined search for the semantic compression ratio ρ.

C. Stage 3: Refined Search for Semantic Compression Ratio
To achieve an accurate value for the semantic compression ratio, a refined search is required in stage 3.This is because the result obtained in stage 2 is only an approximate estimate of the semantic compression ratio.
Based on the Boolean matrix Θ obtained in stage 2, we can determine the segment in which ρ falls.Denote the selected segment for user n by S n , which means (24) Once the segment of ρ n is determined, the computation load function g n (ρ n ) becomes a linear function instead of a nonsmooth piecewise function.
Therefore, the problem need to solve in stage 3 can be reformulated as Problem ( 25) is no longer non-smooth as the piecewise function g n (ρ n ) has been degraded to a linear function.However, problem (25) remains non-convex as the objective function is highly non-convex with respect to ρ.Thus, it is generally hard to obtain the globally optimal solution for problem (25).Next, we employ the gradient ascent method to obtain a suboptimal solution.
For convenience, we define which is the objective function of problem (25).Note that it is only related to the semantic compression ratio ρ.Thus, problem (25) can be rewritten as To begin, set the initial semantic compression ratio as Let ρ (t−1) denote the semantic compression ratio obtained in the (t − 1)-th iteration.Subsequently, we can calculate the gradient of the objective function f (ρ) at ρ (t−1) according to the definition, i.e., where Then, we can update ρ (t) in the t-th iteration towards the gradient ascent direction for a higher f (ρ).The update strategy can be written as where τ (t) represents the step size in the t-th iteration, and B {ρ} refers to a boundary function which ensures that the semantic compression ratio stays within the range determined by constraints (27a) and (27b).Specifically, the boundary function B {ρ} can be expressed as where and Both the convergence rate and the ultimate outcome of the gradient ascent algorithm exhibit pronounced sensitivity to the chosen step size.Oversized step sizes may expedite convergence but risk non-convergence.Conversely, overly small step sizes encourage convergence with more iterations, although resulting in a more optimal solution.Consequently, this paper employs the backtracking linear search method to ascertain a judicious step size.Concretely, within t-th iteration, the step size initiates with a sizeable positive value, i.e., τ (t) = τ , and diminishes gradually by repeating until the Armijo-Goldstein condition is satisfied, expressed as where ξ ∈ (0, 1) serves as a hyper-parameter regulating the step size magnitude.The algorithm will terminate when the increase in f (ρ) between the two most recent iterations is less than a very small positive number, denote by ϵ, or the algorithm reaches the maximum iteration limit of T max .Algorithm 2 provides a summary of the gradient ascent algorithm.
In this stage, the non-smooth computation function g n (ρ n ) is degenerated to a linear function according to the Boolean Algorithm 2 Gradient Ascent Algorithm for Refined Search of Semantic Compression Ratio 1: Initialize ρ (0) .Set iteration index t = 0. 2: Obtain f (ρ) according to (26).Calculate ∇ ρ f ρ (t−1) according to (29).

D. Algorithm Analysis
The overall joint transmission and computation resource allocation algorithm for the multi-user PSC network is presented in Algorithm 3. Algorithm 3 consists of three stages that are executed sequentially.Therefore, the overall complexity of Algorithm 3 can be calculated as O(Stage 1) + O(Stage 2) + O(Stage 3), where O(Stage i) denotes the computation complexity of stage i.The complexity of these three stages is analyzed as follows.
In stage 1, we derive the closed-form solution of the receive beamforming matrix W using the MMSE strategy.Therefore, the computation complexity of stage 1 lies in computing W.
To compute W, we need to perform four matrix multiplications and one matrix inversion.Hence, the computation complexity of stage 1 can be expressed as O(M N 2 +M 2 N +M 3 ).
In stage 2, we employ the AO method to obtain the Boolean matrix Θ.If we exhaustively search all possibilities of Θ, the computation complexity would be O(S N ), which is infeasible.Although the result obtained by the AO method may not be the globally optimal solution, it significantly reduces the complexity to O(I max SN ).In Algorithm 1, the computation complexity for calculating the objective value in line 6 is O(N 2 ).Therefore, the computation complexity of stage 2 is O(I max SN 3 ).
In stage 3, we utilize the gradient ascent algorithm to search for the refined semantic compression ratio ρ.In Algorithm 2, the computation complexity for calculating the gradient in line 4 is O(N 3 ).Let B max denote the maximum iterations of the backtracking linear search in lines 7 to 10 of Algorithm 2. Thus, the complexity of Algorithm 2 is O(B max N ).Consequently, the computation complexity of stage 3 is O(T max (N 3 + B max N )).
As a result, the total complexity of Algorithm 3 can be expressed as Update the receive beamforming matrix W according to (10).4: Stage 2:

5:
Substitute the transmit power p with the semantic compression ratio ρ according to Theorem 1.
Since deducing the optimality of problem ( 9) is challenging in theory, obtaining the globally optimal solution would generally require exponential computation complexity, which is unrealistic.Therefore, we propose Algorithm 3 to provide a suboptimal solution for problem (9) with polynomial computation complexity.

IV. SIMULATION RESULTS
In the simulations, the considered PSC network comprises 8 users, while the BS is equipped with 16 antennas.The multiple access channel matrix H is configured with a longterm channel power gain β set to -90 dB, and the noise power is set to -10 dBm.Furthermore, we set the computation power coefficient to 1 and the maximum power limit to 30 dBm.For the semantic information extraction task based on the probability graph, we adopt the same parameters as in [14].A summary of the main system parameters is provided in Table I.The proposed multi-user PSC system, enhanced by the probability graph with joint transmission and computation optimization, is labeled as the 'PSC' scheme.For comparisons, we incorporate several benchmark schemes as follows.
• 'Non-semantic': This benchmark scheme represents a conventional communication approach where the original data is directly transmitted without employing semantic compression.In this scheme, all users' power is allocated solely to transmission, without any optimization for joint transmission and computation.• 'PSC-S2': This scheme is a simplified version of the 'PSC' scheme, where the optimization process is performed only up to stage 2. The final result is the roughly estimated semantic compression ratio obtained from this stage.• 'PSC-ZF': In this scheme, the ZF strategy is employed at stage 1.This means that the receive beamforming matrix W is calculated as W = H(H H H) −1 .The remaining stages are the same with the 'PSC' scheme.
In Fig. 6, we assess the convergence of the proposed 'PSC' scheme.Two convergent platforms are discernible: the first pertains to the AO algorithm, while the second corresponds to the gradient ascent algorithm.During stage 2, the objective value exhibits rapid ascent and subsequent convergence.This can be attributed to the fact that, in this stage, the AO algorithm addresses an integer programming problem with a discrete and relatively small variable space.Upon the convergence of the AO algorithm, the 'PSC' scheme progresses to stage 3, wherein the gradient ascent algorithm is activated.In stage 3, the objective function converges to a value higher than that achieved in stage 2.This observation serves as validation for the effectiveness of the gradient ascent algorithm.Throughout the iterative process, the objective value steadily increases, eventually reaching a highly stable value.This outcome substantiates the efficacy of the comprehensive algorithm design.
In Fig. 7, the correlation between the sum of equivalent rate and the number of users is depicted.The figure reveals a consistent increase in the sum of equivalent rate across all schemes as the number of users increases.However, it is observed that this increase does not follow a linear trend with a slope of one.Specifically, when N = 8, the sum of equivalent rate is found to be less than twice as high as that when N = 4 within the same scheme.This phenomenon is attributed to the emergence of inter-user interference at the receiver.Furthermore, the growth rate of the 'PSC' scheme surpasses that of the 'PSC-ZF' scheme, indicating that the  MMSE strategy outperforms the ZF strategy in the examined scenario.It is important to emphasize that, consistently, the 'PSC' scheme demonstrates the highest performance, while the sum rate of the 'Non-semantic' scheme consistently remains the lowest.In Fig. 8, the variation of the sum of equivalent rate with changing noise power is illustrated.The figure highlights a consistent decrease in the sum of equivalent rate across all schemes as the noise power increases.When the noise power is small, the performance of the 'PSC' scheme and the 'PSC-ZF' scheme is comparable, suggesting that the ZF strategy is more effective in low-noise environments.It is important to note that, theoretically, when the noise power is zero, the formulas for both MMSE and ZF strategies yield identical results.However, in real-world scenarios, complete absence of noise is implausible.Consequently, the superiority of the MMSE strategy over the ZF strategy becomes evident as noise power increases.This is demonstrated in Fig. 8, where the 'PSC' scheme consistently outperforms the 'PSC-ZF' scheme across various noise power levels, affirming the general superiority of the MMSE strategy.In Fig. 9, the relationship between the sum of equivalent rate and the computation power coefficient is depicted.Notably, the 'Non-semantic' scheme maintains a constant sum of equivalent rate across different p 0 values due to its lack of utilization of semantic communication techniques, and consistently exhibiting the lowest performance among the considered schemes.As the computation power coefficient decreases, the sum of equivalent rate for the other three schemes increases.This trend is attributed to the enhanced efficiency in computation with lower p 0 , facilitating a lower semantic compression ratio.Consequently, a higher sum of equivalent rate is achieved.It is found that the 'PSC-S2' scheme exhibits variable proximity to the 'PSC' scheme, illustrating a dynamic relationship.A small gap between the two indicates that the solution of the 'PSC' scheme closely aligns with the midpoint solution of the 'PSC-S2' scheme.Moreover, the sum of equivalent rate for the 'PSC-S2' scheme demonstrates a segmented function concerning the computation power coefficient p 0 .This behavior arises because the solution of the 'PSC-S2' scheme jumps to the midpoint of another segment of the computation load function g n (ρ n ) only when p 0 changes significantly.
In Fig. 10, the evolution of the sum of equivalent rate is traced across varying maximum power limits.A consistent upward trajectory is observed for all schemes as the maximum power limit increases.This behavior is a direct consequence of the positive correlation between augmented power levels and increased achievable rates for all users.Distinctly, in comparison to the 'Non-semantic' scheme, the advantages of the 'PSC' scheme become more pronounced with higher maximum power limits p max n .This enhancement can be attributed to the 'PSC' scheme's ability to allocate more power to semantic compression as the maximum power limit increases.The reduction in data size achieved through semantic compression significantly contributes to the overall sum of equivalent rate.Conversely, the 'Non-semantic' scheme can only allocate all power to transmission, which does not contribute as significantly to the sum of equivalent rate.Consequently, the proposed 'PSC' scheme exhibits substantial superiority when there is sufficient power.
To depict the allocation of computation power and transmission power within the considered network, Fig. 11 illustrates the distribution in both the 'PSC' and 'PSC-S2' schemes across various computation power coefficients.It can be seen that the sum of computation power and transmission power consistently equals the predefined maximum power limit p max n , set at 30 dBm.This figure reveals no discernible pattern in the variation of computation power with respect to p 0 , and the computation power of the 'PSC-S2' scheme fluctuates, at times surpassing and at other times falling below that of the 'PSC' scheme.This variability underscores the inherent challenge in achieving a balance between transmission and computation within the considered PSC network.

V. CONCLUSION
This paper has introduced the PSC network, a novel paradigm where multiple users employ semantic information extraction techniques to compress extensive original data before transmission to a multi-antenna BS.Our model represents large-sized data through comprehensive knowledge graphs, utilizing a shared probability graph between users and the BS to facilitate efficient semantic compression.We formulated an optimization problem aimed at maximizing the sum of equivalent rate for all users, while considering total power constraints and semantic requirements.To tackle the non-convex and nonsmooth nature of the optimization problem, we proposed a three-stage algorithm.This algorithm determines the receive beamforming matrix of the BS, transmit power, and semantic compression ratio for each user step by step.Numerical results underscore the effectiveness of our proposed scheme, emphasizing its ability to achieve a harmonious equilibrium between transmission and computation.
In future research, we plan to extend our exploration of resource management in the PSC network to diverse scenarios, such as unmanned aerial vehicle (UAV) networks, near-field communications, and other relevant domains.Additionally, considering the uniform computation power coefficient for every user in this study, it is worth investigating the performance of the PSC network among computing-heterogeneous devices.These avenues present interesting directions for future research in the PSC network.(37) To minimize the MSE between x and y, represented by E e H e , where E {•} denotes the expected value of a random variable, the following condition must be satisfied which means there is no correlation between ŷ and e. Condition (38) is equivalent to the condition that minimizes E e H e , because if the correlation between ŷ and e is nonzero, it can still be used to decrease E e H e .Substituting (37) into (38), we have which is equivalent to According to (40), we have  44) into (41), we have which is equivalent to From ( 46), the obtained receive beamforming matrix is associated with the transmit power P. □

APPENDIX B PROOF OF THEOREM 1
Theorem 1 can be proved by the contradiction method.If there exists a user n such that p t n + g n (ρ n )p 0 < p max n . (47) Then, for user n, we can always decrease its semantic compression ratio ρ n due to (16) and constraint (15b).It is evident that the objective function of problem (15) decreases monotonically for ρ n , indicating that a lower semantic compression ratio ρ n produces a higher value of the objective function in problem (15).Therefore, when the objective function of problem (15) reaches its maximum, the semantic compression ratio ρ n and transmit power p t n of each user must satisfy Hence, Theorem 1 is proved.□

Fig. 2 .
Fig. 2.An illustration of the considered PSC network.

Fig. 3 .
Fig. 3. Illustration of the probability graph considered in the PSC system.

N − 1 θθ,
users, we need to determine the optimal θ n for the current user n.Then, we can have the following problem max ns (A ns ρ ns + B ns ) ks (A ks ρ ks + B ks ) + v n

Fig. 7 .
Fig. 7. Sum of equivalent rate vs. number of users.
APPENDIX A PROOF OF LEMMA 1The received signals at the BS without beamforming can be expressed asŷ = Hx + n,(36)which means y = W H ŷ based on (2) and (36).The goal of the MMSE strategy is to minimize the mean square error (MSE) between the transmitted signals x and the received signals y.The error between x and y is e = y − x = W H ŷ − x.
1 .(41)LetusdealwithExŷH first. Substituting (36) into E xŷ H , we obtainE xŷ H = E x (Hx + n) H = E xx H H H + xn H .(42) Since there is no correlation between the transmitted signals x and the noise n, i.e., E xn H = 0, we haveE xŷ H = E xx H H H = PH H . (43)Following the similar procedure, we can obtainE ŷŷ H = HE xx H H H + E nn H = HPH H + σ 2 I M .