Efficient Precoding and Power Allocation Techniques for Maximizing Spectral Efficiency in Beamspace MIMO-NOMA Systems

Beamspace MIMO-NOMA is an effective way to improve spectral efficiency. This paper focuses on a downlink non-orthogonal multiple access (NOMA) transmission scheme for a beamspace multiple-input multiple-output (MIMO) system. To increase the sum rate, we jointly optimize precoding and power allocation, which presents a non-convex problem. To solve this difficulty, we employ an alternating algorithm to optimize the precoding and power allocation. Regarding the precoding subproblem, we demonstrate that the original optimization problem can be transformed into an unconstrained optimization problem. Drawing inspiration from fraction programming (FP), we reconstruct the problem and derive a closed-form expression of the optimization variable. In addition, we effectively reduce the complexity of precoding by utilizing Neumann series expansion (NSE). For the power allocation subproblem, we adopt a dynamic power allocation scheme that considers both the intra-beam power optimization and the inter-beam power optimization. Simulation results show that the energy efficiency of the proposed beamspace MIMO-NOMA is significantly better than other conventional schemes.


Introduction
With the coverage of mobile connections expanded, wireless communications systems are facing an escalating demand for data traffic, which poses challenges in terms of spectral efficiency and energy efficiency.Non-orthogonal multiple access (NOMA) technology has emerged as a key solution for improving spectral efficiency and supporting massive links, as it enables multiple users to share the same spectrum resource simultaneously.The application of NOMA in conventional terrestrial communication systems, benefiting from its superior spectral efficiency capability and capacity to accommodate massive connectivity, has been thoroughly investigated in many aspects [1].Additionally, beamspace multiple-input multiple-output (beamspace MIMO) as another key technology also has several advantages.It leverages the abundant spectrum resources in the millimeter wave band, enabling terminal equipment to achieve high-rate data transmission.Furthermore, by employing large-scale MIMO, beamspace MIMO forms directional beams with high gain, effectively mitigating the challenge of substantial signal transmission path loss inherent in millimeter wave communication.Consequently, beamspace MIMO is recognized as a promising technology for future wireless communications [2].Does this mean we can combine the NOMA and beamspace MIMO technologies to effectively leverage their advantages in the power and spatial domains, leading to improved spectral efficiency?The answer is affirmative.Specifically, considering the characteristics of these two technologies, beamspace MIMO requires a large number of radio frequency (RF) chains, which leads to high energy consumption and renders the all-digital structure unsuitable for direct application [3].Moreover, in beamspace MIMO, the number of supported users cannot exceed the number of RF chains, thereby limiting the system's capacity to accommodate users.However, NOMA excels in increasing the number of system access users.Consequently, the integration of NOMA with mmWave massive MIMO, known as beamspace MIMO-NOMA, has emerged as a promising solution for significantly increasing the number of connections and further enhancing spectral efficiency.This approach has garnered growing research interest [4].

Prior Works
Typically, the optimization of precoding and power allocation designs is considered a means to improve the performance of beamspace MIMO-NOMA systems.These problems have been investigated jointly or partially.However, the presence of inter-beam and intra-beam interference makes these problems non-convex and challenging to solve [5].Fortunately, researchers have developed efficient algorithms to tackle these challenges.
Some works have focused on separately designing precoding or power allocation to improve performance.In [6], the authors propose a ZF precoding scheme to mitigate interference between users and employ the Karush-Kuhn-Tucker (KKT) conditions to investigate the power allocation problem for maximizing the sum rate.Furthermore, [7] explores energy efficiency maximization through power allocation and presents a two-layer iterative algorithm to tackle the non-convex optimization problem.The outer layer converts the original fractional objective function by using the Dinkelbach method, while the inner layer utilizes alternating optimization to solve the transformed problem.In [8], the authors introduce a low-complexity iterative algorithm called mean square error-based dynamic power allocation algorithm (MSE-DPA), which achieves near-perfect performance.Ref. [9] proposes a criterion based on correlation for user pairing to reduce inter-user interference, with ZF precoding applied to the paired users.The results demonstrate that the proposed scheme achieves higher spectral efficiency compared to the conventional scheme.In [10], the main objective is to design a low-complexity hybrid precoder (HP), where the authors propose a symmetric successive over-relaxation (SSOR) algorithm combined with complex regularized zero-forcing (CRZF) linear precoding.
In addition, a significant focus of the work is to design a joint optimization scheme for precoding and power allocation to enhance system performance.In [11], the authors adopt a ZF-based precoding scheme to mitigate inter-beam interference and propose a dynamic power allocation method based on minimum mean square error (MMSE) to maximize the achievable sum rate in beamspace MIMO-NOMA systems.Ref. [12] addresses the limitations of complicated successive interference cancellation (SIC) that were disregarded in [11].Based on the ZF beamforming technique, the power allocation optimization problem is represented as a fractional programming (FP) problem, which was transformed into a convex optimization problem using sequential convex approximation (SCA) and secondorder cone (SOC) transformation.In [13], the authors formulate a joint hybrid beamforming and power allocation problem to maximize the sum rate.They employ the approximate ZF method to design the digital beamforming for minimizing inter-group interference and solve the analog beamforming problem with a constant-modulus constraint using a proposed boundary-compressed particle swarm optimization algorithm.In [14], the authors design ZF precoding matrices and evaluate power allocation coefficients based on optimal spectral efficiency to mitigate intra-beam interference.Additionally, they derive a tight closed-form formula for optimal spectral efficiency using KKT analysis.In [15], from the perspective of spectral efficiency, the authors propose a joint optimization framework and employ the quadratic transformation (QT) method to convert the non-convex power allocation problem into a convex problem.They also design an iterative approach to obtain optimal power allocation and digital beamforming.In [16], the authors propose a hybrid precoder that combines user channel alignment and the ZF algorithm to enhance the SINR.Furthermore, they address the non-convex optimization problem by transforming it into a convex optimization problem for inter-cluster power allocation, which can be solved by using the KKT conditions.

Motivations and Contributions
While the aforementioned research contributions have established a strong foundation for beamspace MIMO-NOMA, further investigation and improvements are still necessary to address practical considerations.Firstly, there is scope for enhancing the optimization of key performance indicators that impact spectral efficiency through various methodologies.Secondly, there is a need for research to focus on reducing computational complexity while improving spectral efficiency simultaneously, which remains an open area of exploration.These observations have inspired our primary research objectives in this study.In this work, our main goal is to maximize the sum rate of beamspace MIMO-NOMA in downlink communications and propose an optimal design scheme for joint precoding and power allocation, building upon the previous research.Against this backdrop, we emphasize the following four aspects that constitute the contributions of our paper:

•
Firstly, we employ block optimization to optimize the joint problem of precoding and power allocation in beamspace MIMO-NOMA systems.In the precoding optimization part, we demonstrated that the original constrained problem can be transformed into an unconstrained problem.Moreover, we elucidated the quantitative relationship between the solutions of the original problem and the equivalent unconstrained problem.For the power allocation part, we adopted a dynamic power allocation method based on a joint power optimization problem, taking into account power optimization within and between beams.

•
Secondly, we devised a precoding scheme based on FP to decouple the optimization variables, effectively transforming the unconstrained problem into three equivalent subproblems.Subsequently, we derived closed expressions for the optimization variables.

•
Thirdly, as the number of antennas at the BS and the number of users accessing the system increase, the hardware and signal processing complexity also escalates.Since the precoding optimization algorithm involves complex matrix inversion operations, its calculation complexity is O N RF 3 , which grows cubically with the increase in the number of RF connections.To mitigate this complexity, we utilized the Neumann series expansion (NSE) method to approximate the inverse of the precise matrix and expand the lower-order terms, thereby reducing the complexity of the matrix inversion operation to O N RF 2 .

•
Finally, we validated the performance of the proposed scheme through simulation.The results demonstrated that the algorithm significantly improves spectral efficiency.Furthermore, the simulation results confirmed that the proposed precoding and power allocation scheme outperforms the benchmark methods.

Organization and Notations
The remainder of the paper is organized as follows.Section 2 outlines the system model of beamspace MIMO-NOMA.Based on this model, Section 3 formulates the maximum sum rate problem, and an introduction to the proposed algorithm is provided.Section 4 presents the simulation results to evaluate the performance.Finally, Section 5 concludes the paper.
Notation: Denote C as the set of complex numbers, and Re{•} as the real part.We use the superscript to denote the Hermitian transpose of a matrix and overline the complex conjugate.The bold lower-case letter denotes a vector; the bold upper-case letter denotes a matrix; the calligraphic upper-case letter denotes a set.I n denotes the identity matrix of dimension n.(•) T , (•) H , (•) −1 and • F denote transpose, Hermitian transpose, inversion, and Frobenius norm operations, respectively.

System Model and Problem Formulation
In this section, we first review the beamspace MIMO system model, followed by a detailed description of the beamspace MIMO-NOMA system model.

System Model of Beamspace MIMO
As illustrated in Figure 1, the system depicted represents a single-cell downlink mmWave MIMO communication system.The BS is equipped with N antennas and N RF RF chains, serving K randomly distributed single-antenna users simultaneously [17].Employing the usual uniform linear array (ULA) structure, utilizing a well-designed lens antenna array at the BS.The received signal vector y = [y 1 , y 2 , • • • , y K ] T is represented as: where s = [s 1 , s 2 , • • • , s K ] T ∈ C K×1 represents the transmitted signal vector for all K users satisfied with E ss the Rayleigh fading channel matrix, where h k ∈ C N×1 denotes the channel vector between the BS and the kth user.In addition, n is the noise vector that follows the distribution CN 0, σ 2 I K .We consider the widely used Saleh-Valenzuela channel model for mmWave communications, h k can be represented as where β k,0 denotes the LoS complex path gain, while a(θ k,0 ) represents the array steering vector for the LoS path, similarly, β k,l and a(θ k,l ) denote the complex gain and steering vec- tor for the lth NLoS path, respectively.Furthermore, L denotes the number of NLoS paths.

System Model and Problem Formulation
In this section, we first review the beamspace MIMO system model, followed by a detailed description of the beamspace MIMO-NOMA system model.

System Model of Beamspace MIMO
As illustrated in Figure 1, the system depicted represents a single-cell downlink mmWave MIMO communication system.The BS is equipped with N antennas and RF N RF chains, serving K randomly distributed single-antenna users simultaneously [17].
Employing the usual uniform linear array (ULA) structure, utilizing a well-designed lens antenna array at the BS.The received signal vector , , , is represented as: where , , , represents the transmitted signal vector for all K users satisfied with ( ) , , , is the diagonal power allocation matrix, , , , is the precoding matrix, and , , , is the Rayleigh fading channel matrix, where × ∈ N k h C 1 denotes the channel vector between the BS and the k th user.In addition, n is the noise vector that follows the distri- bution ( ) . We consider the widely used Saleh-Valenzuela channel model for mmWave communications, k h can be represented as where k β ,0 denotes the LoS complex path gain, while ( ) k a θ ,0 represents the array steering vector for the LoS path, similarly, k l β , and ( ) k l a θ , denote the complex gain and steering vector for the lth NLoS path, respectively.Furthermore, L denotes the num- ber of NLoS paths.For the typical ULA, the expression of a(ϕ) is can be expressed as follows [18]: where  for n = 1, 2, • • • , N are the predefined spatial directions.Then, the received signal vector y in the beamspace MIMO systems is given by where H = UH is the beamspace channel matrix.We employ the classic maximummagnitude-based beam selection method to choose a subset of the N orthogonal beams to serve all K users without obvious performance loss [19].Consequently, the number of RF chains is reduced from N to N RF .Thus, the received signal can be written as where H r = H(m, :), m ∈ M is the dimension-reduced beamspace channel matrix with size |M| × K, and M is the index set of selected beams.It is important to note that in this system, one RF chain generates one beam, resulting in the number of selected beams |M| being equal to the number of RF chains N RF [11].In addition, the dimension-reduced digital precoding matrix W r , has a size of |M| × K. Since W r has a smaller row dimension compared to the original precoding matrix W, the number of required RF chains can be significantly reduced [20].Notwithstanding, reducing the number of RF chains also presents a challenge of limited connections.To overcome this fundamental limit, a novel transmission scheme known as beamspace MIMO-NOMA, which combines the concept of NOMA with beamspace MIMO, has been proposed.By incorporating NOMA into beamspace MIMO systems, both spectral efficiency and connection density can be further enhanced [6].

System Model of Beamspace MIMO-NOMA
As shown in Figure 2, this is a typical beamspace MIMO-NOMA wireless communication system.We consider that there are N RF groups assigned to provide service, and we denote the set of users S m served by the mth beam with S m ∩ S n = ∅ for m = n and The received signal ŷm,n of the nth user in the mth beam can be expressed as follows: where s m,n is the transmitted signal for the nth user in the mth beam with normalized power, and p m,n is the corresponding transmitted power, w m = W r (:, m) represents the mth beam digital precoding vector, and v m,n ∼ CN 0, σ 2 refers to the noise.Based on the principle of NOMA, intra-beam interference can be mitigated by utilizing SIC.Supposing within the same beam, the ith user can sequentially detect the jth user (for all j > i) and remove the detected signals from its received signals [21].In the mth beam, after employing SIC to decode the nth user's signal, the remaining received signal can be expressed as follows:

Alternating Optimization of Beam-Specific Digital Precoding and Power Allocation
In this section, we begin by formulating the optimization problem.Next, we present an alternating optimization method to obtain the solution for beam-specific digital precoding.Finally, we maximize the achievable sum rate by solving the joint power optimization problem using a dynamic power allocation scheme.

Problem Formulation
Our objective is to maximize the achievable sum rate problem by jointly optimizing the beam-specific digital precoding and power allocation, while adhering to the maximum transmit power constraint of the BS.The optimization problem can be formulated as follows: : .
Obviously, three problems need to be addressed to optimize  1 .As shown in ( 9), the presence of both intra-beam interference and inter-beam interference in the system results in the optimization variable { } p , is carried out at the user level.This implies that both aspects are difficult to optimize simultaneously.
To tackle the complexity of the original problem  1 , we decompose it into two sub- problems:   beam and  power for optimization.For the sub-problem   beam , we first convert the constrained optimization problem into an unconstrained optimization problem.Then, we employ the FP algorithm to handle the NP-hard problem, leading to the derivation of a closed expression for precoding W . Additionally, we leverage the NSE to reduce the complexity of the precoding process.As for the sub-problem of power allocation, we utilize a dynamic power allocation scheme to obtain a closed-form expression for the power distribution, ensuring lower complexity.Therefore, the signal-to-interference-plus-noise ratio (SINR) at the nth user in the mth beam can be expressed as follows: where Hence, the corresponding achievable rate can be expressed as follows: Consequently, the overall achievable sum rate of the beamspace MIMO-NOMA scheme is: Indeed, precoding optimization helps mitigate inter-beam interference, but intra-beam interference endures within beamspace MIMO-NOMA systems.Power allocation effectively mitigates this inter-beam interference, thus enhancing overall system performance.It is noteworthy that expressions ( 9)-( 11) illustrate the substantial influence of power allocation parameters {p m,n } and precoding vectors {w m } on maximizing the sum rate.Thus, the system performance can be further enhanced through the meticulous design of precoding and power allocation strategies.Jointly optimizing precoding and power allocation is pivotal for maximizing overall system performance.While this may add complexity, thoughtful design, and analysis allow for performance improvements without imposing significantly higher computational demands.In the following section, we will explore these ideas in greater detail.

Alternating Optimization of Beam-Specific Digital Precoding and Power Allocation
In this section, we begin by formulating the optimization problem.Next, we present an alternating optimization method to obtain the solution for beam-specific digital precoding.Finally, we maximize the achievable sum rate by solving the joint power optimization problem using a dynamic power allocation scheme.

Problem Formulation
Our objective is to maximize the achievable sum rate problem by jointly optimizing the beam-specific digital precoding and power allocation, while adhering to the maximum transmit power constraint of the BS.The optimization problem can be formulated as follows: P 1 : max Obviously, three problems need to be addressed to optimize P 1 .As shown in ( 9), the presence of both intra-beam interference and inter-beam interference in the system results in the optimization variable {p m,n } and {w m } appears in both the nomination and denominator of γ m,n .Consequently, the problem becomes a non-convex optimization problem that is difficult to solve directly.Furthermore, it is highly nonlinear.Additionally, the optimization of precoding {w m } is performed at the beam level, while the optimization of power allocation {p m,n } is carried out at the user level.This implies that both aspects are difficult to optimize simultaneously.
To tackle the complexity of the original problem P 1 , we decompose it into two subproblems: P beam and P power for optimization.For the sub-problem P beam , we first convert the constrained optimization problem into an unconstrained optimization problem.Then, we employ the FP algorithm to handle the NP-hard problem, leading to the derivation of a closed expression for precoding W. Additionally, we leverage the NSE to reduce the complexity of the precoding process.As for the sub-problem of power allocation, we utilize a dynamic power allocation scheme to obtain a closed-form expression for the power distribution, ensuring lower complexity.

The Proposed Beam-Specific Digital Precoding Optimization
In this subsection, we focus on optimizing the beam-specific digital precoding vectors {w m } for a given set of power allocation parameters {p m,n }.To accomplish this, we trans- form the non-convex precoding optimization problem into an unconstrained optimization problem.To be specific, the precoding problem can be formulated as follows: Specifically, inspired by [22], we establish the following definition and proposition.According to Proposition 1, it can be inferred that: Hence, the problem P beam can be transformed into the following unconstrained form: We can express the objective function as follows: where The following proposition establishes the relationship between P beam and P beam .
Proposition 2. The following relationship exists between the optimal solution w o of the problem P beam and the optimal solution w o of the new unconstrained optimization problem P beam .
This implies that if we find solution w o , then solution w o can be obtained according to Proposition 2.
Obviously, the objective function f beam (w) remains non-convex, making it difficult to solve in polynomial time.To address this, we employ the Lagrangian dual transform to reframe the unconstrained problem P beam , as demonstrated below [23].where u refers to a set of auxiliary variables {u m,n }, and the objective function of problem P beam is formulated as follows: When w m is held fixed, the optimal u m,n can be obtained by solving Now, we incorporate u o m,n into ( 21) and obtain where const(u) = log 2 (1 + u m,n ) − u m,n is a constant term.Applying the multidimensional quadratic transform further transforms (23) and leads to the following expression: where v is the collection {v m,n }.With u m,n fixed, the optimal v m,n can also be determined by setting = 0, and the optimal value v o m,n can be expressed as follows: Likewise, with the other variables fixed, the optimal w m satisfies the expression The proposed algorithm is summarized in Algorithm 1.Unfortunately, although N RF is much smaller than K, the matrix inversion in the expression of w o m still remains high-dimensional, resulting in computational complexity of O N RF 3 in each iteration, which may result in significant processing delays.To address this issue, the NSE has been explored as an alternative for approximating matrix inversion [24], we leverage the NSE to simplify the matrix inversion of w o m as follows.Letting Sensors 2023, 23, 7996 10 of 22 we can observe that the matrix A exhibits diagonal dominance.In such cases, the inversion of A can be equivalently expressed as follows [25]: By decomposing the matrix A as A = D + E, where D is a diagonal matrix consisting of the main diagonal elements of A, and E is a hollow matrix consisting of the remaining elements.Replace P in (28) with D and rewrite it as follows Due to the high complexity of the full NSE algorithm, the truncated NSE, which aims to retain only the first k orders (k + 1 terms) of the Neumann series, is a more commonly used approach.The specific formula can be expressed as follows: It should be noted that as the unfolding order increases (denoted as 'k > 1'), the computational complexity of the proposed NSE-based algorithm may exceed the complexity of O N RF 3 .Therefore, to strike a balance between closely approximating the original precoding while reducing complexity, we choose k = 1, then Based on this estimation, the NSE-level approximation algorithm can reduce the computational complexity from O N RF 3 to O N RF 2 .By combining the aforementioned updates, Algorithm 1 provides a detailed description of the proposed precoding optimization algorithm.m for ∀m.
In Algorithm 1, it can be demonstrated that the computational complexity is primarily determined by line 5. Within each iteration, the complexity of obtaining the optimal values u (t) m,n in (22) and v  Additionally, the complexity of finding the optimal value w (t) m in (31) is O N RF 2 , due to the utilization of NSE.Consequently, the computational complexity is significantly lower than the complexity of O N RF 3 stated in (26).

The Adopted Optimization Power Allocation
The initial optimization problem P 1 can be transformed into the following problem when {w m } is fixed.
P power : max Note that the problem remains challenging.To address this difficulty, we introduce Lemma 1 to simplify problem P power .Lemma 1.Let f (a) = − ab ln 2 + log 2 a + 1 ln 2 and a ∈ R 1×1 be a positive scalar, we have where the optimal solution of a is a o = 1 b .
Proof.Since f (a) is a convex function, and the optimal solution of f (a) can be obtained by setting where the maximum value of f (a) is − log 2 b.
Moreover, if we use the minimum mean square error (MMSE) to estimate s m,n , then have the following expression: where c m,n ∈ C 1×1 denotes the channel equalization coefficient, y m,n is defined previously in (8).Substituting into (35), we obtain: According to [11], the optimal equalization coefficient c m,n can be obtained by the following formula: and c m,n can be calculated by ∂e m,n ∂c m,n = 0, then we have Substituting (38) into (36), we can obtain the optimal MMSE expression as follows: According to the extension of the Sherman-Morrison-Woodbury formula [27], Thus, (1 + γ m,n ) −1 can be reformulated as We observe that the expression (41) has the same form as the MMSE expression (39).i.e., we have (1 Using Lemma 1, we can equivalently rewrite P power as Ppower : max where a m,n > 0 is an introduced slack variable.We propose to iteratively optimize {p m,n }, {c m,n } and {a m,n } by using the alternating optimization algorithm.The optimal solution can be obtained by: After obtaining the optimal values c o m,n and a o m,n in the iteration, the optimal value p o m,n can be obtained by solving the following problem: We observe that P power is a convex optimization problem, which can be solved by using the following Lagrange function: where λ ≥ 0.Then, the Karush-Kuhn-Tucker (KKT) condition of problem P power can be obtained as follows.

∂L(p, λ) ∂p
Finally, the optimal solution p o m,n from (45) can be found as follows: We can see that the values of c o m,n , a o m,n and p o m,n obtained in each iteration are closed optimal solutions because (37), ( 33) and ( 45) are all convex after a sequence of transformations.The iterative update of c o m,n , a o m,n and p o m,n will only increase or maintain the objective function in (43).A monotonically growing sequence of objective function values in (43) can be obtained through iterative updating.However, it has an upper bound because of the transmission power restriction.Therefore, the proposed iterative optimization algorithm for power allocation will converge to a stationary solution of problem Ppower .The power allocation optimization technique is described in detail in Algorithm 2. We summarize the proposed algorithm in Algorithm 3. The computational complexity of the proposed algorithm mainly arises from the iteration part.We observe that in each iteration, the complexity of obtaining the optimal values c o m,n in (38) and a o m,n in (44) is linear with the number of users, i.e., O(K).λ in (48) can be obtained by using the Newton or bisection methods, both of which have a complexity of O K 2 log 2 δ , where δ represents the desired accuracy.The overall complexity of the suggested power allocation algorithm can be calculated to be O T max K 2 log 2 δ , where T max is the maximum number of repetitions.Therefore, the complexity of the proposed joint precoding design and power allocation optimization algorithm is O T max K 2 log 2 δ + T max N RF 3 .While the computational complexity of the algorithm without NSE processing is O T max K 2 log 2 δ + T max N RF 4 .

Simulation Result
The performance of the proposed joint optimization algorithm for the mmWave beamspace MIMO-NOMA scheme is evaluated by using numerical simulations in this section.

Simulation Setup
In this paper, we consider a typical single-cell downlink mmWave massive MIMO system.The BS is equipped with a ULA of N = 256 transmit antennas that communicate with K users simultaneously.The system bandwidth is assumed to be 1 Hz, and the total transmit power is set to P = 32mW (15 dBm) [11].For all users' channels, we assume L = 1 LoS component and L = 2 NLoS components, where β k follows a uniform distribution within − 1 2 , 1 2 for 1 ≤ l ≤ L. The SNR is set as P σ 2 , the maximum number of iterations T max = 50.We consider the following four typical mmWave massive MIMO solutions for comparison, and we aim to use the same system configuration in these systems to conduct a fair comparison: "traditional fully digital MIMO" (FDM), "traditional beamspace MIMO" (BM), "traditional MIMO-OMA"(MO), in particular, we compared our approach with the reference [11], which is a particularly classic and highly effective method based on a "beamspace MIMO-NOMA" (BMN) system, as a benchmark.
We evaluated the performance in terms of energy efficiency and spectral efficiency of each of the four baseline systems mentioned above.According to [20], energy efficiency can be expressed as: where P t represents the total transmit power, P RF represents the power consumed by each RF, P SW represents the power consumed by each switch, and P BB represents the power consumed at the baseband.For the parameters, we have adopted the following common values: R RF = 300mW, P SW = 5mW and P BB = 200mW.

Simulation Results
The performance evaluation of the proposed MIMO-NOMA system was carried out in three different cases: performance comparison at different SNRs, performance comparison at different numbers of users, and performance comparison at different numbers of antennas.
x Comparison of performance with different SNRs Figure 3 depicts the comparison of spectral efficiency versus SNRs with K = 32 and K = 128.As the SNR increases, both sets of curves demonstrate an increase in spectral efficiency.The proposed optimization structures namely proposed 'BMN' and proposed 'beamspace MIMO-NOMA with NSE' (BMNN), exhibited very similar results in terms of spectral efficiency growth.This indicates that our precoding scheme, approximated by the NSE, not only reduced the complexity of the original algorithm but also achieved comparable performance.These findings highlight the effectiveness of the NSE approximation algorithm.
NSE, not only reduced the complexity of the original algorithm but also achieved comparable performance.These findings highlight the effectiveness of the NSE approximation algorithm.Figure 4 presents a comparison of spectral efficiency versus SNRs for the proposed system and the baseline systems.In particular, we compared the spectral efficiency of the proposed algorithm in the beamspace MIMO-NOMA system with the classical BMN [11] algorithm, both for 128 users and 32 users.The results indicate that in both scenarios, BMNN outperformed BMN [11], with the advantage becoming more pronounced as the number of users increased.When there were 32 users, the proposed BMNN scheme outperformed the BMN [11], BM, and MO schemes in terms of spectral efficiency.Particularly, compared to BMN [11], the performance gain of BMNN came mainly from the optimization of precoding for different beams in the first stage.Moreover, the proposed BMNN exhibited significantly better performance than BM, benefiting from the integration of beamspace MIMO and NOMA technologies, which enabled simultaneous service to multiple users within each beam and effectively improved spectral efficiency.Since NOMA can achieve higher spectral efficiency than OMA, it is evident that the proposed BMNN also outperforms MO in terms of spectral efficiency.Figure 4 presents a comparison of spectral efficiency versus SNRs for the proposed system and the baseline systems.In particular, we compared the spectral efficiency of the proposed algorithm in the beamspace MIMO-NOMA system with the classical BMN [11] algorithm, both for 128 users and 32 users.The results indicate that in both scenarios, BMNN outperformed BMN [11], with the advantage becoming more pronounced as the number of users increased.When there were 32 users, the proposed BMNN scheme outperformed the BMN [11], BM, and MO schemes in terms of spectral efficiency.Particularly, compared to BMN [11], the performance gain of BMNN came mainly from the optimization of precoding for different beams in the first stage.Moreover, the proposed BMNN exhibited significantly better performance than BM, benefiting from the integration of beamspace MIMO and NOMA technologies, which enabled simultaneous service to multiple users within each beam and effectively improved spectral efficiency.Since NOMA can achieve higher spectral efficiency than OMA, it is evident that the proposed BMNN also outperforms MO in terms of spectral efficiency.Figure 4 presents a comparison of spectral efficiency versus SNRs for the propose system and the baseline systems.In particular, we compared the spectral efficiency of th proposed algorithm in the beamspace MIMO-NOMA system with the classical BMN [1 algorithm, both for 128 users and 32 users.The results indicate that in both scenario BMNN outperformed BMN [11], with the advantage becoming more pronounced as th number of users increased.When there were 32 users, the proposed BMNN scheme ou performed the BMN [11], BM, and MO schemes in terms of spectral efficiency.Particu larly, compared to BMN [11], the performance gain of BMNN came mainly from the op timization of precoding for different beams in the first stage.Moreover, the propose BMNN exhibited significantly better performance than BM, benefiting from the integr tion of beamspace MIMO and NOMA technologies, which enabled simultaneous servic to multiple users within each beam and effectively improved spectral efficiency.Sinc NOMA can achieve higher spectral efficiency than OMA, it is evident that the propose BMNN also outperforms MO in terms of spectral efficiency.that increasing SNR can lead to a substantial growth in energy efficiency, and within the same system, for both 32 users and 128 users, our algorithm outperformed BMN [11].Furthermore, in different systems with 32 users, the energy efficiency of the proposed BMNN was higher than that of the other four baseline systems.Specifically, compared to BM, our proposed BMNN achieved higher energy efficiency, by integrating NOMA and beamformed MIMO, allowing each beam to serve multiple users.
Figure 5 illustrates the comparison of energy efficiency versus SNRs with K = 32 an K=128 users for the proposed system and the baseline systems.It can be clearly seen th increasing SNR can lead to a substantial growth in energy efficiency, and within the sam system, for both 32 users and 128 users, our algorithm outperformed BMN [11].Furthe more, in different systems with 32 users, the energy efficiency of the proposed BMNN w higher than that of the other four baseline systems.Specifically, compared to BM, our pr posed BMNN achieved higher energy efficiency, by integrating NOMA and beamforme MIMO, allowing each beam to serve multiple users.

② Comparison of performance with different users
The aforementioned results were obtained while considering varying SNR, howeve in real communication systems, especially in massive MIMO systems, the number of a cessed users plays a significant role.Therefore, we further investigated the spectrum effi ciency performance of the two proposed solutions under different user scenarios.
Figure 6 depicts how spectrum efficiency varies with the number of users.Bo curves exhibit an upward trend with increasing user count, and the spectrum efficien growth curves of the two proposed optimization structures yield similar results.The aforementioned results were obtained while considering varying SNR, however, in real communication systems, especially in massive MIMO systems, the number of accessed users plays a significant role.Therefore, we further investigated the spectrum efficiency performance of the two proposed solutions under different user scenarios.
Figure 6 depicts how spectrum efficiency varies with the number of users.Both curves exhibit an upward trend with increasing user count, and the spectrum efficiency growth curves of the two proposed optimization structures yield similar results.
Sensors 2023, 23, x FOR PEER REVIEW 17 of 24 Figure 5 illustrates the comparison of energy efficiency versus SNRs with K = 32 and K=128 users for the proposed system and the baseline systems.It can be clearly seen that increasing SNR can lead to a substantial growth in energy efficiency, and within the same system, for both 32 users and 128 users, our algorithm outperformed BMN [11].Furthermore, in different systems with 32 users, the energy efficiency of the proposed BMNN was higher than that of the other four baseline systems.Specifically, compared to BM, our proposed BMNN achieved higher energy efficiency, by integrating NOMA and beamformed MIMO, allowing each beam to serve multiple users.

② Comparison of performance with different users
The aforementioned results were obtained while considering varying SNR, however, in real communication systems, especially in massive MIMO systems, the number of accessed users plays a significant role.Therefore, we further investigated the spectrum efficiency performance of the two proposed solutions under different user scenarios.
Figure 6 depicts how spectrum efficiency varies with the number of users.Both curves exhibit an upward trend with increasing user count, and the spectrum efficiency growth curves of the two proposed optimization structures yield similar results.Figure 7 illustrates a comparison of the spectrum efficiency of the four schemes unde different user scenarios at 0 dB.The BMNN scheme outperformed the BMN [11], BM, an MO schemes.Moreover, compared to the traditional BM schemes, the BMNN optimiza tion scheme proposed in this study further improved spectrum efficiency.Figure 8 displays the energy efficiency performance for all considered schemes as th number of users increases.It is obvious that the proposed algorithm remained superio among the five schemes, which proves the effectiveness of the proposed scheme.Anothe noteworthy observation is that the performance of our proposed BMNN algorithm su passed that of BMN [11] in terms of energy efficiency.This is mainly attributed to the fa that BMN [11] utilizes the ZF algorithm commonly employed in many studies in the pre coding part, whereas our proposed algorithm optimizes the precoding parameter thereby validating the necessity of optimizing precoding design parameters in our algo rithm.Figure 9 shows how spectral efficiency varies with an increasing number of users a Figure 8 displays the energy efficiency performance for all considered schemes as the number of users increases.It is obvious that the proposed algorithm remained superior among the five schemes, which proves the effectiveness of the proposed scheme.Another noteworthy observation is that the performance of our proposed BMNN algorithm surpassed that of BMN [11] in terms of energy efficiency.This is mainly attributed to the fact that BMN [11] utilizes the ZF algorithm commonly employed in many studies in the precoding part, whereas our proposed algorithm optimizes the precoding parameters, thereby validating the necessity of optimizing precoding design parameters in our algorithm.
Sensors 2023, 23, x FOR PEER REVIEW 18 of 24 Figure 7 illustrates a comparison of the spectrum efficiency of the four schemes under different user scenarios at 0 dB.The BMNN scheme outperformed the BMN [11], BM, and MO schemes.Moreover, compared to the traditional BM schemes, the BMNN optimization scheme proposed in this study further improved spectrum efficiency.Figure 8 displays the energy efficiency performance for all considered schemes as the number of users increases.It is obvious that the proposed algorithm remained superior among the five schemes, which proves the effectiveness of the proposed scheme.Another noteworthy observation is that the performance of our proposed BMNN algorithm surpassed that of BMN [11] in terms of energy efficiency.This is mainly attributed to the fact that BMN [11] utilizes the ZF algorithm commonly employed in many studies in the precoding part, whereas our proposed algorithm optimizes the precoding parameters, thereby validating the necessity of optimizing precoding design parameters in our algorithm.Figure 9 shows how spectral efficiency varies with an increasing number of users at SNR levels of −5 dB, 0 dB, and 5 dB.It is important to note that, across all these SNR Figure 9 shows how spectral efficiency varies with an increasing number of users at SNR levels of −5 dB, 0 dB, and 5 dB.It is important to note that, across all these SNR conditions, the BMNN algorithm we propose consistently outperformed the other schemes, and its superiority becomes even more pronounced as the SNR increases.conditions, the BMNN algorithm we propose consistently outperformed the other schemes, and its superiority becomes even more pronounced as the SNR increases.

③ Comparison of performance with different antennas
From Figure 10, it can be observed that, BMNN exhibited a clear advantage over other algorithms until the number of antennas increases to 200.Beyond this point, the spectrum efficiency of the FDM algorithm surpassed that of the others.This is primarily attributed to the increase in the number of antennas in the FDM algorithm.With more antennas, precise beamforming becomes possible, allowing for more accurate signal focusing.This allows signals to be aimed more accurately at the receivers, reducing signal scattering and interference, ultimately leading to improved spectral efficiency.However, it is worth noting that the FDM algorithm typically requires more hardware and signal processing resources, which can lead to higher power consumption.As Figure 11 corroborates, the energy efficiency of the FDM tends to be lower.Nevertheless, as seen in the graph, our proposed BMNN algorithm achieved the highest energy efficiency among all algorithms, highlighting its potential to enhance system performance with a clear advantage.z Comparison of performance with different antennas From Figure 10, it can be observed that, BMNN exhibited a clear advantage over other algorithms until the number of antennas increases to 200.Beyond this point, the spectrum efficiency of the FDM algorithm surpassed that of the others.This is primarily attributed to the increase in the number of antennas in the FDM algorithm.With more antennas, precise beamforming becomes possible, allowing for more accurate signal focusing.This allows signals to be aimed more accurately at the receivers, reducing signal scattering and interference, ultimately leading to improved spectral efficiency.
conditions, the BMNN algorithm we propose consistently outperformed the other schemes, and its superiority becomes even more pronounced as the SNR increases.

③ Comparison of performance with different antennas
From Figure 10, it can be observed that, BMNN exhibited a clear advantage over other algorithms until the number of antennas increases to 200.Beyond this point, the spectrum efficiency of the FDM algorithm surpassed that of the others.This is primarily attributed to the increase in the number of antennas in the FDM algorithm.With more antennas, precise beamforming becomes possible, allowing for more accurate signal focusing.This allows signals to be aimed more accurately at the receivers, reducing signal scattering and interference, ultimately leading to improved spectral efficiency.However, it is worth noting that the FDM algorithm typically requires more hardware and signal processing resources, which can lead to higher power consumption.As Figure 11 corroborates, the energy efficiency of the FDM tends to be lower.Nevertheless, as seen in the graph, our proposed BMNN algorithm achieved the highest energy efficiency among all algorithms, highlighting its potential to enhance system performance with a clear advantage.However, it is worth noting that the FDM algorithm typically requires more hardware and signal processing resources, which can lead to higher power consumption.As Figure 11 corroborates, the energy efficiency of the FDM tends to be lower.Nevertheless, as seen in the graph, our proposed BMNN algorithm achieved the highest energy efficiency among all algorithms, highlighting its potential to enhance system performance with a clear advantage.

Conclusions
In this research, we addressed the joint optimization problem of precoding and power allocation in massive MIMO-NOMA networks, aiming to maximize the sum rate for all devices.To tackle this challenge, we transformed the original optimization problem into an unconstrained problem for the precoding subproblem.We employed the FP approach to handle the non-convex problem, resulting in three equivalent problems and a closed expression for precoding.For the power allocation subproblem, which remains nonconvex, we utilized the MMSE-based dynamic power allocation scheme to solve it.Simulation results demonstrated that the proposed beamspace MIMO-NOMA system outperforms the baseline in terms of both spectrum and energy efficiency.In future work, we intend to extend the proposed optimization framework for precoding from beambased optimization to user-based optimization, aiming to further improve system performance.

Conclusions
In this research, we addressed the joint optimization problem of precoding and power allocation in massive MIMO-NOMA networks, aiming to maximize the sum rate for all devices.To tackle this challenge, we transformed the original optimization problem into an unconstrained problem for the precoding subproblem.We employed the FP approach to handle the non-convex problem, resulting in three equivalent problems and a closed expression for precoding.For the power allocation subproblem, which remains nonconvex, we utilized the MMSE-based dynamic power allocation scheme to solve it.Simulation results demonstrated that the proposed beamspace MIMO-NOMA system outperforms the baseline in terms of both spectrum and energy efficiency.In future work, we intend to extend the proposed optimization framework for precoding from beam-based optimization to user-based optimization, aiming to further improve system performance.
Specifically, the gradient of the of the nth user in the mth beam with respect to the variable w m can be expressed as: This equation implies that w o m satisfies the first-order optimality condition with respect to precoding, where λ is the Lagrange multiplier.Moreover, since w o m satisfies the power constraint and also satisfies the complementary slackness condition, it further satisfies the KKT condition of the original problem P beam .Thus, w o m is a nontrivial stationary point of P beam .The sufficiency proof is finally complete.
The sufficiency of the proposition can be demonstrated by reversing the steps of sufficiency proof.

Figure 1 .
Figure 1.The system model of beamspace MIMO architecture.
is a symmetric set of indices centered around zero.The spatial direction of the channel is defined as θ = d λ sin(ϕ), λ represents the wavelength, d = λ 2 denotes the antenna spacing, and ϕ denotes the physical direction of the corresponding path satisfying − π 2 ≤ θ ≤ π 2 .The lens antenna array serves as a discrete Fourier transformation matrix U, defined as
, the problem becomes a non-convex optimization problem that is difficult to solve directly.Furthermore, it is highly nonlinear.Additionally, the optimization of precoding { } m w is performed at the beam level, while the optimization of power allocation { } m n
(t) m,n in(25) is linear in the number of RF chains, i.e., O(N RF ).

Figure 3 .
Figure 3. Spectrum efficiency comparison versus SNRs of the two schemes with different users.

Figure 4 .
Figure 4. Spectrum efficiency comparison versus SNRs with different users.

Figure 3 .
Figure 3. Spectrum efficiency comparison versus SNRs of the two schemes with different users.

Figure 3 .
Figure 3. Spectrum efficiency comparison versus SNRs of the two schemes with different users.

Figure 4 .
Figure 4. Spectrum efficiency comparison versus SNRs with different users.

Figure 4 .
Figure 4. Spectrum efficiency comparison versus SNRs with different users.

Figure 5
Figure 5 illustrates the comparison of energy efficiency versus SNRs with K = 32 and K = 128 users for the proposed system and the baseline systems.It can be clearly seen

Figure 5 .
Figure 5. Energy efficiency comparison versus SNRs with different users.

Figure 6 .
Figure 6.Spectrum efficiency comparison versus users of the two schemes at = SNR 0 dB.

Figure 5 .
Figure 5. Energy efficiency comparison versus SNRs with different users.y Comparison of performance with different users

Figure 5 .
Figure 5. Energy efficiency comparison versus SNRs with different users.

Figure 6 .
Figure 6.Spectrum efficiency comparison versus users of the two schemes at = SNR 0 dB.

Figure 6 .
Figure 6.Spectrum efficiency comparison versus users of the two schemes at SNR = 0 dB.

Figure 7
Figure 7 illustrates a comparison of the spectrum efficiency of the four schemes under different user scenarios at 0 dB.The BMNN scheme outperformed the BMN [11], BM, and

Figure 9 .
Figure 9. Spectral efficiency comparison versus users with different SNRs.

Figure 9 .
Figure 9. Spectral efficiency comparison versus users with different SNRs.

Figure 9 .
Figure 9. Spectral efficiency comparison versus users with different SNRs.

2 2 p l,k +σ 2 =
gradient of the rate of the nth user in the xth(x = m) beam with respect to variable w m can be expressed as: ∇ w m R x,y = m , ∀m, (A8) and (A9) can be simplified, the expression after bringing them into (A4) can be specifically expressed as: m R sum ( w o m ) − λ( w o m ) λ > 0. (A11) Definition 1. (Trivial Stationary Point): If a point X satisfying HX = 0, which results in a zero-sum rate, we say that it is a trivial stationary point of the original problemP 1 .
Proof.See Appendix A.

Algorithm 1
Proposed Precoding Framework.Beamspace channel vectors: h m,n for ∀m, n; Power allocation parameters: p m,n for ∀m, n; Noise variance: σ 2 ; Maximum iteration times: T max .

Algorithm 2
Proposed Power Allocation Framework.
m,n for ∀m, n.Algorithm 3 Proposed Joint Precoding and Power Allocation Framework.m,nfor ∀m, n.