A Survey on Non-Orthogonal Multiple Access: From the Perspective of Spectral Efﬁciency and Energy Efﬁciency

: Non-orthogonal multiple access (NOMA) is a promising technology for next-generation wireless networks with emerging demands on low latency, high throughput, and massive connectivity. Unlike orthogonal multiple access, NOMA allows multiple users to share the same radio resources, which signiﬁcantly improves spectral efﬁciency (SE). To achieve green wireless communications for numerous networked devices, NOMA helps reduce energy consumption while satisfying rate fairness and quality-of-experience requirements. The goal of this paper is to introduce the innovative approaches for NOMA in terms of the SE and energy efﬁciency, and discuss emerging technologies involved with NOMA. Further, its challenges and future research directions are highlighted.


Introduction
The rapid growth in massive connectivity, where billions of devices are connected to a dense network (i.e., Internet of Things (IoT)) with a high demand for quality-of-experience (QoE) [1], requires innovative technologies to overcome the scarcity of radio resources. Recently, many novel approaches have been introduced as potential solutions for future wireless networks. As a standard for forthcoming wireless networks, a massive multiple-input multiple-output (mMIMO) system utilizes the properties of favorable propagation and channel hardening to improve network throughput. The mMIMO system consists of a central base station (BS) equipped with numerous antennas that serve multiple user equipments (UEs). It has been proved to support multiple access (MA) efficiently when the number of antennas at the BS is higher than the number of UEs [2]. Additionally, non-orthogonal multiple access (NOMA), which enables multiple UEs to share the same time-frequency resources, has been proposed to enhance the throughput and rate fairness, even with a limited ratio of the number of antennas to that of the UEs.
NOMA transmission is distinguished from conventional orthogonal multiple access (OMA) in that UEs can be allocated to the same time/frequency resources. In OMA transmission, UEs are associated to exclusive time slots, frequency resource blocks, or spreading codes, depending on MA methods. These methods include frequency division multiple access (FDMA), time division multiple access (TDMA), code division multiple access (CDMA), and orthogonal frequency division multiple access (OFDMA). The common goal of the preceding OMA schemes is to avoid mutual interference among the UEs, thereby guaranteeing certain spectral efficiency (SE) for the UEs. However, applying that can be integrated into the NOMA to further boost the SE and EE, including millimeter wave (mmWave), full-duplex (FD) radio, and simultaneous wireless information and power transfer (SWIPT). Furthermore, we discuss the challenges and future research directions to further enhance the SE and EE. The main content of this paper is organized as follows. • First, we briefly describe the basic operation of NOMA systems. For power-domain NOMA, the principle and conditions of the SIC are presented in a general case of UE clustering. For code-domain NOMA, we introduce the comprehensive operation of SCMA.

•
We present the recent advances in improving the SE and EE of the NOMA systems, and then, we discuss UE/devices association and cooperative NOMA that are based on cooperation among devices.

•
To improve the SE and EE further, NOMA schemes can be integrated into emerging technologies. This paper provides an overview of such enhancements when NOMA is applied to networks based on mmWave, FD, and SWIPT. • Finally, the challenges of forthcoming networks are discussed, and then we introduce potential research directions, including terahertz (THz) wave, intelligent reflecting surface (IRS), and learning-based methods, which may help to further improve the SE and EE.

Principles of NOMA Transmission
This section presents the principle of the power-domain and code-domain NOMA systems. The two types of NOMA lead to different decoding methods, and SE and EE formulations. Thus, separating the descriptions of the power-and code-domain NOMA is required. For the rest of the paper, the term NOMA is used to indicate power-domain NOMA, while SCMA is used to denote a specific scheme of code-domain NOMA.

NOMA
NOMA has been examined in a single-carrier scenario where multiple DL/uplink (UL) UEs share the same frequency resources to compromise the SE and EE in the power domain. Specifically, a typical example of a two-user scenario was provided to prove the advantages of NOMA [6]. In this work, we further describe the principle of NOMA when it is applied to UE groups (also referred to as clusters). We consider a NOMA-based system where a cell-centered BS equipped with N antennas serves G d DL groups and G u UL groups.

DL Transmission
Consider the g-th DL group that consists of K g DL UEs. As illustrated in Figure 1, the UEs adopt an SIC technique to decode the messages. Particularly, the DL UEs in the g-th DL group can be sorted in descending order of their distances to the BS, i.e., the farthest and nearest DL UEs are denoted by D g,1 and D g,K g , respectively. A certain DL UE, D g,k , k ∈ {1, 2, · · · , K g }, can remove the messages intended to farther DL UEs, D g,k , k ∈ {1, 2, · · · , k − 1}, while suffering from the interference caused by the signals of nearer DL UEs. We assume h d g,k ∈ C 1×N to be the channel vector of the k-th DL UE in the g-th group. The signal intended to D g,k is represented by x d g,k with E |x d g,k | 2 = 1. Therefore, the post-SIC signal received at D g,k can be expressed as where w d g,k ∈ C N×1 and n d g,k ∼ CN (0, σ 2 g,k ) denote the beamforming vector and additive white Gaussian noise (AWGN), respectively. Accordingly, the SINR at D g,k is computed as TheDL NOMA transmission requires to be distinguished from the DL transmission scheme based on dirty paper coding (DPC), where the BS sends the encoded signals (called dirty messages) to the DL UEs. In fact, the messages sent to DL UEs are jointly encoded in order for the cumulative signal received at a certain DL UE to become a purely desired message of the UE [12][13][14]. The key idea of the DPC is based on the encoding technique at the BS, while the DL NOMA uses the SIC technique at the DL UEs. However, the DPC encounters many challenges for practical implementation.

UL Transmission
Generally, we consider the i-th UL group which consists of L i UL UEs. After employing the SIC to the i-th UL group, the signal for decoding message of the -th UE in the i-th group, U i, , at the BS can be formulated as where p i, and h u i, represent the power coefficient and channel vector of U i, , respectively, while the signal and AWGN are x d g,k with E |x u i, | 2 = 1 and n bs ∼ CN (0, σ 2 bs ), respectively. Therefore, the SINR for decoding the U i, 's message is given as where a i, is the received signal vector of U i, using maximum ratio combining (MRC), zero-forcing (ZF), or minimum mean square error (MMSE) at the multiple-antenna receiver. The signals sent from the UL UEs are centrally processed at the BS, rather than separately computed at the DL UE nodes in the DL transmission. For efficiency and simplicity, the BS is usually enabled to apply the SIC for all UL UEs as G u = 1.

SCMA
SCMA is among the most popular code-domain NOMA schemes. According to the principle of CDMA, the information is spread across multiple frequency resources, i.e., subcarriers in orthogonal frequency division multiplexing (OFDM). Specifically, SCMA uses a sparse codebook, where the codewords are mapped into multidimensional constellations. Consequently, the multiple UEs are enabled to share the same time-frequency resources, while they are recognized by different codewords. In this subsection, we introduce the system model of an SCMA and codebook mapping procedure. A general SCMA system uses a codebook, expressed by a factor graph matrix of D × J binary elements, F. Accordingly, the codebook contains J codewords, and each codeword is a D-dimensional complex vector (called the sparse vector) with M (M < D) nonzero entries. Therefore, the number of codewords becomes J = ( D M ), and there are d f = ( D−1 M−1 ) codewords multiplexed per subcarrier of an OFDM. Figure 2 illustrates a system consisting of six active UEs and D = 4 OFDM subcarriers. By establishing an SCMA layer, each UE is allocated to M = 2 of D = 4 subcarriers, as assigned to a codeword via an SCMA encoder. The transmit symbol of each UE is appropriately spread over M = 2 subcarriers. The sparsity of codewords, i.e., M non-zeros among D entities of a codeword, allows the SCMA decoder at the receiver to recover the messages of all the UEs under controllable interference. The interference is usually handled by a joint decoding based on a message passing algorithm (MPA). The SCMA allows the BS to serve a number of UEs exceeding the number of OFDM subcarriers, which leads to enhanced SE.

SIC Conditions for NOMA-Based Design
The following two conditions are important for SIC operation in the DL NOMA transmission.

•
Minimum SINR: The SINR for decoding a message of a certain UE should be the minimum SINRs for decoding the message at all the nearer UEs and itself. • SIC power constraint: a series of SIC conditions can be established for each UE. Specifically, to decode a message in priority, the received power portion for the message must exceed that of the remaining portions in the SIC process.
The two conditions for the SIC operation pursue the same behavior that the power levels for decoding and canceling the interference signals at a UE are greater than that for decoding its own message. The latter condition explicitly reveals this behavior. For the former condition, we examine an example of a two-user scenario (g = 1) with the SINRs at the far and near UEs given by (2), i.e.,γ Here, the SINR of near UE in (6) is derived using the form in (2), γ d 1,2 , with the inter-and intra-group interference degenerated. The SINR of far UE becomesγ d 1,1 in (5), which is given as the minimum of the SINR for decoding its own message, γ d 1,1 as in (2), and SINR for decoding the far UE message at the near UE. When increasing the power for delivering the far UE message (through the beamforming vector w d 1,1 ), the SINR expression (6) for decoding the near UE message is not affected. Therefore, the BS tends to exert more power for w d 1,1 , which helps improve the UE fairness and SE. Aiming at the SE optimization, we let P th be the minimum power difference for successful decoding, and then the objective function and SIC constraint for two conditions in the two-user scenario can be described as in Table 2. Table 2. Objective functions and constraints for SIC design in NOMA systems.

Design Approaches Objective Function, SE (bits/s/Hz) Constraint for SIC
Minimum SINR log 2 Implicit SIC constraint in the objective function SIC power constraint log 2 In the UL transmission, similar conditions can be derived. To further boost the SE for UL UEs, all the UL UEs are organized into one group; therefore, the SIC is applied to the sole group for decoding all the messages at the BS with high computational capability. In the DL and UL NOMA transmissions, the SIC technique supports flexible and efficient delivery of the power levels to the UEs. This is important to the advanced management of power consumption for enhancing the EE, which can be defined as [15] where B and P total denote the system bandwidth and total power consumption, respectively. The numerator of EE represents the throughput as a product of system bandwidth and SE. Therefore, considering throughput or data rate (including sum rate and max-min rate) is directly reflected in the SE.

NOMA-Assisted Spectral-Energy Efficiency
In earlier works, MA techniques with successive decoding were considered [16][17][18][19]. The authors in [16] examined an uplink multi-user single-input single-output (MU-SISO) system (UL NOMA), where a BS requires determining the order for decoding the messages sent from the UEs. By introducing a polymatroid structure, the optimal solution for the ordering was devised using the convex hull of a successive decoding set. In [18], a broadcast system equipped with a single-antenna BS was investigated to serve multiple DL UEs. To maximize the SE, the authors proposed a power allocation method for three duplexing schemes: time-division duplex (TDD), frequency-division duplex (FDD), and code-division duplex (CDD). Although the OMA was applied to TDD and FDD, the power allocation with successive decoding was utilized in CDD scheme, as multiple UEs share the same time-frequency resources. However, the approaches for the SE improvement were developed within the OMA domain, i.e., time, frequency, or code.
The concept of NOMA was introduced in [20], where a NOMA-based scheme is based on CDMA, and thus the MA is executed non-orthogonally in the time-frequency domain. This work also indicated that a trade-off between the SE and EE of NOMA follows the principle of OMA, which is realized by two approaches: (i) to maximize the SE with a given power budget, and (ii) to minimize the power consumption while providing a minimum data rate. These are two alternative approaches to SE and EE optimization in the OMA and NOMA. To enhance the SE, a NOMA with SIC was proposed in [21,22], where a superposition coding in the power domain efficiently supports the MA. In contrast to the CDMA, NOMA using an SIC receiver can address the UE fairness through power allocation without fast transmit power control for resolving the near-far problem.
To utilize NOMA efficiently, many works have recently focused on enhancing the SE in terms of the sum rate and max-min rate. The SE improvement for DL transmission was analytically and numerically proved in [23], where the asymptotic analysis is conducted for a high signal-to-noise ratio (SNR) and an infinite number of UEs. Then, the authors in [24] investigated NOMA in two special cases: fixed power allocation (F-NOMA) and cognitive-radio-inspired NOMA (CR-NOMA), where the SIC is performed at the secondary UE under the QoS requirement for primary UE. Further, the sum and max-min SE were considered to understand the impact of UE pairing on per-UE SE via asymptotic performance analysis. In addition, in a two-UE scenario, Choi et al. reported on proportional fairness scheduling for a DL system to maximize the sum SE and minimum SE [25]. Then, a general scenario was considered in [26], where resource allocation, matching, and particle swarm optimization were investigated to maximize the SE for a DL NOMA system with multiple UEs. Meanwhile, a DL power allocation for NOMA multicell networks was proposed in [27], while the authors in [28] investigated a resource and power allocation for a DL multicarrier system to maximize the minimum SE among UEs. The aforementioned works developed the NOMA-based schemes for MU-SISO systems with the BS and UEs equipped with a single antenna. Therefore, the earlier NOMA-based systems might not fully exploit the channel capacity and advantages of NOMA, which depend largely on disparities in the channel conditions among UEs.
Multiple-antenna techniques were considered in NOMA systems to utilize the spatial degrees of freedom for further enhancing the channel capacity [29,30]. In [31], the authors examined a cellular DL MIMO system, where an intra-beam superposition coding at the BS and an intra-beam SIC at UEs were proposed to simultaneously provide higher sum SE and per-UE throughput. Then, the optimal and low-complexity suboptimal power allocation for the DL MIMO systems were devised in [32] to maximize the ergodic capacity. This aforementioned study further introduced the minimum rate to improve the UE fairness among near and far UEs, and per-UE QoS. Under the consideration of layered transmission, the lower-and upper-bounds on the average sum rate were derived in [33], which shows that the sum rate linearly increases with an increase in the number of antennas. In [34], algorithms for clustering, beamforming, and power allocation were developed to maximize the SE in a general MIMO system with multiple UEs. To improve the SE for a DL heterogeneous network, the authors in [35] proposed a coordinated beamforming design to mitigate the interference. Considering the multiple-input single-output (MISO) systems, the authors in [36] utilized the order of signal power for beamforming design and approximated the DL sum rate maximization problem to a successive second-order-cone (SOC) program. In [37], the beamforming design and power allocation were jointly optimized to improve the SE for secure transmissions.
Although DL NOMA has attracted a lot of interest due to the challenge of performing SIC at different nodes, i.e., UEs, UL NOMA has also been investigated in many recent works. The UL NOMA may be more straightforward as the SIC is performed at one node, i.e., the BS. To maximize the SE for the UL UEs, an MMSE with SIC (MMSE-SIC) was proposed in [38], while the authors in [39] applied ZF with SIC (ZF-SIC) to two sets of UL UEs, associated with strong and weak channel gains, for decoding the UEs messages. To utilize the current network, Al-Imari et al. developed a UL NOMA scheme with simultaneous resource and power allocation for OFDM [40]. A new UL NOMA framework was proposed in [41], where a general model with the SIC applied to a pair of UL UEs was developed. The authors in [42] considered a power allocation to enhance the per-UE SE, while UE pairing and power control for SE maximization were developed in [43]. Several other works explored UL NOMA for SE improvement from the viewpoint of associations among UEs, which will be detailed in the following section.
To enhance the EE, many efforts have been made for designing a power consumption minimization and EE maximization for single and multiple subcarriers. Chen et al. developed new methods to minimize the power consumption of MISO-NOMA DL transmission, i.e., an optimal precoding under QoS constraints [44], and beamforming design based on quasi-degradation criterion [45]. The work in [45] also revealed that when an SIC is performed at a UE with strong channel-condition, the performance of NOMA is similar to that of the DPC. In [46], the EE was improved using the first-order Taylor approximation for beamforming design with a CR-NOMA adopted as in [24]. To further boost the EE, the authors in [47] investigated a resource allocation problem for a NOMA-based system, where a suboptimal subcarriers assignment and power allocation were proposed to maximize the EE. Then, the EE for a UL multicarrier system was reported in [48], in which the UE fairness was guaranteed by power allocation and decoding order.
In summary, a key technique for power-domain NOMA is SIC, in which higher-power signals destined for poorer-channel UEs are decoded and removed at better-channel UEs. This principle not only boots the data rates of poor-channel UEs but also maintains those of good-channel UEs, improving the SE and EE as well as UE fairness. However, more power should be allocated to far UEs than to near UEs in a UE group. Consequently, the number of UEs in a group as well as the total number of UEs is restricted due to a limitation in power budget. To accommodate a large number of UEs in a network, a resource overloading technique will be introduced in the following section.

Spectral and Energy Efficiencies for SCMA-Based Systems
SCMA based on the information spreading technique allows MA where the number of UEs can exceed the number of subcarriers without causing excessive mutual interference; therefore, the SE is significantly increased. In contrast to the NOMA with the optimization techniques developed for power allocation, the advancement in SCMA is usually focused on codebook and detector designs [49]. Moreover, the encoders of an SCMA system directly map data onto subcarriers following the complex multidimensional codewords in the codebook. Then, the detector at the receiver recovers data information from the received signals. Therefore, the optimization design for the codebook and detector is of crucial importance for SE improvement in SCMA systems, and the topic attracted a lot of interest, especially for UL transmissions. Generally, the codebook design is classified into two categories: geometric shaping (GS) and probabilistic shaping (PS). Accordingly, the constellations for modulation of the GS scheme are organized as a non-evenly spaced topology with a uniform distribution [50], while those of the PS are based on an evenly-spaced topology following a nonuniform distribution [51][52][53]. The combination of the GS and PS schemes was recently considered in [54].
As opposed to the OMA, the SCMA codebook design can provide a resource overloading [49]. The overloading rate is defined as the ratio of the number of superposed layers, i.e., the number of codewords to the number of resources. The approaches for codebook design have a limitation of overloading rate at 150%. To overcome the overloading limitation, certain recent works considered a grant-free system called multi-codebook SCMA (McSCMA), where every UE uses different number of codebooks to encode data into multiple codewords. In particular, M-ary data symbols are mapped to the sparse codewords of a codebook, while the active UE detection and combinations of codebook are executed at the BS through pilot detection [55][56][57]. The authors in [57] demonstrated the effect of the number of pilots on pilot detection, and consequently a high-overloading codebook design with codebook grouping was proposed to reduce the number of pilots. Therefore, the McSCMA provides more rooms to improve the SE owing to the extension of the overloading rate to 200%.
The design of a detector is crucial to the design of an SCMA receiver. Owing to the sparse spreading structure of the SCMA, the receiver requires a detector that combines mapping and spreading operations. The SCMA detector was gradually developed in the last decade. For example, an MPA detector was used in [58], which provides nearly optimal bit error rate (BER) and low complexity, as compared with the optimal maximum a posterior (MAP) detector. However, the authors of [59] showed that a heavy overloading can cause a high computational complexity in the original MPA due to an increase in the codebook size and number of UEs. Accordingly, many works were conducted to improve the detector algorithm based on the MPA [59][60][61][62] and other advanced methods [63][64][65].
Although the codebook and detector designs are considered at the BS with high power supply and computational capability, certain advancements at the side of the UEs were reported to enhance the SE in DL transmission. In [66], a novel multi-stage MPA algorithm was proposed to determine the detection order via the sorted SNR levels, while a modified factor graph to reduce the complexity was introduced to efficiently operate at DL UEs. Concurrently, Yu et al. [67] investigated an irregular SCMA, where the rotated angles and extrinsic information transfer chart are designated to generate the UEs' codebook. To employ the advantage of the SCMA, the authors of [68] proposed a low-complexity detector, where a region restriction combined with an improved logarithm MPA is employed for searching the superimposed constellations. To further boost the sum and per-UE SE of the DL systems, several studies focused on other design aspects, i.e., UE pairing, power sharing and scheduling [69], and resource and power allocation [70][71][72].
Considering EE improvement, the SCMA system design has focused on power minimization and the trade-off between SE and power consumption. An earlier work in [73] introduced a unified framework and low-complexity decoding algorithm, which facilitates a prototype for improving the EE of an SCMA-based system. In [74], the authors considered an EE maximization problem under the QoS requirement, where a two-step strategy for codebook assignment and power allocation was proposed to devise a low-complexity algorithm. A new concept of high-dimensional SCMA codebook design was examined in [75] to increase the EE. Accordingly, the authors introduced a decimal signature matrix of a codebook, rather than a binary matrix, which plays as the parity check matrix in MPA. Recently, advanced techniques have been explored for EE maximization in various scenarios of SCMA, including a joint optimization of codebook assignment and power allocation in a single cell [76] and in cloud radio access networks (C-RANs) [77].
The aforementioned advancements in SCMA are categorized according to several design aspects in Table 3.

Improving Spectral-Energy Efficiency using UE Association and Cooperation
The analysis in the aforementioned works demonstrates that the system performance of both NOMA and SCMA schemes is highly influenced by UE grouping and/or decoding order, depending on the channel state information of UEs. To further enhance the SE and EE, many efforts have been made to find a good grouping and decoding order of UEs. Therefore, this section introduces the extensions of NOMA and SCMA to dense networks through UE associations and cooperation among UEs.

UE Associations
From previous works, UE pairing and the order for SIC in NOMA systems have been proved essential for performance improvement [24,36]. Consequently, efforts were taken to further utilize the SIC operation. The authors in [78] considered a general case for UE clustering in the DL and UL transmissions, where an SIC is applied to groups of UEs to maximize the throughput. Then, a simple UE clustering method based on the channel gains was devised for an MU-SISO system. In a multiple-antenna scenario, the optimization of clustering and ordering for SIC are more challenging owing to beamforming/receiver design under power control. Moreover, the UEs in significantly different channel conditions should be grouped together for efficient SIC. Therefore, many recent works applied the SIC to UEs using various strategies of pairing, clustering, and ordering to further improve the SE [79][80][81][82][83][84][85][86] and EE [87,88].
The goal of SCMA is to provide the number of codewords greater than the number of subcarriers, which helps serve a greater number of UEs. Besides the codebook and detector designs, the SE of SCMA-based systems can be improved using resource sharing and UE association in dense-device networks [89,90]. First, the SCMA was employed to enhance the SE in device-to-device (D2D) communication systems, where D2D pairs are allowed to share the same codebook with cellular UEs, and thus the resource allocation and interference management are considered [91][92][93][94]. Kim et al.,in [91], proposed a two-stage method based on the combination of graph theory and inner approximation. The method provides a fast allocation of the codewords to cellular UEs and efficient power allocation to the D2D UEs. Further, improving the EE was recently addressed, for example, a unified resource management method with OFDM resources assigned to D2D pairs was proposed in [95] to minimize the power consumption.

Cooperation among Devices
Cooperation among devices, where the relays are used to support cell-edge UEs in a singleor multi-hop network, was considered as a potential solution to increase the coverage area [96]. In terms of the SE, the combination of NOMA and relays, called cooperative NOMA, provides certain advantages: (i) utilizing the antenna diversity to boost the SE of UEs with weak-channel conditions and reducing the outage probability, and (ii) adopting NOMA with an SIC at the relays to further extend the coverage area. Many works investigated the NOMA relaying, including the UE and dedicated relays, in numerous scenarios targeting the SE and EE enhancement [97][98][99][100][101][102][103]. The SCMA was recently considered in cooperation with relaying [104][105][106]. In particular, the authors of [105] proposed a joint optimization of power allocation, codebook assignment, and subcarrier pairing to maximize the sum rate in an SCMA-based UL network, while the outage probability analysis for UEs in an SCMA relaying system was reported in [106].

Improving Spectral-Energy Efficiency Using Emerging Technologies
This section aims at presenting up-to-date system-level approaches to improve the SE and EE. Apart from the processing advancements of UE association and cooperation, one or more system-level technologies can simultaneously be integrated into NOMA-assisted wireless networks. In what follows, we introduce developments of NOMA systems in combination with some emerging technologies that are either considered as a standard for the upcoming wireless network or widely discussed in many researches.

Millimeter Wave
The mmWave, where the shorter wavelengths enable more antennas to be employed in the same physical space, has been considered as a key technology for 5G new radio (5G-NR) systems and wireless local area network, i.e., mmWave IEEE 802.11ay [107,108]. An earlier work in [108] showed that NOMA with a beamspace MIMO can achieve a spectral and energy-efficient mmWave transmission. Then, a multi-beam NOMA for mmWave was proposed in [109], where the coalition formation game theory and inner approximation were utilized to enhance the SE. In addition, another work also aiming at the SE improvement was recently reported in [110], using the Stackelberg game for UE clustering and power allocation. Considering the EE improvement, the mmWave-NOMA scheme was utilized for cellular heterogeneous networks [111], and for secure transmission under the appearance of an eavesdropper [112]. The use of SCMA for mmWave MIMO systems was reported in [113] to develop an algorithm for sum rate maximization based on the quasi-orthogonal beamspace.

Full-Duplex Radio
FD radio, which allows the DL and UL transmissions to utilize the same time-frequency resources, is theoretically expected to double the SE of wireless networks compared to half-duplex (HD) [114]. Despite the challenges of self-interference (SI, from transmit to receive antennas at the FD nodes) and co-channel interference (CCI, from UL HD-UEs to DL HD-UEs), many efforts were taken to decrease the SI below the noise floor for the SE and EE of FD-based systems to be guaranteed. In combination with NOMA, FD systems were proved to provide a significant SE/EE gain and robustness against the SI and channel uncertainty owing to a joint utilization of power allocation and time-frequency resource sharing for the DL and UL transmissions [115][116][117][118][119]. Particularly, an SIC was utilized for the UL transmission with random decoding order as in [116], while a general optimization design for DL clustering and UL decoding order was recently considered in an FD NOMA system [117]. In addition, FD and relaying were simultaneously developed for a NOMA-based multi-pair two-way relay in [120] and SCMA-based FD MIMO relay in [121].

Simultaneous Wireless Information and Power Transfer
Numerous IoT devices with high energy consumption have become a challenge to forthcoming networks. To address this issue, the wireless power transfer technique through radio frequency signals was considered in [122]. Then, a new paradigm of SWIPT was proposed for green wireless communications to simultaneously consider the SE and EE [123]. Accordingly, NOMA-based systems assisted by SWIPT were investigated to further improve the system performance [124,125], particularly under the framework of power consumption minimization with minimum data rate requirement. The authors in [126] proposed a SWIPT NOMA system by modifying the power splitting scheme to improve the EE, while the optimization of resource and power allocation under limited energy supply was developed in [127] for heterogeneous networks. To boost the performance of NOMA-based systems, the aforementioned technologies have been efficiently utilized in combination with SWIPT [128][129][130][131].

Challenges, Opportunities, and Research Directions
Aiming at enhancing the SE and EE further, this section introduces state-of-the-art approaches that are potentially implementable to NOMA systems. The following approaches, which involve both novel system-level and processing techniques, were investigated in only a few latest works. Nevertheless, they have been proved to be compatible with NOMA in several scenarios, and thus have been promisingly leveraged across various NOMA-based networks.

Terahertz Wave
THz communications, which use higher frequency bands than mmWave, have been considered as a potential candidate for achieving ultra-high reliability and low latency in future wireless communications. A THz NOMA-based system is shown to provide not only a high data rate, but also a strong anti-interference capability owing to a narrow directional beam [132]. Moreover, the MIMO-NOMA system with a large number of antennas is able to utilize beamforming technique in the THz bands [133,134]. In particular, the authors in [133] proposed a beamforming scheme to maximize the throughput for a DL NOMA system, in which a UE clustering, sub-band, and power allocation are jointly examined. To improve the EE, a THz MIMO-NOMA system was investigated in [134], where a UE clustering and a hybrid precoding were developed using an enhanced K-mean algorithm and a distributed alternating direction method of multipliers, respectively. Although there exists only a few works on THz NOMA, the new radio band will be promising for ultra-dense networks with the demands for high data rate.

Intelligent Reflecting Surface
With an enormous growth in the number of devices, the exponentially increasing demands for the SE and EE require innovations in a cost-effective technology in data transmission. IRS technology, where a large array of scattering elements can be configured to amplify the transmit signals, was considered as a promising solution towards 6G wireless networks, e.g., mMIMO 2.0 [135,136]. Recently, NOMA was considered to be integrated in IRS-based networks [137]. Although the authors in [138] proposed a simple design for IRS-NOMA for spatial division multiple access, the numerous potential advantages of the IRS in combination with NOMA should be explored for future wireless communications.

Learning-Based Approaches
The density and mobility of devices in future networks will probably cause high-complexity processing and a steep variation of channel responses, to which the conventional computing model is difficult to adapt. In addition, the time interval required for processing must be in a coherent time block during several hundreds of microseconds. To overcome these issues and improve the SE with scarce resources, a learning-based NOMA was investigated in some recent works [139], while an advanced deep learning algorithm was further developed for NOMA-based systems [140][141][142][143][144][145]. Particularly, a deep learning algorithm integrated in the NOMA system was proposed in [140], where the training and testing models are built for data encoding, decoding, and channel detection to enhance the SE. The authors in [141] used a deep reinforcement learning for UL NOMA to maximize the EE. To further boost the SE and EE, the learning-based methods have been employed in integrated networks, i.e., NOMA mmWave [139] and NOMA SWIPT [145]. The effectiveness of learning-based schemes is proved to be related with various factors, such as the structure of learning network and training and testing frameworks, which have to be further explored. Therefore, future research in learning-based NOMA may provide potential advantages, particularly in combination with other emerging technologies.

Conclusions
This paper has introduced the latest works that utilize NOMA to improve the SE and EE. First, we have described the principles of power-domain and code-domain NOMAs. Based on the NOMA principles, we have discussed numerous efforts that apply NOMA to improve the SE and EE of the concurrent networks and have identified critical factors impacting the performance of NOMA-based systems. Then, we have introduced advanced methods and algorithms, such as UE associations and cooperation among UEs, which can enhance the system performance while meeting high QoS requirements. Many works that incorporate emerging technologies, i.e., mmWave, FD, and SWIPT, in NOMA have also been discussed, because such combinations can further improve the SE and EE. Finally, we have presented challenges and future research directions to meet the increasing demands of future wireless networks.