1. Introduction
The invention of the Sixth Generation (6G) wireless networks with exclusive intelligence and connectivity capabilities to support various applications is transforming communication around the world. It is estimated that the 6G systems will offer a terabit per second data rate, and they will be capable of supporting more than 10 million connections per square kilometer, with an end-to-end latency of less than 0.1 ms. The demands of such hard-to-meet performance necessitate breakthroughs in Multiple-Input Multiple-Output (MIMO) technology, particularly in m-MIMO systems [
1].
One of the key issues in this field is developing a research agenda for a MIMO detection system that is both scalable and computationally efficient. Two fundamental applications, smart manufacturing and industrial automation, are the drivers of these requirements. In particular, the interaction between autonomous systems, robotic production lines, and real-time predictive maintenance platforms, made possible by the Industry 4.0 framework, is increasingly based on ultra-reliable and low-latency communication to ensure optimal performance [
2]. In [
2], the weaknesses of conventional detection systems in such environments are emphasized, as the advanced iterative detection methods applied to industrial Internet of Things applications are provided to provide better operation in controlled environments. However, their findings also showed a decrease in performance in real-world fading channels with spatial correlation. The hybrid approach they employed to reduce complexity involved province decoding with successive interference cancellation, leading to a computational demand approximately 40% lower than Maximum Likelihood (ML) detection. The use of matrix inversion limits its applicability for the very large antenna arrays anticipated in 6G, as its computational cost scales cubically with antenna dimensions [
1,
2,
3]. The combination of reconfigurable intelligent surfaces, three-dimensional beamforming, and terahertz band communication exacerbates the difficulties of 6G detection. The authors in [
3] investigated the computational viability of detection strategies, such as iterative over-relaxation, Zero-Forcing (ZF), and Minimum Mean Square Error (MMSE), in such high-dimensional system settings. The authors in [
3] demonstrated that computing overhead increases dramatically with antenna expansion, making traditional techniques infeasible. To reduce the complexity of the detection process, ref. [
4] proposed a hybrid analog-digital beamforming architecture as a potential solution. To overcome the computational complexity of m-MIMO detection, sophisticated signal processing methods have been widely examined. Specifically, ref. [
4] proposed an Approximate Message Passing (AMP)-based detection system, which has near-optimal performance and a linear computational cost. To verify performance, theoretical performance limits were developed, and spatially correlated channels that are characteristic of dense antenna arrays were considered. Nevertheless, the authors in [
5] suggested an adaptive dampening mechanism that can guarantee convergence stability and increase robustness in unfavorable propagation conditions. Experimental results on a 128 MIMO testbed, 64 × 64 MIMO, showed that the Iterative Message-Passing (IMP) detector needed three orders of magnitude less complexity and a Bit Error Rate (BER) of merely 2 dB than ML detection. Techniques that are assisted by learning tasks, including optimization-based ones, like DetNet introduced by [
5], can also possibly be used as an alternative to conventional detection mechanisms because of the scalability of m-MIMO. Using Quasi-Newton (QN) methods, the authors in [
6] paid professional attention to the application of the Brayden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. This approach can give curvature information, which is necessary to identify MMSE with high accuracy, yet at a very low cost of computation. BFGS and QN algorithms can also be run on near-optimal rates with an exponential convergence rate, given that the MIMO detection problem is posed as a quadratic optimization problem, if it satisfies certain regularity constraints. It was reported in [
6] that BFGS-based detectors needed many fewer floating-point operations and that the performance was like other MMSE systems. It is based on these that learning-augmented iterative detection algorithms have arisen. In [
7], the authors postulated the Learned Approximate Message Passing (LAMP) architecture, which creates AMP algorithm iterations in a neural network architecture with parameters to be trained. In Rayleigh, Rician, and spatially correlated fading channels, LAMP was shown to have better performance than conventional AMP detectors, and both detection and computational efficiency were improved. The studies of QN algorithms of high-dimensional detection are ongoing. In addition, the authors in [
8] have made a comparative study of limited-memory (L-BFGS) and BFGS algorithms to determine theory-based convergence guarantees and implementation trade-offs in m-MIMO cases. The simulation findings in [
8] show that BFGS-type algorithms are very robust in converging even in noisy environments with imperfect channel estimation, especially with a mechanism of adaptive step-size control. The other author in [
9] suggested a hybrid design that deals with a traditional modular architecture that incorporates the neural elements strategically in the various stages of MIMO detection to deliver good performance-complexity trade-offs at manageable memory consumption. They demonstrated the strength of this method in [
9], where a software-defined radio platform was designed and tested under realistic channel conditions. Moreover, the authors in [
10] examined parallel training to work with m-MIMO detection networks, both curriculum-based training and transfer learning approaches to enhance flexibility based on different operating conditions. These methods were identified to generate a significant decrease in training overheads with a competitive detection accuracy maintained. The issue of hardware limitations and energy conservation is still of concern in the implementation of sophisticated detection algorithms in wireless networks. In [
11], the authors analyzed the trade-offs of implementing neural network depth, energy consumption, and numerical accuracy when implementing deep learning (DL)-based MIMO detectors. Regarding [
12], scalability is essential for the scalability of the very large MIMO array and distributed detection architecture. Precisely, the authors discovered that the optimization of hardware accelerators could reduce the power consumption compared to general-purpose Central Processing Units (CPUs) and is critical when it comes to the incorporation of AI-based detection in 6G base stations.
A distributed BFGS-based optimization framework, along with effective load-balancing systems, has near-linear scalability with detection accuracy levels like those of centralized implementations. Nevertheless, the existing techniques have three basic constraints. Most of the QN methods, such as BFGS, have poor convergence in the case of highly correlated channels, a typical scenario in large-scale MIMO communication systems. In addition, DL-based detectors require large training datasets and high computational power, limiting their use in resource-constrained settings. Although the methods based on the Iterative Signal Detection (ISD) approach are scalable for large arrays of antennas, little has been performed in terms of integrating their design with DL; this study attempts to counter these limitations by incorporating the concepts of QN optimization in a DL framework, which reduces the high memory, processing, and offline training requirements of traditional DL-based detectors in m-MIMO systems [
1,
12]. To facilitate efficient online adaptation that consumes less storage, the proposed approach uses the QN principles in the design of a neural network. First, this framework has three significant modifications over previous methods that make it effective. First, does not require any inversion of high-dimensional matrices with QN approximation. Then again, a Recurrent Neural Network (RNN) is used to create one explicit-Q nested optimization step. Every parameter in the RNN undergoes a certain degree of sparsity pattern inversion. Second, enable Deep Q-Network (DQN) modules, which will increase the stability and speed up the convergence. Third, dedicated architecture, which takes advantage of the natural geometry of the massive MIMO channels to enhance accuracy in detection and computation efficiency.
Related Works
Computational challenges are highly experienced in signal detection in the new 6G systems, owing to the antenna array size of m-MIMO systems continuing to increase. The design of efficient detection algorithms is becoming one of the core prerequisites for 6G wireless communications, as it is directly related to the energy consumption, spectral performance, and reliability of the network. Moderate array sizes worked reasonably well in 5G networks; 6G will demand detection methods that can accommodate very large system models with very tight power and latency requirements.
To ensure the scalability and resilience of large-scale deployments, considerable research has been generated, including optimization-based strategies, machine learning-based detectors, and more sophisticated [
9,
12]. As a baseline, linear detection algorithms are widely used in multi-antenna signal detection. ZF, among them, as noted earlier, is prevalent not only in model but in practice due to its mathematical formulation as an easily solvable algebraic expression. Nevertheless, ZF detectors are seriously impaired when the matrices of channels are under-conditioned, which is a common phenomenon in large-scale systems with MIMO, where the effects of noise amplification become apparent. The MMSE detector reduces this limitation by introducing noise variance in the estimation process and is more successful in detecting compared to ZF. The computational complexity of both ZF and MMSE is, however, approximately
O(
N3) with respect to the number of antennas, as both involve explicit matrix inversion. The complexity of optimal detection schemes is also combinatorial, making them impractical for the ultra-dense topologies predicted in 6G networks. The authors in [
13] explored MMSE detection approximations under the Preconditioned Conjugate Gradient (PCG)-based preconditioned matrix inversion-free approaches. Their application, with adaptive halting terms, yields an 80% reduction in the computational cost, with performance in terms of BER within 0.5 dB of perfect MMSE solutions. Despite these gains, the convergence rate of PCG is highly sensitive to channel conditioning, and ill-conditioned channels require many more iterations to achieve stable detection. AMP has become another iteration detection approach that applies to m-MIMO systems. In [
14], the authors developed strict theoretical premises on the dynamics of iterative AMP in high-dimensional regimes and made parallels to the replica approach of statistical physics. Their analysis presented a state evolution that is simple to optimize the parameters and predict the correct behavior of the AMP convergence. Having empirical validation on systems of all sizes and formal convergence assurances, AMP has become a known computationally scalable and analytically tractable large-scale algorithm to detect MIMO. Nevertheless, spatial correlation in m-MIMO channels remains a serious performance problem that prompts a lot of research.
The authors in [
15] demonstrated that traditional AMP algorithms in strongly correlated MIMO channels fail to converge, resulting in a deterioration of the detection performance. In this regard, they introduced a Correlation-Aware-AMP (CA-AMP) design, which enables the use of adaptive damping, statistical channel knowledge, attaining strong convergence, and high accuracy on a variety of correlation models. DL has received much attention on MIMO detection owing to its capability to capture complicated nonlinear interactions. The authors in [
16] thoroughly studied feedforward, recurrent, and convolutional neural networks; they demonstrated that properly designed models can be highly efficient, surpassing traditional approaches. Nevertheless, they face several challenges, including the need for large volumes of training data, the inability to make decisions in diverse environments, and the difficulty in interpreting network decisions. The deep unfolding paradigm tried to fill the gap between iterative optimization and learning-based detection. In contrast to entirely data-driven architectures, unfolding trains are iterative to trainable neural networks with learnable parameters, without harming the structure of the algorithm. This class of models, as illustrated in [
17], represents a trade-off between mathematical rigor and computational efficiency, preserving the convergence properties of the underlying approaches while retaining the ability to adapt to data. Simultaneously, QN methods have been investigated as useful alternatives to second-order optimization methods that avoid expensive Hessian calculations. The authors in [
18] highlighted the numerical stability, convergence, and appropriateness of various QN methods in high-dimensional optimization problems, including MIMO detection. Based on this, subsequent studies have examined specific algorithms such as BFGS, L-BFGS, and Broyden-type updates in terms of memory efficiency and convergence under noisy channel conditions. BFGS-based methods are highly advantageous in large-scale wireless systems, as they can achieve superlinear convergence using only first-order gradient information. The symbol detection has been modeled as a quadratic optimization problem and implemented on MIMO detection with the use of QN methods, where convergence rates are boosted with BFGS updates [
19]. Apply large-scale MIMO systems through a range of variations, taking advantage of MIMO channel structures to simplify further and increase convergence faster. Their strength in the presence of realistic noise has been confirmed both in theoretical analyses and experimental outcomes. Since they converge quickly and do not have high storage needs, limited-memory QN methods like L-BFGS [
20] are specifically well-suited to large-scale MIMO detection. The adaptive memory allocation schemes also enhance the resource-constrained base stations by adapting the correction pairs that are stored to reflect the channel conditions. The authors in [
20] have also discovered the possibilities of incorporating machine learning with QN frameworks to realize detection performance superior to that of established optimization techniques. The authors in [
21] developed hybrid neural QN models by combining the parameter adaptability with the resilience of BFGS updates. Although this implementation maintains the secant condition, it provides better accuracy in detection with reduced processing costs and allows adjusting the optimization process using data. The studies of effective training strategies to communicate-centric neural networks were discussed in [
22], placing emphasis on the design of data, the choice of architecture, and methods of optimization to improve the generalization and complexity of training in various fading conditions. Our findings, together with those of other related works, demonstrate that it is possible to achieve scalable, high-performance MIMO-based detection by integrating learning-based approaches with QN optimization. Meanwhile, the costs of retraining DL-based schemes are enormous and thus mitigated by transfer learning. As it is shown in [
23], with pre-trained models, the adaptation of new MIMO setups can occur without retraining, preserving good detection with varying channel conditions, and can also save on training data and computation costs. The proposed QN-DQN method demonstrates substantial improvements across multiple performance dimensions compared to existing state-of-the-art approaches. Traditional methods, such as Linear Minimum Mean Square Error (LMMSE) [
3] and ZF [
13], achieve excellent generalization with single-shot solutions but suffer from high computational complexity
, and moderate inference times between 2.8 and 2.5 ms. Learning-based approaches, including DetNet [
5], GNN-MIMO [
9], and DRL-Precoding [
22,
23] offer improved performance but require extensive training of around 3.2–6.3 h and exhibit
or
memory complexity with slower convergence (5–20 iterations). Recent hybrid methods like L-BFGS [
8,
20] and Hybrid Neural-QN [
21] achieve better memory efficiency
but still require 6.8–5.9 ms inference time. Whereas, the proposed QN-DQN achieves the best BER performance
at 15 dB with significantly reduced training time around 2.1 h, fastest inference, 4.2 ms, lowest memory complexity
, and rapid convergence (4–7 iterations) as shown in
Table 1. Critically, it maintains excellent generalization through QN guarantees, addressing the key limitations of pure DL methods that require retraining for new channel conditions. This comprehensive advantage across training efficiency, inference speed, memory footprint, and adaptability positions QN-DQN as a practical solution for real-time wireless communication systems where computational resources and latency are constrained.
Finally, to bridge the gap between algorithm design and practical deployment, challenges in hardware implementation have also been investigated. The authors in [
24] proposed resource-aware optimization techniques that evaluate trade-offs across CPUs, graphics processing units, and dedicated accelerators to balance hardware efficiency with detection performance. This research aims to develop a hybrid ISD framework for massive MIMO in 6G systems by integrating DQN with QN methods to overcome the limitations of conventional linear MMSE detectors. The proposed approach reduces computational complexity by avoiding explicit high-dimensional matrix inversions, improves detection robustness by combining the stable search directions of QN with the adaptability of DQN, and enables memory-efficient near-optimal detection suitable for practical 6G base station architectures. This study is structured around three main goals that have not received sufficient attention in high-dimensional matrix inversions for linear MMSE detectors: improving the stability of QN with the flexibility of DQN and developing memory-efficient underperformance ceiling detectors that build on the limitations noted in previous studies.
Minimal complexity for ISD traditional linear methods exhibits intrinsic performance saturation, while linear MMSE detectors require computationally expensive high-dimensional matrix inversions [
1,
4]. To address these challenges, we provide an updated ISD paradigm that avoids explicit matrix inversion by using algebraic restructuring. Furthermore, the iterative technique achieves higher accuracy than its conventional linear counterparts by directly integrating nonlinear components into the detection process.
In wireless communications, the QN-enhanced DQN has been studied separately in [
3,
6], but its combined use is still mostly unknown. To create a novel framework, called the QN-Method Network, that fills this gap by embedding QN update rules inside a DQN. By combining the stable search direction of QN methods with the flexibility of DQN, the proposed design accelerates convergence. It improves robustness across diverse channel conditions, particularly in massive MIMO scenarios.
Spatially correlated channel adaptive detection is a challenging problem, which has a significant deterioration in the performance of traditional detectors in m-MIMO systems [
5]. We suggest adaptive detection schemes to overcome this problem in this paper, which dynamically change according to correlation structures by a hybrid QN-DQN framework. The given strategy attains almost optimal performance in detecting images through the integration of the performance ceiling of linear detectors, which is memory-efficient and computationally efficient. Specifically, the DQN agent learns optimal regularization parameters that reduce the effective condition number, thereby enabling memory-efficient implementation under the performance limits of linear detectors.
4. Simulation Results
In this section, simulation results are conducted to verify the performance of the proposed LMMSE, ISD, and DQN detection algorithms in m-MIMO scenarios. The experiments implemented used various m-MIMO channels and antenna configurations across an SNR range from 0 to 20 dB, in 2 dB increments. The simulations employed synthetic channel data generated using the Rician fading model specified in (2), with Rayleigh components following spatial correlation matrices defined in (3). The noise model utilized AWGN with a covariance matrix
, as stated in (1), evaluated across SNR values from 0 to 20 dB in 2 dB increments. For conciseness, only two antenna configurations (8 × 8 and 128 × 48) are used. From
Table 3, the critical trade-off between training overhead and runtime performance in deep learning systems is clearly demonstrated, highlighting how larger antenna configurations offer improved detection accuracy but require substantially higher computational and training costs. While the DQN-enhanced QN requires a substantial upfront investment in offline training—10,000 episodes and 100,000 channel realizations—it delivers superior inference efficiency, achieving the fastest average inference time of 4.2 ms and the lowest symbol detection latency of 5.8 ms. In contrast, LMMSE and ISD require no training but suffer from significantly slower inference times, 12.3 ms and 8.7 ms, respectively. The rapid convergence of DQN-enhanced QN further justifies the initial training cost, making it ideal for deployment scenarios where real-time performance is paramount and the training overhead can be amortized across numerous inference operations.
Figure 2 depicts a comparison of the BER performance of the suggested Broyden-Net hybrid approach, which combines limited-memory BFGS (L-BFGS) optimization with a DQN-driven adaptation technique, against a traditional linear detector. The traditional linear detector refers to the LMMSE detector, a standard linear detection scheme widely used in m-MIMO systems. These detectors estimate transmitted signals through linear matrix inversion techniques based on the channel matrix. The MIMO configuration used in
Figure 2 represents m-MIMO systems with two antenna setups: 8 × 8 representing a small-scale MIMO scenario and 128 × 48 representing a large-scale or m-MIMO configuration. The experiments were conducted over a range of SNR.
Figure 2 compares the BER performance of the proposed Broyden-Net hybrid detector, which integrates limited memory. The traditional detector exhibits higher BER across all SNRs due to its limited capability to handle multiuser interference and channel non-idealities. Due to its sensitivity to channel condition numbers and its limited ability to mitigate multiuser interference, the conventional linear detector consistently yields higher BER values across the entire SNR range, with the gap being particularly noticeable in the low-to-medium SNR area [
17]. On the other hand, the Broyden-Net hybrid exhibits a more pronounced improvement in the medium-to-high SNR regime (5–15 dB) while maintaining a significantly lower BER. This finding implies that while the learning component adaptively corrects for interference and channel distortions, the QN structure improves convergence stability.
The BER performance shown in
Figure 3 corresponds to a 128 × 48 m-MIMO system operating under Rayleigh fading with QPSK modulation. The LMMSE detector consistently outperforms the other approaches across the entire SNR range, exhibiting a steep decline in BER as SNR increases, reaching values close to
at high SNR. The ISD scheme demonstrates improved performance relative to DQN at higher SNR values beyond 1 dB, but its gains are moderate at low-to-medium SNR, where it remains less effective than LMMSE. Both the proposed DQN-based and ISD-based schemes were evaluated under non-ideal channel estimation and strong inter-user interference, where the linear MMSE detector, LMMSE, provides a performance baseline. The proposed approaches maintain consistent learning stability across the entire SNR range, though their convergence speed and long-term optimization rely on the reward policy adaptation process.
The BER values above
in
Figure 3 reflect the short adaptation budget and the use of adversarial test channels for a fair cross-SNR evaluation, rather than indicating an algorithmic failure. The LMMSE detector is expected to deliver superior immediate performance, as it is a closed-form linear estimator. In contrast, the DQN-based detector achieves stable BER performance in the low-to-medium SNR regime; however, it fails to improve significantly at higher SNR levels, exhibiting performance saturation around
. The superior BER reduction in LMMSE in
Figure 3 can be attributed to its ability to exploit full channel state information through direct matrix inversion, which ensures near-optimal linear signal recovery in Rayleigh fading channels. This allows the LMMSE detector to achieve significantly lower BER at higher SNR values, as noise and interference can be effectively suppressed. On the other hand, ISD and DQN are designed to reduce this complexity by relying on ISD or DQN rather than exact inversion. While these methods trade some BER performance at high SNR, they offer more accurate and stable results in the low-to-medium SNR regime, where linear detectors often struggle with residual interference. The ISD further refines its estimates with QN updates, giving it better high-SNR adaptability than DQN, which saturates due to the limited generalization of its learned policy. Thus, LMMSE achieves the lowest BER but at the expense of complexity, while ISD and DQN strike a balance between accuracy and computational efficiency.
Figure 4a corresponds to the BER performance of an 8 × 8 Rayleigh fading MIMO system under QPSK modulation. The DQN-based detector achieves superior performance across the full SNR range when compared to both LMMSE and ISD. At low-to-moderate SNR values, the DQN curve demonstrates a consistently lower BER, indicating that the learning framework can effectively exploit the statistical structure of the channel without incurring excessive complexity. The ISD detector outperforms LMMSE in mid-to-high SNR regimes, but its iterative refinement process introduces a slight performance gap relative to the DQN-enhanced scheme. At high SNR, both DQN and ISD converge to near-optimal BER values, whereas LMMSE exhibits an error floor due to its linear approximation limits. These results highlight the significant advantage of leveraging QN optimization integrated with DQN to mitigate nonlinear interference effects, thereby providing robust detection in m-MIMO scenarios.
Figure 4b extends the analysis to 16-QAM modulation in the same 8 × 8 Rayleigh fading channel, where the higher-order modulation introduces increased symbol density and greater susceptibility to noise and interference. In this case, the performance gap between LMMSE, ISD, and DQN narrows, as all three detectors face greater difficulty in resolving closely spaced constellation points. Nonetheless, the DQN approach maintains a marginally better BER across the SNR range, confirming its adaptability to modulation complexity. ISD offers competitive performance relative to LMMSE, demonstrating that iterative refinement provides benefits even under dense modulation schemes. However, the resilience of DQN in maintaining lower BER emphasizes its potential as a scalable and generalizable solution. Together,
Figure 4a,b confirm that the proposed DQN-enhanced detection framework achieves substantial performance gains in low-order modulations and remains robust under higher-order constellations, positioning it as a key enabler of reliable and efficient m-MIMO systems. In
Figure 4, the DQN-based detector achieves consistently lower BER across all SNR ranges for both QPSK and 16-QAM modulations in the 8 × 8 configuration, outperforming traditional LMMSE and ISD methods. This performance aligns with that of deep unfolding approaches [
16,
17] and GNN-MIMO [
9], while maintaining significantly reduced computational complexity. The average reward performance of LMMSE, ISD, and the proposed DQN-based scheme is illustrated in
Figure 5, where all methods exhibit rapid early gains (
) before stabilizing beyond
iterations. Although LMMSE converges quickly, its reward saturates due to the limitation of linear processing. ISD attains higher steady-state rewards through iterative refinements consistent with the adaptive step-size rule. The DQN-based framework, however, achieves ISD-level rewards with significantly lower computational overhead by integrating QN optimization and reinforcement learning. Specifically, the state representation, the Q-function in (14), and z mitigation in (18) collectively reduce the convergence time
by up to 60%, while the parallel update structure and stability loss function in (20) ensure scalability and resilience against correlated fading.
Figure 5 confirms that the results of the proposed DQN-based approach achieve near-ISD steady-state performance with markedly reduced complexity, making it a practical and robust solution for real-time m-MIMO systems.
Figure 6 illustrates the loss convergence of the considered approaches, where all algorithms exhibit a sharp decline in loss within the first 50 iterations, with the inset further emphasizing their rapid convergence behavior. Although the LMMSE detector provides stable performance, its reliance on computationally expensive matrix inversion fundamentally limits scalability. In contrast, both ISD and the proposed DQN-based framework achieve consistently lower steady-state losses, with DQN demonstrating enhanced robustness in mitigating residual errors. This improvement can be attributed to the adaptive state representation and the step-size adjustment rule in (13), which enable the DQN agent to regulate convergence behavior under varying channel conditions dynamically. Furthermore, the Broyden-Net approximation of the Q-function, combined with the reward function design in (15), ensures a balance between convergence speed and detection accuracy, effectively reducing real-time training overhead.
Figure 5 reveals that the proposed method achieves ISD-level steady-state rewards with a 60% reduction in convergence time through adaptive step-size control, comparable to the hybrid neural-QN approach [
21], but with faster inference (4.2 ms vs. 5.9 ms).
The robustness of the hybrid QN–DQN approach is supported by the conditioning inequality and confirms improved stability against correlated fading by reducing the effective condition number of
. Memory-efficient updates via limited-memory BFGS in (17) maintain computational feasibility, while the adaptive interference mitigation in (18) demonstrates the framework’s capability to elevate the performance ceiling of linear detectors. Finally, the loss formulation in (20) integrates reconstruction accuracy with condition-number regularization, further stabilizing detection in large-scale scenarios reduces convergence time by a factor of
. Collectively, these mechanisms confirm that the integration of QN techniques with DQN not only achieves nearly identical low-loss steady states as ISD but also enhances convergence adaptability, reduces latency, and preserves memory efficiency, thereby meeting the stringent requirements of m-MIMO detection in 6G systems.
Figure 7 illustrates the convergence behavior of the ISD algorithm under different step-size parameters
evaluated in terms of both training and testing accuracy across optimization steps. Larger values of
facilitate faster convergence, allowing the training accuracy to surpass 90% within the first 400 iterations, while also achieving high testing accuracy above 85%. Conversely, smaller step sizes
result in slower learning, where both training and testing accuracies increase gradually and saturate at comparatively lower values. This demonstrates the critical role of parameter selection in balancing convergence speed and generalization in ISD-based MIMO detection. Furthermore, the performance gap between training and testing curves across all settings remains consistently small, which indicates that the algorithm generalizes effectively without significant overfitting. The steady improvement of test accuracy across iterations underlines the robustness of the DQN-assisted optimization in handling channel variations. The proposed ISD framework successfully balances scalability, detection accuracy, and computational efficiency, which are essential for addressing the stringent requirements of 6G with ultra-large antenna arrays.
Figure 8 illustrates the computational complexity of three different detection algorithms, LMMSE, ISD, and DQN, as a function of the number of base station antennas
. It can be observed that the LMMSE detector, although widely used in conventional MIMO systems, exhibits the steepest growth in complexity, increasing from approximately
to beyond
operations as
increases. This trend underscores the impracticality of LMMSE in large-scale m-MIMO deployments, where the dimensionality and matrix inversion operations become computationally prohibitive. The ISD algorithm achieves noticeable complexity reduction compared to LMMSE, maintaining lower growth across the entire antenna range, owing to its iterative structure that avoids full matrix inversion. However, its scaling behavior still suggests that it may not be optimal for extremely large antenna arrays envisioned in 6G scenarios.
From
Figure 8, the DQN-based approach demonstrates a significantly flatter complexity curve, operating at least an order of magnitude lower than both LMMSE and ISD across all tested antenna sizes. This efficiency arises from the integration of QN optimization with DL, allowing the system to adaptively refine detection without the heavy computational overhead of classical algorithms. The figure thus confirms that DQN not only ensures scalability with increasing antennas but also bridges the gap between accuracy and real-time implementation. Computational complexity in Floating-Point Operations (FLOPs) indicates the processing effort required, while memory usage reflects the storage needed during execution [
28]. The proposed QN-DQN achieves significantly lower complexity
(
FLOPs) compared to LMMSE’s
FLOPs, demonstrating superior efficiency and suitability for real-world deployment, as shown in
Table 4.
The computational complexity of the hybrid algorithm scales as , where is the average convergence time. The DQN agent reduces by a factor of compared to fixed step-size methods, while the L-BFGS updates maintain memory requirements at ) instead of . The DQN-enhanced scheme emerges as a viable solution, combining low complexity with robustness in high-dimensional detection tasks.