Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems
Abstract
:1. Introduction
- We propose an efficient reduced-complexity SQRD algorithm based on a novel modified RVD. Compared to the latest related study, the computational complexity of the proposed SQRD algorithm is greatly reduced by more than 44.7%. In addition, the proposed SQRD algorithm has a competitive BER performance and is implementation-friendly;
- We design a deeply pipelined SQRD hardware architecture with a time-sharing GR structure for 4 × 4 MIMO systems. The proposed time-sharing GR structure cleverly utilizes the CORDIC modules in idle state to process certain rotation operations that should have been handled by additional CORDIC module. Therefore, additional hardware is saved and the hardware efficiency of the proposed SQRD design is improved.
2. Background
2.1. MIMO Detection Model
2.2. Related Studies
3. Proposed SQRD Algorithm with a Novel Modified RVD
3.1. Proposed Modified RVD
3.2. The Proposed SQRD Algorithm
Algorithm 1 Proposed SQRD algorithm |
INPUT: , , OUTPUT: , , |
Step 1: Sorted complex Givens rotation (SCGR) 1: 2: for 3: 4: end 5: for 6: 7: Swap , , and with , , and , respectively 8: Perform complex Givens rotation in vectoring mode on the elements of in pairs 9: for 10: Perform complex Givens rotation in rotation mode on the elements of in pairs 11: end 12: for 13: 14: end 15: end Step 2: Proposed modified RVD 16: for 17: for 18: 19: 20: end 21: 22: 23: end Step 3: Real Givens rotation 24: for 25: Perform real Givens rotation in vectoring mode on 26: for 27: Perform real Givens rotation in rotation mode on 28: end 29: end |
3.3. Performance Evaluation of the Proposed SQRD Algorithm
4. Proposed SQRD VLSI Architecture
4.1. Overview of the Proposed SQRD Hardware Architecture
4.2. Processing Engines
4.3. Time-Sharing Givens Rotation Structure
4.4. CORDIC Architecture
4.5. Sorting
5. Implementation Results and Comparisons
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A
Appendix B
References
- IEEE Draft Standard for Information Technology—Telecommunications and Information Exchange between Systems Local and Metropolitan Area Networks—Specific Requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment Enhancements for High Efficiency WLAN. Available online: https://ieeexplore.ieee.org/document/8424259 (accessed on 31 July 2018).
- Tsai, P.Y.; Lo, P.C.; Shih, F.J.; Jau, W.J.; Huang, M.Y.; Huang, Z.Y. A 4 × 4 MIMO-OFDM Baseband Receiver with 160 MHz Bandwidth for Indoor Gigabit Wireless Communications. IEEE Trans. Circuits Syst. I Regul. Pap. 2015, 62, 2929–2939. [Google Scholar] [CrossRef]
- Yan, Z.T.; He, G.H.; Ren, Y.F.; He, W.F.; Jiang, J.F.; Mao, Z.G. Design and Implementation of Flexible Dual-Mode Soft-Output MIMO Detector with Channel Preprocessing. IEEE Trans. Circuits Syst. I Regul. Pap. 2015, 62, 2706–2717. [Google Scholar] [CrossRef]
- Wubben, D.; Bohnke, R.; Rinas, J.; Kuhn, V.; Kammeyer, K.D. Efficient algorithm for decoding layered space-time codes. Electron. Lett. 2001, 37, 1348–1350. [Google Scholar] [CrossRef] [Green Version]
- Rakesh, G.; Ove, E.; Liu, L. An Adaptive QR Decomposition Processor for Carrier-Aggregated LTE-A in 28-nm FD-SOI. IEEE Trans. Circuits Syst. I Regul. Pap. 2017, 64, 1914–1926. [Google Scholar] [CrossRef]
- Dongyeob, S.; Jongsun, P. A Low-Latency and Area-Efficient Gram–Schmidt-Based QRD Architecture for MIMO Receiver. IEEE Trans. Circuits Syst. I Regul. Pap. 2018, 65, 2606–2616. [Google Scholar] [CrossRef]
- Patel, D.; Shabany, M.; Gulak, P.G. A low-complexity high-speed QR decomposition implementation for MIMO receivers. In Proceedings of the IEEE International Symposium Circuits and Systems, Taipei, Taiwan, 24–27 May 2009; pp. 33–36. [Google Scholar] [CrossRef]
- Ren, Y.F.; He, G.H.; Ma, J. High-throughput sorted MMSE QR decomposition for MIMO detection. In Proceedings of the IEEE International Symposium on Circuits and Systems, Seoul, Korea, 20–23 May 2012; pp. 2845–2848. [Google Scholar] [CrossRef]
- Lee, H.; Oh, K.; Jang, M.C.Y. Efficient Low-Latency Implementation of CORDIC-Based Sorted QR Decomposition for Multi-Gbps MIMO Systems. IEEE Trans. Circuits Syst. II Express Briefs 2018, 65, 1375–1379. [Google Scholar] [CrossRef]
- Wieringen, W.N. Lecture Notes on Ridge Regression. Available online: https://arxiv.org/abs/1509.09169 (accessed on 30 September 2015).
- Martino, L.; Read, J. Joint Introduction to Gaussian Processes and Relevance Vector Machines with Connections to Kalman Filtering and other Kernel Smoothers. Available online: https://arxiv.org/abs/2009.09217 (accessed on 19 September 2020).
- Burg, A. VLSI Circuits for MIMO Communication Systems. Ph.D. Thesis, ETH, Zürich, Switzerland, 2006. [Google Scholar]
- Krishnamoorthy, A.; Menon, D. Matrix inversion using Cholesky decomposition. In Proceedings of the 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 26–28 September 2013; pp. 70–72. [Google Scholar]
- Chen, Y.J.; Halbauer, H.; Jeschke, M.; Richter, R. An efficient Cholesky Decomposition based multiuser MIMO detection algorithm. In Proceedings of the 21st Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, Poznan, Poland, 26–30 September 2010; pp. 499–503. [Google Scholar] [CrossRef]
- Peter, L.; Andreas, B.; Haene, S.; Perels, D.; Felber, N.; Fichtner, W. VLSI Implementation of a High-Speed Iterative Sorted MMSE QR Decomposition. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems, New Orleans, LA, USA, 27–30 May 2007; pp. 1421–1424. [Google Scholar] [CrossRef]
- Peter, L.; Studer, C.; Duetsch, S.; Zgraggen, E.; Kaeslin, H.; Felber, N. Gram-Schmidt-based QR decomposition for MIMO detection: VLSI implementation and comparison. In Proceedings of the 2008 IEEE Asia Pacific Conference on Circuits and Systems, Macao, China, 30 November–3 December 2008; pp. 830–833. [Google Scholar] [CrossRef]
- Miyaoka, Y.; Nagao, Y.; Kurosaki, M.; Ochi, H. Sorted QR decomposition for high-speed MMSE MIMO detection based wireless communication systems. In Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, Seoul, Korea, 20–23 May 2012; pp. 2857–2860. [Google Scholar] [CrossRef]
- Liao, C.; Wang, J.; Huang, Y. A 3.1 Gb/s 8×8 Sorting Reduced K-Best Detector with Lattice Reduction and QR Decomposition. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2014, 22, 2675–2688. [Google Scholar] [CrossRef]
- Zhang, C.; Prabhu, H.; Liu, Y.; Liu, L.; Edfors, O.; Öwall, V. Energy Efficient Group-Sort QRD Processor With On-Line Update for MIMO Channel Pre-Processing. IEEE Trans. Circuits Syst. I Regul. Pap. 2015, 62, 1220–1229. [Google Scholar] [CrossRef] [Green Version]
- Tseng, T.; Shen, C. Design and implementation of a high-throughput configurable pre-processor for MIMO detections. Microelectron. J. 2018, 72, 14–23. [Google Scholar] [CrossRef]
- Chen, W.; Guenther, D.; Shen, C.; Ascheid, G. Design and implementation of a low-latency, high-throughput sorted QR decomposition circuit for MIMO communications. In Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, Jeju, Korea, 25–28 October 2016; pp. 277–280. [Google Scholar] [CrossRef]
- Lin, J.S.; Hwang, Y.T.; Fang, S.H.; Chu, P.H.; Shieh, M.D. Low-Complexity High-Throughput QR Decomposition Design for MIMO Systems. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2015, 23, 2342–2346. [Google Scholar] [CrossRef]
- Guo, Z.; Nilson, P.A. 53.3 Mb/s 4×4 16-QAM MIMO decoder in 0.35µm CMOS. In Proceedings of the IEEE International Symposium Circuits and Systems, Kobe, Japan, 23–26 May 2005; pp. 4947–4950. [Google Scholar] [CrossRef]
- Huang, Z.Y.; Tsai, P.Y. Efficient Implementation of QR Decomposition for Gigabit MIMO-OFDM Systems. IEEE Trans. Circuits Syst. I Regul. Pap. 2011, 58, 2531–2542. [Google Scholar] [CrossRef]
- Alexander, M.; Vladimir, P.; Roman, M.; Alexey, K. Triangular systolic array with reduced latency for QR-decomposition of complex matrices. In Proceedings of the IEEE International Symposium on Circuits and Systems, Island of Kos, Greece, 21–24 May 2006; pp. 385–388. [Google Scholar] [CrossRef]
Algorithm | Number of CORDIC Operations | |
---|---|---|
[9] | 134 | |
The proposed | 74 |
Items | This Study | [9] | [20] | [3] | [8] | [19] |
---|---|---|---|---|---|---|
Antennas | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 |
Algorithm | GR | GR | GR | GR | GR | GR |
Technology | 55 nm | 65 nm | 0.18 µm | 65 nm | 0.13 µm | 90 nm |
(Hz) | 250 M | 243.9 M | 116.3 M | 550 M | 200 M | 220 M |
Gate count | 176.5 K | 278 K | 437.5 K | 468 K | 299 K | 375.1 K |
Processing latency (ns) | 236 | 266.5 | - | 625.5 | 685-775 | 654.5 |
SQRD Processing cycles | 4 | 5 | 4 | 8 | 8 | 5 |
SQRD throughput (SQRD/s) | 62.5 M | 48.8 M | 29 M | 68.75 M | 25 M | 44 M |
SQRD NHE 1 | 0.354 | 0.207 | 0.217 | 0.177 | 0.198 | 0.192 |
Processing cycles | 1 | 1.25 | 4 | - | - | 5 |
throughput (/s) | 250 M | 195 M | 29 M | - | - | 44 M |
NHE | 1.416 | 0.829 | 0.217 | - | - | 0.192 |
MIMO data throughput 2 | 6 Gbps | 4.7 Gbps | 696 Mbps | - | - | 1 Gbps |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, L.; Wu, B.; Ye, T. Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems. Electronics 2020, 9, 1657. https://doi.org/10.3390/electronics9101657
Sun L, Wu B, Ye T. Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems. Electronics. 2020; 9(10):1657. https://doi.org/10.3390/electronics9101657
Chicago/Turabian StyleSun, Lu, Bin Wu, and Tianchun Ye. 2020. "Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems" Electronics 9, no. 10: 1657. https://doi.org/10.3390/electronics9101657
APA StyleSun, L., Wu, B., & Ye, T. (2020). Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems. Electronics, 9(10), 1657. https://doi.org/10.3390/electronics9101657