Dual-Ascent-Inspired Transformer for Compressed Sensing
Abstract
:1. Introduction
2. Related Work
2.1. Deep Unfolding Networks
2.2. Vision Transformer
2.3. Dual-Ascent Method
3. Proposed Method
3.1. Inertial-Dual-Ascent Form of AADMM
3.2. Dual-Ascent-Inspired Transformer
3.2.1. Overall Architecture
- Compared to Equation (20), Equation (24) replaces the fixed coefficient with a learnable convolutional layer, which encodes the residual of the dual variables between adjacent iteration layers. Additionally, it replaces the linear addition operation with a Cross Attention module, which integrates the result of the gradient descent operator with the residual of the dual variables. This design more effectively incorporates inertial information from the dual space into the primal variable updates, thereby facilitating accelerated convergence while ensuring global stability;
- Equation (25) represents a proximal mapping step, which essentially functions as a high-pass filter in the sparse space. Following widely adopted practices in Deep Unfolding Networks (DUNs) [14,15,20], we replace the fixed matrix mappings in Equation (25) with convolutional layers containing learnable parameters, allowing for a more accurate transformation between the intermediate variable space and the sparse domain. Additionally, we employ the GELU activation function [31], which shares a similar role with the soft-thresholding function, to perform filtering. This process is encapsulated in the High-pass Filter (HPF) module;
- In Equations (26) and (27), we introduce as a fixed input to the right side of the Cross Attention module, forming a unified encoder that integrates the residual term into the dual variable update through the Dual Ascent (DA) module. Notably, this encoder not only ensures dimension consistency between the primal and dual variables but also reuses the parameters of the Cross Attention module, so as to avoid employing two independent CA modules. This design can reduce computational complexity and memory consumption, while also providing superior encoding performance compared to the fixed-coefficient approach used in Equation (22).
3.2.2. Sampling and Initial Reconstruction
3.2.3. Cross Attention
3.2.4. High-Pass Filter
3.2.5. Dual Ascent
- Since the dimensions of the primal and dual spaces differ, to integrate information from the primal space into the dual space, we require an encoder with a dimension-matching mechanism. The dimensions of and are always the same, so using as a fixed input naturally matches the dimensions of the primal variable with those of the dual space via the preceding CA module. Additionally, this approach leverages the information lost during down-sampling via the operator, resulting in better performance than using fixed coefficients as the encoder;
- We note that, in Equation (22), the dual-ascent term requires the fixed coefficients and to be positive-definite. Our DA module employs an attention mechanism with identical K and V inputs. To ensure strict positive definiteness when the encoder operates on the dual-ascent term, one approach is to use a linear attention mechanism [34], which produces the linear attention variable . The process is outlined as follows:Assume that for feature maps , it follows that , for ,The coefficient term is clearly positive-definite. This is established based on prior research [34], which shows that the standard attention mechanism can achieve equal or even superior performance compared to the linear attention mechanism. We argue that the encoder derived from this approach can approximately maintain positive definiteness, thereby supporting the preservation of the dual ascent method’s acceleration characteristics in DAT.
3.2.6. Loss Function
4. Experimental Results
4.1. Details of Implementation
4.2. Early-Stage Performance
4.2.1. Training Process Analysis
4.2.2. Comparison of Reconstructions
4.2.3. Comparison of Convergences and Time Complexities
4.2.4. Comparison of Model Sizes
4.3. Influence of Initial Learning Rate
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar]
- Candes, E.J.; Tao, T. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inf. Theory 2006, 52, 5406–5425. [Google Scholar]
- Candès, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 2006, 52, 489–509. [Google Scholar]
- Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 2007, 58, 1182–1195. [Google Scholar]
- Duarte, M.F.; Davenport, M.A.; Takhar, D.; Laska, J.N.; Sun, T.; Kelly, K.F.; Baraniuk, R.G. Single-pixel imaging via compressive sampling. IEEE Signal Process. Mag. 2008, 25, 83–91. [Google Scholar]
- Ma, J.; Zhou, H.; Zhao, J.; Gao, Y.; Jiang, J.; Tian, J. Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6469–6481. [Google Scholar]
- Mallat, S. A Wavelet Tour of Signal Processing; Elsevier: Amsterdam, The Netherlands, 1999. [Google Scholar]
- Tibshirani, R.J. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar]
- Kulkarni, K.; Lohit, S.; Turaga, P.; Kerviche, R.; Ashok, A. Reconnet: Non-iterative reconstruction of images from compressively sensed measurements. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 449–458. [Google Scholar]
- Sun, Y.; Chen, J.; Liu, Q.; Liu, B.; Guo, G. Dual-path attention network for compressed sensing image reconstruction. IEEE Trans. Image Process. 2020, 29, 9482–9495. [Google Scholar]
- Zhang, J.; Ghanem, B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1828–1837. [Google Scholar]
- Chen, B.; Song, J.; Xie, J.; Zhang, J. Deep physics-guided unrolling generalization for compressed sensing. Int. J. Comput. Vis. 2023, 131, 2864–2887. [Google Scholar]
- Chen, B.; Zhang, Z.; Li, W.; Zhao, C.; Yu, J.; Zhao, S.; Chen, J.; Zhang, J. Invertible Diffusion Models for Compressed Sensing. IEEE Trans. Pattern Anal. Mach. Intell. 2024. [Google Scholar]
- Song, J.; Mou, C.; Wang, S.; Ma, S.; Zhang, J. Optimization-inspired cross-attention transformer for compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 6174–6184. [Google Scholar]
- Shen, M.; Gan, H.; Ning, C.; Hua, Y.; Zhang, T. TransCS: A transformer-based hybrid architecture for image compressed sensing. IEEE Trans. Image Process. 2022, 31, 6991–7005. [Google Scholar] [CrossRef] [PubMed]
- Ye, D.; Ni, Z.; Wang, H.; Zhang, J.; Wang, S.; Kwong, S. CSformer: Bridging convolution and transformer for compressive sensing. IEEE Trans. Image Process. 2023, 32, 2827–2842. [Google Scholar] [CrossRef] [PubMed]
- Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
- Yang, Y.; Sun, J.; Li, H.; Xu, Z. ADMM-CSNet: A deep learning approach for image compressive sensing. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 521–538. [Google Scholar] [CrossRef]
- Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
- You, D.; Xie, J.; Zhang, J. ISTA-Net++: Flexible deep unfolding network for compressive sensing. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Chen, W.; Yang, C.; Yang, X. FSOINET: Feature-space optimization-inspired network for image compressive sensing. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 2460–2464. [Google Scholar]
- Zhang, Z.; Liu, Y.; Liu, J.; Wen, F.; Zhu, C. AMP-Net: Denoising-based deep unfolding for compressive image sensing. IEEE Trans. Image Process. 2020, 30, 1487–1500. [Google Scholar] [CrossRef]
- Ochs, P.; Chen, Y.; Brox, T.; Pock, T. iPiano: Inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 2014, 7, 1388–1419. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 261–272. [Google Scholar]
- Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Tibshirani, R.J. The lasso problem and uniqueness. Electron. J. Stat. 2013, 7, 1456–1490. [Google Scholar] [CrossRef]
- Donoho, D.L. De-noising by soft-thresholding. IEEE Trans. Inf. Theory 2002, 41, 613–627. [Google Scholar]
- Lin, R.; Hayashi, K. An Approximated ADMM based Algorithm for ℓ1-ℓ2 Optimization Problem. In Proceedings of the 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Chiang Mai, Thailand, 7–10 November 2022; pp. 1720–1724. [Google Scholar]
- Clarkson, J.A.; Adams, C.R. On definitions of bounded variation for functions of two variables. Trans. Am. Math. Soc. 1933, 35, 824–854. [Google Scholar]
- Boţ, R.I.; Csetnek, E.R.; László, S.C. An inertial forward–backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comput. Optim. 2016, 4, 3–25. [Google Scholar] [CrossRef]
- Hendrycks, D.; Gimpel, K. Gaussian error linear units (gelus). arXiv 2016, arXiv:1606.08415. [Google Scholar]
- Agarap, A. Deep learning using rectified linear units (relu). arXiv 2018, arXiv:1803.08375. [Google Scholar]
- Mitchell, D.P.; Netravali, A.N. Reconstruction filters in computer-graphics. ACM Siggraph Comput. Graph. 1988, 22, 221–228. [Google Scholar]
- Katharopoulos, A.; Vyas, A.; Pappas, N.; Fleuret, F. Transformers are rnns: Fast autoregressive transformers with linear attention. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 5156–5165. [Google Scholar]
- Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 898–916. [Google Scholar]
- Shi, W.; Jiang, F.; Liu, S.; Zhao, D. Image compressed sensing using convolutional neural network. IEEE Trans. Image Process. 2019, 29, 375–388. [Google Scholar]
- Kingma, D.P. Adam: A method for stochastic optimization (2014). arXiv 2017, arXiv:1412.6980. [Google Scholar]
- Loshchilov, I.; Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar]
- Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Epoch 1 | Epoch 10 | |||||||
Method | CS Ratio | |||||||
10% | 30% | 50% | Avg | 10% | 30% | 50% | Avg | |
Octuf | 25.51/0.7983 | 30.45/0.8842 | 33.44/0.9525 | 29.80/0.8783 | 30.18/0.8977 | 36.76/0.9650 | 40.76/0.9826 | 35.90/0.9485 |
TransCS | 21.62/0.5815 | 24.33/0.7071 | 26.24/0.7752 | 24.06/0.6879 | 29.03/0.8776 | 35.71/0.9587 | 39.85/0.9794 | 34.86/0.9386 |
CSformer | - | - | 22.67/0.6529 | 22.67/0.6529 | - | - | 24.22/0.7590 | 24.22/0.7590 |
DAT(Ours) | 27.15/0.8351 | 33.93/0.9515 | 38.82/0.9782 | 33.33/0.9216 | 29.99/0.8928 | 36.77/0.9651 | 40.95/0.9830 | 35.90/ 0.9469 |
Method | CS Ratio | Run Time | TEP | Param | |||
10% | 30% | 50% | |||||
Octuf | 2/27.72 | 3/34.79 | 4/38.63 | 0.046s | 0.184 | 0.82M | |
TransCS | 4/27.63 | 4/34.49 | 5/38.44 | 0.039s | 0.195 | 2.28M | |
CSformer | - | - | 10+ | 0.021s | 0.210+ | 1.76M | |
DAT(Ours) | 1/27.15 | 1/33.93 | 1/38.82 | 0.060s | 0.060 | 0.76M | |
ISTA-Net+ | 20/26.64 | 20/33.82 | 20/38.07 | 0.016s | 0.320 | 1.70M |
Method | Indicator | Initial Learning Rate | |||||||
Avg | Var | ||||||||
Octuf | MaxMSE | 64.82 | 3663.50 | 309.35 | 827.26 | 267.35 | 5077.60 | 1701.65 | 4,537,075.99 |
MinMSE | 2.90 | 3.03 | 2.86 | 6.90 | 3.35 | 7.06 | 4.35 | 4.18 | |
FinalMSE | 2.96 | 3.03 | 2.86 | 9.60 | 3.58 | 7.86 | 4.98 | 8.80 | |
DAT(Ours) | MaxMSE | 24.71 | 20.33 | 20.90 | 95.44 | 15.12 | 25.30 | 33.63 | 930.20 |
MinMSE | 2.94 | 2.95 | 3.25 | 2.89 | 2.91 | 3.07 | 3.00 | 0.02 | |
FinalMSE | 2.94 | 2.95 | 3.42 | 2.89 | 2.91 | 3.07 | 3.03 | 0.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lin, R.; Shen, Y.; Chen, Y. Dual-Ascent-Inspired Transformer for Compressed Sensing. Sensors 2025, 25, 2157. https://doi.org/10.3390/s25072157
Lin R, Shen Y, Chen Y. Dual-Ascent-Inspired Transformer for Compressed Sensing. Sensors. 2025; 25(7):2157. https://doi.org/10.3390/s25072157
Chicago/Turabian StyleLin, Rui, Yue Shen, and Yu Chen. 2025. "Dual-Ascent-Inspired Transformer for Compressed Sensing" Sensors 25, no. 7: 2157. https://doi.org/10.3390/s25072157
APA StyleLin, R., Shen, Y., & Chen, Y. (2025). Dual-Ascent-Inspired Transformer for Compressed Sensing. Sensors, 25(7), 2157. https://doi.org/10.3390/s25072157