Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,630)

Search Parameters:
Keywords = transformer block

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 13987 KB  
Article
SDTformer: Scale-Adaptive Differential Transformer Network for Remote Sensing Image Dehazing
by Boyu Liu and Qi Zhang
Remote Sens. 2026, 18(8), 1136; https://doi.org/10.3390/rs18081136 (registering DOI) - 11 Apr 2026
Abstract
In Transformer-based image restoration models, the self-attention mechanism often introduces attention noise from irrelevant contextual feature, hindering the recovery of underlying clear content. Although many methods have been proposed to suppress attention noise, we note that most existing approaches are often developed for [...] Read more.
In Transformer-based image restoration models, the self-attention mechanism often introduces attention noise from irrelevant contextual feature, hindering the recovery of underlying clear content. Although many methods have been proposed to suppress attention noise, we note that most existing approaches are often developed for general vision tasks and fail to generalize across remote sensing image dehazing, where large-scale spatial structures pose additional challenges for attention modeling. How to effectively model scale-aware attention to suppress redundant activations becomes crucial for remote sensing image dehazing. In this paper, we propose a scale-adaptive differential Transformer (SDTformer), an architecture designed to suppress attention noise through a differential attention mechanism, thereby improving reconstruction fidelity. Specifically, the model incorporates a scale-adaptive differential self-attention module, which models contextual dependencies across different spatial scales and reduces redundant contextual interference by computing differential attention maps. Additionally, a dynamic differential feed-forward network is proposed to adaptively select informative spatial features, strengthening feature aggregation. To further enhance feature representation, a gated fusion module is introduced to aggregate multi-scale features generated by different encoder blocks, which facilitates the learning process of each decoder block and improves the final reconstruction performance. Extensive experimental results on the commonly used benchmarks show that our method achieves favorable performance against state-of-the-art approaches. Full article
Show Figures

Figure 1

28 pages, 3048 KB  
Article
Mathematical Decision Layers for Technical Proposal Generation in Industrial Electrical Houses Using Generative AI
by Juan Pérez, Ignacio González, Nabeel Imam and Juan Carvajal
Mathematics 2026, 14(8), 1263; https://doi.org/10.3390/math14081263 - 10 Apr 2026
Abstract
Industrial electrical houses are engineered systems that transform and control electrical power to supply industrial loads. Preparing technical proposals for these rooms requires consistent engineering choices across multiple artifacts while drawing from heterogeneous client documents, historical projects, and supplier catalogs. This paper reports [...] Read more.
Industrial electrical houses are engineered systems that transform and control electrical power to supply industrial loads. Preparing technical proposals for these rooms requires consistent engineering choices across multiple artifacts while drawing from heterogeneous client documents, historical projects, and supplier catalogs. This paper reports an industrial prototype that integrates generative AI, system modeling, and mathematical decision methods to support that workflow. We represent requested outputs as ordered sequences of functions and link those functions to candidate equipment blocks through functional and physical graphs that enable traceable retrieval and reuse. Using this representation, we compute a minimal internal-cost baseline by solving a mixed-integer assignment model with sizing constraints, and we rank technically feasible alternatives using fuzzy DEMATEL to derive criterion weights and TOPSIS to obtain an overall ordering under multiple criteria. The workflow is illustrated with an example and the prototype tool used in a company operating in Chile, Peru, Ecuador, and Bolivia, where document ingestion and equipment-list extraction are integrated with human validation. The results illustrate how structured representations, optimization, and multi-criteria ranking can support auditable configurations for engineering review and commercial selection. Full article
(This article belongs to the Special Issue Applications of Operations Research and Decision Making)
Show Figures

Figure 1

23 pages, 12247 KB  
Article
A Lightweight and Real-Time Dual-Polarization Fusion Framework for SAR Ship Classification
by Enrico Gărăiman and Anamaria Radoi
Remote Sens. 2026, 18(8), 1129; https://doi.org/10.3390/rs18081129 - 10 Apr 2026
Abstract
Synthetic Aperture Radar (SAR) ship classification plays a critical role in maritime surveillance, addressing challenges such as the similarity between ship categories, as well as scarcity of annotated datasets and data imbalance. In this paper, a lightweight and real-time dual-branch architecture is proposed [...] Read more.
Synthetic Aperture Radar (SAR) ship classification plays a critical role in maritime surveillance, addressing challenges such as the similarity between ship categories, as well as scarcity of annotated datasets and data imbalance. In this paper, a lightweight and real-time dual-branch architecture is proposed to effectively address the SAR ship classification task. The proposed approach integrates dual-polarization data within a hybrid convolution-transformer framework to improve classification performance. The model fuses dual-polarization modes, combining convolutional layers for local feature extraction with transformer blocks for global contextual understanding. Evaluations on the OpenSARShip 2.0 dataset show that the proposed model achieves 97.50% accuracy in the 3-class configuration and 93.28% in the 6-class configuration. For the FUSAR-Ship dataset, which does not provide dual-polarization data for the same ship target, the single branch model achieved an accuracy of 94.92% for the 7-class configuration. Despite its dual-branch design, the model maintains computational efficiency, making it suitable for real-time maritime monitoring applications. The results demonstrate the effectiveness of polarization-aware hybrid models for scalable and robust SAR ship classification. Full article
(This article belongs to the Special Issue Ship Imaging, Detection and Recognition for High-Resolution SAR)
Show Figures

Figure 1

21 pages, 3708 KB  
Article
Directional Presplitting Roof Cutting for Surface Subsidence Control in Extra-Thick Longwall Top-Coal Caving Under Thick Unconsolidated Overburden
by Hongsheng Wang and Wenrui Zhao
Processes 2026, 14(8), 1218; https://doi.org/10.3390/pr14081218 - 10 Apr 2026
Abstract
Large-scale surface subsidence induced by extra-thick seam longwall top-coal caving (LTCC) is strongly amplified by thick unconsolidated overburden, posing serious serviceability risks to overlying linear infrastructure. Taking the S103 Provincial Highway above Panel 6118 in Inner Mongolia, China, as the engineering background, this [...] Read more.
Large-scale surface subsidence induced by extra-thick seam longwall top-coal caving (LTCC) is strongly amplified by thick unconsolidated overburden, posing serious serviceability risks to overlying linear infrastructure. Taking the S103 Provincial Highway above Panel 6118 in Inner Mongolia, China, as the engineering background, this study integrates theoretical analysis, numerical simulation, and in situ monitoring to investigate the subsidence-control mechanism of directional presplitting roof cutting. The results show that roof cutting mitigates surface subsidence by reconstructing the overburden structural system and weakening the stress-transfer chain, thereby transforming key-stratum deformation from integral bending to segmented block movement and narrowing the subsidence-affected zone. An equivalent mining-depth model for subsidence-boundary convergence is proposed to characterize the inward migration of the subsidence-basin boundary under thick unconsolidated cover, and a segmented probability-integral model is developed to explain the kink-like high-gradient feature in the post-cut subsidence profile. Parametric simulations of roof-cutting positions (p = 0, 2, 4, …, 32 m) show that the most effective mitigation occurs in the range p = 4–12 m; using minimum–maximum highway subsidence together with profile flattening as the optimization criteria, the representative optimum is identified at p ≈ 10 m, for which the maximum highway subsidence is approximately 57 mm, about 76% lower than that in the non-cutting case. The results further indicate that, although roof cutting significantly reduces subsidence and deformation gradients, fissure localization and possible discontinuous deformation near the pre-split weak plane still require careful field monitoring. Full article
Show Figures

Figure 1

21 pages, 968 KB  
Article
ViTUNet: Vision Transformer U-Net Hybrid Model for Carious Lesions Segmentation on Bitewing Dental Images
by Vincent Majanga, Ernest Mnkandla, Ekundayo Olufisayo Sunday, Bosun Ajala and Thottempundi Sree
Appl. Sci. 2026, 16(8), 3693; https://doi.org/10.3390/app16083693 - 9 Apr 2026
Abstract
Meticulous segmentation of medical images requires obtaining both local and global spatial detailed information. The conventional U-Net model excels at local spatial feature extraction through residual convolutional blocks but struggles to capture global features. To resolve this issue, we propose the vision transformer [...] Read more.
Meticulous segmentation of medical images requires obtaining both local and global spatial detailed information. The conventional U-Net model excels at local spatial feature extraction through residual convolutional blocks but struggles to capture global features. To resolve this issue, we propose the vision transformer U-NeT (ViTUNet) model framework, which combines the self-attention mechanism of the vision transformer (ViT) to capture global information while maintaining the extraction of local features via U-NeT. This proposed architecture introduces vision transformers to the existing residual convolution blocks in the U-Net encoder path, thereby capturing both local and global features. The decoder path then rebuilds this information into high-quality segmentation maps with accurately highlighted boundaries/edges. This model is utilized to segment carious lesions in bitewing dental radiographs. These images are pre-processed using augmentation, morphological operations, and segmentation to identify the boundaries/edges of the regions of interest (caries/cavity). The proposed method is evaluated on an augmented dataset containing 3000 image–watershed mask pairs. It was trained on 2400 training images and tested on 600 testing images. The experimental results exemplified significant improvements in segmentation performance, achieving 98.45% validation accuracy, 97.88% validation Dice coefficient, and 95.87% validation intersection over union (IoU) metric scores. These results are superior compared to other conventional and state-of-the-art U-NeT models, thus highlighting the impact of transformer-based hybrid architectures in improving medical image segmentation tasks. Full article
(This article belongs to the Special Issue Advances in Medical Physics and Quantitative Imaging)
22 pages, 6746 KB  
Article
Bidirectional T1–T2 Brain MRI Synthesis Using a Fusion U-Net Transformer for Real-World Clinical Data
by Zeynep Cantemir, Hacer Karacan, Emetullah Cindil and Burak Kalafat
Appl. Sci. 2026, 16(8), 3674; https://doi.org/10.3390/app16083674 - 9 Apr 2026
Abstract
Obtaining multiple MRI contrasts for each patient prolongs scan acquisition time, increases healthcare costs, and may not always be feasible due to patient specific constraints. Deep learning-based MRI contrast synthesis offers a potential solution, yet most existing approaches are evaluated on preprocessed public [...] Read more.
Obtaining multiple MRI contrasts for each patient prolongs scan acquisition time, increases healthcare costs, and may not always be feasible due to patient specific constraints. Deep learning-based MRI contrast synthesis offers a potential solution, yet most existing approaches are evaluated on preprocessed public benchmarks that do not reflect real-world clinical variability. In this study, we propose a fusion U-Net transformer framework for bidirectional T1-weighted ↔ T2-weighted brain MRI synthesis trained and evaluated exclusively on retrospectively acquired clinical data. The proposed architecture integrates multiscale convolutional feature extraction with axial attention mechanisms and a transformer bottleneck for efficient global context modeling. A fusion refinement block is incorporated to mitigate skip connection artifacts. An adversarial training strategy with the least squares GAN objective and a hybrid loss combining L1 reconstruction and structural similarity (SSIM) is employed to promote both pixel-level accuracy and perceptual fidelity. The model is evaluated using SSIM and PSNR metrics alongside qualitative expert assessment conducted by two board-certified radiologists. For both synthesis directions, the framework achieves competitive quantitative performance against baseline models under the challenging conditions of clinical data. Expert evaluation confirms high anatomical fidelity and clinically acceptable image quality across both synthesis directions. These results indicate that the proposed framework represents a promising approach for multi-contrast MRI synthesis in clinically heterogeneous data environments. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

29 pages, 16920 KB  
Article
Towards Character-Based Zoning: Managing Historic Urban Landscapes and Integrating a Dynamic Integrity Framework in Jingdezhen, China
by Ding He, Yameng Zhang and Liqiong Wu
Land 2026, 15(4), 615; https://doi.org/10.3390/land15040615 - 9 Apr 2026
Abstract
The Historic Urban Landscape (HUL) approach provides a vital and extensive framework for heritage conservation. However, local practices often struggle to spatially translate qualitative assessments into quantitative controls at the urban block level, the most effective basic scale for administrative implementation, thereby limiting [...] Read more.
The Historic Urban Landscape (HUL) approach provides a vital and extensive framework for heritage conservation. However, local practices often struggle to spatially translate qualitative assessments into quantitative controls at the urban block level, the most effective basic scale for administrative implementation, thereby limiting effective responses to the Management of Change. By integrating HUL with the theory of Dynamic Integrity, this study constructs a multi-dimensional evaluation index system and proposes a HUL evaluation method based on Character-Based Zoning. Taking the 125 urban block units of the historic urban area of Jingdezhen as a case study, this research integrates historical mapping, GIS spatial analysis, and Co-occurrence Network Analysis to reveal the internal structural logic of the heritage system. The study finds that the HUL of Jingdezhen is a multi-nodal dynamic system driven by four core elements: ritual beliefs, administrative management, production activities, and commercial guilds. Critically, modern visual intrusions severely impact the core heritage components within this system, specifically the Dubang and ritual culture. Based on the three dimensions of Heritage Richness, Landscape Sensitivity and Value Centrality, the study systematically identifies a total of 11 types of urban block units within the plots that characterize distinct historic landscape features and transformation patterns. This research not only deepens the localized application of HUL theory but also provides a scientific basis and methodological support for the Management of Change and periodic assessment in dynamic heritage environments. Full article
23 pages, 3355 KB  
Article
Fracture Pressure Prediction for Tight Conglomerate Reservoirs with Analysis of Acid Pretreatment Influence
by Yue Wang, Qinghua Cheng, Jianchao Li, Yunwei Kang, Hui Liu, Qian Wei, Dali Guo and Zixi Guo
Processes 2026, 14(8), 1192; https://doi.org/10.3390/pr14081192 - 8 Apr 2026
Viewed by 156
Abstract
Tight conglomerate reservoirs are characterized by strong heterogeneity, significant in-situ stress differences, and unbalanced fracturing stimulation, which make fracture pressure prediction challenging and severely restrict the effectiveness of reservoir stimulation and ultimate recovery. Although acid pretreatment is an effective means to reduce fracture [...] Read more.
Tight conglomerate reservoirs are characterized by strong heterogeneity, significant in-situ stress differences, and unbalanced fracturing stimulation, which make fracture pressure prediction challenging and severely restrict the effectiveness of reservoir stimulation and ultimate recovery. Although acid pretreatment is an effective means to reduce fracture pressure, its quantitative relationship with fracture pressure remains unclear. There is an urgent need to establish a systematic method that integrates reservoir heterogeneity characterization, data augmentation, and intelligent prediction. Aiming at the tight conglomerate reservoir in the MH Block, this study proposes an intelligent fracture pressure prediction and acid pretreatment optimization method that integrates Self-Organizing Maps (SOMs), Generative Adversarial Networks (GANs), and Transformer models. First, SOM is used to perform unsupervised clustering of logging parameters to identify different geological feature categories and achieve fine-scale characterization of reservoir heterogeneity. Second, to address the issue of limited samples within each cluster, GAN is employed for high-quality data augmentation to expand the training sample set. Finally, a fracture pressure prediction model is constructed based on the Transformer architecture, and the influence of acid treatment parameters on fracture pressure is quantitatively analyzed using the SHAP method and laboratory experiments. The results show that the proposed model achieves a coefficient of determination (R2) of 0.93, a root mean square error (RMSE) of 2.38 MPa, and a mean absolute percentage error (MAPE) of 2.02% on the test set, with prediction accuracy significantly outperforming benchmark models such as BPNN, XGBoost, and LSTM. Ablation experiments verify that both the SOM clustering and GAN augmentation modules effectively enhance model performance. Analysis of acid treatment parameters indicates that hydrofluoric acid (HF) concentration is the dominant factor influencing fracture pressure reduction, and the mud acid system exhibits a stronger synergistic effect compared to the single hydrochloric acid system. Reasonable optimization of acid concentration and dosage can significantly reduce fracture pressure (3.14–5.28 MPa). This method provides a theoretical basis and engineering guidance for accurate fracture pressure prediction and optimal design of acid pretreatment in tight conglomerate reservoirs. Full article
(This article belongs to the Section Petroleum and Low-Carbon Energy Process Engineering)
Show Figures

Figure 1

21 pages, 4667 KB  
Article
Vibration Suppression and Dynamic Optimization of Multi-Layer Motors for Direct-Drive VICTS Antennas
by Xinlu Yu, Aojun Li, Pingfa Feng and Jianghong Yu
Aerospace 2026, 13(4), 346; https://doi.org/10.3390/aerospace13040346 - 8 Apr 2026
Viewed by 158
Abstract
Weight reduction and dynamic performance optimization are critical for airborne direct-drive VICTS satellite communication antennas, which require lightweight, high-speed, and high-precision rotation. Traditional vibration suppression methods, such as uniform support layout and added damping, rely heavily on empirical trial and error, lack targeted [...] Read more.
Weight reduction and dynamic performance optimization are critical for airborne direct-drive VICTS satellite communication antennas, which require lightweight, high-speed, and high-precision rotation. Traditional vibration suppression methods, such as uniform support layout and added damping, rely heavily on empirical trial and error, lack targeted modal control, and cannot balance lightweight design with dynamic stiffness. To address these issues, this paper proposes a wave-theory-based dynamic modeling and rapid optimization method for multi-layer rotating components in direct-drive VICTS antennas. The kinematic model of the rotating ring and ball revolution excitation are derived using the annular wave equation and bearing kinematics. A Modal Blocking Mechanism is established: placing support balls at positions satisfying the half-wavelength constraint suppresses target mode shapes via wave interference, achieving vibration attenuation at the source. A homogenization equivalent method based on RVE is developed for irregular cross-section rings, yielding analytical expressions for in-plane equivalent elastic modulus and out-of-plane equivalent shear modulus. These parameters are integrated into the wave equation to analytically solve vibration modes, avoiding iterative finite element computations. A rapid multi-objective optimization framework is then constructed, minimizing the structural weight and maximizing the modal separation interval under dynamic stiffness and excitation frequency constraints. Numerical simulations, FE analysis, and prototype tests validate the method: the maximum analytical error is only 3.1%. Compared with uniform support designs, the optimized structure achieves a 40% weight reduction, a 40% increase in minimum modal separation, and a 65% reduction in the RMS tracking error. This work provides an efficient, deterministic dynamic design method for large-diameter ring structures, transforming vibration control from empirical adjustment into a precise, physics-informed optimization. Full article
(This article belongs to the Section Astronautics & Space Science)
Show Figures

Figure 1

32 pages, 11105 KB  
Article
Identification of Heritage Landscape Genes and Micro-Regeneration Pathways in Historic Districts: A Case Study of the Chinese Baroque Block
by Songtao Wu and Jianqiao Sun
Land 2026, 15(4), 606; https://doi.org/10.3390/land15040606 - 7 Apr 2026
Viewed by 113
Abstract
In the era of urban stock development, the regeneration of historic districts must abandon the misguided approach of large-scale, sweeping transformations and shift toward a micro-regeneration model characterized by small-scale, precise, and incremental interventions. However, as urban renewal enters this stock-based phase, the [...] Read more.
In the era of urban stock development, the regeneration of historic districts must abandon the misguided approach of large-scale, sweeping transformations and shift toward a micro-regeneration model characterized by small-scale, precise, and incremental interventions. However, as urban renewal enters this stock-based phase, the issues of “physical dissonance” and “cultural discontinuity” in the heritage landscapes of historic districts are becoming increasingly pronounced. To solve this problem, this paper aims to identify the heritage landscape genes of historical districts, explore the characteristics of historical districts, provide operational targets for the micro-renewal of historical districts, guide the implementation of micro-regeneration policies of historical districts, and then improve the quality of historical district heritage landscapes. Taking the Chinese Baroque Block in Harbin as an example, this paper proposes a genetic recognition method for the heritage landscape of historical districts based on the spatial translation of historical information, spatial topology analysis, an improved U-Net deep learning model, and text mining theme analysis. The micro-regeneration path of historical blocks of “gene identification-feature mining-targeted operation-quality improvement” is proposed. The micro-regeneration countermeasures of “gene replacement and texture repair in open space, gene repair and targeted acupuncture in street and alley, gene embedding and catalyst adjustment in courtyard layout, gene recombination and embroidery treatment of architectural style, and retrospective and contextual narrative of intangible genes” are formulated. The heritage landscape gene of historical districts is conducive to the refined control of the characteristics and quality of historical districts and provides new ideas for the micro-regeneration of historical districts in the stock era. Full article
(This article belongs to the Special Issue Young Researchers in Land Planning and Landscape Architecture)
Show Figures

Figure 1

25 pages, 11063 KB  
Article
Tac-Mamba: A Pose-Guided Cross-Modal State Space Model with Trust-Aware Gating for mmWave Radar Human Activity Recognition
by Haiyi Wu, Kai Zhao, Wei Yao and Yong Xiong
Electronics 2026, 15(7), 1535; https://doi.org/10.3390/electronics15071535 - 7 Apr 2026
Viewed by 229
Abstract
Millimeter-wave (mmWave) radar point clouds offer a privacy-preserving solution for Human Activity Recognition (HAR), but their inherent sparsity and noise limit single-modal performance. While multimodal fusion mitigates this issue, existing methods often suffer from severe negative transfer during visual degradation and incur high [...] Read more.
Millimeter-wave (mmWave) radar point clouds offer a privacy-preserving solution for Human Activity Recognition (HAR), but their inherent sparsity and noise limit single-modal performance. While multimodal fusion mitigates this issue, existing methods often suffer from severe negative transfer during visual degradation and incur high computational costs, unsuitable for edge devices. To address these challenges, we propose Tac-Mamba, a lightweight cross-modal state space model. First, we introduce a topology-guided distillation scheme that uses a Spatial Mamba teacher to extract structural priors from visual skeletons. These priors are then explicitly distilled into a Point Transformer v3 (PTv3) radar student with a modality dropout strategy. We also developed a Trust-Aware Cross-Modal Attention (TACMA) module to prevent negative transfer. It evaluates the reliability of visual features through a SiLU-activated cross-modal bilinear interaction, smoothly degrading to a pure radar-driven fallback projection when visual inputs are corrupted. Finally, a Lightweight Temporal Mamba Block (LTMB) with a Zero-Parameter Cross-Gating (ZPCG) mechanism captures long-range kinematic dependencies with linear complexity. Experiments on the public MM-Fi dataset under strict cross-environment protocols demonstrate that Tac-Mamba achieves competitive accuracies of 95.37% (multimodal) and 87.54% (radar-only) with only 0.86M parameters and 1.89 ms inference latency. These results highlight the model’s exceptional robustness to modality missingness and its feasibility for edge deployment. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Graphical abstract

17 pages, 1612 KB  
Article
AutoMamba: Efficient Autonomous Driving Segmentation Model with Mamba
by Haoran Sun, Zhensong Li and Shiliang Zhu
Sensors 2026, 26(7), 2227; https://doi.org/10.3390/s26072227 - 3 Apr 2026
Viewed by 350
Abstract
Semantic segmentation for autonomous driving demands balancing high-fidelity perception with real-time latency. While Transformers achieve state-of-the-art results, their quadratic complexity bottlenecks high-resolution processing. State Space Models (SSMs) like Mamba offer linear complexity but often suffer from local detail loss and inefficient scanning strategies. [...] Read more.
Semantic segmentation for autonomous driving demands balancing high-fidelity perception with real-time latency. While Transformers achieve state-of-the-art results, their quadratic complexity bottlenecks high-resolution processing. State Space Models (SSMs) like Mamba offer linear complexity but often suffer from local detail loss and inefficient scanning strategies. We introduce AutoMamba, a tailored Hybrid-SSM architecture. We propose a Hybrid-SSM block incorporating Depthwise Convolutions to inject local spatial priors and a Stage-Adaptive Mixed-Scanning strategy. This strategy prioritizes horizontal context in early stages for road layouts while only activating vertical scanning in deep layers to preserve anisotropic structures like poles. Furthermore, we reveal that unlike Transformers, Mamba architectures require Auxiliary Supervision and Online Hard Example Mining (OHEM) to address “long-tail forgetting.” Experiments on Cityscapes and BDD100K under a training-from-scratch setting demonstrate AutoMamba’s superiority. Notably, AutoMamba-B0 achieves 67.79% mIoU on Cityscapes with 31.3% fewer FLOPs than SegFormer-B0. Moreover, while the larger SegFormer-B2 fails with Out-Of-Memory errors at 2048×2048 resolution, AutoMamba-B2 scales efficiently, validating its linear complexity advantage for next-generation perception systems. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

23 pages, 1017 KB  
Article
Interval-Based Tropical Cyclone Intensity Forecasting with Spatiotemporal Transformers
by Tao Guo, Hua Zhang, Tao Song and Shiqiu Peng
Remote Sens. 2026, 18(7), 1069; https://doi.org/10.3390/rs18071069 - 2 Apr 2026
Viewed by 248
Abstract
Accurate tropical cyclone (TC) intensity forecasting remains challenging due to the strong nonlinearity of intensity evolution and the rapid structural changes associated with storm development. In this work, we propose TC-QFormer, an interval-based probabilistic framework for 24 h TC intensity forecasting that combines [...] Read more.
Accurate tropical cyclone (TC) intensity forecasting remains challenging due to the strong nonlinearity of intensity evolution and the rapid structural changes associated with storm development. In this work, we propose TC-QFormer, an interval-based probabilistic framework for 24 h TC intensity forecasting that combines transformer-based spatiotemporal modeling with scalar conditioning. Specifically, we adapt the PredFormer video prediction model for multi-horizon scalar regression and introduce a lightweight Scalar–Image Fusion Block to incorporate historical intensity information into the visual representations. A two-stage training strategy is adopted, in which the model is first pretrained for deterministic median prediction and subsequently fine-tuned to directly predict multiple conditional quantiles using the pinball loss. Experiments are conducted on the TCIR dataset using geostationary infrared and water vapor satellite imagery together with aligned historical intensity records. The proposed method is evaluated against representative recurrent and non-recurrent baselines, including ConvLSTM, PredRNN, and SimVP. Results indicate that the proposed framework achieves improved deterministic accuracy and produces well-calibrated 80% prediction intervals, particularly at longer forecast lead times and during rapidly evolving intensity regimes. These findings suggest that combining transformer-based spatiotemporal modeling with scalar–image conditioning provides an effective and interpretable approach for probabilistic TC intensity forecasting. Full article
Show Figures

Figure 1

24 pages, 30338 KB  
Article
On the Dynamics and Stability of Envelope Rossby Solitary Waves Under the Topographic Geostrophic Approximation
by Guohua Cao, Quansheng Liu, Liangui Yang and Ruigang Zhang
Mathematics 2026, 14(7), 1189; https://doi.org/10.3390/math14071189 - 2 Apr 2026
Viewed by 154
Abstract
Scholars are widely concerned about the research of nonlinear Rossby waves due to their essential importance in understanding the geophysical fluid dynamics. The effects of different topographies on the propagation of barotropic Rossby waves are discussed in this paper. Starting from the classical [...] Read more.
Scholars are widely concerned about the research of nonlinear Rossby waves due to their essential importance in understanding the geophysical fluid dynamics. The effects of different topographies on the propagation of barotropic Rossby waves are discussed in this paper. Starting from the classical shallow water equation of uniformly rotating fluid with bottom topography, a new Schrödinger model equation of nonlinear Rossby wave amplitude is obtained by multi-scale spatial-temporal transformations and perturbation expansion method, which has an advantage in characterizing the propagation of the blocking for atmospheres. The evolutionary dynamics of dipole blocking are discussed analytically and are simulated numerically via changing terrain parameters for sinusoidal topography, slope topography, and roughed topography, respectively. The results show that the amplitude increase for sinusoidal bottom topography makes the dipole blocking move faster and enhances the intensity significantly. For sloped topography, the intensity of dipole blocking slowly decreases with increasing topographic slope. At the same time, the effect of the frequency for roughed topography agrees with the slope effect on the dynamics of nonlinear envelope solitary Rossby waves. This theoretical attempt gives a new explanation of the topographic Rossby waves. Full article
Show Figures

Figure 1

18 pages, 2329 KB  
Article
Multi-Scale Optimal Transport Transformer for Efficient Exemplar-Based Image Translation
by Jinsong Zhang, Xiongzheng Li and Yuqin Lin
Big Data Cogn. Comput. 2026, 10(4), 107; https://doi.org/10.3390/bdcc10040107 - 1 Apr 2026
Viewed by 327
Abstract
Exemplar-based image translation generates an output image by transferring appearance from a reference exemplar to a content image. Existing works only consider the local correspondences between two modalities, and ignore the global distributions in each modality, struggling to obtain fine-grained details with efficient [...] Read more.
Exemplar-based image translation generates an output image by transferring appearance from a reference exemplar to a content image. Existing works only consider the local correspondences between two modalities, and ignore the global distributions in each modality, struggling to obtain fine-grained details with efficient computation. In this paper, we propose OTFormer, a multi-scale Optimal Transport transformer for exemplarbased image translation. We formulate cross-modal alignment as a multi-scale optimal transport problem, which progressively provides a globally coherent matching. In addition, we design a lightweight multi-scale fusion block to extract and fuse features efficiently. Experiments on CelebA-HQ and DeepFashion demonstrate that OTFormer improves both image fidelity and style adherence, while reducing model parameters by 62% and achieving faster inference compared with strong baselines. These results highlight OTguided global alignment as an effective and deployable solution for high-fidelity exemplarbased image translation. Full article
Show Figures

Figure 1

Back to TopTop