Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (61)

Search Parameters:
Keywords = image color translation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
10 pages, 8944 KiB  
Article
High-Speed Full-Color Polarized Light Imaging of Collagen Using a Polarization Camera
by Bin Yang, Neil Nayyar and Billy Sanchez
Bioengineering 2025, 12(7), 720; https://doi.org/10.3390/bioengineering12070720 - 30 Jun 2025
Viewed by 358
Abstract
Polarized light imaging (PLI) has been effective in visualizing and quantifying collagen content. Collagen-specific data are often overlaid over the tissue image for visualization. However, such contextual tissue images are typically in grayscale and lack important color information, limiting the usefulness of PLI [...] Read more.
Polarized light imaging (PLI) has been effective in visualizing and quantifying collagen content. Collagen-specific data are often overlaid over the tissue image for visualization. However, such contextual tissue images are typically in grayscale and lack important color information, limiting the usefulness of PLI in imaging the stained histology slides and for surgical guidance. The objective of this study was to develop a robust and easy-to-implement PLI technique to capture both true color and birefringent collagen data, and we call it ColorPOL. ColorPOL uses only one polarization-sensitive camera to capture information at 75 frames per second. The true color images were synthesized from individual RGB images, and collagen-specific information (fiber orientation and retardance) was derived from the green channel image. We implemented ColorPOL in transmission mode on an upright microscope and in reflection mode for wide-field thick tissue imaging. The color images in both implementations provided valuable color tissue context that facilitated the identification and localization of collagen content. Additionally, we demonstrated that in reflection mode, the high imaging speed enabled us to record and visualize continuous deformations of the collagenous tissues (tendons, sciatic nerves, and blood vessels) overlaid on the processed collagen-specific information. Robust performance and flexible configuration will make ColorPOL a valuable tool in basic research and translational applications. Full article
Show Figures

Figure 1

36 pages, 26652 KiB  
Article
Low-Light Image Enhancement for Driving Condition Recognition Through Multi-Band Images Fusion and Translation
by Dong-Min Son and Sung-Hak Lee
Mathematics 2025, 13(9), 1418; https://doi.org/10.3390/math13091418 - 25 Apr 2025
Viewed by 538
Abstract
When objects are obscured by shadows or dim surroundings, image quality is improved by fusing near-infrared and visible-light images. At night, when visible and NIR lights are insufficient, long-wave infrared (LWIR) imaging can be utilized, necessitating the attachment of a visible-light sensor to [...] Read more.
When objects are obscured by shadows or dim surroundings, image quality is improved by fusing near-infrared and visible-light images. At night, when visible and NIR lights are insufficient, long-wave infrared (LWIR) imaging can be utilized, necessitating the attachment of a visible-light sensor to an LWIR camera to simultaneously capture both LWIR and visible-light images. This camera configuration enables the acquisition of infrared images at various wavelengths depending on the time of day. To effectively fuse clear visible regions from the visible-light spectrum with those from the LWIR spectrum, a multi-band fusion method is proposed. The proposed fusion process subsequently combines detailed information from infrared and visible-light images, enhancing object visibility. Additionally, this process compensates for color differences in visible-light images, resulting in a natural and visually consistent output. The fused images are further enhanced using a night-to-day image translation module, which improves overall brightness and reduces noise. This night-to-day translation module is a trained CycleGAN-based module that adjusts object brightness in nighttime images to levels comparable to daytime images. The effectiveness and superiority of the proposed method are validated using image quality metrics. The proposed method significantly contributes to image enhancement, achieving the best average scores compared to other methods, with a BRISQUE of 30.426 and a PIQE of 22.186. This study improves the accuracy of human and object recognition in CCTV systems and provides a potential image-processing tool for autonomous vehicles. Full article
Show Figures

Figure 1

22 pages, 26135 KiB  
Article
New Approach for Mapping Land Cover from Archive Grayscale Satellite Imagery
by Mohamed Rabii Simou, Mohamed Maanan, Safia Loulad, Mehdi Maanan and Hassan Rhinane
Technologies 2025, 13(4), 158; https://doi.org/10.3390/technologies13040158 - 14 Apr 2025
Viewed by 656
Abstract
This paper examines the use of image-to-image translation models to colorize grayscale satellite images for improved built-up segmentation of Agadir, Morocco, in 1967 and Les Sables-d’Olonne, France, in 1975. The proposed method applies advanced colorization techniques to historical remote sensing data, enhancing the [...] Read more.
This paper examines the use of image-to-image translation models to colorize grayscale satellite images for improved built-up segmentation of Agadir, Morocco, in 1967 and Les Sables-d’Olonne, France, in 1975. The proposed method applies advanced colorization techniques to historical remote sensing data, enhancing the segmentation process compared to using the original grayscale images. In this study, spatial data such as Landsat 5TM satellite images and declassified satellite images were collected and prepared for analysis. The models were trained and validated using Landsat 5TM RGB images and their corresponding grayscale versions. Once trained, these models were applied to colorize the declassified grayscale satellite images. To train the segmentation models, colorized Landsat images were paired with built-up-area masks, allowing the models to learn the relationship between colorized features and built-up regions. The best-performing segmentation model was then used to segment the colorized declassified images into built-up areas. The results demonstrate that the Attention Pix2Pix model successfully learned to colorize grayscale satellite images accurately, improving the PSNR by up to 27.72 and SSIM by 0.96. Furthermore, the results of segmentation were highly satisfactory, with UNet++ identified as the best-performing model with an mIoU of 96.95% in Greater Agadir and 95.42% in Vendée. These findings indicate that the application of the developed method can achieve accurate and reliable results that can be utilized for future LULC change studies. The innovative approach of the study has significant implications for land planning and management, providing accurate LULC information to inform decisions related to zoning, environmental protection, and disaster management. Full article
(This article belongs to the Section Environmental Technology)
Show Figures

Graphical abstract

18 pages, 4664 KiB  
Article
Local Binary Pattern–Cycle Generative Adversarial Network Transfer: Transforming Image Style from Day to Night
by Abeer Almohamade, Salma Kammoun and Fawaz Alsolami
J. Imaging 2025, 11(4), 108; https://doi.org/10.3390/jimaging11040108 - 31 Mar 2025
Viewed by 687
Abstract
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN, [...] Read more.
Transforming images from day style to night style is crucial for enhancing perception in autonomous driving and smart surveillance. However, existing CycleGAN-based approaches struggle with texture loss, structural inconsistencies, and high computational costs. In our attempt to overcome these challenges, we produced LBP-CycleGAN, a new modification of CycleGAN that benefits from the advantages of a Local Binary Pattern (LBP) that extracts details of texture, unlike traditional CycleGAN, which relies heavily on color transformations. Our model leverages LBP-based single-channel inputs, ensuring sharper, more consistent night-time textures. We evaluated three model variations: (1) LBP-CycleGAN with a self-attention mechanism in both the generator and discriminator, (2) LBP-CycleGAN with a self-attention mechanism in the discriminator only, and (3) LBP-CycleGAN without a self-attention mechanism. Our results demonstrate that the LBP-CycleGAN model without self-attention outperformed the other models, achieving a superior texture quality while significantly reducing the training time and computational overhead. This work opens up new possibilities for efficient, high-fidelity night-time image translation in real-world applications, including autonomous driving and low-light vision systems. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

25 pages, 2503 KiB  
Article
Compatibility Between OLCI Marine Remote-Sensing Reflectance from Sentinel-3A and -3B in European Waters
by Frédéric Mélin, Ilaria Cazzaniga and Pietro Sciuto
Remote Sens. 2025, 17(7), 1132; https://doi.org/10.3390/rs17071132 - 22 Mar 2025
Viewed by 563
Abstract
There has been an uninterrupted suite of ocean-color missions with global coverage since 1997, a continuity now supported by programs ensuring the launch of a series of platforms such as the Sentinel-3 missions hosting the Ocean and Land Color Imager (OLCI). The products [...] Read more.
There has been an uninterrupted suite of ocean-color missions with global coverage since 1997, a continuity now supported by programs ensuring the launch of a series of platforms such as the Sentinel-3 missions hosting the Ocean and Land Color Imager (OLCI). The products derived from these missions should be consistent and allow the analysis of long-term multi-mission data records, particularly for climate science. In metrological terms, this agreement is expressed by compatibility, by which data from different sources agree within their stated uncertainties. The current study investigates the compatibility of remote-sensing reflectance products RRS derived from standard atmospheric correction algorithms applied to Sentinel-3A and -3B (S-3A and S-3B, respectively) data. For the atmospheric correction l2gen, validation results obtained with field data from the ocean-color component of the Aerosol Robotic Network (AERONET-OC) and uncertainty estimates appear consistent between S-3A and S-3B as well as with other missions processed with the same algorithm. Estimates of the error correlation between S-3A and S-3B RRS, required to evaluate their compatibility, are computed based on common matchups and indicate varying levels of correlation for the various bands and sites in the interval 0.33–0.60 between 412 and 665 nm considering matchups of all sites put together. On average, validation data associated with Camera 1 of OLCI show lower systematic differences with respect to field data. In direct comparisons between S-3A and S-3B, RRS data from S-3B appear lower than S-3A values, which is explained by the fact that a large share of these comparisons relies on S-3B data collected by Camera 1 and S-3A data collected by Cameras 3 to 5. These differences are translated into a rather low level of metrological compatibility between S-3A and S-3B RRS data when compared daily. These results suggest that the creation of OLCI climate data records is challenging, but they do not preclude the consistency of time (e.g., monthly) composites, which still needs to be evaluated. Full article
Show Figures

Figure 1

23 pages, 1716 KiB  
Article
Knowledge Translator: Cross-Lingual Course Video Text Style Transform via Imposed Sequential Attention Networks
by Jingyi Zhang, Bocheng Zhao, Wenxing Zhang and Qiguang Miao
Electronics 2025, 14(6), 1213; https://doi.org/10.3390/electronics14061213 - 19 Mar 2025
Cited by 1 | Viewed by 481
Abstract
Massive Online Open Courses (MOOCs) have been growing rapidly in the past few years. Video content is an important carrier for cultural exchange and education popularization, and needs to be translated into multiple language versions to meet the needs of learners from different [...] Read more.
Massive Online Open Courses (MOOCs) have been growing rapidly in the past few years. Video content is an important carrier for cultural exchange and education popularization, and needs to be translated into multiple language versions to meet the needs of learners from different countries and regions. However, current MOOC video processing solutions rely excessively on manual operations, resulting in low efficiency and difficulty in meeting the urgent requirement for large-scale content translation. Key technical challenges include the accurate localization of embedded text in complex video frames, maintaining style consistency across languages, and preserving text readability and visual quality during translation. Existing methods often struggle with handling diverse text styles, background interference, and language-specific typographic variations. In view of this, this paper proposes an innovative cross-language style transfer algorithm that integrates advanced techniques such as attention mechanisms, latent space mapping, and adaptive instance normalization. Specifically, the algorithm first utilizes attention mechanisms to accurately locate the position of each text in the image, ensuring that subsequent processing can be targeted at specific text areas. Subsequently, by extracting features corresponding to this location information, the algorithm can ensure accurate matching of styles and text features, achieving an effective style transfer. Additionally, this paper introduces a new color loss function aimed at ensuring the consistency of text colors before and after style transfer, further enhancing the visual quality of edited images. Through extensive experimental verification, the algorithm proposed in this paper demonstrated excellent performance on both synthetic and real-world datasets. Compared with existing methods, the algorithm exhibited significant advantages in multiple image evaluation metrics, and the proposed method achieved a 2% improvement in the FID metric and a 20% improvement in the IS metric on relevant datasets compared to SOTA methods. Additionally, both the proposed method and the introduced dataset, PTTEXT, will be made publicly available upon the acceptance of the paper. For additional details, please refer to the project URL, which will be made public after the paper has been accepted. Full article
(This article belongs to the Special Issue Applications of Computational Intelligence, 3rd Edition)
Show Figures

Figure 1

19 pages, 6028 KiB  
Article
DCLTV: An Improved Dual-Condition Diffusion Model for Laser-Visible Image Translation
by Xiaoyu Zhang, Laixian Zhang, Huichao Guo, Haijing Zheng, Houpeng Sun, Yingchun Li, Rong Li, Chenglong Luan and Xiaoyun Tong
Sensors 2025, 25(3), 697; https://doi.org/10.3390/s25030697 - 24 Jan 2025
Viewed by 1404
Abstract
Laser active imaging systems can remedy the shortcomings of visible light imaging systems in difficult imaging circumstances, thereby attaining clear images. However, laser images exhibit significant modal discrepancy in contrast to the visible image, impeding human perception and computer processing. Consequently, it is [...] Read more.
Laser active imaging systems can remedy the shortcomings of visible light imaging systems in difficult imaging circumstances, thereby attaining clear images. However, laser images exhibit significant modal discrepancy in contrast to the visible image, impeding human perception and computer processing. Consequently, it is necessary to translate laser images to visible images across modalities. Existing cross-modal image translation algorithms are plagued with issues, including difficult training and color bleeding. In recent studies, diffusion models have demonstrated superior image generation and translation abilities and been shown to be capable of generating high-quality images. To achieve more accurate laser-visible image translation, we designed an improved diffusion model, called DCLTV, which limits the randomness of diffusion models by means of dual-condition control. We incorporated the Brownian bridge strategy to serve as the first condition control and employed interpolation-based conditional injection to function as the second condition control. We also established a dataset comprising 665 pairs of laser-visible images to compensate for the data deficiency in the field of laser-visible image translation. Compared to five representative baseline models, namely Pix2pix, BigColor, CT2, ColorFormer, and DDColor, the proposed DCLTV achieved the best performance in terms of both qualitative and quantitative comparisons, realizing at least a 15.89% reduction in FID and at least a 22.02% reduction in LPIPS. We further validated the effectiveness of the dual conditions in DCLTV through ablation experiments, achieving the best results with an FID of 154.74 and an LPIPS of 0.379. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

18 pages, 7292 KiB  
Article
Concurrent Viewing of H&E and Multiplex Immunohistochemistry in Clinical Specimens
by Larry E. Morrison, Tania M. Larrinaga, Brian D. Kelly, Mark R. Lefever, Rachel C. Beck and Daniel R. Bauer
Diagnostics 2025, 15(2), 164; https://doi.org/10.3390/diagnostics15020164 - 13 Jan 2025
Viewed by 1303
Abstract
Background/Objectives: Performing hematoxylin and eosin (H&E) staining and immunohistochemistry (IHC) on the same specimen slide provides advantages that include specimen conservation and the ability to combine the H&E context with biomarker expression at the individual cell level. We previously used invisible deposited chromogens [...] Read more.
Background/Objectives: Performing hematoxylin and eosin (H&E) staining and immunohistochemistry (IHC) on the same specimen slide provides advantages that include specimen conservation and the ability to combine the H&E context with biomarker expression at the individual cell level. We previously used invisible deposited chromogens and dual-camera imaging, including monochrome and color cameras, to implement simultaneous H&E and IHC. Using this approach, conventional H&E staining could be simultaneously viewed in color on a computer monitor alongside a monochrome video of the invisible IHC staining, while manually scanning the specimen. Methods: We have now simplified the microscope system to a single camera and increased the IHC multiplexing to four biomarkers using translational assays. The color camera used in this approach also enabled multispectral imaging, similar to monochrome cameras. Results: Application is made to several clinically relevant specimens, including breast cancer (HER2, ER, and PR), prostate cancer (PSMA, P504S, basal cell, and CD8), Hodgkin’s lymphoma (CD15 and CD30), and melanoma (LAG3). Additionally, invisible chromogenic IHC was combined with conventional DAB IHC to present a multiplex IHC assay with unobscured DAB staining, suitable for visual interrogation. Conclusions: Simultaneous staining and detection, as described here, provides the pathologist a means to evaluate complex multiplexed assays, while seated at the microscope, with the added multispectral imaging capability to support digital pathology and artificial intelligence workflows of the future. Full article
(This article belongs to the Special Issue New Promising Diagnostic Signatures in Histopathological Diagnosis)
Show Figures

Figure 1

21 pages, 9210 KiB  
Article
sRrsR-Net: A New Low-Light Image Enhancement Network via Raw Image Reconstruction
by Zhiyong Hong, Dexin Zhen, Liping Xiong, Xuechen Li and Yuhan Lin
Appl. Sci. 2025, 15(1), 361; https://doi.org/10.3390/app15010361 - 2 Jan 2025
Viewed by 1594
Abstract
Most existing low-light image enhancement (LIE) methods are primarily designed for human-vision-friendly image formats, such as sRGB, due to their convenient storage and smaller file sizes. In addition, raw images provide greater detail and a wider dynamic range, which makes them more suitable [...] Read more.
Most existing low-light image enhancement (LIE) methods are primarily designed for human-vision-friendly image formats, such as sRGB, due to their convenient storage and smaller file sizes. In addition, raw images provide greater detail and a wider dynamic range, which makes them more suitable for LIE tasks. Despite these advantages, raw images, the original format captured by cameras, are larger and less accessible and are hard to use in methods of LIE with mobile devices. In order to leverage both the advantages of sRGB and raw domains while avoiding the direct use of raw images as training data, this paper introduces sRrsR-Net, a novel framework with the image translation process of sRGB–raw–sRGB for LIE task. In our approach, firstly, the RGB-to-iRGB module is designed to convert sRGB images into intermediate RGB feature maps. Then, with these intermediate feature maps, to bridge the domain gap between sRGB and raw pixels, the RAWFormer module is proposed to employ global attention to effectively align features between the two domains to generate reconstructed raw images. For enhancing the raw images and restoring them back to normal-light sRGB, unlike traditional Image Signal Processing (ISP) pipelines, which are often bulky and integrate numerous processing steps, we propose the RRAW-to-sRGB module. This module simplifies the process by focusing only on color correction and white balance, while still delivering competitive results. Extensive experiments on four benchmark datasets referring to both domains demonstrate the effectiveness of our approach. Full article
(This article belongs to the Special Issue Advances in Image Enhancement and Restoration Technology)
Show Figures

Figure 1

21 pages, 1062 KiB  
Review
An Overview of Quantum Circuit Design Focusing on Compression and Representation
by Ershadul Haque, Manoranjan Paul, Faranak Tohidi and Anwaar Ulhaq
Electronics 2025, 14(1), 72; https://doi.org/10.3390/electronics14010072 - 27 Dec 2024
Viewed by 1637
Abstract
Quantum image computing has attracted attention due to its vast storage capacity and faster image data processing, leveraging unique properties such as parallelism, superposition, and entanglement, surpassing classical computers. Although classical computing power has grown substantially over the last decade, its rate of [...] Read more.
Quantum image computing has attracted attention due to its vast storage capacity and faster image data processing, leveraging unique properties such as parallelism, superposition, and entanglement, surpassing classical computers. Although classical computing power has grown substantially over the last decade, its rate of improvement has slowed, struggling to meet the demands of massive datasets. Several approaches have emerged for encoding and compressing classical images on quantum processors. However, a significant limitation is the complexity of preparing the quantum state, which translates pixel coordinates into corresponding quantum circuits. Current approaches for representing large-scale images require higher quantum resources, such as qubits and connection gates, presenting significant hurdles. This article aims to overview the pixel intensity and state preparation circuits requiring fewer quantum resources and explore effective compression techniques for medium and high-resolution images. It also conducts a comprehensive study of quantum image representation and compression techniques, categorizing methods by grayscale and color image types and evaluating their strengths and weaknesses. Moreover, the efficacy of each model’s compression can guide future research toward efficient circuit designs for medium- to high-resolution images. Furthermore, it is a valuable reference for advancing quantum image processing research by providing a systematic framework for evaluating quantum image compression and representation algorithms. Full article
(This article belongs to the Special Issue Image Fusion and Image Processing)
Show Figures

Figure 1

26 pages, 13748 KiB  
Article
An Automatic Solution for Registration Between Single-Image and Point Cloud in Manhattan World Using Line Primitives
by Yifeng He, Jingui Zou, Ruoming Zhai, Liyuan Meng, Yinzhi Zhao, Dingliang Yang and Na Wang
Remote Sens. 2024, 16(23), 4382; https://doi.org/10.3390/rs16234382 - 23 Nov 2024
Viewed by 1315
Abstract
2D-3D registration is increasingly being applied in various scientific and engineering scenarios. However, due to appearance differences and cross-modal discrepancies, it is demanding for image and point cloud registration methods to establish correspondences, making 2D-3D registration highly challenging. To handle these problems, we [...] Read more.
2D-3D registration is increasingly being applied in various scientific and engineering scenarios. However, due to appearance differences and cross-modal discrepancies, it is demanding for image and point cloud registration methods to establish correspondences, making 2D-3D registration highly challenging. To handle these problems, we propose a novel and automatic solution for 2D-3D registration in Manhattan world based on line primitives, which we denote as VPPnL. Firstly, we derive the rotation matrix candidates by establishing the vanishing point coordinate system as the link of point cloud principal directions to camera coordinate system. Subsequently, the RANSAC algorithm, which accounts for the clustering of parallel lines, is employed in conjunction with the least-squares method for translation vectors estimation and optimization. Finally, a nonlinear least-squares graph optimization method is carried out to optimize the camera pose and realize the 2D-3D registration and point colorization. Experiments on synthetic data and real-world data illustrate that our proposed algorithm can address the problem of 2D-3D direct registration in the case of Manhattan scenes where images are limited and sparse. Full article
Show Figures

Figure 1

23 pages, 16837 KiB  
Article
MapGen-Diff: An End-to-End Remote Sensing Image to Map Generator via Denoising Diffusion Bridge Model
by Jilong Tian, Jiangjiang Wu, Hao Chen and Mengyu Ma
Remote Sens. 2024, 16(19), 3716; https://doi.org/10.3390/rs16193716 - 6 Oct 2024
Cited by 2 | Viewed by 1633
Abstract
Online maps are of great importance in modern life, especially in commuting, traveling and urban planning. The accessibility of remote sensing (RS) images has contributed to the widespread practice of generating online maps based on RS images. The previous works leverage an idea [...] Read more.
Online maps are of great importance in modern life, especially in commuting, traveling and urban planning. The accessibility of remote sensing (RS) images has contributed to the widespread practice of generating online maps based on RS images. The previous works leverage an idea of domain mapping to achieve end-to-end remote sensing image-to-map translation (RSMT). Although existing methods are effective and efficient for online map generation, generated online maps still suffer from ground features distortion and boundary inaccuracy to a certain extent. Recently, the emergence of diffusion models has signaled a significant advance in high-fidelity image synthesis. Based on rigorous mathematical theories, denoising diffusion models can offer controllable generation in sampling process, which are very suitable for end-to-end RSMT. Therefore, we design a novel end-to-end diffusion model to generate online maps directly from remote sensing images, called MapGen-Diff. We leverage a strategy inspired by Brownian motion to make a trade-off between the diversity and the accuracy of generation process. Meanwhile, an image compression module is proposed to map the raw images into the latent space for capturing more perception features. In order to enhance the geometric accuracy of ground features, a consistency regularization is designed, which allows the model to generate maps with clearer boundaries and colorization. Compared to several state-of-the-art methods, the proposed MapGen-Diff achieves outstanding performance, especially a 5% RMSE and 7% SSIM improvement on Los Angeles and Toronto datasets. The visualization results also demonstrate more accurate local details and higher quality. Full article
Show Figures

Figure 1

21 pages, 6746 KiB  
Article
A LiDAR-Camera Joint Calibration Algorithm Based on Deep Learning
by Fujie Ren, Haibin Liu and Huanjie Wang
Sensors 2024, 24(18), 6033; https://doi.org/10.3390/s24186033 - 18 Sep 2024
Cited by 3 | Viewed by 3390
Abstract
Multisensor (MS) data fusion is important for improving the stability of vehicle environmental perception systems. MS joint calibration is a prerequisite for the fusion of multimodality sensors. Traditional calibration methods based on calibration boards require the manual extraction of many features and manual [...] Read more.
Multisensor (MS) data fusion is important for improving the stability of vehicle environmental perception systems. MS joint calibration is a prerequisite for the fusion of multimodality sensors. Traditional calibration methods based on calibration boards require the manual extraction of many features and manual registration, resulting in a cumbersome calibration process and significant errors. A joint calibration algorithm for a Light Laser Detection and Ranging (LiDAR) and camera is proposed based on deep learning without the need for other special calibration objects. A network model constructed based on deep learning can automatically capture object features in the environment and complete the calibration by matching and calculating object features. A mathematical model was constructed for joint LiDAR-camera calibration, and the process of sensor joint calibration was analyzed in detail. By constructing a deep-learning-based network model to determine the parameters of the rotation matrix and translation matrix, the relative spatial positions of the two sensors were determined to complete the joint calibration. The network model consists of three parts: a feature extraction module, a feature-matching module, and a feature aggregation module. The feature extraction module extracts the image features of color and depth images, the feature-matching module calculates the correlation between the two, and the feature aggregation module determines the calibration matrix parameters. The proposed algorithm was validated and tested on the KITTI-odometry dataset and compared with other advanced algorithms. The experimental results show that the average translation error of the calibration algorithm is 0.26 cm, and the average rotation error is 0.02°. The calibration error is lower than those of other advanced algorithms. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

26 pages, 12522 KiB  
Article
A Vision–Language Model-Based Traffic Sign Detection Method for High-Resolution Drone Images: A Case Study in Guyuan, China
by Jianqun Yao, Jinming Li, Yuxuan Li, Mingzhu Zhang, Chen Zuo, Shi Dong and Zhe Dai
Sensors 2024, 24(17), 5800; https://doi.org/10.3390/s24175800 - 6 Sep 2024
Cited by 5 | Viewed by 2241
Abstract
As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs. However, the existing image processing technique is heavily reliant [...] Read more.
As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs. However, the existing image processing technique is heavily reliant on image annotations. It is time consuming to build a high-quality dataset with diverse training images and human annotations. In this paper, we introduce the utilization of Vision–language Models (VLMs) in the traffic sign detection task. Without the need for discrete image labels, the rapid deployment is fulfilled by the multi-modal learning and large-scale pretrained networks. First, we compile a keyword dictionary to explain traffic signs. The Chinese national standard is used to suggest the shape and color information. Our program conducts Bootstrapping Language-image Pretraining v2 (BLIPv2) to translate representative images into text descriptions. Second, a Contrastive Language-image Pretraining (CLIP) framework is applied to characterize not only drone images but also text descriptions. Our method utilizes the pretrained encoder network to create visual features and word embeddings. Third, the category of each traffic sign is predicted according to the similarity between drone images and keywords. Cosine distance and softmax function are performed to calculate the class probability distribution. To evaluate the performance, we apply the proposed method in a practical application. The drone images captured from Guyuan, China, are employed to record the conditions of traffic signs. Further experiments include two widely used public datasets. The calculation results indicate that our vision–language model-based method has an acceptable prediction accuracy and low training cost. Full article
Show Figures

Figure 1

23 pages, 6515 KiB  
Review
Clinical Applications of Cardiac Magnetic Resonance Parametric Mapping
by Daniele Muser, Anwar A. Chahal, Joseph B. Selvanayagam and Gaetano Nucifora
Diagnostics 2024, 14(16), 1816; https://doi.org/10.3390/diagnostics14161816 - 20 Aug 2024
Cited by 1 | Viewed by 2123
Abstract
Cardiovascular magnetic resonance (CMR) imaging is widely regarded as the gold-standard technique for myocardial tissue characterization, allowing for the detection of structural abnormalities such as myocardial fatty replacement, myocardial edema, myocardial necrosis, and/or fibrosis. Historically, the identification of abnormal myocardial regions relied on [...] Read more.
Cardiovascular magnetic resonance (CMR) imaging is widely regarded as the gold-standard technique for myocardial tissue characterization, allowing for the detection of structural abnormalities such as myocardial fatty replacement, myocardial edema, myocardial necrosis, and/or fibrosis. Historically, the identification of abnormal myocardial regions relied on variations in tissue signal intensity, often necessitating the use of exogenous contrast agents. However, over the past two decades, innovative parametric mapping techniques have emerged, enabling the direct quantitative assessment of tissue magnetic resonance (MR) properties on a voxel-by-voxel basis. These mapping techniques offer significant advantages by providing comprehensive and precise information that can be translated into color-coded maps, facilitating the identification of subtle or diffuse myocardial abnormalities. As unlikely conventional methods, these techniques do not require a substantial amount of structurally altered tissue to be visually identifiable as an area of abnormal signal intensity, eliminating the reliance on contrast agents. Moreover, these parametric mapping techniques, such as T1, T2, and T2* mapping, have transitioned from being primarily research tools to becoming valuable assets in the clinical diagnosis and risk stratification of various cardiac disorders. In this review, we aim to elucidate the underlying physical principles of CMR parametric mapping, explore its current clinical applications, address potential pitfalls, and outline future directions for research and development in this field. Full article
Show Figures

Figure 1

Back to TopTop