1. Introduction
In imaging devices, it is important to take into account the optical characteristics of image acquisition, processing, and generation during capturing, developing, and forming images. This is essential to ensure the clear and accurate communication of the optical information of the real object. Although significant progress has been made in optical design, considering aspects such as light scattering and aberrations [
1,
2], challenges persist when dealing with optical information in consumer imaging devices subject to design restrictions on cost and miniaturization. Aberrations, including distortion and blur, become inevitable in such scenarios, leading to a range of issues such as reduced image clarity, false colors, and the occurrence of artifacts. Furthermore, the appearance and perceived quality of a reproduced image may differ from those of the real object [
3]. Therefore, when images acquired by imaging devices such as cameras and scanners are generated for display on a screen or projector, or printed on paper, the image quality is generally improved by applying image-processing techniques such as sharpening filters.
The modulation transfer function (MTF) has long been used as an index for evaluating the performance of optical systems such as lenses and cameras. The MTF indicates the contrast modulation characteristics of images in the frequency domain, and its metric is specified in ISO 12233 [
4]. Characteristics of camera lenses and sensors, such as light scattering and aberrations, can change the MTF and cause image blurring. Conventional studies have focused on the accuracy [
5,
6] and simplicity [
7] of the MTF calculation method. Several previous studies have attempted to increase the MTF by improving optical systems, such as lenses and circuits, for applications that require high-resolution images, such as remote sensing and aerospace applications that handle satellite images [
8,
9,
10]. Developing techniques to maintain a high MTF is important, particularly in fields requiring high-quality images (e.g., medical imaging, security cameras, and remote sensing). To improve image blurring caused by spatially different image formation characteristics in the image plane of the captured image, it is common to utilize the image formation characteristic information (point spread function and MTF) of the image sensor or to convert pixel values by filter processing using known image formation characteristics based on optical shooting conditions. These image formation characteristics are difficult to implement without the manufacturer of the camera or image sensor [
11,
12] and cannot be handled by end users.
A software-based image-sharpening process is a possible method for improving image blurring, even when optical information is unknown. Typical methods include sharpening filters, unsharp masking, edge detection algorithms, high-pass filters, and methods based on deep learning, which are important for image processing [
13,
14,
15,
16,
17,
18]. Recent trends have focused on machine-learning-based methods, such as deep neural networks, for image quality evaluation and improvement [
19,
20]. These approaches to image quality improvement through image processing have enabled high-quality sharpening and improved the visual appearance of the detailed texture and contour information in images. Image sharpening by image processing, such as unsharp masking, is widely used to improve image blurring, in which the converted image is visually and arbitrarily enhanced and can be converted into an optically unnatural image, increasing noise and creating artifacts owing to excessive sharpening and local edge enhancement. In addition, appropriate parameter settings are required to extract fine details from the image. Another issue is the development of generic algorithms to handle different image types and qualities. It is also difficult to find a generic approach, because the optimal sharpening method may vary depending on the quality and blurring level of the original image.
To the best of our knowledge, no studies have been conducted to improve the MTF for end users. In our trial, we attempted to control the MTF by sharpening the filters of the three channels of the color camera [
21]. However, the MTF conversion between channels using sharpening filters is far from the MTF of common camera lenses, and the optical certainty of the MTF is questionable. We have also worked on managing the total appearance of digital images generated by different imaging devices [
22,
23]. Although we converted the perceptual glossiness, transparency, and other qualities perceived from the reproduced image, we were not able to control frequency information, such as MTF.
Generally, widely used imaging devices have different image characteristics owing to the different designs of each device. Therefore, when comparing images acquired by multiple imaging devices, different image information is acquired and generated even though the target object is the same. In this study, we propose a method to convert the MTFs of imaging devices to be handled closer to the target MTF in order to match the optical characteristics that differ between images. Specifically, the spatial modulation characteristic MTF, which represents the image formation characteristics, is used to match the target MTF for multiple image channels with different image formation characteristics within or between imaging devices. Here, conversion means not only the improvement of image blur caused by low MTF (conversion to boost MTF) but also the generation of image blur (conversion to lower MTF) and adjusts the MTF of the actual image device, which varies depending on the optical characteristics, such as wavelength, filters, and other physical imaging device characteristics. The derived conversion relationship can be applied to a variety of images and can output images with MTF matching between channels.
Our motivation for this study is to develop an appearance management technology to reduce differences in resolution characteristics among imaging devices, as color management technology controls colors to reduce color differences among imaging devices and to improve the appearance of images. The approach of converting MTFs between image channels is novel and not found in existing studies and has the potential to contribute to the development of various industrial technologies, such as reducing prototype costs through the application of image simulation technology in the design of imaging equipment. The contributions and novelty of this study are as follows: (1) to reduce differences in the appearance of generated images caused by characteristic differences between imaging devices; (2) to reduce the impact of aberrations in the design of imaging devices on the appearance of images, which is an orthodox issue in imaging; and (3) to make MTF-based image simulation possible by generating the spatial resolution of image variables through MTF conversion.
3. Experiments
Examples of experimental results based on the proposed method are presented in this section. MTF conversion was performed under two conditions: Condition A, in which MTF characteristics differed among multiple channels within an image acquisition device; and Condition B, in which the MTF characteristics possessed by each channel differed among different image acquisition devices. In Condition A, the MTF of the other channels in the same imaging device is used as the target MTF, and in Condition B, the MTF possessed by different imaging devices is used as the target MTF. In this experiment, the MTF was calculated separately using tilted edges based on ISO 12233 [
4]. To verify the effectiveness of the proposed method, the MTF conversion results from the application of an unsharp masking filter [
18], which is widely used in the image-blur-sharpening process, are shown along with the results of the proposed method, and compared with their results.
Here, we explain the applications of these filters. The MTF conversion using the unsharp masking filter is performed by convolving the following 3 × 3 filters:
. The filter parameter
was obtained for each input image and conversion was performed. To make processing other than filter processing equivalent to the proposed method, MTF conversion is performed by changing
from 0.01 to 9.0 in 0.01 increments to approach the target MTF, and the parameters are determined to minimize the MSE between the target MTF and the converted MTF in the interval from 0 to the Nyquist frequency of 0.5 cpp; thereafter, MTF conversion is performed.
3.1. Condition A: Among Multichannel Images with Different MTFs within an Imaging Device
In ordinary digital color cameras, color images are generated from RGB or CIEXYZ 3-channel images, RGBW 4-channel images, etc. In spectral cameras, color images are generated from multiple channels corresponding to different wavelengths. In other words, there are multichannel images within the imaging equipment, and there are differences in the imaging characteristics that they possess owing to the different wavelength and spatial dependencies among them. Under Condition A, the MTF of a given channel within the same imaging device was used as the target MTF. For example, if the X image has the highest MTF value, the MTF of the X image is used as the target MTF, and the MTF conversion is performed to improve image blurring by increasing the MTF of the Y and Z images of multiple channels with low MTFs.
In this experiment, by targeting different MTFs in the CIEXYZ channels of the SR-5100 used as a digital camera, we set the Y image with the highest MTF value as the target MTF and performed MTF conversion to improve image blurring by increasing the MTF of the X and Z images of multiple channels with low MTFs. Although general consumer cameras use three channels in the RGB space to produce images, the RGB space is a device-dependent color space affected by the color-rendering technology inherent in the camera for color reproduction. Therefore, we used an imaging device capable of generating color images in a device-independent CIEXYZ space.
Figure 6a shows the input images, including an image of colorful glass tiles containing glitter and a slanted black edge illuminated behind it by a plane light source; (b) shows the results of blurring improvement between image channels by the sharpening filter of the conventional method; and (c) shows the conversion results by the proposed method (including the MTF, output image, and partially enlarged image (15 × 20 pixels)). In
Figure 6a, the MTF difference between the image channels and the false color of the purple fringes in the image contours are observed. This phenomenon is caused by the wavelength dependency of the image-forming characteristics, such as aberrations in the imaging system, and is an artifact that does not exist in a real system. In the conventional image blur improvement method using an unsharp masking filter shown in
Figure 6b, the filter
k was varied from 0.01 to 0.9 (in 0.01 increments) to find the k with the lowest MSE. The results showed that
k = 0.17 for the X and
k = 1.94 for the Z channel. Owing to the sharpening process applied to sharpen the image appearance without considering the optical phenomena, the low-frequency MTF, particularly for the X channel, is excessively and unnaturally increased, and the MTF for the high-frequency channel is significantly reduced toward the high-frequency MTF. In addition, unnaturally distinct reddish contour artifacts are generated near the edges of the image. However, as shown in
Figure 6c of the proposed method, such an unnatural contour enhancement did not occur, and the false colors near the edges that occurred in the input image were also reduced, confirming that the blurring of the entire image was reduced. In the image shown in the middle row, colorful glass tiles containing glitter are photographic targets. The proposed method showed a slight improvement in image sharpness, although the difference was not as apparent as that in the edge images.
Figure 7 shows the convergence process of the MSE along the iteration times of the MTF conversion and the results of the coefficient array. The convergence condition for MSE was defined as the ratio
of the current MSE and MSE in the previous iteration. When
was satisfied, the iteration was terminated and considered convergence. In the figure, the number of convergence iterations is indicated by a square, and convergence is achieved after four and six iterations for the X and Z channels, respectively. The coefficient array was visualized in grayscale, with the maximum and minimum values within each channel shown in white and black, respectively. For the X channel, the maximum and minimum values were 1.33 and 1.00 for the X channel and 5.75 and 1.00 for the Z channel. In particular, enhancement was achieved in the high-frequency band, and the Z channel of the input with a low MTF was significantly improved. It was confirmed that the proposed method could approximate the target MTF in the MTF conversion for multiple channels with different MTFs in the imaging equipment.
3.2. Condition B: Among Image Channels with different MTFs between Different Imaging Devices
Widely used imaging devices generally have different image formation characteristics owing to their different designs, such as optical systems and circuits. Therefore, when image formation characteristics are compared between multiple imaging devices, images with different appearances, such as blurring, are acquired and generated. In Condition B, the MTF conversion is performed using different MTFs of the different imaging devices as the target MTF. In the experiment, the same camera SR-5100 as in Condition A was used in the out-of-focus condition to simulate an imaging device with inferior image formation characteristics to compare different image formation characteristics (otherwise the same conditions). Thus, the MTF of each CIEXYZ channel is closer to the target MTF. For example, an experiment is conducted in which the target MTF is set to the higher MTF owned by a different imaging device, and the MTF of the channels of the imaging device with low MTF is converted to be closer to the target MTF.
Figure 8 shows an example of the MTF conversion assuming between imaging devices with different image-forming characteristics for an image taken of the same objects as in Condition A, such as a colored glitter glass image and an edge image. As in Condition A, three image channels (X, Y, and Z) were used to avoid issues such as those caused by tone mapping from CIEXYZ to the RGB space. From
Figure 8a, the MTF difference between the image channels and the occurrence of a green false color in the image contours generated from them were confirmed. This artifact did not exist in the real object, as shown in
Figure 7a. The k results for the unsharp masking filter shown in
Figure 8b were 0.31, 0.84, 0.30 for the X, Y, and Z channels, respectively. Green artifacts are clearly visible near the black edges of the edge image. By comparing
Figure 8b,c, it is confirmed that the proposed method is closer to the target MTF and produces a natural and clear appearance of the edge contours in the image. In addition, in the enlarged image of the glitter in the glass tile, the highlight of the glitter appeared to be more prominent in the proposed method than in the blurred input image.
Figure 9 shows the results of the MSE iteration and coefficient array. As in Condition A, iterations were performed until MSE with ε ≤ 1.1, and the iteration times of convergence were 3, 2, and 3 for the X, Y, and Z channels, respectively. The maximum and minimum values of the coefficient array were max 1.30, min 1.00 for X, max 1.83, min 1.00 for Y, and max 1.98, min 1.00 for Z channel. The number of iterations required for convergence was small, confirming that the MTF conversion results accurately approximated the target MTF.
3.3. Discussion and Issues
Compared to the conventional unsharp masking method for improving image blurring related to the spatial resolution of the image, MTF conversion using the proposed method can achieve natural image processing with no false color or edge enhancement, following the MTF allowed by the camera MTF.
Table 1 summarizes the results of obtaining the SSIM and PNSR values for the Y channel based on the following: (i) is an image captured by an actual camera with the target MTF; (ii) is an image with only a different MTF compared to (i) (conditions other than focus are the same); (iii) is an image obtained by taking (ii) as the input image and converting it using the proposed method to obtain the MTF of (i); and (iv) is an image obtained by taking (ii) as the input image and performing MTF conversion using the sharpening filter. Each image shows a close-up view of an edge image. The SSIM and PSNR values between each image in (ii)–(iv) and that in (i) were calculated, and the improvement in the values was confirmed by the proposed method and sharpening filter, indicating that the proposed method was more effective than the sharpening filter.
However, several issues remain to be resolved. Our proposed method targets the conversion of interchannel or interdevice characteristics. Therefore, the amount of conversion is small compared to the restoration of the degraded image like the deconvolution algorithm, which transforms it into an ideal image. Our method performs frequency transformations, which theoretically creates the problem of noise enhancement; however, the effect is sufficiently small to be negligible because of the small number of target transformations, and we do not include regularization or any other smoothing of the image space as a constraint. Therefore, the MSE error has the advantage of being extremely small because it can be adjusted purely based on the MTF difference. For conversions with significant MTF differences, the proposed method may not be applicable because of the emphasis on noise, and there may be a limitation in the proposed method. For the purpose of this study, the situation is so small that its influence is negligible owing to the small number of targeted conversions. The MTF conversion affects the frequency characteristics of the image; therefore, a careful approach is required for conversion.
To solve the issue of improving color reproduction, it is necessary to perform MTF conversion in the CIEXYZ space, which is independent of the imaging device, and to consider the correspondence between the RGB space, which is independent of the imaging device, and the CIEXYZ space, which is independent of the imaging device, by considering again, because the rendering method inside the imaging device is a black box. For future development, it is necessary to explore image reproduction methods that consider their relationship with color reproduction. As a preliminary step in the experiments described in our paper, we validated our results using images with different numbers of channels (CIEXYZ images and spectral images every 1 nm). Although the accuracy could be improved by dividing the visible light wavelength range into more groups, based on our experimental conditions, we concluded that there was no need to increase the number of channels for the following three reasons: (1) The human eye could not recognize any difference between the results of the converted spectral and CIEXYZ images; (2) the improvement in accuracy was not expected to be sufficient to be worth the computational workload; and (3) because spectral images were converted to CIEXYZ images in the subsequent step, this was more efficient, and the MSE accuracy of the condition for the three CIEXYZ channels was sufficiently small.
Regarding the computational load, it is necessary to consider measures such as adjusting the number of parameters in steps according to the desired accuracy based on the tradeoff between the coefficient calculation and the desired MTF conversion accuracy. In this study, we employed a method to improve approximation accuracy by sampling at 5% steps with respect to the sum of the change ratios, dividing the frequency band from low to high into 20 bands, and obtaining effective coefficients. The more quantization points are added, the more MSE reduction and MTF approximation accuracy can be achieved. Therefore, there is still potential for further study, together with the acceleration of the MTF conversion algorithm. In addition, it is necessary to address the side effects of MTF conversion, such as an increase in noise and the amount of computation that occurs as a result of MTF improvement.
In this study, the MTF was calculated using the ISO 12233 edge method. However, many studies have discussed speeding up the MTF calculation method, and future research on speeding up the MTF calculation and improving its accuracy by introducing these methods as appropriate is required. In the proposed method, the iterative process to converge the MSE value was adopted as a starting point. However, other methods, such as the goodness-of-fit coefficient, should also be considered in the MTF approximation method to explore better approaches from the viewpoints of accuracy and speed. These are important research questions for improving image-processing and optical technologies.
It is expected that the results of this study can be used to solve various image-forming characteristic issues by implementing an MTF conversion that considers the differences in image-forming characteristics among common imaging devices. In addition, because the MTF conversion target can be measured or simulated using existing methods, it has the potential to contribute to the development of various industrial technologies such as reducing the cost of prototypes by applying image simulation technology to the design of imaging equipment. This is applicable not only to image-capturing devices such as cameras, as shown in this study, but is also expected to be applicable to image-generating devices such as displays.
4. Conclusions
In this study, we proposed a method to address the issue of differing characteristics that inevitably rise within or between imaging devices. We focused on the characteristics of MTF, an optical performance index that allows them to be converted automatically among image channels within or between imaging devices. The results of the MTF conversion for multichannel images with different image-forming characteristics within an imaging device showed that it is possible to generate sharper images by approximating the target MTF. The physical resolution characteristics can be controlled by the MTF conversion enabled in our study, and as a future step, we will begin to develop control techniques that consider perceptual resolution.
We believe that this study is important because the following technical and academic progress is expected by converting the MTF between image channels using the proposed method, which has not yet been addressed. (1) Even if the imaging characteristics of the hardware are unknown, the MTF can be converted to the target MTF using the image after it is captured. (2) As any MTF can be converted into a target, image simulation for conversion to a different MTF is possible. (3) It is possible to generate high-definition images, thereby meeting the requirements of various industrial and research fields in which high-definition images are required. This study is expected to address various image-forming characteristic issues, potentially leading to the advancement of various industrial technologies. For example, it has the potential to reduce prototype costs through the application of image simulation technology in the design of imaging equipment. Our method may be applied not only to imaging devices such as cameras but also to image-producing devices such as displays. Consequently, this study is expected to serve as a foundational technology for comprehensively improving the image reproduction capabilities of imaging devices.