1. Introduction
The continuous development of microscopy technology has provided new tools and methods for bionics research. With the help of high-resolution biomicroscopes, researchers can observe and measure the microstructure and surface features of biological samples, which provide an important basis for their application to engineering and technology [
1,
2]. However, the depth of field of optical microscopy is always limited, resulting in clear images within the depth of field and blurred images beyond the depth of field. As a result, this shortcoming restricts complete observations of the object. Additionally, another significant drawback of traditional monocular biological microscopy is its constraint on the two-dimensional scale, which may not be as comprehensive and intuitive as three-dimensional (3D) observations. Up to now, numerous approaches have been proposed for reconstructing the 3D information from out-of-focus planes using a microscope [
3,
4,
5]. Among them, there are two main methods for 3D reconstruction based on optical imaging, which are the stereo vision imaging method and the structured light method. The stereo vision imaging method employs two or more cameras to simultaneously capture images of biological samples from different viewpoints. By identifying similarities between these captured images that correspond to the same scene, a 3D reconstruction of the object can be achieved [
6]. The main advantage of this method is that it can provide highly accurate 3D reconstruction results [
7]. However, this method is sensitive to ambient light, and the simultaneous use of multiple cameras is both costly and bulky in size. Moreover, it increases the time consumption of data processing [
8]. The structured light method adopts a light source to project a certain structural pattern on the surface of the sample to be measured, and the shape of the structural pattern is changed, and then the changed patterns on the object surface are observed by a camera to infer depth information of the sample [
9]. The method has advantages in terms of efficiency and field of view [
10]. However, it has high requirements for the light source because the secondary reflection of light often occurs, so the problem of light interaction produced by multiple light sources needs to be considered [
11]. With the rapid development in the field of optical imaging, a 3D reconstruction method for the monocular biological microscope based on the shape-from-focus (SFF) approach has been developed [
12,
13]. By analyzing the relationship between object distance, focal length, and image sharpness using a combination of depth-of-field measurement and vertical scanning technology, the 3D information of the object can be recovered. The method has the advantages of low amount of calculation, high accuracy, and easy miniaturization [
14,
15]. The problem of the limited depth of field of the monocular biological microscope can be extended by switching a low-magnification objective or employing a translation stage to scan the sample in the optical axis direction to acquire multi-focus images [
16,
17]. Unfortunately, the switching of the objective or movement translation stage requires manual or mechanical operations, which makes the microscope complicated, bulky, complex, heavy, and expensive [
18]. Moreover, manual or mechanical operation inevitably causes sample vibration that will affect the 3D reconstruction performance [
19].
The electrically tunable lens (ETL) is a new type of lens proposed based on the principle of bionics, which mimics the structure of the human eye for fast and precise focus adjustment. Compared with the translation stage, the ETL can realize vibration-free axial scanning and is suitable for compact, fast-response, low-power microscopy [
20,
21,
22]. In 2015, Jun Jiang et al. proposed a 3D temporal focusing microscopy using an ETL to extend the depth of field. The ETL provided a fast and compact way to perform non-mechanical
z-direction scanning [
23]. However, the magnification is changed when the focal length of the ETL changes, which will affect the imaging performance because the temporal focusing microscope is not a telecentric optical structure. In 2018, Yufu Qu et al. proposed monocular wide-field microscopy with extended depth of field to enable accurate 3D reconstruction [
24]. However, the acquisition process and reconstruction algorithms are time-consuming because images from multiple views of the samples are required to acquire 3D point clouds to realize 3D reconstruction. In 2021, Gyanendra Sheoran et al. carried out a simulation and analysis of a combination of a variable numerical aperture wide-field microscope objective with an ETL for axial scanning with a telecentric image space [
25]. However, it is difficult to place the ETL at the back focal plane of the objective for precise axial scanning with continuous resolution. Therefore, there is an urgent need for a microscope with extended depth of field and 3D reconstruction that can rapidly acquire and process images with accurate 3D reconstruction results for bionics research.
In this paper, we propose a biological microscope with colorful 3D reconstruction and extended depth of field using an ETL. To obtain high imaging performance, the magnification of the proposed microscope for extended depth of field is invariant, i.e., it is not appreciably affected by the focal length change in the ETL. It is realized by employing a telecentric 4f structure consisting of two identical relay lenses. The ETL is placed in the confocal plane of the 4f optical system and performs a continuous axial scanning of the sample without mechanical movement. By adjusting the focal length of the ETL, the images with different sharpness of the sample are obtained. Conventional biological microscopes encounter a paradox of high resolution and large depth of field. This optical limitation is overcome by using image fusion techniques to achieve both goals simultaneously. We propose to use an improved Laplace pyramid image fusion to expand the depth of field and thus present the key features of the sample. The 3D structure of the sample can be reconstructed using the SFF algorithm. During axial scanning, the state of each pixel in the image changes from defocus to focus to defocus. The sharpness of the pixel blocks in the image area is evaluated by the focus evaluation operator, and Gaussian curve fitting is performed on the evaluated values to obtain the depth information of each point to form a 3D depth map. We developed a monocular biological microscope prototype and carried out imaging experiments to verify its feasibility. Under the 10× objective, depths of field of 120 µm, 240 µm, and 1440 µm are obtained for the shrimp larvae, bee tentacle, and gauge block samples, respectively. The maximum absolute errors of the two standard gauge blocks are −39.9 μm and −30.6 μm, which indicates that 3D reconstruction deviations are 0.78% and 1.52%.
2. Optical Simulation of the 4f Optical System with an ETL
If the monocular biological microscope using ETL is not telecentric, the change in the focal length of the ETL affects the magnification, which in turn affects the resolution and quality of the image. To achieve a large axial scanning range at constant magnification, the ETL needs to be placed at the back focal plane of the objective. However, the actual position of the back focal plane of the objective is usually within the barrel of the objective. Consequently, it is difficult to place the ETL exactly at the back focal plane of the objective. Fortunately, if the back focal plane of the objective is relayed out by two identical relay lenses forming a 4
f configuration, the ETL can be placed at the conjugate plane of the back focal plane [
26]. The 4
f optical system is based on the Abbe imaging principle. It consists of two relay lenses (relay lens
L1 and relay lens
L2) with the same focal length, which cascades the front and back focal planes of the two relay lenses, as shown in
Figure 1. Via the 4
f optical system, an ETL can be easily placed at the confocal plane.
According to Fourier optics, the light field
f (
x,
y) can be expanded into the superposition of countless complex functions.
where
fx and
fy are the spatial frequencies in the
x and
y directions, respectively.
F(
fx,
fy) is distributed as the spatial frequency spectrum with the variation in
f (
x,
y).
The Fourier transform of
f (
x,
y) can be expressed as
In the 4
f system, the object with the light field distribution
M (
x1,
y1) is placed on the object plane and passed through the relay lens
L1 to obtain the spectral function of the object. The light field distribution
M (
fx1,
fy1) can be expressed as
The spatial spectrum of object
M is obtained on the spectrum surface, the ETL is placed on the confocal plane, and after the transformation of the relay lens
L2, the light field distribution
M (
fx2,
fy2) is as follows:
Thus, the image in the image plane is centrally symmetric with the image in the object plane. When the 4
f optical system is added to the infinite remote microscope. The axial scanning range of the objective Δ
z is as follows [
5]:
where Δ
z is the axial scanning range of the objective from the initial front focal plane,
fr′ is the focal length of the relay lens,
M0 is the magnification of the objective lens, and
fe′ is the focal length of the ETL. From Equation (5), it can be seen that the axial scanning range Δ
z is proportional to the square of the focal length
fr′ of the relay lens and inversely proportional to the focal length
fe′ of the ETL and the square of the magnification
M0 of the objective.
Since the image of the object plane and the image plane are centrosymmetric, when the chief rays pass through the back focal point of the objective lens, they also pass through the center of the ETL because the position of the ETL is conjugated to the back focal plane of the objective lens. During the focus is scanned axially by changing the focal length of the ETL, the ETL does not change the propagation directions of the chief rays, and the image points corresponding to the chief rays are maintained. Hence, the magnification of the 4
f optical system remains constant when the focal length of the ETL is altered [
27,
28]. To further verify the axial scanning function of the 4
f optical system with an ETL, we perform an optical simulation using Zemax 19.4 software. To keep the magnification constant during axial scanning, the choice of the ETL position is key to maintaining the telecentric of the 4
f optical system. In this simulation, the 4
f optical system is constructed by two commercial relay lenses. Two achromatic lenses (#49-360, Edmund, NJ, USA) with a focal length of 100 mm are chosen as two relay lenses to minimize chromatic aberrations. A commercial ETL (EL-10-40-TC, Optotune, Dietikon, Switzerland) is placed in the confocal plane of the 4
f optical system. To match the 10× objective lens with an
NA of 0.25 used in the experimental section, the square space
NA is set by 0.025 in the simulation. The ray tracing of multiple structures under five configurations of the 4
f optical system with an ETL is shown in
Figure 2.
From
Figure 2, we can see that by changing the focal length of the ETL, the axial scanning of the object can be achieved without moving the image plane. The relationship between the object distance and the focal power of the ETL is shown in
Figure 3. From
Figure 3, we can find that by adjusting the focal power of the ETL from negative to positive, the object distance shifts from 130 mm to 60 mm, corresponding to an axial scanning range of 70 mm. We also obtain the relationship between the magnification of the 4
f optical system and the focal power of the ETL, as shown in
Figure 3. It demonstrates that the object plane can be widely shifted by changing the focal length of the ETL without appreciably affecting the magnification of the 4
f optical system. The maximum error of the magnification of the 4
f optical system is 3.5%. From
Figure 3, we can also find that when the focal powers of the ETL are −3.5 dpt and 3.5 dpt, the magnification of the simulated results of object
F letter is invariant, i.e., is not appreciably affected by the focal power change in the ETL. Thus, when the ETL is located in the confocal plane, the 4
f optical system becomes approximately telecentric, and the magnification remains constant.
When broad-spectrum light passes through an optical system, different wavelengths of light propagate along their respective optical paths, resulting in differences in imaging between wavelengths of light, which is defined as chromatic aberration [
29]. ETL is a refractive optical element, and the dispersion characteristics of the material inevitably cause chromatic aberration problems. Therefore, in the proposed 4
f optical system, we choose two sets of double-glued lenses to reduce the chromatic aberration of the system.
Figure 4 and
Figure 5 show the optical path, axial chromatic aberration, and vertical chromatic aberration results for the ETL only and the 4
f system with an ETL, respectively.
3. Principle of the Laplace Pyramid Image Fusion
Limited by the depth of field, the microscope can only obtain clear images of the sample height within the depth of field. The images of the surfaces beyond the depth of field will become blurred. The calculation of the depth of field is shown as
where
dDOF is the depth of field,
λ is the wavelength of the illumination light,
n is the refractive index of the medium between the sample and the objective lens,
NA is the numerical aperture,
M is the magnification, and
e is the minimum resolvable distance. According to Equation (6), we can find that the depth of field decreases with increasing the magnification of the microscope. In the experiment, the depth of field of the 10× objective is 4.4 μm. To extend the depth of field, axial scanning of the object is performed by adjusting the focal length of the ETL in this paper. The images of the scanned sample at different focus positions are acquired. Via multi-focus image fusion technology, a fully focused image can be achieved, and accurate and complete image information can be obtained [
30,
31]. This technology compensates for the shortcomings of one source image and makes the details of the object clearer.
Microscopic imaging requires preserving as much detailed information as possible in the original image for analysis and processing. Laplace pyramid image fusion is capable of producing a series of images at different scales, which can be used to extract detailed information about the image. However, conventional fusion rules do not always produce optimal results for the focus region. Therefore, an improved fusion method of the Laplace pyramid is proposed in this paper. After the pyramid decomposition, the image forms a multi-scale map with different resolutions similar to the pyramid shape [
32]. By comparing the source images at the corresponding scales, it is possible to extract the image details that are prominent in each of the source images into the fused image, enriching the fused image as much as possible in terms of the amount of information and achieving a better fusion effect. The flowchart of the improved Laplace pyramid image fusion method is shown in
Figure 6.
Assuming that the original image is
A, we make
G0 (
i,
j) =
A (
i,
j) (where 1 ≤
i ≤
R0, 1 ≤ j ≤
C0) as the initial layer of the Gaussian pyramid, namely layer 0, the first layer Gaussian pyramid can be generated by
where
Gk(
i,
j) and
Gk+1(
i,
j) represent the image of the current layer and the image of the next layer, respectively,
Rk and
Ck represent the height and width of the image of the Gaussian pyramid of layer
k, and
s(
m,
n) represents the image mask to filter out the high-frequency part of the image.
The Gaussian pyramid is generated on the image, and the
k+1th layer image of the Gaussian pyramid is
Gk+1. The Gaussian pyramid
Gk+1 of layer
k+1 is convolved and interpolated to obtain
G’
k+1. The same arithmetic operation matches its size to the Gaussian pyramid
Gk of layer
k. The gray values from
Gk to
G’
k+1 are subtracted, and the difference between the two adjacent layers of the image is obtained. The difference is usually the detail in the image processing. The Laplacian pyramid model is constructed from this difference in information.
where
LPk is the
kth layer of Laplace’s pyramid.
LPk, as the difference between
Gk and
G’
k+1, represents the information difference between two adjacent layers of pyramids, which is lost from the lower level of the pyramid.
UP represents the upsampling of the image.
g5*5 indicates the Gaussian convolution kernel with the window size of 5 × 5, which is represented as follows:
Multi-source images have different features and details. Laplacian pyramid image fusion is used to filter these features and remove the blurred parts of the image using appropriate fusion rules to obtain a fully focused image [
33]. For the fusion of Laplacian pyramids of the same level, a Gaussian pyramid is obtained by inverse Laplacian transformation, and the bottom image of the pyramid is the fused image. The traditional Laplace operator used for image fusion is
where
wn×n is the size of the selected window and
T is a threshold.
We performed image fusion according to the fusion rules of the modified Laplace operator (
MML) and a multi-scale
SML(
MSML) to replace the traditional Laplace operator, as shown in the following:
where the
step represents the window size of the
ML operator.
w1,
w2, and
w3 are the three different sizes of the selected windows.
Compared with the traditional Laplace operator, the modified Laplace operator considers the change in clarity in the diagonal direction around the pixel points in the selected region; meanwhile, a variable spacing step to accommodate for possible variations in the size of texture elements is also added, so the judgment of the clear region is more robust and further improves the effect of the image fusion. Because a single window only considers a neighborhood of one scale, selecting a relatively small window is sensitive to noise, and selecting a relatively large window leads to overly smooth image fusion results. Therefore, a new multi-scale SML is used to take full advantage of different neighborhoods. By combining neighborhood information at different scales, the features and details of the image will be captured more comprehensively.
4. Principle of the Colorful 3D Reconstruction
By mechanically moving the sample to obtain the image of different depths, the 3D information of biological samples can be solved. Unfortunately, this will cause the sample’s vibration, resulting in blurriness of the captured image which reduces the image resolution and affects the reconstruction accuracy [
34]. In this paper, multi-focused images are obtained by changing the focal length of the ETL. The SFF algorithm is used to achieve the 3D reconstruction based on these multi-focused images, which reflects the relationship between the tested surface degree of focus and depth distribution. The focus measure (FM) function is used to extract depth information from the multi-focused image sequence. The 3D morphology of the tested sample surface is reconstructed according to the height information [
35,
36]. The schematic of the 3D reconstruction based on the SFF algorithm using the ETL is shown in
Figure 7. Firstly, by varying the focal length of the ETL for axial scanning, sequential multi-focused images of the sample at different depths can be obtained. Each image has both clear and blurred focus regions, and each pixel of the image undergoes the process of defocus–focus–defocus in the image sequence, as shown in
Figure 7a. Secondly, by defining a suitable window size for the image and evaluating the sharpness of the pixels in the image, we obtain the image sequence where the pixels with maximum sharpness are located, as shown in
Figure 7b. Thirdly, based on the calibrated height at the location where the image sequence is taken, the depth value of the measured surface point corresponding to the pixel is obtained, as shown in
Figure 7c. Lastly, the focus measurement operator measures the sharpness of the selected pixel block in the acquired sequence image. It is usually possible to directly select the position of the maximum value of the focus evaluation function curve as the depth value of a pixel point. Although it is possible to obtain a reconstructed image of the surface of the object, this leads to inaccuracies in the measurement because the image obtained is discrete, whereas the actual depth of the sample is continuous. Therefore, an interpolated fitting operation is required to obtain continuous depth information. In this paper, Gaussian curve fitting is used to obtain the height values close to the real surface microform. We obtain the height value of each point in the window to obtain the discrete depth information, which is interpolated and curve smoothed to obtain the 3D depth map, as shown in
Figure 7d,e. The colorful information of the pixels in the image obtained by Laplace pyramid fusion is mapped to the corresponding positions in the depth map, as shown in
Figure 7f. The process of 3D reconstruction of the image sequence is finally achieved.
Focus measurement is an important step in the process of 3D reconstruction and directly affects the accuracy of the 3D model. In this paper, the Tenengrad function is selected for focus measurement. The Tenengrad function calculates the gradient values horizontally and vertically by using the Sobel operator with the convolution operation for each pixel in the image. The two convolution kernels of the Sobel gradient operator are shown in Equation (14).
The Tenengrad function based on the Sobel operator is calculated as follows
where
M ×
N is the window size, and
t is the threshold value introduced to modulate the sensitivity of the evaluation function. The Sobel Gradient operator ∇G(
x,
y) can be expressed as follows.
The Tenengrad function is rotation invariant and isotropic, which can highlight the edges and lines in all directions, so it can be used as a criterion for the degree of image focus. In addition, the Tenengrad function uses the edge intensity for evaluating its sharpness and has high accuracy and certain anti-noise capability [
37,
38].