# Fast Multi-Focus Fusion Based on Deep Learning for Early-Stage Embryo Image Enhancement

## Abstract

## 1. Introduction

## 2. Related Works

## 3. Multi-Focus Image Fusion Framework

#### 3.1. Hardware Setup for the Acquisition of Multi-Focus Images

#### 3.2. Data Preparation

#### 3.3. Multi-Focus Image Fusion Approach Using U-Net Architecture

#### 3.4. Alternative Image Fusion Approaches

#### 3.4.1. Inverse Laplacian Pyramid Transform

#### 3.4.2. Enhanced Correlation Coefficient Maximization

#### 3.5. Image Similarity Metrics

- Root Mean Squared Error (RMSE) is commonly used to estimate the difference between two images by directly computing the variation in pixel values. The smaller value of RMSE represents better similarity [26]. Its value is defined as$$\mathrm{RMSE}=\sqrt{\frac{1}{MN}{\displaystyle \sum _{i=1}^{M}\sum _{j=1}^{N}({I}_{X}(i,j)-{I}_{Y}{(i,j)}^{2}}};$$
- Spectral Angle Mapper (SAM) determines the spectral similarity between two spectra by calculating the angle between the spectra and treating them as vectors in a space with dimensionality equal to the number of bands. Small angles between the two spectrums indicate high similarity, where the ideal value of zero indicates the best spectral quality [27]. It is calculated using the following formula$$\mathrm{SAM}=\mathrm{acos}\left(\frac{{\displaystyle {\sum}_{i=1}^{L}{u}_{i}{v}_{i}}}{\sqrt{{\displaystyle {\sum}_{i=1}^{L}{u}_{i}^{2}}}\sqrt{{\displaystyle {\sum}_{i=1}^{L}{v}_{i}^{2}}}}\right);$$
- Peak Signal-to-Noise Ratio (PSNR) is calculated based on RMSE, taking into account maximum possible pixel value of the image. For 8-bit representation, acceptable values for wireless transmission quality loss are considered to be around 20 dB to 25 dB, while in a lossy image range between 30 and 50 dB, where higher is better [28]. The value of PSNR is obtained using$$\mathrm{PSNR}=20\phantom{\rule{4.pt}{0ex}}{\mathrm{log}}_{10}\frac{\mathrm{max}}{\mathrm{RMSE}};$$
- Universal Quality Index (UQI) represents brightness distortion, contrast distortion and correlation difference between two images. The best value is 1 if the images are equal [29]. The mathematical form of UQI is$$\mathrm{UQI}=\frac{4{\sigma}_{{I}_{X}{I}_{Y}}{\mu}_{{I}_{\mathrm{X}}}{\mu}_{{I}_{\mathrm{Y}}}}{\left({\sigma}_{{I}_{X}}^{2}+{\sigma}_{{I}_{Y}}^{2}\right)\left({\mu}_{{I}_{X}}^{2}+{\mu}_{{I}_{Y}}^{2}\right)};$$
- Structural Similarity Index Method (SSIM) determines the local patterns of pixel intensities between two images taking into account three estimates of luminance, contrast, and structure [30]. The value ranges between $-1$ and 1, where the ideal value is 1. The value of SSIM is given by$$\mathrm{SSIM}=\frac{\left(2{\mu}_{{I}_{\mathrm{X}}}{\mu}_{{I}_{\mathrm{Y}}}+{C}_{1}\right)\left(2{\sigma}_{{I}_{\mathrm{X}}{I}_{\mathrm{Y}}}+{C}_{2}\right)}{\left({\mu}_{{I}_{\mathrm{X}}}^{2}+{\mu}_{{I}_{\mathrm{Y}}}^{2}+{C}_{1}\right)\left({\sigma}_{{I}_{\mathrm{X}}}^{2}+{\sigma}_{{I}_{\mathrm{Y}}}^{2}+{C}_{2}\right)};$$
- Multi-Scale Structural Similarity Index Method (MS-SSIM), which is more advanced form of SSIM, determines the quality based in the terms of image luminance, contrast and structure at multiple scales. The ideal value is 1. The computations are typically performed in a sliding N × N (by default 11 × 11) Gaussian-weighted window [31].

## 4. Experimental Results

## 5. Discussion

## 6. Conclusions

## Abbreviations

CNNs | Convolutional Neural Networks |

ECC | Enhanced Correlation Coefficient |

IVF | In-Vitro Fertilization |

LP | Laplacian pyramid |

MS-SSIM | Multi-Scale Structural Similarity Index Method |

PSNR | Peak Signal-to-Noise Ratio |

SAM | Spectral Angle Mapper |

SSIM | Structural Similarity Index Method |

TL | Time-Lapse |

UQI | Universal Quality Index |

**Figure 7.**Fused embryo images using U-Net, Laplacian pyramids (LP) and Enhanced Correlation Coefficient (ECC) approaches: (

**a**) one cell, (

**b**) two cells, (

**c**) four cells, (

**d**) eight cells.

**Figure 10.**Comparison of fusion times using the proposed multi-focus image fusion approach using U-Net architecture, and its two alternatives such as Laplacian pyramid transform and ECC method for different resolution images.

**Figure 11.**Comparison of the fused images using the proposed multi-focus image fusion approach using U-Net architecture, LP transform and ECC method for 2-cell and 4-cell embryos.

**Figure 12.**(

**A**) early-stage embryo taken at the first focal plane (FP1), (

**B**) early-stage embryo taken at the seventh focal plane (FP7), (

**C**) fused image using the proposed algorithm (FS).

