On the Importance of Non-Gaussianity in Chlorophyll Fluorescence Imaging

Angelina El Ghaziri; Nizar Bouhlel; Natalia Sapoukhina; David Rousseau

doi:10.3390/rs15020528

,

and

¹

Institut Agro, Université d’Angers, INRAE, IRHS, SFR QuaSaV, 49000 Angers, France

²

LARIS, UMR INRAe IRHS, Université d’Angers, 62 Avenue Notre Dame du Lac, 49000 Angers, France

^*

Author to whom correspondence should be addressed.

Remote Sens.2023, 15(2), 528;https://doi.org/10.3390/rs15020528

This article belongs to the Special Issue Mathematical Modelling and Simulation Algorithms for Plant Growth (Above and Belowground) from Contactless Images

Version Notes

Order Reprints

Abstract

We propose a mathematical study of the statistics of chlorophyll fluorescence indices. While most of the literature assumes Gaussian distributions for these indices, we demonstrate their fundamental non-Gaussian nature. Indeed, while the noise in the raw fluorescence images can be assumed as Gaussian additive, the deterministic ratio between them produces nonlinear non-Gaussian distributions. We investigate the states in which this non-Gaussianity can affect the statistical estimation when wrongly approached with linear estimators. We provide an expectation–maximization estimator adapted to the non-Gaussian distributions. We illustrate the interest of this estimator with simulations from images of chlorophyll fluorescence indices.. We demonstrate the benefits of our approach by comparison with the standard Gaussian assumption. Our expectation–maximization estimator shows low estimation errors reaching seven percent for a more pronounced deviation from Gaussianity compared to Gaussianity assumptions estimators rising to more than 70 percent estimation error. These results show the importance of considering rigorous mathematical estimation approaches in chlorophyll fluorescence indices. The application of this work could be extended to various vegetation indices also made up of a ratio of Gaussian distributions.

Keywords:

Arabidopsis; Bayesian inference; Expectation–Maximization (EM) algorithm; parameter estimation; plant imaging; statistics; vegetation indices

1. Introduction

Chlorophyll fluorescence imaging is a well-established imaging technique for plant phenotyping [1,2,3,4,5,6]. In this imaging technique, flashes of light are sent onto leaves and the resulting emitted fluorescence is captured with grayscaled images. Images acquired during the illumination protocols are then combined to provide chlorophyll fluorescence indices. These indices are directly related to the availability of electrons in the tissue and therefore are related to their chemical content and indirectly also to the physiology of the tissue at the time of the acquisition. Chlorophyll fluorescence imaging has been widely reported to monitor plant growth and response to stress [5]. While used already, investigations on chlorophyll fluorescence continue to be extended in various directions including the search for new sequences of illumination protocols [7], the physiological interpretation of image signature [8], the genetic determinism associated with chlorophyll fluorescence signals [9,10] or the fundamental biomolecular mechanisms at work [9,11]. We position this article in this trend of further investigations of chlorophyll fluorescence but here at the level of the mathematical modeling of the statistics observed in chlorophyll fluorescence indices.

Chlorophyll fluorescence indices are mainly built with differences and ratios of images which are basically corrupted with various amounts of Gaussian additive noise. The nonlinear derministic combination of these images trivially produces images with non-Gaussian noise. An estimation of the distribution of gray levels in the resulting indices is then performed. Surprisingly, the non-Gaussianity of the chlorophyll fluorescence indices has only recently been highlighted empirically [12]. In most of the literature, Gaussianity is assumed, and therefore, one resorts to the linear associated estimators of average and standard deviation to characterize the chlorophyll fluorescence indices [13,14,15,16,17,18,19,20,21,22,23,24,25,26]. This assumption may not be an issue for the phenotyping situations considered in the literature where a measure of a biomarker is not the aim but rather a difference between a reference (genotype or control conditions) and another plant (other genotype or various stress conditions). From a methodological and mathematical point of view, it is not rigorous to systematically have this Gaussian assumption since the possible negative impact on the estimation of chlorophyll fluorescence distribution parameters is not known.

In this article, we further investigate this non-Gaussianity mathematically. We demonstrate the states where Gaussianity assumptions can be made, and we design appropriate statistical estimators of generic value in the Gaussian and non-Gaussian cases. We illustrate the advantages of our approach using simulations and on images of chlorophyll fluorescence indices.

The paper is organized as follows. We describe the empirical data sets of chlorophyll fluorescence images of diseased plants (Section 2). From the statistical analysis of these data sets, we then propose a statistical model for chlorophyll fluorescence indices (Section 3). We derive two Bayesian estimators of the resulting non-Gaussian distributions. The performance of these estimators is compared with the standard Gaussian approximation on synthetic data simulating the empirical data set (Section 4). We conclude with the importance of considering the non-Gaussianity of chlorophyll fluorescence indices. We provide mathematical proofs of the properties related to the various expressions in the Appendix A.

2. Material

2.1. Arabidopsis thaliana Inoculated by a Bacteria

We consider chlorophyll fluorescence imaging on rosettes of Arabidopsis thaliana ecotype Col0. The experiment consisted of 36 pots of four plants each. Half of the pots were inoculated with water, and the other half with the virulent DC300, a tomato bacteria. We attribute ‘Healthy’ to the pots inoculated with water and ‘Diseased’ to the ones with the bacteria. Chlorophyll fluorescence imaging was realized during six days of the experiment: D0, D2, D5, D6, D7, and D8. The same data set was used for automated disease segmentation [13,27]. A full description of the experiment is in these two papers. We got interested in

F_{m}

, the fluorescence after saturating actinic flash, and

F_{0}

, the basal fluorescence before the flash. To build a statistical model of

F_{0}

and

F_{m}

we manually selected areas located in the limb of the leaves as illustrated in Figure 1. The physical interpretation of the distribution observed is the thermal noise of the camera, which is expected to be Gaussian with an additive coupling.

Figure 1. Example of chlorophyll fluorescence images of Arabidopsis thaliana inoculated by a bacteria, at day 2, pot 19, a healthy pot (inoculated with water): (a)

F_{m}

maximum fluorescence and (b)

F_{0}

minimum fluorescence. The histograms (c,d) are the associated frequency distribution of pixel counts inside the region of interest drawn in a solid yellow line in (a,b), respectively. The dashed blue line in the histograms is the fit with a normal probability density function (pdf).

Therefore, we verified the adequacy of

F_{0}

and

F_{m}

to a normal distribution with the D’Agostino test [28]. We chose the D’Agostino test among other normality tests since it is the recommended test in case of the presence of ex-aequo in the variable. It’s our case with the number of pixels data. We sample four pots from Healthy and Diseased for each day of the experiment, and we select areas located in the limb of the leaves as illustrated in Figure 1. Table 1 shows the mean and the standard deviation of these four p-values associated with the D’Agostino test of normality on the resulting pixel counts of the selections. From Table 1, all p-values are higher than 0.05. Consequently, we do not reject the null hypothesis that

F_{0}

and

F_{m}

follow a normal distribution in either healthy or diseased tissues. We extracted the mean and the standard deviation of healthy and diseased pixels from

F_{0}

and

F_{m}

parameters for the six days of the experiment. The results are in Table 2. It is noticeable that the standard deviation of the Gaussian distribution is relatively stable for healthy and diseased plants over the experiment. This observation is compatible with the interpretation of randomness due mainly to the stationary thermal noise of the camera.

Table 1. Mean ± the standard deviation of four p-values associated with the D’Agostino test of normality in the limb of the Arabidopsis thaliana inoculated by a bacteria for

F_{m}

maximum fluorescence and

F_{0}

minimum fluorescence. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 2. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for images of plants inoculated with bacteria. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

2.2. Arabidopsis thaliana Infected with a Fungal Pathogen

We considered a second study aiming to score the development of fungal pathogen symptoms (Botrytis cinerea) on the Arabidopsis thaliana plant [12]. This is currently the only public data set on chlorophyll fluorescence imaging with diseased plants that we found. The data set can be found in [29]. It is composed of chlorophyll fluorescences images and RGB images acquired during 96 h post-infection at 0 h, 24 h, 72 h, and 98 h. After checking successfully (not shown) the Gaussianity of the distribution of gray levels in limbs of

F_{0}

and

F_{m}

images, we computed like for the previous data set the mean and standard deviation associated. The value obtained is provided in Table 3. Here again, one can notice that the standard deviation of the Gaussian distribution is relatively stable over the experiment but only for the healthy plants. For this fungal disease, spores progressively appear at the surface of the leaves. These spores act as a multiplicative filter that absorbs the incident light. The emergence of these spores may add another source of randomness here.

Table 3. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for the data set of plants infected with fungal pathogen data. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

In the following sections, we will refer to the bacteria data set, the chlorophyll fluorescence images associated with the Arabidopsis thaliana inoculated by a bacteria and by fungal pathogen data set, the chlorophyll fluorescence images of the Arabidopsis thaliana infected with a fungal pathogen.

3. Methods

3.1. Statistical Model of Fv/Fm

In the chlorophyll fluorescence imaging technique, the raw images

F_{m}

and

F_{0}

are not directly used [30,31]. Instead, they are combined to produce some indices, which serve as a biomarker. We focus on the most common of these indices, known as the maximum quantum yield of photosystem II (PSII) [32]:

\frac{F_{v}}{F_{m}} = \frac{(F_{m} - F_{0})}{F_{m}} .

(1)

This ratio is an indicator of plant stress and is among the most used chlorophyll fluorescence parameter. It serves as a biomarker to assess the normal or abnormal photosynthetic activity of plant tissue with a threshold applied to the distributions. The choice of this parameter

F_{v} / F_{m}

is made without loss of generality as all of the indices in chlorophyll fluorescence are based on ratios of images with variations concerning the timing of the flash of light and the wavelength.

We have shown that both

F_{m}

and

F_{0}

can be modeled as Gaussian distribution in the previous section.

Since the Gaussian distribution is alpha-stable, the difference between the two Gaussian distributions is known to be a Gaussian distribution. Consequently, the distribution of

F_{v} / F_{m}

can be modeled in the following way. Let us consider the variables X and Y as

F_{0}

and

F_{m}

, respectively, where X and Y are two identical and independent normal distributions:

X \sim N (μ_{x}, σ_{x}) and Y \sim N (μ_{y}, σ_{y}) .

The probability density function,

P Z

, of the ratio

Z = X / Y

is given by [33,34]:

p_{Z} (z) = \frac{ρ}{π (1 + ρ^{2} z^{2})} exp (- \frac{ρ^{2} β^{2} + 1}{2 δ_{y}^{2}}) \times [1 + \sqrt{\frac{π}{2}} q erf (\frac{q}{\sqrt{2}}) exp (\frac{q^{2}}{2})],

(2)

with

β = \frac{μ_{x}}{μ_{y}}

;

ρ = \frac{σ_{y}}{σ_{x}}

;

δ_{y} = \frac{σ_{y}}{μ_{y}}

;

q = \frac{1 + β ρ^{2} z}{δ_{y} \sqrt{1 + ρ^{2} z^{2}}}

, and

erf (\frac{q}{\sqrt{2}}) = \frac{2}{\sqrt{π}} \int_{0}^{\frac{q}{\sqrt{2}}} exp (- t^{2}) d t

.

We can write this

P Z

otherwise using the confluent hypergeometric function,

_{1} F_{1} (.)

[35]:

p_{Z} (z) = \frac{ρ}{π (1 + ρ^{2} z^{2})} exp (- \frac{β^{2} ρ^{2} + 1}{2 δ_{y}^{2}})_{1} F_{1} (1, \frac{1}{2}; \frac{1}{2 δ_{y}^{2}} \frac{{(1 + β ρ^{2} z)}^{2}}{1 + ρ^{2} z^{2}}),

(3)

with

_{1} F_{1} (.)

, the confluent hypergeometric function, also known as Kummer’s function and defined as follow:

\begin{matrix} _{1} F_{1} (a; c; z) = \sum_{n = 0}^{+ \infty} \frac{{(a)}_{n}}{{(c)}_{n}} \frac{z^{n}}{n!}, \end{matrix}

(4)

where the Pochhammer symbol

{(a)}_{n}

indicates the nth rising factorial of a, i.e.,

\begin{matrix} {(a)}_{n} = a (a + 1) \dots (a + n - 1) = \frac{Γ (a + n)}{Γ (a)} if n = 1, 2, \dots \end{matrix}

If

n = 0

,

{(a)}_{n} = 1

. The demonstration of the second form of the

P Z

given by Equation (3) is presented in Appendix A.1.

An approximation of the distribution of Z by a normal distribution has been proposed by [34]. Most authors defined conditions on the parameters,

β

,

ρ

and

δ_{y}

resulting from empirical or simulation works and showed the switch from a normal distribution to a bimodal distribution under certain values [36,37,38]. This is illustrated with simulation for different values of

β

,

ρ

and

δ_{y}

in Figure 2. We draw the distribution of the ratio for three different values of the parameters and we add the normal approximation proposed by [34].

Figure 2. The black curve with circle points is the distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) for an increasing values of the parameters,

β

,

ρ

, and

δ_{y}

and the red curve with crossed points is the normal approximation of the distribution in each of these cases. (a) A case with perfect fit of the ratio density and the normal approximation; (b) a deviation from normal distribution; (c) a case where the ratio density is bimodal and the normal approximation is not appropriate.

In the first case (

β = 0.1

,

ρ = 0.05

,

δ_{y} = 0.1

), we see a perfect fit of the normal approximation. In the second case (

β = 2

,

ρ = 0.5

,

δ_{y} = 0.6

), we start to see the deviation from normality, whereas for (

β = 2

,

ρ = 2

,

δ_{y} = 2

), the distribution of the ratio is bimodal. Therefore, supposing a normal distribution of

F v / F m

will be more or less wrong depending on the values of

β

,

ρ

, and

δ_{y}

.

There is not a sufficient amount of public sets of chlorophyll fluorescence images to determine if the normal assumption can always be done in estimation of statistical parameters. Therefore, it is necessary to design estimators adapted for the non-Gaussianity of Z in general.

3.2. Estimation of the PZ Parameters

We now present two estimators of the non-Gaussian distribution Z modeling the Maximum Quantum Yield

F_{v} / F_{m}

. The first estimator is based on the Bayesian estimation and the second one follows the expectation maximization (EM) algorithm. The Bayesian estimation is chosen here with the hypothesis that two parameters are known according to some prior information and we need to estimate only one parameter. For the EM algorithm, all the parameters are unknown and need to be estimated. We consider that the latter is a more general approach than the former.

3.2.1. Bayesian Estimation

We suppose that

β

and

ρ

are known, and we aim to estimate

δ_{y}

using Bayesian inference. We consider the results for the first day of the experiment as our observed data to get

β

and

ρ

values. The next step is to define the prior probability for

δ_{y}

. We recall that

δ_{y} = σ_{y} / μ_{y}

. Since only the parameters of the first day are known (

μ_{x}; μ_{y}; σ_{x}; σ_{y}

), we simulate N samples of Y associated to the first day:

Y_{1}, \dots, Y_{N} \sim N (μ_{y}, σ_{y})

. The ratio of the standard deviation to the mean value of these N samples leads to one

δ_{y}

value. We repeat this simulation a couple of times (let us say S times) and we define

δ_{y} \sim N (μ_{δ_{y}}, σ_{δ_{y}}

), with

μ_{δ_{y}}

and

σ_{δ_{y}}

the mean and standard deviation of the

δ_{y}

values obtained with the S simulations. The posterior distribution is therefore given by:

\begin{matrix} p (δ_{y} | z_{1}, \dots, z_{N}) & = \frac{p (δ_{y}) p (z_{1}, \dots, z_{N} | δ_{y})}{\int_{- \infty}^{+ \infty} p (δ_{y}) p (z_{1}, \dots, z_{N} | δ_{y}) d δ_{y}}, \end{matrix}

where

\begin{matrix} p (z_{1}, \dots, z_{N} | δ_{y}) = exp (\frac{- N (ρ^{2} β^{2} + 1)}{2 δ_{y}^{2}}) \prod_{i = 1}^{N} \frac{ρ}{π (1 + ρ^{2} z_{i}^{2})}_{1} F_{1} (1, \frac{1}{2}; \frac{1}{2 δ_{y}^{2}} \frac{{(1 + β ρ^{2} z_{i})}^{2}}{1 + ρ^{2} z_{i}^{2}}), \end{matrix}

(5)

and

\begin{matrix} \int_{- \infty}^{+ \infty} p (δ_{y}) p (z_{1}, \dots, z_{N} | δ_{y}) d δ_{y} = (\prod_{i = 1}^{N} \frac{ρ}{π (1 + ρ^{2} z_{i}^{2})}) \times \\ \int_{- \infty}^{+ \infty} exp (\frac{- N (ρ^{2} β^{2} + 1)}{2 δ_{y}^{2}} - \frac{{(δ_{y} - μ_{δ_{y}})}^{2}}{2 σ_{δ_{y}}^{2}}) \prod_{i = 1}^{N}_{1} F_{1} (1, \frac{1}{2}; \frac{1}{2 δ_{y}^{2}} \frac{{(1 + β ρ^{2} z_{i})}^{2}}{1 + ρ^{2} z_{i}^{2}}) d δ_{y}, \end{matrix}

(6)

with

z_{1} = \frac{X_{1}}{Y_{1}}, \dots, z_{N} = \frac{X_{N}}{Y_{N}}

, N ratios of N samples of two normal distributions

X \sim N (μ_{x}, σ_{x})

, and

Y \sim N (μ_{y}, σ_{y})

. Last step is to determine the posterior mean estimator of

δ_{y}

, which is given by [39]:

\begin{matrix} \hat{δ_{y}} & = E (δ_{y} | z_{1}, \dots, z_{N}) \\ = \int_{- \infty}^{+ \infty} δ_{y} p (δ_{y} | z_{1}, \dots, z_{N}) d δ_{y} . \end{matrix}

(7)

It is not obvious to compute the above integral since it contains the confluent hypergeometric function. In this case, we use the Monte Carlo (MC) method to numerically approximate this integral. The approximation of

δ_{y}

using Monte-Carlo is detailed in Algorithm 1. The resulting quantity is an estimator without bias and highly consistent with

δ_{y}

. The only drawback of using the MC method is that it is time-consuming. More iterations lead to better consistency but to more simulation time.

3.2.2. EM Estimation of the Parameters

Knowing that the pdf of Z is the result of the ratio of two independent normal distributions N(

μ_{x}, σ_{x}

) and N(

μ_{y}, σ_{y}

), respectively, we estimate first the parameters

(μ_{x}, σ_{x}, μ_{y}, σ_{y})

by performing the expectation maximization (EM) algorithm. Thereafter, the parameters

(β, ρ, δ_{y})

can be deduced from the following relations:

\begin{matrix} \hat{β} = \frac{{\hat{μ}}_{x}}{{\hat{μ}}_{y}}, \hat{ρ} = \frac{{\hat{σ}}_{y}}{{\hat{σ}}_{x}}, {\hat{δ}}_{y} = \frac{{\hat{σ}}_{y}}{{\hat{μ}}_{y}} \end{matrix}

(8)

Consider N independent and identically distributed realizations

z_{i}, i = {1, \dots, N}

of a random variable Z distributed according to the

P Z

written with the confluent hypergeometric function (3). This form of the

P Z

is more adequate to the estimation parameter procedure using the EM algorithm.

The pdf associated to the ratio Z depends on the set of unknown parameters:

θ = (μ_{x}, μ_{y}, σ_{x}^{2}, σ_{y}^{2})

. The maximum likelihood estimator

\hat{θ}

of the set parameter

θ

, is given by:

\begin{matrix} {\hat{θ}}_{M L} & = arg max_{θ} p_{Z} (z | θ) = arg max_{θ} \prod_{i = 1}^{N} p_{Z} (z_{i} | θ) . \end{matrix}

(9)

Algorithm 1: MC estimation of

δ_{y}

.

Input: $N, S, K$
Output: $\hat{δ_{y}}$
Loop: K iteration:
Repeat for k in $1 \dots K$
Loop: S iteration:
Initialization:
Set the parameters $(μ_{x}, σ_{x}, μ_{y}, σ_{y})$ of the first day of the experiment
Repeat
generate N samples $y_{1}, \dots, y_{N} \sim N (μ_{y}, σ_{y})$
calculate the associated mean $μ_{y}^{s}$ and standard deviation $σ_{y}^{s}$ :
$μ_{y}^{s} = \frac{1}{N} \sum_{i = 1}^{N} y_{i}$ and $σ_{y}^{s} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - μ_{y}^{s})}^{2}}$ , for $s = 1, \dots, S$
deduce $δ_{y}^{s} = \frac{σ_{y}^{s}}{μ_{y}^{s}}$ , for $s = 1, \dots, S$
generate N samples $x_{1}, \dots, x_{N} \sim N (μ_{x}, σ_{x})$
compute N samples $z_{1} = \frac{x_{1}}{y_{1}}, \dots, z_{N} = \frac{x_{N}}{y_{N}}$
deduce $p (z_{1}, \dots, z_{N} | δ_{y}^{s})$ , for $s = 1, \dots, S$ using Equation (5)
End loop
calculate $μ_{δ_{y}} = \frac{1}{S} \sum_{s = 1}^{S} δ_{y}^{s}$ and $σ_{δ_{y}} = \sqrt{\frac{1}{S} \sum_{s = 1}^{S} {(δ_{y}^{s} - μ_{δ_{y}})}^{2}}$
deduce $p (δ_{y}^{s})$ the pdf of $δ_{y}^{s} \sim N (μ_{δ_{y}}, σ_{δ_{y}})$ , for $s = 1, \dots, S$
Return $\hat{δ_{y}^{k}} = \frac{\sum_{s = 1}^{S} δ_{y}^{s} \times p (δ_{y}^{s}) \times p (z_{1}, \dots, z_{N} | δ_{y}^{s})}{\sum_{s = 1}^{S} p (δ_{y}^{s}) \times p (z_{1}, \dots, z_{N} | δ_{y}^{s})}$ .
Save $\hat{δ_{y}^{k}}$ the result of the last return for k from 1 to K.
Return $\hat{δ_{y}} = \frac{1}{K} \sum_{k = 1}^{K} \hat{δ_{y}^{k}}$ ;

In the absence of an explicit solution of the maximum likelihood Equation (9), the Expectation–Maximization (EM) algorithm is used to find an estimation

\hat{θ}

given a current estimate

θ^{'}

of

θ

. We suppose that for each observed

z_{i}

an unobserved and hidden data

y_{i}

is associated. The sequence

{y_{i}, i = 1, \dots, N}

is also supposed to be independent and identically distributed.

\begin{matrix} \hat{θ} & = & arg max_{θ} E_{Y | Z} {ln f_{Y, Z} (Z, Y | θ) | z, θ^{'}} \\ = & arg max_{θ} E_{Y | Z} {ln \prod_{i = 1}^{N} f_{Y, Z} (Z_{i}, Y_{i} | θ) | z_{i}, θ^{'}} \\ = & arg max_{θ} E_{Y | Z} {\sum_{i = 1}^{N} ln f_{Y, Z} (Z_{i}, Y_{i} | θ) | z_{i}, θ^{'}} \\ = & arg max_{θ} \sum_{i = 1}^{N} E_{Y | Z} {ln f_{Y, Z} (Z_{i}, Y_{i} | θ) | z_{i}, θ^{'}} . \end{matrix}

(10)

Let

θ_{x} = (μ_{x}, σ_{x})

and

θ_{y} = (μ_{y}, σ_{y})

. By using the fact that:

f_{Y, Z} (z_{i}, y_{i} | θ) = f_{Z | Y} (z_{i} | y_{i}, θ_{x}) f_{Y} (y_{i} | θ_{y}),

(11)

then (10) can be maximized separately in respect to the set of parameters

θ_{x}

and

θ_{y}

as follows:

\begin{matrix} {\hat{θ}}_{x} = arg max_{θ_{x}} \sum_{i = 1}^{N} E_{Y | Z} {ln f_{Z | Y} (Z_{i} | Y_{i}, θ_{x}) | z_{i}, θ^{'}} \end{matrix}

(12)

\begin{matrix} {\hat{θ}}_{y} = arg max_{θ_{y}} \sum_{i = 1}^{N} E_{Y | Z} {ln f_{Y} (Y_{i} | θ_{y}) | z_{i}, θ^{'}} . \end{matrix}

(13)

By developing these two equations, and differentiating with respect to

μ_{x}

,

σ_{x}

,

μ_{y}

and

σ_{y}

(Appendix A.2), we provide the estimate

{\hat{θ}}_{x} = (μ_{x}, σ_{x})

and

{\hat{θ}}_{y} = (μ_{y}, σ_{y})

:

\begin{matrix} {\hat{μ}}_{x} & = \frac{1}{N} \sum_{i = 1}^{N} z_{i} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \end{matrix}

(14)

\begin{matrix} {\hat{σ}}_{x}^{2} & = \frac{1}{N} \sum_{i = 1}^{N} z_{i}^{2} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} - {\hat{μ}}_{x}^{2} \end{matrix}

(15)

\begin{matrix} {\hat{μ}}_{y} & = \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \end{matrix}

(16)

\begin{matrix} {\hat{σ}}_{y}^{2} & = \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} - {\hat{μ}}_{y}^{2} . \end{matrix}

(17)

where

E_{Y | Z} {Y_{i} | z_{i}, θ^{'}}

and

E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}}

are the posterior expectation values dependent of the distribution of Y given by:

\begin{matrix} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} & = \frac{γ}{μ} \frac{_{1} F_{1} (2, \frac{3}{2}; \frac{γ^{2}}{4 μ})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{2}}{4 μ})}, \end{matrix}

(18)

and

\begin{matrix} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} & = \frac{1}{μ} \frac{_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{2}}{4 μ})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{2}}{4 μ})}, \end{matrix}

(19)

with

μ = \frac{1}{2} (\frac{z_{i}^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}})

and

γ = \frac{μ_{y}}{σ_{y}^{2}} + \frac{μ_{x}}{σ_{x}^{2}} z_{i}

.

The proof of both these expressions is also in Appendix A.2. This leads to the following iterative Algorithm 2 for solving the maximum likelihood problem:

3.3. Comparison with Normal Assumptions Baseline

To show the benefit of using the Bayesian and EM estimations with the

P Z

distribution, we compare them with a standard normal distribution assumption and the normal approximation proposed in [34].

3.4. Numerical Experiments

We now present the metrics used to establish the value of the proposed non-Gaussian estimator for

P Z

. We then describe the numerical simulation undertaken with these metrics.

Algorithm 2: EM algorithm.

Input: $N, z_{i}, ϵ$
Output: $\hat{θ} = ({\hat{μ}}_{x}, {\hat{σ}}_{x}, {\hat{μ}}_{y}, {\hat{σ}}_{y})$ and consequently ( $\hat{β}, \hat{ρ}, {\hat{δ}}_{y})$
Initialization:
Set the parameters $θ^{'} = (μ_{x}^{'}, σ_{x}^{'}, μ_{y}^{'}, σ_{y}^{'})$
Loop:
Repeat
Calculate $μ^{'} = \frac{1}{2} (\frac{z_{i}^{2}}{σ_{x}^{' 2}} + \frac{1}{σ_{y}^{' 2}})$ and $γ^{'} = \frac{μ_{y}^{'}}{σ_{y}^{' 2}} + \frac{μ_{x}^{'}}{σ_{x}^{' 2}} z_{i}$
Estimate ${\hat{μ}}_{x} = \frac{1}{N} \sum_{i = 1}^{N} z_{i} \frac{γ^{'}}{μ^{'}} \frac{_{1} F_{1} (2, \frac{3}{2}; \frac{γ^{' 2}}{4 μ^{'}})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})}$
Estimate ${\hat{σ}}_{x}^{2} = \frac{1}{N} \sum_{i = 1}^{N} z_{i}^{2} \frac{1}{μ^{'}} \frac{_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})} - {\hat{μ}}_{x}^{2}$
Estimate ${\hat{μ}}_{y} = \frac{1}{N} \sum_{i = 1}^{N} \frac{γ^{'}}{μ^{'}} \frac{_{1} F_{1} (2, \frac{3}{2}; \frac{γ^{' 2}}{4 μ^{'}})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})}$
Estimate ${\hat{σ}}_{y}^{2} = \frac{1}{N} \sum_{i = 1}^{N} \frac{1}{μ^{'}} \frac{_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})}{_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{' 2}}{4 μ^{'}})} - {\hat{μ}}_{y}^{2}$
Calculate the stop criterion: $D \leftarrow ∥ \hat{θ} - θ^{'} ∥$
Define the next iteration initialization: $μ_{x}^{'}, σ_{x}^{'}, μ_{y}^{'}, σ_{y}^{'} \leftarrow {\hat{μ}}_{x}, {\hat{σ}}_{x}, {\hat{μ}}_{y}, {\hat{σ}}_{y}$
Until $D < ϵ$
Return $\hat{β} = \frac{{\hat{μ}}_{x}}{{\hat{μ}}_{y}}$ ; $\hat{ρ} = \frac{{\hat{σ}}_{y}}{{\hat{σ}}_{x}}$ ; ${\hat{δ}}_{y} = \frac{{\hat{σ}}_{y}}{{\hat{μ}}_{y}}$ ;

3.4.1. Fractional Moments

Dividing two normal distributions lead to a high variability of the mean value. This problem has been raised in agricultural research [40]. For a coefficient of variation of Y (

C V_{Y}

) strictly lower than 0.2, the mean value of the ratio is stable. The coefficient of variation of Y is equal to

δ_{y}

: the ratio of the standard deviation to the mean of Y. Thus, we used interchangeably

δ_{y}

or

C V_{Y}

. For the bacteria data set,

δ_{y}

values are between 0.2 and 0.3 (see Table 4), and for the fungal pathogen data set,

δ_{y}

values are between 0.8 and 1.3 (see Table 5). Therefore, we are not in a situation with a stable mean ratio. We propose an alternative for the mean value in these cases. We suggest using the mean of the fractional moment.

Table 4. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the bacteria data set. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 5. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

We give the expression for these moments that benefits from the independence of the variability of the denominator. The fractional moments expression is given by:

\begin{matrix} E {| Z |^{s}} & = {(\frac{σ_{x}}{σ_{y}})}^{s} \frac{Γ (1 - s) Γ (1 + s)}{Γ (1 - \frac{s}{2}) Γ (1 + \frac{s}{2})}_{1} F_{1} (\frac{s}{2}, \frac{1}{2}, - \frac{μ_{y}^{2}}{2 σ_{y}^{2}})_{1} F_{1} (\frac{- s}{2}, \frac{1}{2}, - \frac{μ_{x}^{2}}{2 σ_{x}^{2}}) \end{matrix}

(20)

\begin{matrix} = ρ^{- s} \frac{Γ (1 - s) Γ (1 + s)}{Γ (1 - \frac{s}{2}) Γ (1 + \frac{s}{2})}_{1} F_{1} (\frac{s}{2}, \frac{1}{2}, - \frac{1}{2 δ_{y}^{2}})_{1} F_{1} (\frac{- s}{2}, \frac{1}{2}, - \frac{1}{2 δ_{x}^{2}}) \end{matrix}

(21)

with

β = \frac{μ_{x}}{μ_{y}}, ρ = \frac{σ_{y}}{σ_{x}}, δ_{x} = \frac{σ_{x}}{μ_{x}}, δ_{y} = \frac{σ_{y}}{μ_{y}}

. The proof of this expression is given in Appendix A.4 for

0 < s < 1

.

3.4.2. Monte Carlo Experiments

To have ground truth in the evaluation of the value of the estimator of

P Z

, we resorted to the use of simulation of the two empirical data sets both in the healthy and diseased plants. We considered the size of the smallest leaves and the largest ones on our two data sets. We found 10 pixels for the smallest leaves and 80 for the largest ones. Generation of Gaussian distribution for

F_{m}

and

F_{0}

mimicking the experimental observation of our two experimental data sets given in Table 2 and Table 3. The resulting observations of

F_{v} / F_{m}

were computed. The two proposed non-Gaussian estimators and the Gaussian baselines estimator described in the previous section were computed. Simulations were repeated 5000 times to compute average performance and associated standard deviations. The end point of our experiments is the relative error:

R E = \frac{M e a s u r e d - E s t i m a t e d}{E s t i m a t e d}

.

The measured value is the exact value of the fractional moment, and the estimated value is obtained with one of the compared estimators. This comparison is made at all dates of the experimental data used for the simulation. The prior values used in our proposed methods are initialized with the values of the first day of the experiment.

4. Results

We are now ready to assess the importance of non-Gaussianity in chlorophyll fluorescence images via the comparison of the relative error of our two proposed statistical estimators against the standard estimator under Gaussian assumptions.

4.1. Arabidopsis thaliana Inoculated by a Bacteria

To illustrate the non-Gaussianity of the bacteria data set, we provide the distribution of

P Z

for the mean values of

β

,

ρ

, and

δ_{y}

over the six days of our real experimental data set both for Healthy and Diseased in Figure 3. The normal approximation is added. One can observe a small deviation from Gaussianity.

Figure 3. The distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) is the black curve with circle points, and the normal approximation is the red curve with crossed points. The parameters

β

,

ρ

, and

δ_{y}

of the distribution

P Z

are associated with the mean value of these parameters over the six days of the acquisition of chlorophyll fluorescence images for (a) Healthy:

β = 0.15

,

ρ = 5.79

and

δ_{y} = 0.22

and (b) Diseased:

β = 0.24

,

ρ = 4.21

and

δ_{y} = 0.25

plants of the bacteria data set.

We considered the second-order fractional moment of

P Z

for this first data set since it led to stable results of the mean of fractional moments for the values of

δ_{y}

between 0.3 and 1 (see Appendix A.5). We calculated these second-order fractional moments of

P Z

for each day of the experiment (Table 6). These are the measured values to which we will compare the estimated values obtained with the Monte Carlo simulations with 10 and 80 samples. The results of the Monte Carlo simulation are given in Table 7. The normal distribution and the normal approximation are the methods of estimation when assuming a normal distribution of

F_{v} / F_{m}

. Bayesian and EM estimation are the two proposed non-Gaussian estimators with

P Z

distribution.

Table 6. Second-order fractional moment of

P Z

distribution of the ratio for the healthy and diseased leaves of the bacteria data set. D0, …., and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 7. The mean value (

μ

) and the associated standard deviation (

σ

) of the second-order fractional moment of the Monte Carlo simulation for 10 and 80 sample sizes, with the first day of the experiment as a reference value, with the assumptions of Gaussian probability density function and the Gaussian approximation proposed in [34] and with the two non-Gaussian estimators, Bayesian and EM.

With both measured (Table 6) and estimated values (Table 7), we calculate the relative error for each day of the experiment. The mean value of the relative error over the six days of the experiment is given in Table 8, for Healthy and Diseased plants, and per sample size.

Table 8. Mean value of the relative error ( % ) for the five days of the experiment per method of estimation and per sample size, N, for Healthy and Diseased plants of the bacteria data set.

The maximum value of the relative error was not high: 6% with normal assumptions. The relative error is lower for Healthy compared to Diseased plants. Overall, we have a lower error with Bayesian and EM estimations compared to the normal distribution and normal approximation.

4.2. Arabidopsis thaliana Infected with a Fungal Pathogen

We apply the same analysis to the fluorescence images of Arabidopsis thaliana infected with a fungal pathogen. We start with a representation of the distribution of

P Z

associated with the mean values of

β

,

ρ

, and

δ_{y}

over the five dates of the empirical data set for both Healthy and Diseased plants, Figure 4.

Figure 4. The distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) is the black curve with circle points, and the normal approximation is the red curve with crossed points. The parameters

β

,

ρ

, and

δ_{y}

of the distribution

P Z

are associated with the mean value of these parameters over the six days of the acquisition of chlorophyll fluorescence images for (a) Healthy:

β = 0.192

,

ρ = 5.377

and

δ_{y} = 0.461

and (b) Diseased:

β = 0.358

,

ρ = 3.359

and

δ_{y} = 0.976

plants of the fungal pathogen data set.

The deviation from normality is more pronounced with this data set. We calculate the measured values of the fourth-order fractional moment of

F_{v} / F_{m}

for each date of the experiment (Table 9). The choice to use the fourth order is due to the higher values of

δ_{y}

in this data set. Thus, a lower value of the fractional moment was more appropriate (cf. simulation of

C V_{y}

in Appendix A.5). We then calculate the mean value of the fourth-order fractional moment with the Monte-Carlo simulations for 10 and 80 sample sizes. The simulation results are in Table 10. The mean values represent the estimated fourth-order fractional moment with the four methods of estimation: normal distribution assumption, the normal approximation, and the two non-Gaussian estimators, Bayesian and EM.

Table 9. Fourth-order fractional moment of

P Z

distribution of the ratio for the Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Table 10. The mean value (

μ

) and the associated standard deviation (

σ

) of the fourth-order fractional moment of the Monte Carlo simulation with 24 h as a reference value.

We calculate the relative error per time of experiment with the measured value of the fourth-order fractional moment (Table 9) and the estimated values (Table 10). We give in Table 11 the mean value of the relative error over all the experiment acquisition times.

Table 11. Mean value of the relative error (%) for the five times of the experiment per method of estimation and per number of observations for Healthy and Diseased plants of the fungal pathogen data set.

We see clearly on the fungal data set that the relative error is much higher if one supposes a normal distribution of the ratio: 26% for healthy (

N = 10

) and 82% for diseased (

N = 10

) v.s around 6% and 7% for Bayesian and EM estimations. Thus, these results show that using a normal distribution or normal approximation of the ratio

F_{v} / F_{m}

leads to more or less wrong results depending on how pronounced is the deviation from Gaussianity.

5. Discussion

We have quantified the importance of considering the non-Gaussianity in the maximum quantum yield of photosystem II. This non-Gaussianity was only recently highlighted empirically in [12]. In most of the recent studies using the maximum quantum yield of photosystem II as phenotyping characteristic [14,15,17,18,19,20,21,22,24,25,26], the results are presented with average and standard deviation. Our mathematical study shows that this may not be the most appropriate approach especially when the distribution of the gray levels in

F_{v} / F_{m}

strongly deviates from Gaussianity. This advocate for the use of box plots like in [23] to easily visualize the non-Gaussianity. The only use of average and standard deviation in [14,15,17,18,19,20,21,22,24,25,26] was not problematic as the maximum quantum yield of photosystem II is not used as a biomarker for its absolute value but rather to differentiate different phenotypes. However, our mathematical models show that the distribution can be nonsymmetric with heavy tails. This may explain why advanced models have been used in the literature for decision making with decisions trees [41,42] or Gaussian mixtures [12,13] for decision making. We focus on estimation in this article which is distinct from detection. It could be an interesting perspective to compare the advanced approaches [12,13,41,42] with a simple single threshold to be applied to the non-Gaussian distribution considered in our work.

6. Conclusions

In this article, we have demonstrated the importance to consider mathematically the non-Gaussianity of vegetation indices composed of a ratio of images corrupted by Gaussian noises. This was illustrated and detailed with chlorophyll fluorescence images. We have designed estimators adapted to this non-Gaussianity under the hypothesis of independence of the distribution of the images used in the ratio. Despite the simplicity of this model, the benefits of this approach by comparison with the usual Gaussian assumption was demonstrated.

In this work, we focused on estimating the distribution parameters of the ratio

F_{v} / F_{m}

used in chlorophyll fluorescence indices. In the two chlorophyll fluorescence, infected plants data sets that we considered, the distribution of this ratio

F_{v} / F_{m}

didn’t follow a normal distribution. If more chlorophyll fluorescence data sets are available, it could be interesting to have a range of the parameters for the density distribution of the ratio and to identify the intervals for these parameters where normality is verified.

This work could be extended in various directions. Many vegetation indices include a ratio of images and therefore could benefit from the approach proposed in the article. With the two empirical data sets considered, the deviation from Gaussianity was limited, but it was enough to show the importance to use adapted non-Gaussian models. Theoretically, the deviation from Gaussianity can be severe. Therefore, it is fundamentally important to have mathematically grounded estimators. The assumption of independence of the images is a current limitation of our approach that produced lower error but yet biased estimators, especially for small leaves (with a small number of observations). It would be interesting to design models of covariance between images used in the vegetation indices ratio in order to propose possibly unbiased estimators independently of the size of the leaves.

Author Contributions

Conceptualization, D.R., A.E.G. and N.B.; methodology, D.R., A.E.G. and N.B.; software, A.E.G. and N.B.; validation, A.E.G. and N.B.; formal analysis, A.E.G. and N.B.; investigation, A.E.G.; resources, D.R.; data curation, A.E.G., N.B. and N.S.; writing—original draft preparation, A.E.G., D.R. and N.B.; writing—review and editing, A.E.G., N.B., N.S. and D.R.; visualization, A.E.G.; supervision, D.R.; project administration, D.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Available under reasonable request.

Acknowledgments

Authors gratefully acknowledge the PHENOTIC platform node of the French national infrastructure on plant phenotyping ANR PHENOME 11-INBS-0012 and thank Tristan Boureau from Platform PHENOTIC for the fluorescence images acquisition of Arabidopsis plants infected by a bacteria.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

pdf	Probability distribution function
$P Z$	Probability density function of the ratio

Appendix A

Appendix A.1. Ratio Distribution

The aim of this appendix is to show the passage from the distribution of the ratio written as Equation (2) to the distribution of the ratio written with the hypergeometric function Equation (3). We start by recalling two equalities that will be used hereinafter in the proof.

Prerequisite A1 ([43], Section 3.462, Equation (5)).

\begin{matrix} \int_{0}^{\infty} x e^{- μ x^{2} - 2 λ x} d x & = \frac{1}{2 μ} [1 - λ \sqrt{\frac{π}{μ}} e^{\frac{λ^{2}}{μ}} (1 - e r f (\frac{λ}{\sqrt{μ}}))], Real (μ) > 0 \end{matrix}

(A1)

Since

erf (.)

function is an odd function, it follows:

\begin{matrix} \int_{0}^{\infty} x e^{- μ x^{2} - 2 λ x} d x + \int_{0}^{\infty} x e^{- μ x^{2} + 2 λ x} d x & = \frac{1}{μ} [1 + \sqrt{π \frac{λ^{2}}{μ}} e^{\frac{λ^{2}}{μ}} erf (\frac{λ}{\sqrt{μ}})] \end{matrix}

(A2)

Prerequisite A2 ([43], Section 3.462, Equation (1)).

\begin{matrix} \int_{0}^{\infty} x^{ν - 1} e^{- μ x^{2} - γ x} d x & = {(\frac{1}{2 μ})}^{\frac{ν}{2}} Γ (ν) e^{\frac{γ^{2}}{8 μ}} D_{- ν} (\frac{γ}{\sqrt{2 μ}}), Real (μ) > 0, real (ν) > 0 \end{matrix}

(A3)

with

D_{- ν} (z)

the parabolic cylinder function defined as follow:

\begin{matrix} D_{- ν} (z) & = 2^{- \frac{ν}{2}} e^{- \frac{z^{2}}{4}} [\frac{\sqrt{π}}{Γ (\frac{1 + ν}{2})}_{1} F_{1} (\frac{ν}{2}, \frac{1}{2}; \frac{z^{2}}{2}) - \frac{\sqrt{2 π}}{Γ (\frac{ν}{2})} z_{1} F_{1} (\frac{1 + ν}{2}, \frac{3}{2}; \frac{z^{2}}{2})], \end{matrix}

Γ (ν)

is the gamma function and

_{1} F_{1} (.)

is the confluent hypergeometric function.

Proof.

Set

X \sim N (μ_{x}, σ_{x})

and

Y \sim N (μ_{y}, σ_{y})

. We will start by proving the probability distribution of the ratio

Z = X / Y

since, at one step of the proof, the switch to the hypergeometric function will be realised. We denote by

g (z, y)

the pdf of the joint two random variables

(Z = X / Y, Y = Y)

. It is given as following:

\begin{matrix} g (z, y) = f (x, y) | \frac{\partial (z, y)}{\partial (x, y)} |^{- 1}, \end{matrix}

(A4)

where the Jacobian determinant of the change of variables is given by

| J | = | \frac{\partial (z, y)}{\partial (x, y)} |

and calculated as follow:

| J | = | \begin{matrix} \frac{\partial z}{\partial x} & \frac{\partial z}{\partial y} \\ \frac{\partial y}{\partial x} & \frac{\partial y}{\partial y} \end{matrix} | = | \begin{matrix} \frac{1}{y} & - \frac{x}{y^{2}} \\ 0 & 1 \end{matrix} | = | \frac{1}{y} |

.

Therefore,

\begin{matrix} g (z, y) = | y | f (x, y) . \end{matrix}

(A5)

The probability distribution of the ratio

Z = X / Y

, denoted

p_{Z} (z)

, is given by:

\begin{matrix} p_{Z} (z) = \int_{- \infty}^{\infty} g (z, y) d y = \int_{- \infty}^{\infty} | y | f (x, y) d y . \end{matrix}

(A6)

Since the two random variables X and Y are independent then

f (x, y) = f_{X} (x) f_{Y} (y)

. The expression of

p_{Z} (z)

is then given by

\begin{matrix} p_{Z} (z) & = \int_{- \infty}^{\infty} | y | f_{X} (x) f_{Y} (y) d y \\ = \int_{- \infty}^{\infty} | y | f_{X} (z y) f_{Y} (y) d y, \sin ce x = z y \\ = \frac{1}{2 π σ_{x} σ_{y}} \int_{- \infty}^{\infty} | y | e^{- \frac{{(z y - μ_{x})}^{2}}{2 σ_{x}^{2}}} e^{- \frac{{(y - μ_{y})}^{2}}{2 σ_{y}^{2}}} d y \\ = \frac{1}{2 π σ_{x} σ_{y}} e^{- \frac{1}{2} (\frac{μ_{x}^{2}}{σ_{x}^{2}} + \frac{μ_{y}^{2}}{σ_{y}^{2}})} \int_{- \infty}^{\infty} | y | e^{- \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}}) y^{2} + (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}}) y} d y \\ = \frac{1}{2 π σ_{x} σ_{y}} e^{- \frac{1}{2} (\frac{μ_{x}^{2}}{σ_{x}^{2}} + \frac{μ_{y}^{2}}{σ_{y}^{2}})} [\int_{0}^{\infty} y e^{- \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}}) y^{2} + (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}}) y} d y - \int_{- \infty}^{0} y e^{- \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}}) y^{2} + (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}}) y} d y] \\ = \frac{1}{2 π σ_{x} σ_{y}} e^{- \frac{1}{2} (\frac{μ_{x}^{2}}{σ_{x}^{2}} + \frac{μ_{y}^{2}}{σ_{y}^{2}})} [\int_{0}^{\infty} y e^{- \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}}) y^{2} + (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}}) y} d y + \int_{0}^{\infty} t e^{- \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}}) t^{2} - (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}}) t} d t] . \end{matrix}

By a direct application of Prerequisite A1, with:

μ = \frac{1}{2} (\frac{z^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}})

and

λ = \frac{1}{2} (\frac{μ_{x}}{σ_{x}^{2}} z + \frac{μ_{y}}{σ_{y}^{2}})

, then,

\frac{λ^{2}}{μ} = \frac{1}{2} \frac{{(μ_{y} σ_{x}^{2} + μ_{x} σ_{y}^{2})}^{2}}{σ_{x}^{2} σ_{y}^{2}} \frac{1}{σ_{x}^{2} + σ_{y}^{2} z^{2}}

.

Setting:

β = \frac{μ_{x}}{μ_{y}}

,

ρ = \frac{σ_{y}}{σ_{x}}

, and

δ_{y} = \frac{σ_{y}}{μ_{y}}

, we have the three following equations:

\begin{matrix} \frac{λ^{2}}{μ} & = \frac{1}{2} \frac{1}{δ_{y}^{2}} \frac{{(1 + β ρ^{2} z)}^{2}}{1 + ρ^{2} z^{2}} = \frac{1}{2} q^{2} with q = \frac{1}{δ_{y}} \frac{1 + β ρ^{2} z}{\sqrt{1 + ρ^{2} z^{2}}} \end{matrix}

(A7)

\begin{matrix} e^{- \frac{1}{2} (\frac{μ_{x}^{2}}{σ_{x}^{2}} + \frac{μ_{y}^{2}}{σ_{y}^{2}})} & = e^{- \frac{1}{2} \frac{1 + β^{2} ρ^{2}}{δ_{y}^{2}}} \end{matrix}

(A8)

\begin{matrix} \frac{1}{2 π σ_{x} σ_{y}} \frac{1}{μ} & = \frac{ρ}{π (1 + ρ^{2} z^{2})} . \end{matrix}

(A9)

The probability distribution of the ratio,

p_{Z} (z)

is then given by:

\begin{matrix} p_{Z} (z) & = \frac{ρ}{π (1 + ρ^{2} z^{2})} e^{- \frac{1}{2} \frac{1 + β^{2} ρ^{2}}{δ_{y}^{2}}} [1 + \sqrt{π \frac{q^{2}}{2}} e^{\frac{q^{2}}{2}} e r f (\frac{q}{\sqrt{2}})] . \end{matrix}

(A10)

This probability distribution could be written differently using the Prerequisite A2. In fact, the integral in (A1) is a particular form of the integral in (A3). We can then write the Equation (A2) considering

ν = 2

as follow:

\begin{matrix} \int_{0}^{\infty} x e^{- μ x^{2} - γ x} d x + \int_{0}^{\infty} x e^{- μ x^{2} + γ x} d x & = \frac{1}{2 μ} e^{\frac{γ^{2}}{8 μ}} [D_{- 2} (\frac{γ}{\sqrt{2 μ}}) + D_{- 2} (- \frac{γ}{\sqrt{2 μ}})] . \end{matrix}

(A11)

Since ∀ real

(ν) > 0

, we have:

\begin{matrix} D_{- ν} (z) + D_{- ν} (- z) & = 2 \times 2^{- \frac{ν}{2}} e^{- \frac{z^{2}}{4}} \frac{\sqrt{π}}{Γ (\frac{1 + ν}{2})}_{1} F_{1} (\frac{ν}{2}, \frac{1}{2}; \frac{z^{2}}{2}) . \end{matrix}

(A12)

Therefore

\begin{matrix} D_{- 2} (\frac{γ}{\sqrt{2 μ}}) + D_{- 2} (- \frac{γ}{\sqrt{2 μ}}) & = e^{- \frac{γ^{2}}{8 μ}} \frac{\sqrt{π}}{Γ (\frac{3}{2})}_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{2}}{4 μ}) \\ = 2 e^{- \frac{γ^{2}}{8 μ}}_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{2}}{4 μ}) . \end{matrix}

(A13)

and

\begin{matrix} \int_{0}^{\infty} x e^{- μ x^{2} - γ x} d x + \int_{0}^{\infty} x e^{- μ x^{2} + γ x} d x & = \frac{1}{μ}_{1} F_{1} (1, \frac{1}{2}; \frac{γ^{2}}{4 μ}) . \end{matrix}

(A14)

Since

γ = 2 λ = \frac{μ_{y}}{σ_{y}^{2}} + \frac{μ_{x}}{σ_{x}^{2}} z

, it’s obvious that

\frac{γ^{2}}{4 μ} = \frac{1}{2} q^{2}

.

In summary, the final expression of

p_{Z} (z)

is given by:

\begin{matrix} p_{Z} (z) & = \frac{ρ}{π (1 + ρ^{2} z^{2})} e^{- \frac{1}{2} \frac{1 + β^{2} ρ^{2}}{δ_{y}^{2}}}_{1} F_{1} (1, \frac{1}{2}; \frac{1}{2 δ_{y}^{2}} \frac{{(1 + β ρ^{2} z)}^{2}}{1 + ρ^{2} z^{2}}) \end{matrix}

(A15)

□

Appendix A.2. EM Algorithm Estimations

We recall that the aim of this appendix is to find the estimations of

θ_{x} = (μ_{x}, σ_{x})

and

θ_{y} = (μ_{y}, σ_{y})

associated to the maximisation problem:

\begin{matrix} {\hat{θ}}_{x} = arg max_{θ_{x}} \sum_{i = 1}^{N} E_{Y | Z} {ln f_{Z | Y} (Z_{i} | Y_{i}, θ_{x}) | z_{i}, θ^{'}} \end{matrix}

(A16)

\begin{matrix} {\hat{θ}}_{y} = arg max_{θ_{y}} \sum_{i = 1}^{N} E_{Y | Z} {ln f_{Y} (Y_{i} | θ_{y}) | z_{i}, θ^{'}} . \end{matrix}

(A17)

Estimation of $θ_{x} = (μ_{x}, σ_{x})$ :

Since

\begin{matrix} ln f_{Z | Y} (z_{i} | y_{i}, θ_{x}) & = ln (| y_{i} | f_{X} (z_{i} y_{i})) \end{matrix}

(A18)

\begin{matrix} = ln | y_{i} | - ln \sqrt{2 π} σ_{x} - \frac{{(z_{i} y_{i} - μ_{x})}^{2}}{2 σ_{x}^{2}}, \end{matrix}

(A19)

by replacing (A18) in (A16), differentiating with respect to

μ_{x}

and

σ_{x}

, and setting the result to zero:

\begin{matrix} {\hat{μ}}_{x} = \frac{1}{N} \sum_{i = 1}^{N} z_{i} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \end{matrix}

(A20)

\begin{matrix} {\hat{σ}}_{x}^{2} = \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {{(z_{i} Y_{i} - {\hat{μ}}_{x})}^{2} | z_{i}, θ^{'}} . \end{matrix}

(A21)

However,

\begin{matrix} E_{Y | Z} {{(z_{i} Y_{i} - {\hat{μ}}_{x})}^{2} | z_{i}, θ^{'}} & = E_{Y | Z} {z_{i}^{2} Y_{i}^{2} + {\hat{μ}}_{x}^{2} - 2 z_{i} Y_{i} {\hat{μ}}_{x} | z_{i}, θ^{'}} \\ = E_{Y | Z} {z_{i}^{2} Y_{i}^{2} | z_{i}, θ^{'}} + {\hat{μ}}_{x}^{2} - 2 z_{i} {\hat{μ}}_{x} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} . \end{matrix}

(A22)

Thus

\begin{matrix} \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {{(z_{i} Y_{i} - {\hat{μ}}_{x})}^{2} | z_{i}, θ^{'}} & = \frac{1}{N} \sum_{i = 1}^{N} z_{i}^{2} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} + {\hat{μ}}_{x}^{2} - 2 {\hat{μ}}_{x} \frac{1}{N} \sum_{i = 1}^{N} z_{i} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \\ = \frac{1}{N} \sum_{i = 1}^{N} z_{i}^{2} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} - {\hat{μ}}_{x}^{2} . \end{matrix}

(A23)

In summary

\begin{matrix} {\hat{μ}}_{x} = \frac{1}{N} \sum_{i = 1}^{N} z_{i} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \end{matrix}

(A24)

\begin{matrix} {\hat{σ}}_{x}^{2} & = \frac{1}{N} \sum_{i = 1}^{N} z_{i}^{2} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} - {\hat{μ}}_{x}^{2}, \end{matrix}

(A25)

where

E_{Y | Z} {Y_{i} | z_{i}, θ^{'}}

and

E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}}

are the posterior expectation values dependent of the distribution of Y.

Estimation of $θ_{y} = (μ_{y}, σ_{y})$ :

Since

\begin{matrix} ln f_{Y} (Y_{i} | θ_{y}) & = - ln \sqrt{2 π} σ_{y} - \frac{{(y_{i} - μ_{y})}^{2}}{2 σ_{y}^{2}}, \end{matrix}

(A26)

by replacing (A26) in (A17), differentiating with respect to

μ_{x}

and

σ_{x}

, and setting the result to zero:

\begin{matrix} {\hat{μ}}_{y} = \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} \end{matrix}

(A27)

\begin{matrix} {\hat{σ}}_{y}^{2} = \frac{1}{N} \sum_{i = 1}^{N} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} - {\hat{μ}}_{y}^{2} . \end{matrix}

(A28)

Determination of $E_{Y | Z} {Y_{i} | z_{i}, θ^{'}}$ and $E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}}$ :

The posterior expectation

E_{Y | Z} {Y_{i} | z_{i}, θ^{'}}

and

E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}}

are given by:

\begin{matrix} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} = \int_{- \infty}^{\infty} y_{i} f_{Y | Z} (y_{i} | z_{i}) d y_{i} \end{matrix}

(A29)

\begin{matrix} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} = \int_{- \infty}^{\infty} y_{i}^{2} f_{Y | Z} (y_{i} | z_{i}) d y_{i} . \end{matrix}

(A30)

Both these equations depends on the posterior distribution

f_{Y | Z} (y_{i} | z_{i})

given by:

\begin{matrix} f_{Y | Z} (y_{i} | z_{i}) & = \frac{| y_{i} | f_{X} (y_{i} z_{i}) f_{Y} (y_{i})}{g_{Z} (z_{i})} \\ = \frac{| y_{i} | e^{- \frac{{(y_{i} z_{i} - μ_{x})}^{2}}{2 σ_{x}^{2}}} e^{- \frac{{(y_{i} - μ_{y})}^{2}}{2 σ_{y}^{2}}}}{e^{- \frac{1}{2} (\frac{μ_{x}^{2}}{σ_{x}^{2}} + \frac{μ_{y}^{2}}{σ_{y}^{2}})} \frac{1}{μ}_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} \\ = \frac{| y_{i} | e^{- μ y_{i}^{2} + γ y_{i}}}{\frac{1}{μ}_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})}, \end{matrix}

(A31)

with

μ = \frac{1}{2} (\frac{z_{i}^{2}}{σ_{x}^{2}} + \frac{1}{σ_{y}^{2}})

and

γ = \frac{μ_{y}}{σ_{y}^{2}} + \frac{μ_{x}}{σ_{x}^{2}} z_{i}

.

Therefore, the posterior expectation

E_{Y | Z} {Y_{i} | z_{i}, θ^{'}}

is equal to:

\begin{matrix} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} & = \int_{- \infty}^{\infty} y_{i} f_{Y | Z} (y_{i} | z_{i}) d y_{i} \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} \int_{- \infty}^{\infty} y_{i} | y_{i} | e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} [\int_{0}^{\infty} y_{i}^{2} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} - \int_{- \infty}^{0} y_{i}^{2} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i}] \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} [\int_{0}^{\infty} y_{i}^{2} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} - \int_{0}^{\infty} t_{i}^{2} e^{- μ t_{i}^{2} - γ t_{i}} d t_{i}] \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} {(\frac{1}{2 μ})}^{\frac{3}{2}} Γ (3) e^{\frac{γ^{2}}{8 μ}} [D_{- 3} (- \frac{γ}{\sqrt{2 μ}}) - D_{- 3} (\frac{γ}{\sqrt{2 μ}})] . \end{matrix}

Since ∀

R e a l (ν) > 0

we have

\begin{matrix} D_{- ν} (- z) - D_{- ν} (z) & = 2 \times 2^{- \frac{ν}{2}} e^{- \frac{z^{2}}{4}} \frac{\sqrt{2 π}}{Γ (\frac{ν}{2})} z_{1} F_{1} (\frac{1 + ν}{2}, \frac{3}{2}; \frac{z^{2}}{2}), \end{matrix}

(A32)

and

Γ (3 / 2) = \sqrt{π} / 2

, the new expression of the posterior expectation is then given by

\begin{matrix} E_{Y | Z} {Y_{i} | z_{i}, θ^{'}} & = \frac{γ}{μ} \frac{_{1} F_{1} (2, \frac{3}{2}; \frac{γ^{2}}{4 μ})}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} . \end{matrix}

(A33)

As for the posterior expectation,

E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}}

, we have:

\begin{matrix} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} & = \int_{- \infty}^{\infty} y_{i}^{2} f_{Y | Z} (y_{i} | z_{i}) d y_{i} \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} \int_{- \infty}^{\infty} y_{i}^{2} | y_{i} | e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} [\int_{0}^{\infty} y_{i}^{3} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} - \int_{- \infty}^{0} y_{i}^{3} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i}] \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} [\int_{0}^{\infty} y_{i}^{3} e^{- μ y_{i}^{2} + γ y_{i}} d y_{i} + \int_{0}^{\infty} t_{i}^{3} e^{- μ t_{i}^{2} - γ t_{i}} d t_{i}] \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} {(\frac{1}{2 μ})}^{2} Γ (4) e^{\frac{γ^{2}}{8 μ}} [D_{- 4} (- \frac{γ}{\sqrt{2 μ}}) + D_{- 4} (\frac{γ}{\sqrt{2 μ}})] \\ = \frac{μ}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} {(\frac{1}{2 μ})}^{2} Γ (4) e^{\frac{γ^{2}}{8 μ}} [2 \times 2^{- 2} e^{- \frac{γ^{2}}{8 μ}} \frac{\sqrt{π}}{Γ (\frac{5}{2})}_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{2}}{4 μ})] \\ = \frac{1}{μ} \frac{_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{2}}{4 μ})}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} . \end{matrix}

In summary

\begin{matrix} E_{Y | Z} {Y_{i}^{2} | z_{i}, θ^{'}} & = \frac{1}{μ} \frac{_{1} F_{1} (2, \frac{1}{2}; \frac{γ^{2}}{4 μ})}{_{1} F_{1} (1, \frac{1}{2}, \frac{γ^{2}}{4 μ})} . \end{matrix}

(A34)

Appendix A.3. Mean Value of the Ratio

Set

X \sim N (μ_{x}, σ_{x})

and

Y \sim N (μ_{y}, σ_{y})

. We show in this appendix that the expectation of the ratio,

E {Z}

where

Z = X / Y

is given by

E {Z} = \frac{β}{δ_{y}^{2}}_{1} F_{1} (1, \frac{3}{2}; - \frac{1}{2 δ_{y}^{2}})

(A35)

with

β = \frac{μ_{x}}{μ_{y}}

and

δ_{y} = \frac{σ_{y}}{μ_{y}}

.

We start by developing the mathematical expectation of the ratio,

E {Z}

:

\begin{matrix} E {Z} = E {X} E {\frac{1}{Y}} = μ_{x} E {\frac{1}{Y}} . \end{matrix}

(A36)

Since

Y \sim N (μ_{y}, σ_{y})

, the new variable

T = 1 / Y

flows the reciprocal normal distribution with a pdf

f (t)

defined as follow

\begin{matrix} f (t) = \frac{1}{\sqrt{2 π} σ_{y} t^{2}} exp (- \frac{{(1 / t - μ_{y})}^{2}}{2 σ_{y}^{2}}) . \end{matrix}

(A37)

The mean of T doesn’t exist since

t f (t)

is not Lebesgue integrable. We propose here the expectation of the variable T by computing the Cauchy principal value of

t f (t)

given as follows:

\begin{matrix} E {T} & = P . V . \int_{- \infty}^{+ \infty} t f (t) d t = P . V . \frac{1}{\sqrt{2 π} σ_{y}} \int_{- \infty}^{+ \infty} \frac{1}{t} exp (- \frac{{(1 / t - μ_{y})}^{2}}{2 σ_{y}^{2}}) d t \\ = \frac{1}{\sqrt{2 π} σ_{y}} (lim_{ϵ \to 0} \int_{- \infty}^{- ϵ} \frac{1}{x} exp (- \frac{{(x - μ_{y})}^{2}}{2 σ_{y}^{2}}) d x + lim_{ϵ \to 0} \int_{ϵ}^{+ \infty} \frac{1}{x} exp (- \frac{{(x - μ_{y})}^{2}}{2 σ_{y}^{2}}) d x) \\ = \frac{1}{\sqrt{2 π} σ_{y}} (lim_{ϵ \to 0} \int_{ϵ}^{+ \infty} \frac{1}{x} exp (- \frac{{(x - μ_{y})}^{2}}{2 σ_{y}^{2}}) d x - lim_{ϵ \to 0} \int_{ϵ}^{+ \infty} \frac{1}{x} exp (- \frac{{(- x - μ_{y})}^{2}}{2 σ_{y}^{2}}) d x) \\ = \frac{1}{\sqrt{2 π} σ_{y}} lim_{ϵ \to 0} \int_{ϵ}^{+ \infty} \frac{1}{x} [exp (- \frac{{(x - μ_{y})}^{2}}{2 σ_{y}^{2}}) - exp (- \frac{{(- x - μ_{y})}^{2}}{2 σ_{y}^{2}})] d x \\ = \frac{2}{\sqrt{2 π} σ_{y}} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \int_{0}^{+ \infty} \frac{1}{x} sinh (\frac{x μ_{x}}{σ_{y}^{2}}) exp (- \frac{x^{2}}{2 σ_{y}^{2}}) d x . \end{matrix}

Using the following property (see [43], Section 3.562)

\begin{matrix} \int_{0}^{+ \infty} x^{2 μ - 1} exp (- β x^{2}) sinh (γ x) d x & = \frac{1}{2} Γ (2 μ) {(2 β)}^{- μ} exp (\frac{γ^{2}}{8 β}) \times \\ [D_{- 2 μ} (- \frac{γ}{\sqrt{2 β}}) - D_{- 2 μ} (\frac{γ}{\sqrt{2 β}})], \end{matrix}

(A38)

under the conditions

R e a l (μ) > - 1 / 2

and

R e a l (β) > 0

, and using Equation (A4), we can deduce:

\begin{matrix} D_{- 2 μ} (- \frac{γ}{\sqrt{2 β}}) - D_{- 2 μ} (\frac{γ}{\sqrt{2 β}}) = 2 \times 2^{- μ} exp (- \frac{γ^{2}}{8 β}) \frac{\sqrt{2 π}}{Γ (μ)} \frac{γ}{\sqrt{2 β}}_{1} F_{1} (\frac{1 + 2 μ}{2}, \frac{3}{2}, \frac{γ^{2}}{4 β}) . \end{matrix}

(A39)

Thus,

\begin{matrix} \int_{0}^{+ \infty} x^{2 μ - 1} exp (- β x^{2}) sinh (γ x) d x = 2^{- μ} {(2 β)}^{- μ} \frac{Γ (2 μ)}{Γ (μ)} \sqrt{2 π} \frac{γ}{\sqrt{2 β}}_{1} F_{1} (\frac{1 + 2 μ}{2}, \frac{3}{2}, \frac{γ^{2}}{4 β}) . \end{matrix}

(A40)

In the case

μ = 0

, the above equation becomes:

\begin{matrix} \int_{0}^{+ \infty} x^{- 1} exp (- β x^{2}) sinh (γ x) d x = lim_{μ \to 0} \frac{Γ (2 μ)}{Γ (μ)} \sqrt{2 π} \frac{γ}{\sqrt{2 β}}_{1} F_{1} (\frac{1}{2}, \frac{3}{2}, \frac{γ^{2}}{4 β}) . \end{matrix}

(A41)

Knowing that

lim_{μ \to 0} μ Γ (μ) = 1

then

lim_{μ \to 0} \frac{2 μ Γ (2 μ)}{2 μ Γ (μ)} = 1 / 2

. As a consequence,

\begin{matrix} \int_{0}^{+ \infty} x^{- 1} exp (- β x^{2}) sinh (γ x) d x = \sqrt{\frac{π}{2}} \frac{γ}{\sqrt{2 β}}_{1} F_{1} (\frac{1}{2}, \frac{3}{2}, \frac{γ^{2}}{4 β}) . \end{matrix}

(A42)

By applying this result to Equation (A38), we get then after identification:

\begin{matrix} E {T} & = \frac{μ_{y}}{σ_{y}^{2}} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}})_{1} F_{1} (\frac{1}{2}, \frac{3}{2}, \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \\ = \frac{μ_{y}}{σ_{y}^{2}}_{1} F_{1} (1, \frac{3}{2}, - \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) . \end{matrix}

(A43)

The last equation is obtained using the property (see [43], Section 9.212):

\begin{matrix} _{1} F_{1} (α, γ, z) = e^{z}_{1} F_{1} (γ - α, γ, - z) . \end{matrix}

(A44)

We have therefore proven that:

\begin{matrix} E {Z} & = μ_{x} \frac{μ_{y}}{σ_{y}^{2}}_{1} F_{1} (1, \frac{3}{2}; - \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \\ = \frac{β}{δ_{y}^{2}}_{1} F_{1} (1, \frac{3}{2}; - \frac{1}{2 δ_{y}^{2}}), with β = \frac{μ_{x}}{μ_{y}} and δ_{y} = \frac{σ_{y}}{μ_{y}} . \end{matrix}

(A45)

Appendix A.4. Fractional Moments of the Absolute Value of the Ratio

Set

X \sim N (μ_{x}, σ_{x})

and

Y \sim N (μ_{y}, σ_{y})

. We provide in this appendix the fractional absolute moments given by

\begin{matrix} {E {| Z |}^{s} {} = E {| X |}^{s} {} E {| 1 / Y |}^{s}}, Z = X / Y, \forall s, 0 < s < 1 \end{matrix}

(A46)

Set

T = 1 / Y

. We calculate then

E {| T |^{s}}

:

\begin{matrix} E {| T |^{s}} & = \frac{1}{\sqrt{2 π} σ_{y}} \int_{- \infty}^{+ \infty} {| t |}^{s - 2} exp (- \frac{{(\frac{1}{t} - μ_{y})}^{2}}{2 σ_{y}^{2}}) d t \\ = \frac{1}{\sqrt{2 π} σ_{y}} (\int_{- \infty}^{0} {(- t)}^{s - 2} exp - \frac{{(\frac{1}{t} - μ_{y})}^{2}}{2 σ_{y}^{2}} d t + \int_{0}^{\infty} t^{s - 2} exp - \frac{{(\frac{1}{t} - μ_{y})}^{2}}{2 σ_{y}^{2}} d t) \\ = \frac{1}{\sqrt{2 π} σ_{y}} (\int_{0}^{+ \infty} x^{- s} exp - \frac{{(- x - μ_{y})}^{2}}{2 σ_{y}^{2}} d x + \int_{0}^{+ \infty} x^{- s} exp - \frac{{(x - μ_{y})}^{2}}{2 σ_{y}^{2}} d x) \\ = \frac{1}{\sqrt{2 π} σ_{y}} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \int_{0}^{+ \infty} x^{- s} exp (- \frac{x^{2}}{2 σ_{y}^{2}}) [exp (\frac{μ_{y} x}{σ_{y}^{2}}) + exp (- \frac{μ_{y} x}{σ_{y}^{2}})] d x \\ = \frac{2}{\sqrt{2 π} σ_{y}} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \int_{0}^{+ \infty} x^{- s} exp (- \frac{x^{2}}{2 σ_{y}^{2}}) cosh (\frac{μ_{y} x}{σ_{y}^{2}}) d x . \end{matrix}

(A47)

Using the following property(see [43], Section 3.562):

\begin{matrix} \int_{0}^{+ \infty} x^{2 μ - 1} exp (- β^{2} x^{2}) cosh (γ x) d x & = \frac{1}{2} Γ (2 μ) {(2 β)}^{- μ} exp (\frac{γ^{2}}{8 β}) \times \\ [D_{- 2 μ} (- \frac{γ}{\sqrt{2 β}}) + D_{- 2 μ} (\frac{γ}{\sqrt{2 β}})] \\ R e a l (μ) > 0, R e a l (β) > 0 . \end{matrix}

(A48)

And knowing that

\begin{matrix} D_{- 2 μ} (- \frac{γ}{\sqrt{2 β}}) + D_{- 2 μ} (\frac{γ}{\sqrt{2 β}}) & = 2 \times 2^{- μ} exp (\frac{- γ^{2}}{8 β}) \frac{\sqrt{π}}{Γ (\frac{1 + 2 μ}{2})}_{1} F_{1} (μ, \frac{1}{2}; \frac{γ^{2}}{4 β}) \end{matrix}

(A49)

We deduce:

\begin{matrix} \int_{0}^{+ \infty} x^{2 μ - 1} exp (- β^{2} x^{2}) cosh (γ x) d x = Γ (2 μ) {(4 β)}^{- μ} \frac{\sqrt{π}}{Γ (\frac{1 + 2 μ}{2})}_{1} F_{1} (μ, \frac{1}{2}, \frac{γ^{2}}{4 β}) \end{matrix}

(A50)

Substituting this last property in Equation (A47) leads to:

\begin{matrix} E {| T |^{s}} & = \frac{2}{\sqrt{2 π} σ_{y}} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) Γ (1 - s) σ_{y}^{1 - s} 2^{\frac{s - 1}{2}} \frac{\sqrt{π}}{Γ (\frac{2 - s}{2})}_{1} F_{1} (\frac{1 - s}{2}, \frac{1}{2}, \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \end{matrix}

(A51)

After simplification, we have:

\begin{matrix} E {| T |^{s}} & = {(\frac{\sqrt{2}}{σ_{y}})}^{s} exp (- \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \frac{Γ (1 - s)}{Γ (1 - s / 2)}_{1} F_{1} (\frac{1 - s}{2}, \frac{1}{2}, \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \\ = {(\frac{\sqrt{2}}{σ_{y}})}^{s} \frac{Γ (1 - s)}{Γ (1 - s / 2)}_{1} F_{1} (\frac{s}{2}, \frac{1}{2}, - \frac{μ_{y}^{2}}{2 σ_{y}^{2}}) \end{matrix}

(A52)

We deduce the case of X:

\begin{matrix} E {| X |^{s}} & = {(\frac{σ_{x}}{\sqrt{2}})}^{s} \frac{Γ (1 + s)}{Γ (1 + s / 2)}_{1} F_{1} (\frac{- s}{2}, \frac{1}{2}, - \frac{μ_{x}^{2}}{2 σ_{x}^{2}}) \end{matrix}

(A53)

To conclude:

\begin{matrix} E {| Z |^{s}} & = {(\frac{σ_{x}}{σ_{y}})}^{s} \frac{Γ (1 - s) Γ (1 + s)}{Γ (1 - \frac{s}{2}) Γ (1 + \frac{s}{2})}_{1} F_{1} (\frac{s}{2}, \frac{1}{2}, - \frac{μ_{y}^{2}}{2 σ_{y}^{2}})_{1} F_{1} (\frac{- s}{2}, \frac{1}{2}, - \frac{μ_{x}^{2}}{2 σ_{x}^{2}}) \\ = ρ^{- s} \frac{Γ (1 - s) Γ (1 + s)}{Γ (1 - \frac{s}{2}) Γ (1 + \frac{s}{2})}_{1} F_{1} (\frac{s}{2}, \frac{1}{2}, - \frac{1}{2 δ_{y}^{2}})_{1} F_{1} (\frac{- s}{2}, \frac{1}{2}, - \frac{1}{2 δ_{x}^{2}}) \end{matrix}

(A54)

with

β = \frac{μ_{x}}{μ_{y}}, ρ = \frac{σ_{y}}{σ_{x}}, δ_{x} = \frac{σ_{x}}{μ_{x}}, δ_{y} = \frac{σ_{y}}{μ_{y}}

.

Appendix A.5. Simulation of the CV_Y

We investigate at first the variation of the mean value of the fractional moments for

δ_{y}

values between 0.2 and 0.3. These values correspond to

δ_{y}

of the bacteria data set. We simulate as follows: repeat 5000 times the calculation of the second-order fractional moments mean for ten pairs of observations

X_{i} / Y_{i}

, where

X_{i} \sim N (μ_{X}, σ_{X})

and

Y_{i} \sim N (μ_{Y}, σ_{Y})

, under varying coefficients of variation, with

μ_{X} / μ_{Y} = 0.15

. This corresponds to

\sqrt{μ_{X} / μ_{Y}} = 0.39

. Deduce the standard deviation associated with these 5000 values of the fractional means. The choice of

μ_{X} / μ_{Y} = 0.15

does not impact the simulation as pointed out in [40]. Nevertheless, we chose a value close to the bacteria data for this first simulation. The simulation results of

C V_{y}

are in Table A1.

Table A1. Simulation of the ratio distribution: mean values of the fractional moment of the second order and standard deviations (in brackets) for 5000 values of 10 pairs of observations

X_{i} / Y_{i}

, where

X_{i} \sim N (μ_{X}, σ_{X})

and

Y_{i} \sim N (μ_{Y}, σ_{Y})

, under varying coefficients of variation of Y and X, with

\sqrt{μ_{X} / μ_{Y}} = 0.39

.

Table A1. Simulation of the ratio distribution: mean values of the fractional moment of the second order and standard deviations (in brackets) for 5000 values of 10 pairs of observations

X_{i} / Y_{i}

, where

X_{i} \sim N (μ_{X}, σ_{X})

and

Y_{i} \sim N (μ_{Y}, σ_{Y})

, under varying coefficients of variation of Y and X, with

\sqrt{μ_{X} / μ_{Y}} = 0.39

.

${CV}_{X}$	${CV}_{Y}$
${CV}_{X}$	0.20	0.22	0.23	0.24	0.25	0.26	0.27	0.28	0.29	0.30
0.20	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.03)	(0.05)
0.21	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.04)	(0.02)	(0.03)	(0.04)	(0.03)	(0.05)
0.22	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.04)	(0.03)	(0.03)	(0.04)	(0.03)	(0.04)
0.23	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.04)	(0.04)
0.24	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.03)	(0.03)	(0.04)
0.25	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.04)	(0.03)	(0.08)
0.26	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.03)	(0.03)	(0.04)	(0.04)
0.27	0.39	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41	0.41
	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.06)	(0.03)	(0.03)	(0.04)	(0.08)
0.28	0.39	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41
	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.03)	(0.03)	(0.03)	(0.05)
0.29	0.39	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41
	(0.02)	(0.02)	(0.03)	(0.03)	(0.03)	(0.03)	(0.03)	(0.04)	(0.04)	(0.04)
0.30	0.39	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.41
	(0.02)	(0.02)	(0.02)	(0.03)	(0.03)	(0.06)	(0.03)	(0.06)	(0.04)	(0.04)

The estimated values of

\sqrt{μ_{X} / μ_{Y}}

are close to 0.39, and the standard deviations (in brackets) associated with the mean values of the fractional moments of the second order show a very low variation. Thus, we use the second-order fractional moment for the bacteria data set.

We secondly investigated the mean value of the fractional moment of the second order for the

C V_{y}

between 0.8 and 1.3. These values correspond to

δ_{y}

of the Fungal pathogen data set. It had a high variability for

C V_{y}

and a poor estimation of the mean value (we don’t show the results of this simulation for the sake of readability). We then simulated with a fractional moment of the fourth order (

s = 1 / 4

). This corresponds to

{(μ_{X} / μ_{Y})}^{1 / 4} = 0.65

. The results of this simulation are in Table A2.

Table A2. Simulation of the ratio distribution: mean values of the fractional moment of the fourth order and standard deviations (in brackets) for 5000 values of 10 pairs of observations

X_{i} / Y_{i}

, where

X_{i} \sim N (μ_{X}, σ_{X})

and

Y_{i} \sim N (μ_{Y}, σ_{Y})

, under varying coefficients of variation (CV), with

{(μ_{X} / μ_{Y})}^{1 / 4} = 0.65

.

Table A2. Simulation of the ratio distribution: mean values of the fractional moment of the fourth order and standard deviations (in brackets) for 5000 values of 10 pairs of observations

X_{i} / Y_{i}

, where

X_{i} \sim N (μ_{X}, σ_{X})

and

Y_{i} \sim N (μ_{Y}, σ_{Y})

, under varying coefficients of variation (CV), with

{(μ_{X} / μ_{Y})}^{1 / 4} = 0.65

.

${CV}_{X}$	${CV}_{Y}$
${CV}_{X}$	0.4	0.5	0.6	0.7	0.8	0.9	1	1.1	1.2	1.3
0.4	0.67	0.68	0.69	0.70	0.71	0.71	0.70	0.70	0.69	0.69
	(0.05)	(0.06)	(0.07)	(0.08)	(0.08)	(0.08)	(0.09)	(0.09)	(0.09)	(0.09)
0.5	0.66	0.68	0.69	0.70	0.70	0.70	0.70	0.69	0.69	0.68
	(0.06)	(0.07)	(0.08)	(0.08)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)
0.6	0.66	0.67	0.69	0.69	0.70	0.70	0.69	0.69	0.68	0.68
	(0.06)	(0.07)	(0.08)	(0.08)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)
0.7	0.65	0.67	0.69	0.69	0.69	0.69	0.69	0.69	0.68	0.67
	(0.06)	(0.07)	(0.08)	(0.08)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)
0.8	0.66	0.67	0.69	0.70	0.70	0.70	0.69	0.69	0.68	0.68
	(0.06)	(0.07)	(0.08)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)	(0.09)
0.9	0.66	0.67	0.69	0.70	0.70	0.70	0.69	0.69	0.69	0.68
	(0.06)	(0.07)	(0.09)	(0.09)	(0.10)	(0.09)	(0.10)	(0.09)	(0.09)	(0.09)
1	0.66	0.68	0.70	0.70	0.71	0.70	0.70	0.70	0.69	0.68
	(0.06)	(0.08)	(0.09)	(0.09)	(0.09)	(0.10)	(0.10)	(0.10)	(0.10)	(0.09)
1.1	0.67	0.69	0.70	0.71	0.71	0.71	0.71	0.71	0.70	0.69
	(0.06)	(0.08)	(0.09)	(0.09)	(0.09)	(0.10)	(0.10)	(0.10)	(0.10)	(0.10)
1.2	0.68	0.69	0.71	0.72	0.72	0.72	0.71	0.71	0.70	0.70
	(0.06)	(0.08)	(0.09)	(0.09)	(0.09)	(0.10)	(0.10)	(0.10)	(0.10)	(0.10)
1.3	0.68	0.70	0.71	0.72	0.72	0.72	0.72	0.72	0.71	0.71
	(0.06)	(0.08)	(0.09)	(0.09)	(0.09)	(0.10)	(0.10)	(0.10)	(0.10)	(0.11)

The standard deviations (in brackets) associated with the mean values of the fractional moments of the fourth order show a low variation, and the exact value of 0.65 is reasonably approximated. Therefore, we propose to use higher fractional moments for higher values of

δ_{y}

. In our data sets, we will use

s = 1 / 2

for the bacteria data set and

s = 1 / 4

for the fungal pathogen data set.

References

Maxwell, K.; Johnson, G.N. Chlorophyll fluorescence—A practical guide. J. Exp. Bot. 2000, 51, 659–668. [Google Scholar] [CrossRef] [PubMed]
Gorbe, E.; Calatayud, A. Applications of chlorophyll fluorescence imaging technique in horticultural research: A review. Sci. Hortic. 2012, 138, 24–35. [Google Scholar] [CrossRef]
Kalaji, H.M.; Schansker, G.; Ladle, R.J.; Goltsev, V.; Bosa, K.; Allakhverdiev, S.I.; Brestic, M.; Bussotti, F.; Calatayud, A.; Dąbrowski, P.; et al. Frequently asked questions about in vivo chlorophyll fluorescence: Practical issues. Photosynth. Res. 2014, 122, 121–158. [Google Scholar] [CrossRef] [PubMed]
Kalaji, H.M.; Schansker, G.; Brestic, M.; Bussotti, F.; Calatayud, A.; Ferroni, L.; Goltsev, V.; Guidi, L.; Jajoo, A.; Li, P.; et al. Frequently asked questions about chlorophyll fluorescence, the sequel. Photosynth. Res. 2017, 132, 13–66. [Google Scholar] [CrossRef] [PubMed]
Pérez-Bueno, M.L.; Pineda, M.; Barón, M. Phenotyping plant responses to biotic stress by chlorophyll fluorescence imaging. Front. Plant Sci. 2019, 10, 1135. [Google Scholar] [CrossRef]
Valcke, R. Can chlorophyll fluorescence imaging make the invisible visible? Photosynthetica 2021, 59, 381–398. [Google Scholar] [CrossRef]
Küpper, H.; Benedikty, Z.; Morina, F.; Andresen, E.; Mishra, A.; Trtílek, M. Analysis of OJIP chlorophyll fluorescence kinetics and QA reoxidation kinetics by direct fast imaging. Plant Physiol. 2019, 179, 369–381. [Google Scholar] [CrossRef]
McAusland, L.; Atkinson, J.A.; Lawson, T.; Murchie, E.H. High throughput procedure utilising chlorophyll fluorescence imaging to phenotype dynamic photosynthesis and photoprotection in leaves under controlled gaseous conditions. Plant Methods 2019, 15, 1–15. [Google Scholar] [CrossRef]
Harbinson, J.; Croce, R.; van Grondelle, R.; van Amerongen, H.; van Stokkum, I. Chlorophyll fluorescence as a tool for describing the operation and regulation of photosynthesis in vivo. In Light Harvesting in Photosynthesis; CRC Press: Boca Raton, FL, USA, 2018; pp. 539–571. [Google Scholar]
Schmierer, M.; Knopf, O.; Asch, F. Growth and photosynthesis responses of a super dwarf rice genotype to shade and nitrogen supply. Rice Sci. 2021, 28, 178–190. [Google Scholar] [CrossRef]
Pleban, J.R.; Guadagno, C.R.; Mackay, D.S.; Weinig, C.; Ewers, B.E. Rapid chlorophyll a fluorescence light response curves mechanistically inform photosynthesis modeling. Plant Physiol. 2020, 183, 602–619. [Google Scholar] [CrossRef]
Pavicic, M.; Overmyer, K.; Rehman, A.u.; Jones, P.; Jacobson, D.; Himanen, K. Image-Based Methods to Score Fungal Pathogen Symptom Progression and Severity in Excised Arabidopsis Leaves. Plants 2021, 10, 158. [Google Scholar] [CrossRef] [PubMed]
Rousseau, C.; Belin, E.; Bove, E.; Rousseau, D.; Fabre, F.; Berruyer, R.; Guillaumès, J.; Manceau, C.; Jacques, M.A.; Boureau, T. High throughput quantitative phenotyping of plant resistance using chlorophyll fluorescence image analysis. Plant Methods 2013, 9, 17. [Google Scholar] [CrossRef] [PubMed]
Leufen, G.; Noga, G.; Hunsche, M. Proximal sensing of plant-pathogen interactions in spring barley with three fluorescence techniques. Sensors 2014, 14, 11135–11152. [Google Scholar] [CrossRef] [PubMed]
Su, L.; Dai, Z.; Li, S.; Xin, H. A novel system for evaluating drought–cold tolerance of grapevines using chlorophyll fluorescence. BMC Plant Biol. 2015, 15, 82. [Google Scholar] [CrossRef]
Bresson, J.; Vasseur, F.; Dauzat, M.; Koch, G.; Granier, C.; Vile, D. Quantifying spatial heterogeneity of chlorophyll fluorescence during plant growth and in response to water stress. Plant Methods 2015, 11, 23. [Google Scholar] [CrossRef]
Tatagiba, S.D.; DaMatta, F.M.; Rodrigues, F.Á. Leaf gas exchange and chlorophyll a fluorescence imaging of rice leaves infected with Monographella albescens. Phytopathology 2015, 105, 180–188. [Google Scholar] [CrossRef]
Ajigboye, O.O.; Bousquet, L.; Murchie, E.H.; Ray, R.V. Chlorophyll fluorescence parameters allow the rapid detection and differentiation of plant responses in three different wheat pathosystems. Funct. Plant Biol. 2016, 43, 356–369. [Google Scholar] [CrossRef]
Dias, C.S.; Araujo, L.; Chaves, J.A.A.; DaMatta, F.M.; Rodrigues, F.A. Water relation, leaf gas exchange and chlorophyll a fluorescence imaging of soybean leaves infected with Colletotrichum truncatum. Plant Physiol. Biochem. 2018, 127, 119–128. [Google Scholar] [CrossRef]
Wen, Z.; Raffaello, T.; Zeng, Z.; Pavicic, M.; Asiegbu, F.O. Chlorophyll fluorescence imaging for monitoring effects of Heterobasidion parviporum small secreted protein induced cell death and in planta defense gene expression. Fungal Genet. Biol. 2019, 126, 37–49. [Google Scholar] [CrossRef]
Polonio, Á.; Pineda, M.; Bautista, R.; Martínez-Cruz, J.; Pérez-Bueno, M.L.; Barón, M.; Pérez-García, A. RNA-seq analysis and fluorescence imaging of melon powdery mildew disease reveal an orchestrated reprogramming of host physiology. Sci. Rep. 2019, 9, 7978. [Google Scholar] [CrossRef]
Kim, J.H.; Bhandari, S.R.; Chae, S.Y.; Cho, M.C.; Lee, J.G. Application of maximum quantum yield, a parameter of chlorophyll fluorescence, for early determination of bacterial wilt in tomato seedlings. Hortic. Environ. Biotechnol. 2019, 60, 821–829. [Google Scholar] [CrossRef]
Wang, S.; Leus, L.; Lootens, P.; Van Huylenbroeck, J.; Van Labeke, M.C. Germination Kinetics and Chlorophyll Fluorescence Imaging Allow for Early Detection of Alkalinity Stress in Rhododendron Species. Horticulturae 2022, 8, 823. [Google Scholar] [CrossRef]
Suárez, J.C.; Vanegas, J.I.; Contreras, A.T.; Anzola, J.A.; Urban, M.O.; Beebe, S.E.; Rao, I.M. Chlorophyll Fluorescence Imaging as a Tool for Evaluating Disease Resistance of Common Bean Lines in the Western Amazon Region of Colombia. Plants 2022, 11, 1371. [Google Scholar] [CrossRef]
Schlie, T.P.; Dierend, W.; Köpcke, D.; Rath, T. Detecting low-oxygen stress of stored apples using chlorophyll fluorescence imaging and histogram division. Postharvest Biol. Technol. 2022, 189, 111901. [Google Scholar] [CrossRef]
Brigmon, R.L.; McLeod, K.W.; Doman, E.; Seaman, J.C. The impact of tritium phytoremediation on plant health as measured by fluorescence. J. Environ. Radioact. 2022, 255, 107018. [Google Scholar] [CrossRef]
Sapoukhina, N.; Boureau, T.; Rousseau, D. Plant disease symptom segmentation in chlorophyll fluorescence imaging with a synthetic dataset. Front. Plant Sci. 2022, 13, 969205. [Google Scholar] [CrossRef]
D’Agostino, R.; Pearson, E.S. Tests for Departure from Normality. Empirical Results for the Distributions of b2 and √b1. Biometrika 1973, 60, 613–622. [Google Scholar] [CrossRef]
Pavicic, M. MDPI_leaf_infection. Available online: https://github.com/mipavici/MDPI_leaf_infection (accessed on 15 November 2022).
Berger, S.; Benediktyová, Z.; Matouš, K.; Bonfig, K.; Mueller, M.J.; Nedbal, L.; Roitsch, T. Visualization of dynamics of plant–pathogen interaction by novel combination of chlorophyll fluorescence imaging and statistical analysis: Differential effects of virulent and avirulent strains of P. syringae and of oxylipins on A. thaliana. J. Exp. Bot. 2006, 58, 797–806. [Google Scholar] [CrossRef]
Sánchez-Moreiras, A.M.; Graña, E.; Reigosa, M.J.; Araniti, F. Imaging of Chlorophyll a Fluorescence in Natural Compound-Induced Stress Detection. Front. Plant Sci. 2020, 11, 583590. [Google Scholar] [CrossRef]
Genty, B.; Meyer, S. Quantitative Mapping of Leaf Photosynthesis Using Chlorophyll Fluorescence Imaging. Aust. J. Plant Physiol. 1995, 22, 277–284. [Google Scholar] [CrossRef]
Marsaglia, G. Ratios of Normal Variables and Ratios of Sums of Uniform Variables. J. Am. Stat. Assoc. 1965, 60, 193–204. [Google Scholar] [CrossRef]
Díaz-Francés, E.; Rubio, F. On the existence of a normal approximation to the distribution of the ratio of two independent normal random variables. Stat. Pap. 2013, 54, 309–323. [Google Scholar] [CrossRef]
Pham-Gia, T.; Thanh, D.N. Hypergeometric Functions: From One Scalar Variable to Several Matrix Arguments, in Statistics and Beyond. Open J. Stat. 2016, 6, 951–994. [Google Scholar] [CrossRef]
Hayya, J.; Armstrong, D.; Gressis, N. A Note on the Ratio of Two Normally Distributed Variables. Manag. Sci. 1975, 21, 1338–1341. [Google Scholar] [CrossRef]
Kuethe, D.O.; Caprihan, A.; Gach, H.M.; Lowe, I.J.; Fukushima, E. Imaging obstructed ventilation with NMR using inert fluorinated gases. J. Appl. Physiol. 2000, 88, 2279–2286. [Google Scholar] [CrossRef]
Marsaglia, G. Ratios of Normal Variables. J. Stat. Softw. 2006, 16, 1–10. [Google Scholar] [CrossRef]
Marin, J.M.; Robert, C.P. Bayesian Essentials with R, 2nd ed.; Springer Texts in Statistics; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
Qiao, C.G.; Wood, G.R.; Lai, C.D.; Luo, D.W. Comparison of two common estimators of the ratio of the means of independent normal variables in agricultural research. J. Appl. Math. Decis. Sci. 2006, 2006, 78375. [Google Scholar] [CrossRef]
Foucher, J.; Ruh, M.; Briand, M.; Préveaux, A.; Barbazange, F.; Boureau, T.; Jacques, M.A.; Chen, N.W. Improving Common Bacterial Blight Phenotyping by Using Rub Inoculation and Machine Learning: Cheaper, Better, Faster, Stronger. Phytopathology® 2022, 112, 691–699. [Google Scholar] [CrossRef]
Meline, V.; Delage, W.; Brin, C.; Li-Marchetti, C.; Sochard, D.; Arlat, M.; Rousseau, C.; Darrasse, A.; Briand, M.; Lebreton, G.; et al. Role of the acquisition of a type 3 secretion system in the emergence of novel pathogenic strains of Xanthomonas. Mol. Plant Pathol. 2019, 20, 33–50. [Google Scholar] [CrossRef]
Zwillinger, D.; Moll, V. Preface to the Eighth Edition. In Table of Integrals, Series, and Products, 8th ed.; Zwillinger, D., Moll, V., Gradshteyn, I.S., Ryzhik, I.M., Eds.; Academic Press: Cambridge, MA, USA, 2014; p. xvii. [Google Scholar] [CrossRef]

Figure 1. Example of chlorophyll fluorescence images of Arabidopsis thaliana inoculated by a bacteria, at day 2, pot 19, a healthy pot (inoculated with water): (a)

F_{m}

maximum fluorescence and (b)

F_{0}

minimum fluorescence. The histograms (c,d) are the associated frequency distribution of pixel counts inside the region of interest drawn in a solid yellow line in (a,b), respectively. The dashed blue line in the histograms is the fit with a normal probability density function (pdf).

Figure 2. The black curve with circle points is the distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) for an increasing values of the parameters,

β

,

ρ

, and

δ_{y}

and the red curve with crossed points is the normal approximation of the distribution in each of these cases. (a) A case with perfect fit of the ratio density and the normal approximation; (b) a deviation from normal distribution; (c) a case where the ratio density is bimodal and the normal approximation is not appropriate.

Figure 3. The distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) is the black curve with circle points, and the normal approximation is the red curve with crossed points. The parameters

β

,

ρ

, and

δ_{y}

of the distribution

P Z

are associated with the mean value of these parameters over the six days of the acquisition of chlorophyll fluorescence images for (a) Healthy:

β = 0.15

,

ρ = 5.79

and

δ_{y} = 0.22

and (b) Diseased:

β = 0.24

,

ρ = 4.21

and

δ_{y} = 0.25

plants of the bacteria data set.

Figure 4. The distribution

P Z

of the ratio

F_{v} / F_{m}

(

F_{m}

maximum fluorescence and

F_{v} = F_{m} - F_{0}

,

F_{0}

minimum fluorescence) is the black curve with circle points, and the normal approximation is the red curve with crossed points. The parameters

β

,

ρ

, and

δ_{y}

of the distribution

P Z

are associated with the mean value of these parameters over the six days of the acquisition of chlorophyll fluorescence images for (a) Healthy:

β = 0.192

,

ρ = 5.377

and

δ_{y} = 0.461

and (b) Diseased:

β = 0.358

,

ρ = 3.359

and

δ_{y} = 0.976

plants of the fungal pathogen data set.

Table 1. Mean ± the standard deviation of four p-values associated with the D’Agostino test of normality in the limb of the Arabidopsis thaliana inoculated by a bacteria for

F_{m}

maximum fluorescence and

F_{0}

minimum fluorescence. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 1. Mean ± the standard deviation of four p-values associated with the D’Agostino test of normality in the limb of the Arabidopsis thaliana inoculated by a bacteria for

F_{m}

maximum fluorescence and

F_{0}

minimum fluorescence. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Time	$F_{m}$		$F_{0}$
Time	Healthy	Diseased	Healthy	Diseased
D0	0.53 ± 0.33	0.55 ± 0.25	0.59 ± 0.34	0.60 ± 0.33
D2	0.39 ± 0.25	0.59 ± 0.35	0.31 ± 0.25	0.57 ± 0.38
D5	0.60 ± 0.38	0.45 ± 0.31	0.33 ± 0.19	0.31± 0.18
D6	0.30 ± 0.22	0.46 ± 0.23	0.47± 0.33	0.44 ± 0.26
D7	0.27 ± 0.10	0.24 ± 0.19	0.43 ± 0.31	0.44 ± 0.10
D8	0.26 ± 0.16	0.25 ± 0.10	0.25 ± 0.14	0.11 ± 0.04

Table 2. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for images of plants inoculated with bacteria. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 2. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for images of plants inoculated with bacteria. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Time	$μ_{F_{0}}$		$σ_{F_{0}}$		$μ_{F_{m}}$		$σ_{F_{m}}$
Time	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased
D0	62.845	42.158	15.261	6.870	418.173	182.367	89.129	33.985
D2	64.756	78.609	16.343	15.380	432.239	338.601	95.681	63.571
D5	67.168	76.984	16.796	21.338	438.586	301.224	96.591	87.018
D6	67.402	77.424	17.150	21.545	444.460	306.174	99.267	86.616
D7	68.781	77.363	17.407	21.546	447.470	306.662	100.153	87.427
D8	67.256	74.562	16.961	21.719	441.044	305.809	98.580	86.968

Table 3. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for the data set of plants infected with fungal pathogen data. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Table 3. Mean

μ

, and standard deviation,

σ

, values on Healthy and Diseased tissues of chlorophyll fluorescence parameters

F_{0}

(minimum fluorescence) and

F_{m}

(maximum fluorescence) for the data set of plants infected with fungal pathogen data. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Time	$μ_{F_{0}}$		$σ_{F_{0}}$		$μ_{F_{m}}$		$σ_{F_{m}}$
Time	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased
0 h	142.747	-	31.056	-	810.626	-	192.474	-
24 h	122.533	123.685	52.404	77.543	676.331	329.290	275.724	277.567
48 h	105.450	82.616	56.436	66.731	537.056	222.493	295.606	203.232
72 h	121.525	73.177	40.216	42.067	618.474	172.596	225.188	150.189
96 h	103.579	43.691	50.808	41.099	526.299	103.953	274.768	133.111

Table 4. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the bacteria data set. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 4. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the bacteria data set. D0, …, and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Time	$β$		$ρ$		$δ_{y}$
Time	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased
D0	0.150	0.231	5.840	4.947	0.213	0.186
D2	0.150	0.232	5.855	4.133	0.221	0.188
D5	0.153	0.256	5.751	4.078	0.220	0.289
D6	0.152	0.253	5.788	4.020	0.223	0.283
D7	0.154	0.252	5.753	4.058	0.224	0.285
D8	0.152	0.244	5.812	4.004	0.224	0.284

Table 5. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Table 5. The values of

β

,

ρ

and

δ_{y}

associated with the fluorescence data on Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Time	$β$		$ρ$		$δ_{y}$
Time	Healthy	Diseased	Healthy	Diseased	Healthy	Diseased
0 h	0.176		6.198		0.237
24 h	0.181	0.376	5.262	3.580	0.408	0.843
48 h	0.196	0.371	5.238	3.046	0.550	0.913
72 h	0.196	0.424	5.599	3.570	0.364	0.870
96 h	0.197	0.420	5.408	3.239	0.522	1.280

Table 6. Second-order fractional moment of

P Z

distribution of the ratio for the healthy and diseased leaves of the bacteria data set. D0, …., and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Table 6. Second-order fractional moment of

P Z

distribution of the ratio for the healthy and diseased leaves of the bacteria data set. D0, …., and D8 are the six days of the acquisition of chlorophyll fluorescence images.

Plants	Time
Plants	D0	D2	D5	D6	D7	D8
Healthy	0.608	0.608	0.604	0.606	0.603	0.605
Diseased	0.514	0.514	0.478	0.482	0.483	0.492

Table 7. The mean value (

μ

) and the associated standard deviation (

σ

) of the second-order fractional moment of the Monte Carlo simulation for 10 and 80 sample sizes, with the first day of the experiment as a reference value, with the assumptions of Gaussian probability density function and the Gaussian approximation proposed in [34] and with the two non-Gaussian estimators, Bayesian and EM.

Table 7. The mean value (

μ

) and the associated standard deviation (

σ

) of the second-order fractional moment of the Monte Carlo simulation for 10 and 80 sample sizes, with the first day of the experiment as a reference value, with the assumptions of Gaussian probability density function and the Gaussian approximation proposed in [34] and with the two non-Gaussian estimators, Bayesian and EM.

Method of Estimation	Plants, Sample Size (N)
	Healthy,		Healthy,		Diseased,		Diseased,
	$N = 10$		$N = 80$		$N = 10$		$N = 80$
	$μ$	$σ$	$μ$	$σ$	$μ$	$σ$	$μ$	$σ$
Normal distribution	0.591	0.037	0.613	0.009	0.524	0.016	0.523	0.007
Normal approximation	0.618	0.021	0.618	0.007	0.523	0.019	0.523	0.007
Bayesian estimation	0.608	0.022	0.608	0.008	0.515	0.020	0.514	0.007
EM estimation	0.612	0.017	0.608	0.008	0.516	0.021	0.513	0.008

Table 8. Mean value of the relative error ( % ) for the five days of the experiment per method of estimation and per sample size, N, for Healthy and Diseased plants of the bacteria data set.

Method of Estimation	Plants, Sample Size (N)
	Healthy,	Healthy,	Diseased,	Diseased,
	$N = 10$	$N = 80$	$N = 10$	$N = 80$
Normal distribution	2	1	6	6
Normal approximation	2	2	6	6
Bayesian estimation	0	0	4	4
EM estimation	1	0	5	4

Table 9. Fourth-order fractional moment of

P Z

distribution of the ratio for the Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Table 9. Fourth-order fractional moment of

P Z

distribution of the ratio for the Healthy and Diseased plants of the fungal pathogen data set. 0 h, …, 96 h are the five times of the acquisition of chlorophyll fluorescence images.

Plants	Time
Plants	0 h	24 h	48 h	72 h	96 h
Healthy	0.349	0.335	0.304	0.322	0.306
Diseased	-	0.165	0.168	0.138	0.157

Table 10. The mean value (

μ

) and the associated standard deviation (

σ

) of the fourth-order fractional moment of the Monte Carlo simulation with 24 h as a reference value.

Table 10. The mean value (

μ

) and the associated standard deviation (

σ

) of the fourth-order fractional moment of the Monte Carlo simulation with 24 h as a reference value.

Method of Estimation	Plants, Sample Size (N)
	Healthy,		Healthy,		Diseased,		Diseased,
	$N = 10$		$N = 80$		$N = 10$		$N = 80$
	$μ$	$σ$	$μ$	$σ$	$μ$	$σ$	$μ$	$σ$
Normal distribution	0.399	0.037	0.277	0.018	0.284	0.053	0.055	0.025
Normal approximation	0.370	0.035	0.369	0.012	0.231	0.055	0.231	0.019
Bayesian estimation	0.337	0.047	0.334	0.018	0.167	0.107	0.165	0.038
EM estimation	0.327	0.070	0.334	0.017	0.163	0.106	0.165	0.037

Table 11. Mean value of the relative error (%) for the five times of the experiment per method of estimation and per number of observations for Healthy and Diseased plants of the fungal pathogen data set.

Method of Estimation	Plants, Sample Size (N)
	Healthy,	Healthy,	Diseased,	Diseased,
	$N = 10$	$N = 80$	$N = 10$	$N = 80$
Normal distribution	26	12	82	65
Normal approximation	17	17	48	48
Bayesian estimation	7	6	7	7
EM estimation	5	6	6	7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

On the Importance of Non-Gaussianity in Chlorophyll Fluorescence Imaging

Abstract

1. Introduction

2. Material

2.1. Arabidopsis thaliana Inoculated by a Bacteria

2.2. Arabidopsis thaliana Infected with a Fungal Pathogen

3. Methods

3.1. Statistical Model of Fv/Fm

3.2. Estimation of the PZ Parameters

3.2.1. Bayesian Estimation

3.2.2. EM Estimation of the Parameters

3.3. Comparison with Normal Assumptions Baseline

3.4. Numerical Experiments

3.4.1. Fractional Moments

3.4.2. Monte Carlo Experiments

4. Results

4.1. Arabidopsis thaliana Inoculated by a Bacteria

4.2. Arabidopsis thaliana Infected with a Fungal Pathogen

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Ratio Distribution

Appendix A.2. EM Algorithm Estimations

Appendix A.3. Mean Value of the Ratio

Appendix A.4. Fractional Moments of the Absolute Value of the Ratio

Appendix A.5. Simulation of the CV_Y

References

Article Metrics

Citations

Article Access Statistics

On the Importance of Non-Gaussianity in Chlorophyll Fluorescence Imaging

Abstract

1. Introduction

2. Material

2.1. Arabidopsis thaliana Inoculated by a Bacteria

2.2. Arabidopsis thaliana Infected with a Fungal Pathogen

3. Methods

3.1. Statistical Model of Fv/Fm

3.2. Estimation of the PZ Parameters

3.2.1. Bayesian Estimation

3.2.2. EM Estimation of the Parameters

3.3. Comparison with Normal Assumptions Baseline

3.4. Numerical Experiments

3.4.1. Fractional Moments

3.4.2. Monte Carlo Experiments

4. Results

4.1. Arabidopsis thaliana Inoculated by a Bacteria

4.2. Arabidopsis thaliana Infected with a Fungal Pathogen

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Ratio Distribution

Appendix A.2. EM Algorithm Estimations

Appendix A.3. Mean Value of the Ratio

Appendix A.4. Fractional Moments of the Absolute Value of the Ratio

Appendix A.5. Simulation of the CVY

References

Article Metrics

Citations

Article Access Statistics

Appendix A.5. Simulation of the CV_Y