The method uses the cross-validation technique, which is based on dividing the dataset into test data (also known as validation data) and training data. The nearest neighbors algorithms are applied to classify the data, then the maximum likelihood estimate and the error risk calculation are performed to find the optimal parameters. Finally, the part of the training information obtained is applied to analyze the test data.

#### 5.2. Numerical Illustrations

We first examine a simple numerical example for quantum data, then a case of data with errors and finally a case of misreported claims. For all the cases, we take $\lambda =1$ and $\Delta t=1$, for instance.

**Numerical example**. Consider the following dataset

which is divided in two subsets

The successive steps of the procedure described in

Section 5.1 are applied from the estimate

$({u}_{0},{d}_{0})=(40,25)$.

(1) Maxwell-Boltzmann likelihood (

11) with risk function (

14). We obtain the following results (

Table 1).

Choosing $M=0.01$, we see that $(u,d)=(15,9)$ and $(p,q)=(0.1,0.9)$. The associated minimum risk is $2.987181$ and the maximum likelihood value is 2.198608 × 10${}^{-7}$. The loop takes 8 steps, i.e., it works very fast for a small data set.

To reduce overfitting, we apply the

k-fold cross-validation method with

$k=2$. This gives the results below (

Table 2).

When ${V}_{1}$ is the training set, we get $(\overline{u},\overline{d})=(1/2)({u}_{1}+{u}_{2},{d}_{1}+{d}_{2})=(13,8.5)$ and $(\overline{p},\overline{q})=(1/2)({p}_{1}+{p}_{2},{q}_{1}+{q}_{2})=(0.3,0.7)$, with $F(13,8.5)=1.9884$. Thus, there is a significant reduction in the risk function with a somewhat close $(u,d)$.

(2) Bose-Einstein likelihood (

12) with risk function (

15). Here are the numerical results (

Table 3).

Observe that we obtain the same $(u,d)=(15,9)$ but with probabilities $(p,q)=(0.13,0.87)$. Again it takes 8 steps to reach the level $M=0.01$.

A 2-fold cross-validation method improves the results as follows (

Table 4).

With ${V}_{1}$ as training set, we get $(\overline{u},\overline{d})=(13,8.5)$ and $(\overline{p},\overline{q})=(0.33,0.67)$, with $F(13,8.5)=2.0047$ instead of $2.963947$ obtained before.

(3) Bose-Einstein likelihood (

13) with risk function (

16). The results are in the following table (

Table 5), again for

$M=0.01$.

The results here are somewhat different since $(u,d)=(17,9)$ and $(p,q)=(0.57,0.43)$. The loop now takes only six steps. For this dataset, the model which fits best, i.e., with the smallest risk function, is using Bose-Einstein statistics.

We also performed several numerical experiments with simulated data. In the examples (4)–(7) below, the simulations yield datasets of size $n=100$ ($n=1000$ was used too), and the calculations are made with $M=0.1$.

(4) Uniform random data (

Table 6). As in the examples (1), (2), we apply the usual Maxwell-Boltzmann and Bose-Einstein statistics.

We notice that the best fit is not always given by the Maxwell-Boltzmann statistics.

**Data with errors**. We wish to examine a dataset disturbed by an error. For that, we start with a set

$\{{j}_{1},{j}_{2},\dots ,{j}_{n}\}$ of true observables

$\{0,u,d,u+d,2u,2d\}$. Then, we add a special random error

$\{{e}_{1},{e}_{2},\dots ,{e}_{n}\}$ so that the dataset generated is given by

Below, we choose

${e}_{i}\in \{\mu ,0,-\mu \}$ where

$\mu $ has three possible values

$1,2,10$.

(5) Random data with errors (

Table 7). The non-perturbed data

$\{{j}_{1},{j}_{2},\dots ,{j}_{n}\}$ come from a uniform sampling in

$\{0,u,d,u+d,2u,2d\}$.

We see that, as before, the best model depends on the dataset. In the case of a small error $\mu $, the results are of course very close.

(6) Adjusted random data with errors (

Table 8). The non-perturbed dataset

$\{{j}_{1},{j}_{2},\dots ,{j}_{n}\}$ is obtained by simulation according to the way 2.

The results are close when $\mu $ is small and slightly different when $\mu $ increases.

**Misreported data**. Data samples may not report or misreport claims, either by mistake or voluntarily. This can also occur because of a change of risk. Let

V be a dataset with

n reported claims and

m misreported claims:

where

m is known but the true claim amounts are unknown. To handle the missing data, we apply a nearest neighbour approach and approximate the missing quantity by the average of the

k closest neighbours. Below, we choose

$k=\sqrt{n}$.

(7) Random data with misreports (

Table 9). First, data

$\{{v}_{1},{v}_{2},\dots ,{v}_{n}\}$ are generated according to the Maxwell-Boltzman model perturbed by errors via (

17). Then, random errors

$\{{e}_{1},{e}_{2},\dots ,{e}_{m}\}$ are generated to replace missing data, where

m is of values

$0,5,20$ (

$m=0$ meaning no missing data). Finally, the two datasets are combined by putting the errors at random position.

As expected, a small value of m does not affect the results very much. What is a little surprising is that for a relatively large value $m=20$ ($20\%$), estimates of probabilities change slightly ($2\%$) but estimates for claim amounts are significantly modified ($20\%$).

In practice, the algorithm works well and quickly in most situations. We also performed numerical calculations with a grid size of $\Delta t=0.1$, and it is essentially the value of the risk function that is affected.