# Beyond Henssge’s Formula: Using Regression Trees and a Support Vector Machine for Time of Death Estimation in Forensic Medicine

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data-Driven Model

#### 2.1.1. The Generation of Data and Test Data

- Time (h): 1–18, with a step of 0.5 h.
- Ambient temperature ($\xb0\mathrm{C}$): −10–35, in increments of 0.5 $\xb0\mathrm{C}$.
- Correction factor: 0.7, 0.9, 1.0, 1.1, 1.2,1.3, 1.4, based on Table 5 in [49]
- Body weight (kg): between 50 and 100 kg, with a precision of 0.5 kg, drawn from a normal distribution with postselection (mean of 70 kg with $\sigma $ large enough to generate an appropriate quantity of test data close to the upper limit).
- Rectal temperature ($\xb0\mathrm{C}$). Based on the randomly selected data described above, it was calculated from the Henssge formula according to Algorithm 1 which uses Algorithm 2.
- The number of desired data points, which is an approximate value, since some of the weights drawn from a normal distribution were outside of the desired range and therefore were not considered in either the training or test data sets.

- (1)
- The ambient temperature must not be higher than the measured rectal temperature. Since the rectal temperature was calculated during data generation, this case could not occur.
- (2)
- For the correction factors, the value does not need to be adjusted based on weight until 1.4. Beyond this value, it must be corrected, but our model is currently not set up for this (see Table 5 in [49]).
- (3)

- (1)
- Randomly select one parameter set (weight, correction factor, environmental temperature) from the required sets of parameters for the Henssge formula.
- (2)
- Determine the rectal temperature by evaluating the Henssge formula.

Algorithm 1: Calculating rectal temperature. |

Algorithm 2: Body weight adjusted by correction factor. |

Algorithm 3: Generating training data and test data. |

#### 2.1.2. Training

#### 2.1.3. Testing

#### 2.1.4. Error Calculation

#### Sum of Squared Residuals (SSR)

#### Mean Squared Error (MSE)

#### Mean Absolute Error (MAE)

#### Coefficient of Determination (${R}^{2}$)

## 3. Results

- Regression tree;
- Random forests;
- Extremely randomized trees;
- Tree modified with the bagging method;
- SVR with an RBF kernel;
- SVR improved with adaptive boosting.

#### Results of Training

`C`parameter represents the compromise between minimizing false classification errors and maximizing the decision boundary, meaning the higher the value of

`C`, the fewer the false classifications and the stricter the decision margin, we performed four additional control runs with higher

`C`values (10, 20, 50, 100) to check the accuracy of the SVR estimate when improved by adaptive boosting for these four cases as well. The results of these were divided into two parts, first for SVR alone, and then for the improved results using the adaptive boosting method.

`C`= 50 and

`C`= 100 at at a sample size of approximately 11,000 were as follows: Based on Table 2 and Table 3, it can be concluded that increasing the value of

`C`further improved the achieved results. By breaking down the average error of the 25% of test data into correction factors with a 5 kg binning, we determined for the cases of

`C`= 5 and

`C`= 100 (see in Figure 2) that in the former case, the error was approximately $\pm 0.3$ h = 20 min, and in the latter case, the two worst results were approximately $\pm 0.16$ h = ±9.6 min, but the average errors were below 4 min.

## 4. Discussion

**Table 5.**Comparison of our SVM and AdaBoost + SVM results with the result from the multilayer feedforward network (neural method) by Zerdazi et al. [30] in case of Scenario 1.

Name | MAE | MSE |
---|---|---|

Neural method | 1.85 | 5.69 |

SVR | 0.17 | 0.14 |

AdaBoost + SVR | 0.17 | 0.12 |

**Table 6.**Comparison of our SVM and AdaBoost + SVM results with the result of the multilayer feedforward network (neural method) by Zerdazi et al. [30] in case of Scenario 2.

Name | MAE | MSE |
---|---|---|

Neural method | 0.86 | 1.21 |

SVR | 0.18 | 0.08 |

AdaBoost + SVR | 0.14 | 0.05 |

## 5. Conclusions

## Abbreviations

MAE | Mean absolute error |

MSE | Mean squared error |

PMI | Post mortem interval |

${\mathrm{R}}^{2}$ | Coefficient of determination |

RBF | Radial basis function |

SSR | Sum of squared residuals |

SVM | Support vector machine |

SVR | Support vector regression |

TSS | Sum of squared errors |

## Appendix A

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 3.210 | 0.028 | 1.4494 | 4.1032 | 0.8285 | random_state = 10 |

2 | 1955 | 4.079 | 0.035 | 1.1186 | 2.7904 | 0.8814 | criterion = ’friedman_mse’ |

3 | 2925 | 4.773 | 0.085 | 1.0143 | 2.2565 | 0.9114 | criterion = ’friedman_mse’, random_state = 10 |

4 | 3909 | 3.410 | 0.066 | 1.0092 | 2.2439 | 0.9048 | criterion = ’poisson’, random_state = 10 |

5 | 4881 | 3.994 | 0.084 | 0.9533 | 1.8468 | 0.9199 | criterion = ’friedman_mse’, random_state = 50 |

6 | 5864 | 3.757 | 0.094 | 0.9168 | 1.7343 | 0.9275 | criterion = ’friedman_mse’, random_state = 10 |

7 | 6806 | 5.299 | 0.106 | 0.8515 | 1.4768 | 0.9386 | default |

8 | 7824 | 5.210 | 0.122 | 0.8436 | 1.5394 | 0.9380 | criterion = ’friedman_mse’, random_state = 100 |

9 | 8785 | 4.753 | 0.139 | 0.7854 | 1.2311 | 0.9476 | random_state = 50 |

10 | 9759 | 5.269 | 0.150 | 0.7842 | 1.2724 | 0.9480 | criterion = ’friedman_mse’, random_state = 50 |

11 | 10755 | 4.995 | 0.166 | 0.7522 | 1.2489 | 0.9505 | criterion = ’poisson’, random_state = 25 |

12 | 11708 | 6.811 | 0.184 | 0.7371 | 1.1654 | 0.9523 | criterion = ’poisson’, random_state = 10 |

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 39.188 | 2.749 | 0.9662 | 1.7966 | 0.9249 | criterion = ’poisson’, max_features = None, n_estimators = 200, random_state = 10 |

2 | 1955 | 61.648 | 2.630 | 0.7092 | 1.0064 | 0.9572 | max_features = None, random_state = 25 |

3 | 2925 | 106.174 | 8.661 | 0.6797 | 0.9713 | 0.9619 | criterion = ’poisson’, max_features = None, n_estimators = 200, random_state = 25 |

4 | 3909 | 120.701 | 9.031 | 0.6110 | 0.8032 | 0.9659 | criterion = ’poisson’, max_features = None, n_estimators = 150, random_state = 25 |

5 | 4881 | 148.617 | 14.466 | 0.5761 | 0.6445 | 0.9720 | criterion = ’poisson’, max_features = None, n_estimators = 200, random_state = 25 |

6 | 5864 | 154.820 | 15.039 | 0.5370 | 0.5759 | 0.9759 | max_features = None, n_estimators = 200, random_state = 50 |

7 | 6806 | 180.887 | 15.659 | 0.4801 | 0.4774 | 0.9802 | criterion = ’friedman_mse’, max_features = None, n_estimators = 150 |

8 | 7824 | 207.772 | 15.076 | 0.4967 | 0.5081 | 0.9795 | criterion = ’poisson’, max_features = None, n_estimators = 150 |

9 | 8785 | 266.933 | 24.548 | 0.4510 | 0.4271 | 0.9818 | criterion = ’poisson’, max_features = None, n_estimators = 200 |

10 | 9759 | 342.604 | 31.534 | 0.4608 | 0.4486 | 0.9817 | criterion = ’poisson’, max_features = None, n_estimators = 200, random_state = 10 |

11 | 10755 | 401.349 | 31.706 | 0.4252 | 0.3816 | 0.9849 | criterion = ’friedman_mse’, max_features = None, n_estimators = 200, random_state = 10 |

12 | 11708 | 344.101 | 29.897 | 0.4134 | 0.3625 | 0.9852 | criterion = ’poisson’, max_features = None, n_estimators = 200, random_state = 10 |

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 12.355 | 0.883 | 0.8422 | 1.3935 | 0.9417 | max_features = None, n_estimators = 50 |

2 | 1955 | 24.224 | 5.954 | 0.5849 | 0.7086 | 0.9699 | criterion = ’friedman_mse’, max_features = None, n_estimators = 200 |

3 | 2925 | 29.064 | 9.108 | 0.5909 | 0.7653 | 0.9700 | criterion = ’friedman_mse’, max_features = None, n_estimators = 200 |

4 | 3909 | 40.331 | 13.009 | 0.5108 | 0.5983 | 0.9746 | criterion = ’friedman_mse’, max_features = None, n_estimators = 200 |

5 | 4881 | 48.345 | 10.611 | 0.4896 | 0.5011 | 0.9783 | criterion = ’poisson’, max_features = None, n_estimators = 125 |

6 | 5864 | 60.539 | 17.568 | 0.4568 | 0.4608 | 0.9807 | criterion = ’poisson’, max_features = None, n_estimators = 200 |

7 | 6806 | 67.474 | 11.464 | 0.4143 | 0.3983 | 0.9834 | criterion = ’poisson’, max_features = None, n_estimators = 125 |

8 | 7824 | 81.665 | 23.157 | 0.4052 | 0.3798 | 0.9847 | max_features = None, n_estimators = 200 |

9 | 8785 | 103.041 | 15.619 | 0.3745 | 0.3372 | 0.9856 | criterion = ’poisson’, max_features = None, n_estimators = 125 |

10 | 9759 | 132.059 | 32.412 | 0.3645 | 0.3218 | 0.9869 | criterion = ’friedman_mse’, max_features = None, n_estimators = 200 |

11 | 10755 | 144.701 | 33.263 | 0.3419 | 0.2745 | 0.9891 | max_features = None, n_estimators = 200 |

12 | 11708 | 139.604 | 30.451 | 0.3437 | 0.2850 | 0.9883 | max_features = None, n_estimators = 200 |

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 2.176 | 1.593 | 0.9907 | 1.8638 | 0.9221 | n_estimators = 100 |

2 | 1955 | 3.515 | 4.360 | 0.6796 | 0.9420 | 0.9600 | n_estimators = 125 |

3 | 2925 | 4.198 | 4.913 | 0.6961 | 1.0027 | 0.9606 | n_estimators = 100 |

4 | 3909 | 6.989 | 13.231 | 0.6193 | 0.8344 | 0.9646 | n_estimators = 200 |

5 | 4881 | 7.111 | 10.565 | 0.5726 | 0.6419 | 0.9722 | n_estimators = 125 |

6 | 5864 | 7.989 | 18.109 | 0.5387 | 0.5820 | 0.9757 | n_estimators = 200 |

7 | 6806 | 6.822 | 10.743 | 0.4857 | 0.4862 | 0.9798 | n_estimators = 100 |

8 | 7824 | 9.390 | 24.057 | 0.4859 | 0.4912 | 0.9802 | n_estimators = 200 |

9 | 8785 | 9.695 | 19.855 | 0.4592 | 0.4441 | 0.9811 | n_estimators = 150 |

10 | 9759 | 9.830 | 17.622 | 0.4616 | 0.4524 | 0.9815 | n_estimators = 150 |

11 | 10755 | 11.064 | 24.376 | 0.4313 | 0.3870 | 0.9847 | n_estimators = 150 |

12 | 11708 | 11.208 | 24.229 | 0.4153 | 0.3629 | 0.9851 | n_estimators = 200 |

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 0.749 | 0.070 | 0.9406 | 2.5332 | 0.8941 | C = 5, epsilon = 0.005, gamma = 1 |

2 | 1955 | 3.184 | 0.163 | 0.5779 | 1.2309 | 0.9477 | C = 5, epsilon = 0.01, gamma = 1 |

3 | 2925 | 5.612 | 0.333 | 0.4774 | 0.8372 | 0.9671 | C = 5, epsilon = 0.01, gamma = 1 |

4 | 3909 | 9.611 | 0.360 | 0.4257 | 0.9012 | 0.9618 | C = 5, epsilon = 0.05, gamma = 1 |

5 | 4881 | 13.404 | 0.626 | 0.3987 | 0.7310 | 0.9683 | C = 5, epsilon = 0.05, gamma = 1 |

6 | 5864 | 18.982 | 0.612 | 0.3873 | 0.6430 | 0.9731 | C = 5, epsilon = 0.05, gamma = 1 |

7 | 6806 | 28.362 | 1.058 | 0.3181 | 0.4621 | 0.9808 | C = 5, epsilon = 0.01, gamma = 1 |

8 | 7824 | 28.212 | 0.824 | 0.3372 | 0.4831 | 0.9805 | C = 5, gamma = 1 |

9 | 8785 | 36.948 | 1.110 | 0.2882 | 0.3819 | 0.9837 | C = 5, epsilon = 0.05, gamma = 1 |

10 | 9759 | 50.228 | 1.318 | 0.3093 | 0.4550 | 0.9814 | C = 5, epsilon = 0.05, gamma = 1 |

11 | 10755 | 58.958 | 1.651 | 0.2763 | 0.3306 | 0.9869 | C = 5, epsilon = 0.01, gamma = 1 |

12 | 11708 | 70.528 | 2.318 | 0.2753 | 0.3388 | 0.9861 | C = 5, epsilon = 0.01, gamma = 2 |

(#) | Case Number | Training Time (s) | Prediction Time (s) | MAE | MSE | R2 | Best Parameters |
---|---|---|---|---|---|---|---|

1 | 986 | 30.903 | 1.401 | 0.9026 | 2.1245 | 0.9112 | loss = ’exponential’, n_estimators = 20, random_state = 15 |

2 | 1955 | 132.368 | 2.591 | 0.5437 | 0.8068 | 0.9657 | loss = ’exponential’, n_estimators = 20, random_state = 15 |

3 | 2925 | 262.288 | 3.775 | 0.4243 | 0.4630 | 0.9818 | loss = ’square’, n_estimators = 20 |

4 | 3909 | 474.196 | 7.863 | 0.3575 | 0.4367 | 0.9815 | loss = ’exponential’, n_estimators = 20 |

5 | 4881 | 737.671 | 9.702 | 0.3358 | 0.3134 | 0.9864 | loss = ’exponential’, n_estimators = 20, random_state = 15 |

6 | 5864 | 1163.730 | 13.094 | 0.3190 | 0.2559 | 0.9893 | loss = ’exponential’, n_estimators = 20, random_state = 25 |

7 | 6806 | 1507.816 | 16.351 | 0.2619 | 0.1962 | 0.9918 | loss = ’exponential’, n_estimators = 20, random_state = 10 |

8 | 7824 | 1778.717 | 19.738 | 0.2667 | 0.1764 | 0.9929 | loss = ’exponential’, n_estimators = 20 |

9 | 8785 | 2378.614 | 25.049 | 0.2325 | 0.1434 | 0.9939 | loss = ’exponential’, n_estimators = 20, random_state = 25 |

10 | 9759 | 3199.345 | 31.656 | 0.2607 | 0.1788 | 0.9927 | n_estimators = 20, random_state = 25 |

11 | 10755 | 3350.830 | 33.012 | 0.2102 | 0.1115 | 0.9956 | loss = ’exponential’, n_estimators = 20, random_state = 25 |

12 | 11708 | 4460.481 | 44.018 | 0.2177 | 0.1323 | 0.9946 | loss = ’square’, n_estimators = 20, random_state = 2 |

**Figure 2.**Average error with a 5 kg windowing as a function of the correction factor for

`C`= 5 and

`C`= 100 cases with the AdaBoost + SVR model.

${\mathit{T}}_{\mathit{a}}$ | A |
---|---|

≤$23.2$$\xb0\mathrm{C}$ | 1.25 |

≥$23.3$$\xb0\mathrm{C}$ | 1.11 |

MAE | MSE | ${\mathbf{R}}^{2}$ | |
---|---|---|---|

C = 10 | 0.2578 | 0.2746 | 0.9886 |

C = 20 | 0.2255 | 0.2252 | 0.9906 |

C = 50 | 0.1979 | 0.1828 | 0.9924 |

C = 100 | 0.1683 | 0.1290 | 0.9947 |

MAE | MSE | ${\mathbf{R}}^{2}$ | |
---|---|---|---|

C = 10 | 0.2177 | 0.1340 | 0.9944 |

C = 20 | 0.1875 | 0.0987 | 0.9959 |

C = 50 | 0.1820 | 0.1109 | 0.9954 |

C = 100 | 0.1606 | 0.0762 | 0.9969 |

Name | $1\mathit{\sigma}$ Value | $2\mathit{\sigma}$ Value | $1\mathit{\sigma}$ | $2\mathit{\sigma}$ |
---|---|---|---|---|

Decision tree | −1.1751–1.0571 | −2.2912–2.1732 | 2161 (80.36%) | 2546 (94.68%) |

Bagging | −0.64297–0.60864 | −1.2688–1.2344 | 2004 (74.53%) | 2506 (93.19%) |

Random forests | −0.64995–0.59144 | −1.2706–1.2121 | 2034 (75.64%) | 2504 (93.12%) |

Extra trees | −0.5316–0.51464 | −1.0601–1.0395 | 2064 (76.76%) | 2514 (93.49%) |

SVR | −0.60545–0.54442 | −1.1804–1.1194 | 2292 (85.24%) | 2569 (95.54%) |

AdaBoost + SVR | −0.3423–0.32924 | −0.67807–0.66501 | 2076 (77.2%) | 2552 (94.91%) |

