# Predictive Factors for Neutralizing Antibody Levels Nine Months after Full Vaccination with BNT162b2: Results of a Machine Learning Analysis

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Clinical Study

#### 2.2. Antibodies Measurement

#### 2.3. Data Collection

#### 2.4. Machine Learning

#### 2.5. Principal Component Analysis

#### 2.6. Factor Analysis of Mixed Data

#### 2.7. K-Means Cluster Analysis

#### 2.8. Random Forest

## 3. Results

#### 3.1. Demographics

#### 3.2. Neutralizing Antibody Levels

#### 3.3. Principal Component Analysis and K-Means Clustering

#### 3.4. Factor Analysis of Mixed dData

#### 3.5. Random Forest

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Figure A1.**Scree (elbow) plots to determine the number of principal components or clusters in principal component analysis (

**A**), k-means clustering (

**B**), and factor analysis of mixed data (

**C**). The x-axis refers to the number of principal components or clusters, while the y-axis corresponds to eigenvalues (for

**A**,

**C**) and the Within-Cluster Sum of Square (WCSS) (for plot

**B**).

**Figure A2.**Principal component analysis of percent inhibition of SARS-CoV-2 binding at day 36 and at the third and ninth months after vaccination. The coordinates of the two-dimensional plot refer to the “scores” of the variables in the dimensionality reduced space (two principal components are shown). Key: D36, neutralizing antibody levels two weeks after second vaccination; M3 neutralizing antibody levels three months after second vaccination; M9, neutralizing antibody levels nine months after second vaccination; BMI, body mass index.

**Figure A3.**K-means cluster analysis plot of all features used in the study (neutralizing antibody levels and demographics). The two axes are labelled as x[,1] and x[,2]. In each cluster, shown with different color, a feature predominates.

**Figure A4.**Correlation matrix showing the correlation coefficients (Spearman’s) between the numerical variables used in the analysis.

**Figure 1.**Mean percent inhibition (±standard deviation) of SARS-CoV-2 binding to the human host receptor angiotensin converting enzyme-2 after vaccination with the BNT162b2 mRNA vaccine in 302 subjects. Neutralizing antibody levels were measured on day 1 (first vaccination day), 8, 22 (second vaccination day), two weeks later, and one month, three months, six months, and nine months (i.e., 295 days from the initiation of the study) after the second vaccination.

**Figure 2.**Principal component analysis of the features used in the study. The color of the dots is from the k-means cluster analysis, which divides the subjects into five groups, each with one distinguishing feature. The coordinates of the two-dimensional plot refer to the “scores” of the variables in the dimensionality-reduced space (two dimensions (principal components) are shown in the plot). Key: D36, neutralizing antibody levels two weeks after second vaccination; M3 neutralizing antibody levels three months after second vaccination; M9, neutralizing antibody levels nine months after second vaccination; BMI, body mass index.

**Figure 3.**Biplot for factor analysis of mixed data. The coordinates of the two-dimensional plot refer to the “scores” of the variables in the dimensionality-reduced space (two principal components are shown in the plot). The lines represent the vectors of the variables in the 2D space. Key: D36, neutralizing antibody levels two weeks after second vaccination; M9, neutralizing antibody levels nine months after second vaccination; BMI, body mass index; MedHist_1, subjects with autoimmune disorders; MedHist_0, subjects without autoimmune disorders; Sex_M, men; Sex_F, women; PCR+, subjects with previous COVID-19 infection; PCR-, subjects without previous COVID19 infection.

**Figure 4.**Importance scores for the feature parameters of the subjects participating in the study with regard to their contribution in predicting neutralizing antibody levels nine months after the second vaccination. Key: D36, neutralizing antibody levels two weeks after second vaccination; M3, neutralizing antibody levels three months after second vaccination; BMI, body mass index; MedHist_1, subjects with autoimmune disorders; MedHist_0, subjects without autoimmune disorders; Sex_M, men; Sex_F, women; PCR+, subjects with previous COVID-19 infection; PCR-, subjects without previous COVID19 infection.

**Figure 5.**Confusion matrix of the random forest classifier to indicate the percentage of correct or incorrect predictions. The labels “negative”, “moderate”, “high”, and “very high” refer to neutralizing antibody levels (0–30%), (30–50%), (50–75%), (>75%).

**Figure 6.**Flowchart presenting the overall route of analysis. The association between neutralizing antibody levels (NAbs) and demographics/medical history was investigated using four machine learning methods: principal component analysis, k-means cluster analysis, factor analysis of mixed data, and random forest.

Participant Characteristic | Value (Median, Range) |
---|---|

Sample size | 302 |

Gender | |

Men | 102 (33.8%) |

Women | 200 (66.2%) |

Age | 48 (49) |

Men | 49 (48) |

Women | 48 (46) |

BMI | 24.8 (27) |

Men | 26.8 (19.5) |

Women | 23.3 (27.2) |

Variable | Principal Component | |
---|---|---|

1 | 2 | |

Age | 0.26 | −0.60 |

BMI | 0.09 | −0.75 |

D36 | −0.44 | −0.21 |

M3 | −0.57 | −0.13 |

M9 | −0.64 | −0.09 |

**Table 3.**Characteristics of each group formed by k-means cluster analysis. Values refer to the mean estimates for each category.

Group | Group Label | Variable * | ||||
---|---|---|---|---|---|---|

D36 | M3 | M9 | Age | BMI | ||

1 | Low NAbs D36 | 69.46 | 79.84 | 69.39 | 47.50 | 23.40 |

2 | Low NAbs M3/M9 | 95.09 | 76.26 | 41.13 | 49.51 | 24.30 |

3 | Obese | 97.21 | 90.73 | 58.89 | 50.03 | 32.64 |

4 | Young Individuals | 97.32 | 94.03 | 75.72 | 32.25 | 22.90 |

5 | Typical subjects | 97.10 | 93.51 | 78.65 | 55.99 | 24.83 |

