Identification of Clinically Relevant HIV Vif Protein Motif Mutations through Machine Learning and Undersampling
Abstract
:1. Introduction
2. Dataset
Undersampling
3. Methods
3.1. Decision Trees
3.2. Multinomial Naïve Bayes
3.3. Multi Layer Perceptron (MLP)
3.4. Support Vector Machine (SVM)
3.5. Methods for Assessing the Relevance of Each Vif Variable
- Generating p balanced datasets through undersampling (see Section 2);
- Constructing input variable combinations of less than 10 in size (k);
- Identifying the variable combinations of each balanced datasets providing the best classification performance;
- Calculating the relevance of each variable through a probabilistic approach, and;
- Optimizing the selection of the most relevant variables by using a threshold value.
3.5.1. MAREV-1
3.5.2. MAREV-2
3.5.3. Hypothesis Evaluation on the MAREV-1 and MAREV-2 Approaches
4. Results
4.1. Classification on the Balanced Datasets
4.2. Results Using the MAREV-1
4.3. Results Using the MAREV-2
4.4. Decision Trees and the Most Relevant Variable Combinations from MAREV-1 and MAREV-2
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
HIV | Human Immunodeficiency Virus |
Vif | Viral Infectivity Factor |
CD4 | Cluster of Differentiation 4 |
APOBEC3 | APOlipoprotein Bmessenger RNA Editing enzyme, Catalytic polypeptide-like |
NLIS | Nuclear Localisation Inhibitory Signal |
SVMs | Support Vector Machines |
ANNs | Artificial Neural Networks |
NB | Naïve Bayes |
MLP | Multi-Layer Perceptron |
RBF | Radial Basis Function |
Appendix A. Variable Assessment
(a) CART | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | BCbox-3 | 6 | 1.355 | 0.545 | 0 | 0.188 | 0 | 0 | 0.107 | 0.087 | 0 | 8.282 |
2 | APOBEC-3 | 0 | 2.71 | 1.818 | 1.793 | 0.469 | 0.192 | 0 | 0 | 0 | 0 | 6.982 |
3 | APOBEC-5 | 0 | 0 | 2 | 1.11 | 1.875 | 0.673 | 0.186 | 0.107 | 0.087 | 0 | 6.038 |
4 | APOBEC-2 | 0 | 2.613 | 1.364 | 0.768 | 0.562 | 0.288 | 0.186 | 0 | 0 | 0 | 5.782 |
5 | BCbox-2 | 2.4 | 1.161 | 0.273 | 0.939 | 0.094 | 0.096 | 0.093 | 0 | 0.348 | 0.1 | 5.504 |
6 | APOBEC-6 | 0 | 0 | 0 | 0.939 | 0.469 | 1.346 | 0.744 | 0.321 | 0.435 | 0 | 4.254 |
7 | APOBEC-4 | 1.6 | 0.097 | 0.455 | 0.427 | 0.656 | 0.096 | 0.093 | 0 | 0.087 | 0 | 3.511 |
8 | Cul5-3 | 0 | 0 | 1.091 | 0.427 | 0.562 | 0.385 | 0.279 | 0 | 0.087 | 0.2 | 3.031 |
9 | CBFb-1 | 0 | 0 | 0 | 0.085 | 0.281 | 0.673 | 0.465 | 0.964 | 0.348 | 0.1 | 2.917 |
10 | APOBEC-7 | 0 | 0.097 | 0.091 | 0.341 | 0.375 | 0.288 | 0.837 | 0 | 0 | 0.3 | 2.33 |
11 | APOBEC-8 | 0 | 0 | 0.091 | 0.171 | 0.094 | 0.288 | 0.279 | 0.536 | 0.174 | 0 | 1.633 |
12 | NLIS | 0 | 0.968 | 0.182 | 0 | 0 | 0.192 | 0 | 0.107 | 0 | 0 | 1.449 |
13 | CBFb-2 | 0 | 0 | 0.091 | 0 | 0.188 | 0.096 | 0.279 | 0.214 | 0 | 0.1 | 0.968 |
14 | Cul5-1 | 0 | 0 | 0 | 0 | 0.094 | 0.096 | 0.372 | 0.214 | 0.087 | 0.1 | 0.963 |
15 | Cul5-2 | 0 | 0 | 0 | 0 | 0.094 | 0.096 | 0.093 | 0.321 | 0.087 | 0.1 | 0.791 |
16 | BCbox-1 | 0 | 0 | 0 | 0 | 0 | 0.192 | 0.093 | 0.107 | 0.174 | 0 | 0.566 |
(b) MLP | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | BCbox-3 | 7.3 | 0.09 | 0.33 | 0.151 | 0 | 0.07 | 0.07 | 0 | 0 | 0.125 | 8.136 |
2 | APOBEC-3 | 0 | 2.79 | 2.062 | 0.527 | 0.714 | 0.211 | 0.14 | 0.071 | 0 | 0 | 6.516 |
3 | APOBEC-5 | 0 | 0 | 1.732 | 1.355 | 0.643 | 0.915 | 0.211 | 0.071 | 0.069 | 0.062 | 5.059 |
4 | BCbox-2 | 1.2 | 1.44 | 0.495 | 0.527 | 0.357 | 0.211 | 0.211 | 0 | 0.069 | 0.062 | 4.572 |
5 | APOBEC-2 | 0 | 2.07 | 0.825 | 0.527 | 0.357 | 0.282 | 0.14 | 0.143 | 0 | 0.062 | 4.406 |
6 | APOBEC-4 | 1.3 | 0.09 | 0.577 | 0.903 | 0.286 | 0.282 | 0.211 | 0 | 0.069 | 0.062 | 3.78 |
7 | Cul5-2 | 0 | 0 | 0.082 | 0.376 | 1.714 | 0.352 | 0.281 | 0.286 | 0.138 | 0.062 | 3.292 |
8 | APOBEC-7 | 0.1 | 1.35 | 0.412 | 0.301 | 0.143 | 0.423 | 0.351 | 0.071 | 0.069 | 0.062 | 3.283 |
9 | Cul5-3 | 0 | 0.09 | 0.66 | 0.602 | 0.357 | 0.282 | 0.421 | 0.357 | 0.345 | 0 | 3.114 |
10 | APOBEC-6 | 0 | 0.09 | 0.33 | 0.828 | 0.357 | 0.563 | 0.281 | 0.286 | 0.138 | 0.188 | 3.06 |
11 | APOBEC-8 | 0 | 0.09 | 0 | 0.226 | 0.5 | 0.352 | 0.421 | 0.143 | 0.207 | 0 | 1.939 |
12 | CBFb-1 | 0 | 0 | 0 | 0.075 | 0.214 | 0.423 | 0.632 | 0.357 | 0 | 0.125 | 1.826 |
13 | NLIS | 0.1 | 0.72 | 0.165 | 0.151 | 0.071 | 0.211 | 0.14 | 0.071 | 0.069 | 0.062 | 1.761 |
14 | CBFb-2 | 0 | 0.09 | 0.165 | 0.301 | 0 | 0.141 | 0.14 | 0.357 | 0.276 | 0.062 | 1.533 |
15 | BCbox-1 | 0 | 0.09 | 0.165 | 0.075 | 0.214 | 0.211 | 0.211 | 0.286 | 0.138 | 0 | 1.39 |
16 | Cul5-1 | 0 | 0 | 0 | 0.075 | 0.071 | 0.07 | 0.14 | 0.5 | 0.414 | 0.062 | 1.334 |
(c) NB | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
2 | BCbox-3 | 0 | 6.48 | 0.08 | 0.298 | 0.468 | 0.317 | 0.186 | 0.077 | 0 | 0 | 7.906 |
3 | APOBEC-3 | 0 | 0.27 | 1.92 | 0.968 | 1.091 | 0.952 | 0.372 | 0 | 0 | 0.1 | 5.673 |
4 | APOBEC-4 | 0 | 0.09 | 2.08 | 0.968 | 0.545 | 0.397 | 0.186 | 0.308 | 0.143 | 0 | 4.717 |
5 | BCbox-2 | 0 | 0.09 | 1.12 | 1.117 | 0.156 | 0.317 | 0.279 | 0.308 | 0.143 | 0 | 3.53 |
6 | APOBEC-6 | 0 | 0 | 0.64 | 0.521 | 0.779 | 0.556 | 0.186 | 0.308 | 0.286 | 0 | 3.276 |
7 | CBFb-2 | 0 | 0.63 | 0.72 | 0.298 | 0.312 | 0.159 | 0.186 | 0.231 | 0.143 | 0.1 | 2.778 |
8 | APOBEC-5 | 0 | 0 | 0.08 | 0.223 | 0.701 | 0.476 | 0.651 | 0.231 | 0.286 | 0.1 | 2.749 |
9 | APOBEC-7 | 0 | 1.08 | 0.08 | 0.447 | 0.312 | 0.238 | 0.093 | 0.308 | 0.143 | 0 | 2.7 |
10 | Cul5-3 | 0 | 0 | 0.64 | 0.521 | 0.156 | 0.238 | 0.744 | 0.154 | 0.143 | 0 | 2.596 |
11 | NLIS | 0 | 0.27 | 0.08 | 0.521 | 0.234 | 0.397 | 0.279 | 0.154 | 0.143 | 0.2 | 2.278 |
12 | BCbox-1 | 0 | 0 | 0.4 | 0.223 | 0.623 | 0.397 | 0.093 | 0.231 | 0.143 | 0.1 | 2.21 |
13 | CBFb-1 | 0 | 0 | 0 | 0.223 | 0.39 | 0.238 | 0.279 | 0.385 | 0 | 0.2 | 1.715 |
14 | APOBEC-8 | 0 | 0.09 | 0.16 | 0.372 | 0.156 | 0.159 | 0.279 | 0.154 | 0.143 | 0 | 1.513 |
15 | Cul5-2 | 0 | 0 | 0 | 0.223 | 0.078 | 0 | 0.186 | 0.077 | 0 | 0.2 | 0.764 |
16 | Cul5-1 | 0 | 0 | 0 | 0.074 | 0 | 0.159 | 0 | 0.077 | 0.286 | 0 | 0.596 |
(d) SVMs | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | BCbox-3 | 5.5 | 0.827 | 0.774 | 0.402 | 0.676 | 0.323 | 0 | 0.083 | 0 | 0 | 8.585 |
2 | BCbox-2 | 1.5 | 2.02 | 1.29 | 0.805 | 0.423 | 0.081 | 0.226 | 0 | 0.286 | 0 | 6.631 |
3 | APOBEC-4 | 2.6 | 0.276 | 1.118 | 0.885 | 0.761 | 0.323 | 0.226 | 0.167 | 0 | 0.1 | 6.455 |
4 | APOBEC-2 | 0.1 | 1.745 | 1.376 | 0.563 | 0.423 | 0.726 | 0 | 0.417 | 0.19 | 0.1 | 5.64 |
5 | APOBEC-3 | 0 | 0.276 | 1.032 | 2.011 | 0.676 | 1.048 | 0.302 | 0.167 | 0.095 | 0 | 5.607 |
6 | APOBEC-7 | 0 | 2.663 | 0.688 | 0.483 | 0.507 | 0.242 | 0.226 | 0.083 | 0.19 | 0 | 5.083 |
7 | NLIS | 0.1 | 0.643 | 1.032 | 0.483 | 0.169 | 0.242 | 0.528 | 0.083 | 0.19 | 0 | 3.471 |
8 | APOBEC-5 | 0 | 0.092 | 0.086 | 0.241 | 0.592 | 0.645 | 0.906 | 0.583 | 0.19 | 0 | 3.335 |
9 | Cul5-3 | 0 | 0.092 | 0.258 | 0.483 | 0.676 | 0.565 | 0.528 | 0.083 | 0.19 | 0 | 2.875 |
10 | APOBEC-6 | 0 | 0 | 0 | 0.161 | 0.423 | 0.242 | 0.679 | 0.583 | 0.19 | 0.2 | 2.478 |
11 | CBFb-2 | 0.1 | 0.367 | 0.258 | 0.322 | 0 | 0.242 | 0.075 | 0 | 0.19 | 0 | 1.555 |
12 | APOBEC-8 | 0.1 | 0 | 0.086 | 0.08 | 0.338 | 0.161 | 0.075 | 0.25 | 0.095 | 0.2 | 1.387 |
13 | BCbox-1 | 0 | 0 | 0 | 0.08 | 0.169 | 0.081 | 0.075 | 0.167 | 0 | 0.1 | 0.672 |
14 | Cul5-2 | 0 | 0 | 0 | 0 | 0.169 | 0.081 | 0.075 | 0.167 | 0.095 | 0 | 0.587 |
15 | CBFb-1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.167 | 0.095 | 0.3 | 0.562 |
16 | Cul5-1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.075 | 0 | 0 | 0 | 0.075 |
(a) CART | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 6.8 | 0.2 | 0.494 | 0.467 | 0.338 | 0.094 | 0 | 0 | 0 | 0 | 8.393 |
2 | APOBEC-3 | 0 | 4.8 | 0.691 | 0.467 | 0.676 | 0 | 0.133 | 0 | 0 | 0 | 6.767 |
3 | APOBEC-5 | 0 | 0.4 | 1.481 | 2.24 | 0.845 | 0.377 | 0.4 | 0.214 | 0 | 0 | 5.958 |
4 | BCbox-3 | 2.1 | 1.3 | 0.889 | 0.653 | 0.169 | 0.189 | 0 | 0 | 0.167 | 0 | 5.467 |
5 | APOBEC-4 | 0 | 0.4 | 2.765 | 0.28 | 0.761 | 0.472 | 0.133 | 0.214 | 0.333 | 0 | 5.359 |
6 | Cul5-3 | 0.5 | 0.8 | 0.198 | 0.653 | 0.254 | 0.66 | 0.4 | 0.536 | 0.167 | 0 | 4.167 |
7 | APOBEC-6 | 0 | 0.3 | 0.099 | 1.12 | 1.183 | 0.472 | 0.4 | 0.321 | 0 | 0.111 | 4.006 |
8 | APOBEC-7 | 0.1 | 0 | 0.395 | 0.093 | 0.93 | 0.377 | 0.533 | 0.214 | 0 | 0.222 | 2.865 |
9 | CBFb-1 | 0 | 0 | 0 | 0.093 | 0.254 | 1.038 | 0.933 | 0.214 | 0.167 | 0 | 2.699 |
10 | BCbox-2 | 0.5 | 0.6 | 0.395 | 0.467 | 0.169 | 0.189 | 0 | 0.107 | 0 | 0.111 | 2.538 |
11 | CBFb-2 | 0 | 0.1 | 0.296 | 0.28 | 0.085 | 0.094 | 0.667 | 0.321 | 0.333 | 0 | 2.177 |
12 | APOBEC-8 | 0 | 0.1 | 0.099 | 0 | 0 | 0.66 | 0 | 0.536 | 0 | 0.111 | 1.506 |
13 | Cul5-1 | 0 | 0 | 0 | 0 | 0 | 0.189 | 0.4 | 0 | 0.5 | 0.111 | 1.2 |
14 | NLIS | 0 | 0 | 0.099 | 0.187 | 0.338 | 0.189 | 0 | 0.107 | 0 | 0.222 | 1.142 |
15 | Cul5-2 | 0 | 0 | 0.099 | 0 | 0 | 0 | 0 | 0.214 | 0 | 0.111 | 0.424 |
16 | BCbox-1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.333 | 0 | 0.333 |
(b) MLP | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 6.7 | 0.36 | 0.495 | 0 | 0.207 | 0.074 | 0 | 0.075 | 0 | 0.136 | 8.047 |
2 | APOBEC-3 | 0 | 4.05 | 0.825 | 0.692 | 0.276 | 0.147 | 0.17 | 0 | 0.148 | 0 | 6.308 |
3 | BCbox-3 | 2.1 | 0.54 | 0.247 | 0.385 | 0.345 | 0.147 | 0.255 | 0.375 | 0.222 | 0.045 | 4.662 |
4 | APOBEC-5 | 0 | 0.63 | 1.567 | 1.385 | 0.345 | 0.588 | 0.085 | 0 | 0 | 0.045 | 4.645 |
5 | APOBEC-6 | 0 | 0.81 | 1.155 | 0.769 | 0.828 | 0.294 | 0.255 | 0.225 | 0.074 | 0.045 | 4.455 |
6 | APOBEC-4 | 0 | 0.36 | 1.897 | 0.231 | 0.621 | 0.294 | 0.34 | 0.225 | 0.148 | 0 | 4.116 |
7 | Cul5-3 | 0.5 | 0.9 | 0.495 | 0.846 | 0 | 0.368 | 0 | 0.525 | 0 | 0.045 | 3.679 |
8 | CBFb-1 | 0 | 0 | 0.33 | 0.385 | 0.552 | 0.735 | 0.596 | 0.15 | 0.148 | 0 | 2.895 |
9 | APOBEC-8 | 0 | 0.27 | 0.082 | 0.846 | 0.276 | 0.368 | 0.511 | 0.15 | 0.074 | 0.045 | 2.622 |
10 | Cul5-2 | 0 | 0.54 | 0.082 | 0.154 | 0.621 | 0.294 | 0.255 | 0 | 0.444 | 0.227 | 2.618 |
11 | APOBEC-7 | 0.1 | 0 | 0.165 | 0.462 | 0.69 | 0.515 | 0.085 | 0.15 | 0.222 | 0.045 | 2.434 |
12 | BCbox-2 | 0.5 | 0.27 | 0.165 | 0.385 | 0.276 | 0.147 | 0.255 | 0.225 | 0.074 | 0 | 2.297 |
13 | NLIS | 0 | 0.09 | 0.165 | 0.154 | 0.345 | 0.441 | 0.34 | 0.075 | 0 | 0.136 | 1.747 |
14 | CBFb-2 | 0.1 | 0.18 | 0.165 | 0.077 | 0.138 | 0.221 | 0.426 | 0.075 | 0.148 | 0.136 | 1.665 |
15 | Cul5-1 | 0 | 0 | 0.165 | 0.077 | 0.138 | 0.294 | 0.34 | 0.3 | 0 | 0.091 | 1.405 |
16 | BCbox-1 | 0 | 0 | 0 | 0.154 | 0.345 | 0.074 | 0.085 | 0.45 | 0.296 | 0 | 1.404 |
(c) NB | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
2 | APOBEC-4 | 0 | 1.8 | 0.851 | 0.612 | 1 | 0.317 | 0.651 | 0.143 | 0.065 | 0.111 | 5.551 |
3 | APOBEC-3 | 0 | 0.09 | 2.043 | 1.4 | 0.923 | 0.397 | 0.186 | 0.071 | 0.065 | 0 | 5.174 |
4 | APOBEC-6 | 0 | 0 | 0.596 | 0.875 | 0.538 | 0.794 | 0.837 | 0.143 | 0.323 | 0.111 | 4.217 |
5 | APOBEC-8 | 0 | 1.89 | 0.426 | 0.525 | 0.231 | 0.238 | 0.279 | 0.286 | 0.129 | 0 | 4.003 |
6 | BCbox-2 | 0 | 2.52 | 0.426 | 0.088 | 0.154 | 0 | 0.093 | 0.286 | 0.065 | 0.222 | 3.852 |
7 | APOBEC-5 | 0 | 0 | 0.255 | 1.05 | 0.538 | 0.952 | 0.465 | 0.286 | 0.129 | 0.111 | 3.787 |
8 | BCbox-3 | 0 | 0.54 | 0.17 | 0.7 | 0.846 | 0.873 | 0.093 | 0.286 | 0.129 | 0 | 3.637 |
9 | Cul5-3 | 0 | 1.35 | 0.596 | 0.175 | 0.231 | 0.317 | 0.093 | 0.214 | 0.065 | 0 | 3.041 |
10 | APOBEC-7 | 0 | 0 | 0.085 | 0.525 | 0.538 | 0.397 | 0.465 | 0.214 | 0.129 | 0.222 | 2.576 |
11 | CBFb-2 | 0 | 0 | 0.681 | 0.438 | 0.385 | 0.159 | 0 | 0.143 | 0.387 | 0 | 2.192 |
12 | NLIS | 0 | 0.81 | 0.17 | 0.438 | 0.231 | 0.238 | 0 | 0.071 | 0.129 | 0 | 2.087 |
13 | CBFb-1 | 0 | 0 | 0.17 | 0.175 | 0.231 | 0 | 0.372 | 0.357 | 0.323 | 0.111 | 1.739 |
14 | Cul5-2 | 0 | 0 | 1.362 | 0 | 0.077 | 0.079 | 0 | 0.214 | 0 | 0 | 1.732 |
15 | Cul5-1 | 0 | 0 | 0.085 | 0 | 0.077 | 0.159 | 0.372 | 0.143 | 0.065 | 0 | 0.9 |
16 | BCbox-1 | 0 | 0 | 0.085 | 0 | 0 | 0.079 | 0.093 | 0.143 | 0 | 0.111 | 0.511 |
(d) SVMs | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 8.8 | 0.329 | 0.107 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9.236 |
2 | BCbox-3 | 0.1 | 2.854 | 0.96 | 1.242 | 0.941 | 0.645 | 0.19 | 0.4 | 0 | 0 | 7.332 |
3 | APOBEC-3 | 0 | 2.854 | 0.853 | 1.242 | 0.471 | 0.484 | 0 | 0 | 0.25 | 0.167 | 6.32 |
4 | APOBEC-5 | 0 | 0 | 0.427 | 1.016 | 1.529 | 0.806 | 0.571 | 0.6 | 0.25 | 0.167 | 5.367 |
5 | APOBEC-4 | 0 | 0.659 | 2.133 | 0.565 | 0.824 | 0.323 | 0.571 | 0 | 0 | 0 | 5.074 |
6 | APOBEC-7 | 0 | 0.659 | 1.067 | 0.452 | 0.118 | 0 | 0.762 | 0.4 | 0.25 | 0.167 | 3.873 |
7 | APOBEC-6 | 0.1 | 0.11 | 0.64 | 0.339 | 0.706 | 0.484 | 0.19 | 0.4 | 0.5 | 0 | 3.469 |
8 | BCbox-2 | 0.4 | 0.439 | 0.64 | 0.677 | 0.235 | 0.161 | 0.19 | 0.2 | 0 | 0.167 | 3.11 |
9 | Cul5-3 | 0.4 | 0.439 | 0.64 | 0.226 | 0.471 | 0.323 | 0 | 0 | 0.5 | 0 | 2.998 |
10 | APOBEC-8 | 0.1 | 0.22 | 0.32 | 0.339 | 0.118 | 0.645 | 0.571 | 0.2 | 0 | 0 | 2.512 |
11 | NLIS | 0.1 | 0.22 | 0.107 | 0.226 | 0.118 | 0.484 | 0.571 | 0.2 | 0 | 0 | 2.025 |
12 | CBFb-2 | 0 | 0.11 | 0.107 | 0.452 | 0.353 | 0.323 | 0.19 | 0.2 | 0 | 0 | 1.734 |
13 | CBFb-1 | 0 | 0 | 0 | 0.113 | 0.118 | 0.323 | 0.19 | 0.4 | 0.25 | 0.333 | 1.727 |
14 | Cul5-2 | 0 | 0 | 0 | 0.113 | 0 | 0 | 0 | 0 | 0 | 0 | 0.113 |
15 | BCbox-1 | 0 | 0.11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.11 |
16 | Cul5-1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(a) CART | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 6.3 | 0.276 | 0.556 | 0 | 0 | 0 | 0.444 | 0 | 0 | 0 | 7.576 |
2 | BCbox-1 | 0 | 5.235 | 0.667 | 0.75 | 0.13 | 0 | 0 | 0 | 0 | 0.25 | 7.032 |
3 | APOBEC-3 | 0 | 0.827 | 1.778 | 1.625 | 0.783 | 0.167 | 0.148 | 0.2 | 0 | 0 | 5.527 |
4 | APOBEC-5 | 0 | 0.918 | 0.889 | 0.875 | 1.435 | 0.333 | 0.593 | 0 | 0 | 0 | 5.043 |
5 | CBFb-2 | 1 | 0.551 | 1.333 | 0.75 | 0.13 | 0.167 | 0.296 | 0 | 0 | 0 | 4.228 |
6 | APOBEC-4 | 0 | 0.643 | 1.556 | 0.5 | 0.522 | 0.667 | 0.296 | 0 | 0 | 0 | 4.183 |
7 | APOBEC-7 | 0 | 0.276 | 0.667 | 0.375 | 0.652 | 0.5 | 0.296 | 0.2 | 0 | 0.25 | 3.216 |
8 | APOBEC-6 | 0 | 0.092 | 0.222 | 0.375 | 0.783 | 0.833 | 0.296 | 0.4 | 0.2 | 0 | 3.201 |
9 | BCbox-2 | 2.7 | 0 | 0.111 | 0.125 | 0 | 0 | 0.148 | 0 | 0 | 0 | 3.084 |
10 | CBFb-1 | 0 | 0 | 0 | 0.125 | 0.522 | 1.167 | 0.296 | 0.2 | 0.2 | 0.25 | 2.76 |
11 | Cul5-1 | 0 | 0 | 0 | 0 | 0.13 | 0.333 | 0.296 | 1 | 0.4 | 0 | 2.16 |
12 | NLIS | 0 | 0.092 | 0 | 0.375 | 0.261 | 0.5 | 0.148 | 0.6 | 0 | 0 | 1.976 |
13 | Cul5-3 | 0 | 0.092 | 0.111 | 0.625 | 0.261 | 0 | 0.148 | 0.2 | 0.2 | 0 | 1.637 |
14 | Cul5-2 | 0 | 0 | 0 | 0.125 | 0 | 0 | 0.296 | 0.2 | 0.6 | 0 | 1.221 |
15 | APOBEC-8 | 0 | 0 | 0.111 | 0.125 | 0.261 | 0.167 | 0.148 | 0 | 0.4 | 0 | 1.212 |
16 | BCbox-3 | 0 | 0 | 0 | 0.25 | 0.13 | 0.167 | 0.148 | 0 | 0 | 0.25 | 0.945 |
(b) MLP | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 6.3 | 0.273 | 0.4 | 0.092 | 0 | 0.357 | 0.118 | 0.13 | 0.154 | 0.125 | 7.949 |
2 | BCbox-1 | 0 | 5.182 | 0.5 | 0.276 | 0.689 | 0.476 | 0.118 | 0 | 0.154 | 0 | 7.394 |
3 | APOBEC-5 | 0 | 0.545 | 1 | 0.276 | 1.082 | 0.952 | 0.471 | 0.261 | 0.154 | 0 | 4.741 |
4 | CBFb-2 | 1 | 0.818 | 1.7 | 0.368 | 0.098 | 0.357 | 0.118 | 0 | 0 | 0 | 4.46 |
5 | APOBEC-3 | 0 | 1.182 | 0.8 | 1.105 | 0.59 | 0.238 | 0 | 0.13 | 0.308 | 0 | 4.353 |
6 | CBFb-1 | 0 | 0 | 0.1 | 1.289 | 0.492 | 0.238 | 0.471 | 0.652 | 0.154 | 0.125 | 3.521 |
7 | BCbox-2 | 2.7 | 0 | 0.3 | 0.092 | 0 | 0 | 0 | 0 | 0 | 0 | 3.092 |
8 | Cul5-1 | 0 | 0 | 0.1 | 0.553 | 0.59 | 0.595 | 0.588 | 0.652 | 0 | 0 | 3.078 |
9 | APOBEC-4 | 0 | 0.273 | 0.8 | 0.645 | 0.393 | 0 | 0.353 | 0.13 | 0.154 | 0 | 2.748 |
10 | APOBEC-7 | 0 | 0.182 | 0.8 | 0.553 | 0.59 | 0.238 | 0.118 | 0.13 | 0 | 0 | 2.611 |
11 | Cul5-2 | 0 | 0 | 0.1 | 0.092 | 0.295 | 0.833 | 0.471 | 0.261 | 0.308 | 0.125 | 2.485 |
12 | NLIS | 0 | 0 | 0.6 | 0.184 | 0.197 | 0.119 | 0.353 | 0.13 | 0.154 | 0.25 | 1.987 |
13 | BCbox-3 | 0 | 0.091 | 0.1 | 0.737 | 0.197 | 0.238 | 0 | 0.13 | 0.308 | 0 | 1.801 |
14 | APOBEC-8 | 0 | 0 | 0.3 | 0.092 | 0.492 | 0.119 | 0.353 | 0.13 | 0 | 0.25 | 1.736 |
15 | APOBEC-6 | 0 | 0.182 | 0.3 | 0.368 | 0.098 | 0.119 | 0.235 | 0.13 | 0 | 0.125 | 1.558 |
16 | Cul5-3 | 0 | 0.273 | 0.1 | 0.276 | 0.197 | 0.119 | 0.235 | 0.13 | 0.154 | 0 | 1.484 |
(c) NB | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
2 | Cul5-3 | 0 | 5.22 | 1.495 | 0.236 | 0.651 | 0.069 | 0 | 0.073 | 0 | 0.143 | 7.887 |
3 | APOBEC-4 | 0 | 3.24 | 0.703 | 0.393 | 0.651 | 0.347 | 0.082 | 0.146 | 0 | 0.143 | 5.705 |
4 | CBFb-2 | 0 | 0 | 1.495 | 0.708 | 0.94 | 0.347 | 0.49 | 0.366 | 0.133 | 0 | 4.478 |
5 | APOBEC-3 | 0 | 0 | 0.967 | 1.416 | 0.217 | 0.833 | 0.408 | 0.366 | 0.067 | 0 | 4.274 |
6 | BCbox-1 | 0 | 0.36 | 1.319 | 0.787 | 0.578 | 0.208 | 0.327 | 0 | 0.067 | 0.143 | 3.788 |
7 | BCbox-2 | 0 | 0 | 0.44 | 1.573 | 0.434 | 0.139 | 0.408 | 0.073 | 0.067 | 0.143 | 3.276 |
8 | APOBEC-5 | 0 | 0 | 0.791 | 0.157 | 0.361 | 0.278 | 0.571 | 0.22 | 0.133 | 0.143 | 2.655 |
9 | APOBEC-7 | 0 | 0 | 0.176 | 0.236 | 0.578 | 0.625 | 0.327 | 0.512 | 0.133 | 0 | 2.587 |
10 | CBFb-1 | 0 | 0 | 0.088 | 0.157 | 0.578 | 0.833 | 0.163 | 0.293 | 0.4 | 0 | 2.513 |
11 | NLIS | 0 | 0.18 | 0.088 | 0.393 | 0.361 | 0.486 | 0.163 | 0.073 | 0.133 | 0.143 | 2.021 |
12 | APOBEC-6 | 0 | 0 | 0.352 | 0.236 | 0.361 | 0.208 | 0.245 | 0.439 | 0 | 0 | 1.841 |
13 | BCbox-3 | 0 | 0 | 0.088 | 0.157 | 0 | 0.347 | 0.245 | 0.293 | 0.2 | 0 | 1.33 |
14 | APOBEC-8 | 0 | 0 | 0 | 0.236 | 0.072 | 0.069 | 0.408 | 0.073 | 0.333 | 0 | 1.192 |
15 | Cul5-1 | 0 | 0 | 0 | 0.157 | 0.145 | 0.139 | 0.082 | 0.073 | 0.267 | 0 | 0.862 |
16 | Cul5-2 | 0 | 0 | 0 | 0.157 | 0.072 | 0.069 | 0.082 | 0 | 0.067 | 0.143 | 0.59 |
(d) SVMs | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 5.8 | 1.299 | 1.129 | 0.339 | 0 | 0.179 | 0 | 0 | 0 | 0 | 8.746 |
2 | BCbox-2 | 2.1 | 1.485 | 1.6 | 0.339 | 0.15 | 0.179 | 0 | 0.167 | 0 | 0 | 6.018 |
3 | APOBEC-3 | 0.2 | 1.206 | 1.318 | 1.581 | 0.6 | 0.357 | 0.462 | 0 | 0 | 0 | 5.723 |
4 | CBFb-2 | 0.3 | 2.876 | 1.035 | 0.565 | 0.15 | 0.179 | 0 | 0 | 0 | 0 | 5.105 |
5 | APOBEC-5 | 0 | 0.093 | 0.188 | 0.903 | 1.05 | 0.357 | 0.615 | 0.667 | 0.364 | 0 | 4.237 |
6 | APOBEC-7 | 0.4 | 0.186 | 0.565 | 0.226 | 1.8 | 0.179 | 0 | 0.333 | 0.364 | 0 | 4.052 |
7 | BCbox-3 | 0.3 | 0.742 | 0.282 | 0.339 | 0 | 0.893 | 0.308 | 0 | 0.182 | 0.5 | 3.546 |
8 | APOBEC-4 | 0.6 | 0.371 | 0.471 | 0.903 | 0.15 | 0.536 | 0.308 | 0 | 0.182 | 0 | 3.52 |
9 | NLIS | 0.1 | 0.186 | 0.376 | 0.113 | 0.45 | 0.536 | 0.923 | 0.5 | 0 | 0 | 3.184 |
10 | Cul5-3 | 0.2 | 0.278 | 0.659 | 0.452 | 0.45 | 0.179 | 0.154 | 0.167 | 0.182 | 0 | 2.72 |
11 | APOBEC-8 | 0 | 0.186 | 0 | 0.339 | 0.6 | 0.714 | 0.154 | 0.333 | 0.364 | 0 | 2.689 |
12 | APOBEC-6 | 0 | 0 | 0.282 | 0.452 | 0.45 | 0.357 | 0.308 | 0 | 0.182 | 0 | 2.031 |
13 | CBFb-1 | 0 | 0 | 0 | 0.113 | 0.15 | 0.357 | 0.154 | 0.5 | 0.182 | 0 | 1.456 |
14 | BCbox-1 | 0 | 0.093 | 0.094 | 0.226 | 0 | 0 | 0.615 | 0.333 | 0 | 0 | 1.361 |
15 | Cul5-1 | 0 | 0 | 0 | 0.113 | 0 | 0 | 0 | 0 | 0 | 0.5 | 0.613 |
16 | Cul5-2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(a) CART | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | NLIS | 9.9 | 0 | 0 | 0 | 0.158 | 0 | 0 | 0 | 0 | 0 | 10.058 |
2 | APOBEC-3 | 0 | 5.091 | 0.427 | 0.28 | 0 | 0 | 0.2 | 0 | 0 | 0 | 5.998 |
3 | APOBEC-5 | 0 | 0.909 | 4.16 | 0.28 | 0 | 0 | 0.2 | 0 | 0 | 0 | 5.549 |
4 | APOBEC-2 | 0.1 | 2 | 0.747 | 0.7 | 0.474 | 0 | 0.4 | 0.5 | 0.4 | 0 | 5.32 |
5 | CBFb-1 | 0 | 0 | 0.64 | 1.12 | 1.263 | 0.833 | 0.4 | 0 | 0 | 0 | 4.256 |
6 | BCbox-1 | 0 | 0.636 | 0.853 | 0.42 | 0.632 | 0.333 | 0.4 | 0.5 | 0.4 | 0 | 4.175 |
7 | APOBEC-8 | 0 | 0 | 0.213 | 1.82 | 0.632 | 0.5 | 0.8 | 0 | 0 | 0 | 3.965 |
8 | APOBEC-7 | 0 | 0.182 | 0.107 | 0.42 | 0.632 | 0.5 | 0.8 | 0.25 | 0.4 | 0 | 3.29 |
9 | Cul5-1 | 0 | 0 | 0.213 | 0.42 | 0.474 | 0.833 | 0.4 | 0.75 | 0 | 0 | 3.09 |
10 | APOBEC-6 | 0 | 0 | 0.107 | 1.12 | 0.789 | 0 | 0 | 0.5 | 0 | 0 | 2.516 |
11 | APOBEC-4 | 0 | 0.091 | 0.107 | 0.28 | 0 | 0.333 | 0.2 | 0.25 | 0 | 0.5 | 1.761 |
12 | BCbox-3 | 0 | 0.091 | 0.32 | 0 | 0.158 | 1 | 0 | 0 | 0 | 0 | 1.569 |
13 | CBFb-2 | 0 | 0 | 0.107 | 0.14 | 0.158 | 0.333 | 0 | 0 | 0 | 0.5 | 1.238 |
14 | Cul5-2 | 0 | 0 | 0 | 0 | 0.158 | 0.167 | 0.2 | 0.25 | 0.4 | 0 | 1.175 |
15 | BCbox-2 | 0 | 0 | 0 | 0 | 0.158 | 0.167 | 0 | 0 | 0.4 | 0 | 0.725 |
16 | Cul5-3 | 0 | 0 | 0 | 0 | 0.316 | 0 | 0 | 0 | 0 | 0 | 0.316 |
(b) MLP | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | NLIS | 9.9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9.9 |
2 | BCbox-1 | 0 | 3.33 | 0.791 | 1.28 | 0.312 | 0.137 | 0.123 | 0.45 | 0.16 | 0.053 | 6.636 |
3 | CBFb-1 | 0 | 0 | 1.67 | 0.768 | 1.169 | 0.753 | 0.308 | 0.225 | 0 | 0 | 4.894 |
4 | APOBEC-5 | 0 | 0.81 | 1.319 | 1.11 | 0.39 | 0.479 | 0.185 | 0.075 | 0.4 | 0.053 | 4.82 |
5 | APOBEC-3 | 0 | 2.43 | 0.44 | 0.683 | 0.234 | 0 | 0.246 | 0.075 | 0.16 | 0.053 | 4.32 |
6 | Cul5-1 | 0 | 0 | 0.264 | 0.427 | 1.169 | 1.096 | 0.8 | 0.375 | 0.08 | 0.105 | 4.316 |
7 | APOBEC-2 | 0.1 | 1.62 | 0.615 | 0.341 | 0.156 | 0.205 | 0.431 | 0.15 | 0.16 | 0.158 | 3.937 |
8 | Cul5-2 | 0 | 0 | 0 | 0.256 | 1.325 | 0.753 | 0.862 | 0.375 | 0.16 | 0.105 | 3.836 |
9 | APOBEC-8 | 0 | 0 | 0.44 | 0.854 | 0.39 | 0.137 | 0.492 | 0.45 | 0.32 | 0.053 | 3.135 |
10 | APOBEC-7 | 0 | 0.09 | 1.319 | 0.427 | 0.078 | 0.205 | 0.185 | 0.225 | 0.16 | 0.053 | 2.741 |
11 | APOBEC-4 | 0 | 0.36 | 0.527 | 0 | 0 | 0.274 | 0 | 0 | 0 | 0.158 | 1.319 |
12 | APOBEC-6 | 0 | 0 | 0.352 | 0.085 | 0.312 | 0.205 | 0.062 | 0.15 | 0.08 | 0.053 | 1.298 |
13 | BCbox-3 | 0 | 0.36 | 0 | 0.171 | 0.078 | 0.274 | 0.123 | 0.15 | 0.08 | 0.053 | 1.288 |
14 | CBFb-2 | 0 | 0 | 0.264 | 0.171 | 0.234 | 0.137 | 0.062 | 0.15 | 0.08 | 0.053 | 1.149 |
15 | Cul5-3 | 0 | 0 | 0 | 0.256 | 0.156 | 0.137 | 0.123 | 0.075 | 0.08 | 0.053 | 0.88 |
16 | BCbox-2 | 0 | 0 | 0 | 0.171 | 0 | 0.205 | 0 | 0.075 | 0.08 | 0 | 0.531 |
(c) NB | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | APOBEC-2 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
2 | Cul5-3 | 0 | 1.71 | 3.2 | 0.794 | 0.174 | 0.574 | 0.073 | 0.077 | 0 | 0 | 6.601 |
3 | NLIS | 0 | 3.78 | 1.04 | 1.01 | 0.087 | 0.082 | 0.145 | 0.154 | 0.125 | 0 | 6.424 |
4 | BCbox-1 | 0 | 0.54 | 1.12 | 1.155 | 1.13 | 0.656 | 0.218 | 0.231 | 0.125 | 0 | 5.175 |
5 | APOBEC-8 | 0 | 2.16 | 0.32 | 0.361 | 0.435 | 0.41 | 0.582 | 0.231 | 0 | 0 | 4.498 |
6 | APOBEC-3 | 0 | 0.09 | 0.24 | 1.227 | 0.783 | 0.41 | 0.727 | 0.308 | 0.125 | 0.222 | 4.131 |
7 | APOBEC-5 | 0 | 0 | 0.32 | 0.289 | 0.87 | 0.656 | 0.291 | 0.077 | 0.375 | 0 | 2.877 |
8 | CBFb-1 | 0 | 0 | 0.16 | 0.433 | 0.435 | 0.656 | 0.364 | 0.308 | 0.125 | 0.111 | 2.591 |
9 | APOBEC-7 | 0 | 0.09 | 0.08 | 0.505 | 0.261 | 0.41 | 0.218 | 0.385 | 0.25 | 0.222 | 2.421 |
10 | BCbox-3 | 0 | 0 | 0.24 | 0.433 | 0.696 | 0.164 | 0.291 | 0 | 0.25 | 0 | 2.073 |
11 | CBFb-2 | 0 | 0 | 0.56 | 0.289 | 0.174 | 0.41 | 0.145 | 0.231 | 0.125 | 0.111 | 2.045 |
12 | BCbox-2 | 0 | 0.45 | 0.16 | 0.361 | 0.087 | 0.246 | 0.218 | 0.231 | 0 | 0 | 1.753 |
13 | APOBEC-4 | 0 | 0.18 | 0.32 | 0 | 0.174 | 0.246 | 0.436 | 0 | 0.125 | 0 | 1.481 |
14 | APOBEC-6 | 0 | 0 | 0.16 | 0.144 | 0.348 | 0 | 0.145 | 0.308 | 0.125 | 0.111 | 1.341 |
15 | Cul5-1 | 0 | 0 | 0.08 | 0 | 0 | 0 | 0.145 | 0.231 | 0.25 | 0.111 | 0.817 |
16 | Cul5-2 | 0 | 0 | 0 | 0 | 0.348 | 0.082 | 0 | 0.231 | 0 | 0.111 | 0.772 |
(d) SVMs | ||||||||||||
Rank | Variable | pos1 | pos2 | pos3 | pos4 | pos5 | pos6 | pos7 | pos8 | pos9 | pos10 | Total |
1 | NLIS | 9.2 | 0.45 | 0.188 | 0.106 | 0 | 0 | 0 | 0 | 0 | 0 | 9.944 |
2 | BCbox-3 | 0.1 | 3.51 | 0.941 | 0.636 | 0.8 | 0.568 | 0.2 | 0.097 | 0.2 | 0 | 7.052 |
3 | APOBEC-2 | 0.3 | 1.8 | 1.976 | 0.955 | 0.533 | 0.227 | 0.1 | 0 | 0 | 0 | 5.892 |
4 | APOBEC-4 | 0 | 0.9 | 1.6 | 1.379 | 0.4 | 0.568 | 0.1 | 0.194 | 0.2 | 0 | 5.341 |
5 | APOBEC-3 | 0 | 0.27 | 1.035 | 1.273 | 0.533 | 0.455 | 0.6 | 0.484 | 0 | 0 | 4.65 |
6 | BCbox-1 | 0.2 | 0.27 | 0.376 | 0.742 | 0.133 | 1.023 | 0.7 | 0.387 | 0.6 | 0 | 4.432 |
7 | APOBEC-8 | 0 | 0.09 | 0.471 | 0.424 | 1.333 | 0.227 | 0.3 | 0.194 | 0 | 0.556 | 3.595 |
8 | APOBEC-5 | 0 | 0 | 0.282 | 0.636 | 0.667 | 0.455 | 0.5 | 0.29 | 0.2 | 0.222 | 3.252 |
9 | BCbox-2 | 0 | 0.9 | 0.094 | 0.424 | 0.267 | 0.341 | 0.4 | 0.194 | 0 | 0 | 2.619 |
10 | APOBEC-6 | 0 | 0 | 0.094 | 0.106 | 0.267 | 0.455 | 0.3 | 0.387 | 0.4 | 0 | 2.008 |
11 | CBFb-2 | 0.2 | 0.63 | 0.376 | 0.106 | 0.133 | 0 | 0.1 | 0.097 | 0.2 | 0 | 1.843 |
12 | Cul5-3 | 0 | 0 | 0.565 | 0.106 | 0.133 | 0.114 | 0.4 | 0.29 | 0 | 0 | 1.608 |
13 | APOBEC-7 | 0 | 0.18 | 0 | 0.106 | 0.533 | 0.227 | 0.2 | 0.097 | 0 | 0 | 1.343 |
14 | CBFb-1 | 0 | 0 | 0 | 0 | 0.267 | 0.114 | 0.1 | 0.097 | 0.2 | 0.111 | 0.888 |
15 | Cul5-2 | 0 | 0 | 0 | 0 | 0 | 0.114 | 0 | 0.194 | 0 | 0 | 0.307 |
16 | Cul5-1 | 0 | 0 | 0 | 0 | 0 | 0.114 | 0 | 0 | 0 | 0.111 | 0.225 |
CD4Ini | ||||||
---|---|---|---|---|---|---|
Rank | Variable | CART | MLP | NB | SVMs | Total |
1 | BCbox-3 | 8.282 | 8.136 | 7.906 | 8.585 | 32.909 |
2 | APOBEC-3 | 5.782 | 4.406 | 10 | 5.64 | 25.828 |
3 | APOBEC-5 | 6.982 | 6.516 | 5.673 | 5.607 | 24.778 |
4 | APOBEC-2 | 5.504 | 4.572 | 3.53 | 6.631 | 20.237 |
5 | BCbox-2 | 3.511 | 3.78 | 4.717 | 6.455 | 18.463 |
6 | APOBEC-6 | 6.038 | 5.059 | 2.749 | 3.335 | 17.181 |
7 | APOBEC-4 | 2.33 | 3.283 | 2.7 | 5.083 | 13.396 |
8 | Cul5-3 | 4.254 | 3.06 | 3.276 | 2.478 | 13.068 |
9 | CBFb-1 | 3.031 | 3.114 | 2.596 | 2.875 | 11.616 |
10 | APOBEC-7 | 1.449 | 1.761 | 2.278 | 3.471 | 8.959 |
11 | APOBEC-8 | 2.917 | 1.826 | 1.715 | 0.562 | 7.02 |
12 | NLIS | 0.968 | 1.533 | 2.778 | 1.555 | 6.834 |
13 | CBFb-2 | 1.633 | 1.939 | 1.513 | 1.387 | 6.472 |
14 | Cul5-1 | 0.791 | 3.292 | 0.764 | 0.587 | 5.434 |
15 | Cul5-2 | 0.566 | 1.39 | 2.21 | 0.672 | 4.838 |
16 | BCbox-1 | 0.963 | 1.334 | 0.596 | 0.075 | 2.968 |
CD4Hist | ||||||
Rank | Variable | CART | MLP | NB | SVMs | Total |
1 | APOBEC-2 | 8.393 | 8.047 | 10 | 9.236 | 35.676 |
2 | APOBEC-3 | 6.767 | 6.308 | 5.174 | 6.32 | 24.569 |
3 | APOBEC-5 | 5.467 | 4.662 | 3.637 | 7.332 | 21.098 |
4 | BCbox-3 | 5.359 | 4.116 | 5.551 | 5.074 | 20.1 |
5 | APOBEC-4 | 5.958 | 4.645 | 3.787 | 5.367 | 19.757 |
6 | Cul5-3 | 4.006 | 4.455 | 4.217 | 3.469 | 16.147 |
7 | APOBEC-6 | 4.167 | 3.679 | 3.041 | 2.998 | 13.885 |
8 | APOBEC-7 | 2.538 | 2.297 | 3.852 | 3.11 | 11.797 |
9 | CBFb-1 | 2.865 | 2.434 | 2.576 | 3.873 | 11.748 |
10 | BCbox-2 | 1.506 | 2.622 | 4.003 | 2.512 | 10.643 |
11 | CBFb-2 | 2.699 | 2.895 | 1.739 | 1.727 | 9.06 |
12 | APOBEC-8 | 2.177 | 1.665 | 2.192 | 1.734 | 7.768 |
13 | Cul5-1 | 1.142 | 1.747 | 2.087 | 2.025 | 7.001 |
14 | NLIS | 0.424 | 2.618 | 1.732 | 0.113 | 4.887 |
15 | Cul5-2 | 1.2 | 1.405 | 0.9 | 0 | 3.505 |
16 | BCbox-1 | 0.333 | 1.404 | 0.511 | 0.11 | 2.358 |
VLIni | ||||||
Rank | Variable | CART | MLP | NB | SVMs | Total |
1 | APOBEC-2 | 7.576 | 7.949 | 10 | 8.746 | 34.271 |
2 | BCbox-1 | 5.527 | 4.353 | 4.274 | 5.723 | 19.877 |
3 | APOBEC-3 | 7.032 | 7.394 | 3.788 | 1.361 | 19.575 |
4 | APOBEC-5 | 4.228 | 4.46 | 4.478 | 5.105 | 18.271 |
5 | CBFb-2 | 5.043 | 4.741 | 2.655 | 4.237 | 16.676 |
6 | APOBEC-4 | 4.183 | 2.748 | 5.705 | 3.52 | 16.156 |
7 | APOBEC-7 | 3.084 | 3.092 | 3.276 | 6.018 | 15.47 |
8 | APOBEC-6 | 1.637 | 1.484 | 7.887 | 2.72 | 13.728 |
9 | BCbox-2 | 3.216 | 2.611 | 2.587 | 4.052 | 12.466 |
10 | CBFb-1 | 2.76 | 3.521 | 2.513 | 1.456 | 10.25 |
11 | Cul5-1 | 1.976 | 1.987 | 2.021 | 3.184 | 9.168 |
12 | NLIS | 3.201 | 1.558 | 1.841 | 2.031 | 8.631 |
13 | Cul5-3 | 0.945 | 1.801 | 1.33 | 3.546 | 7.622 |
14 | Cul5-2 | 1.212 | 1.736 | 1.192 | 2.689 | 6.829 |
15 | APOBEC-8 | 2.16 | 3.078 | 0.862 | 0.613 | 6.713 |
16 | BCbox-3 | 1.221 | 2.485 | 0.59 | 0 | 4.296 |
VLHist | ||||||
Rank | Variable | CART | MLP | NB | SVMs | Total |
1 | NLIS | 10.058 | 9.9 | 6.424 | 9.944 | 36.326 |
2 | APOBEC-3 | 5.32 | 3.937 | 10 | 5.892 | 25.149 |
3 | APOBEC-5 | 4.175 | 6.636 | 5.175 | 4.432 | 20.418 |
4 | APOBEC-2 | 5.998 | 4.32 | 4.131 | 4.65 | 19.099 |
5 | CBFb-1 | 5.549 | 4.82 | 2.877 | 3.252 | 16.498 |
6 | BCbox-1 | 3.965 | 3.135 | 4.498 | 3.595 | 15.193 |
7 | APOBEC-8 | 4.256 | 4.894 | 2.591 | 0.888 | 12.629 |
8 | APOBEC-7 | 1.569 | 1.288 | 2.073 | 7.052 | 11.982 |
9 | Cul5-1 | 1.761 | 1.319 | 1.481 | 5.341 | 9.902 |
10 | APOBEC-6 | 3.29 | 2.741 | 2.421 | 1.343 | 9.795 |
11 | APOBEC-4 | 0.316 | 0.88 | 6.601 | 1.608 | 9.405 |
12 | BCbox-3 | 3.09 | 4.316 | 0.817 | 0.225 | 8.448 |
13 | CBFb-2 | 2.516 | 1.298 | 1.341 | 2.008 | 7.163 |
14 | Cul5-2 | 1.238 | 1.149 | 2.045 | 1.843 | 6.275 |
15 | BCbox-2 | 1.175 | 3.836 | 0.772 | 0.307 | 6.09 |
16 | Cul5-3 | 0.725 | 0.531 | 1.753 | 2.619 | 5.628 |
a MAREV-1 | b MAREV-2 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Variable | CD4Ini | CD4Hist | VLIni | VLHist | CD4Ini | CD4Hist | VLIni | VLHist | |
APOBEC-2 | 20.237 | 35.676 | 34.271 | 19.099 | 6.5 | 10.0 | 10.0 | 8.75 | |
APOBEC-3 | 25.828 | 24.569 | 19.575 | 25.149 | 7.75 | 2.25 | 5.083 | 3.75 | |
APOBEC-4 | 13.396 | 19.757 | 16.156 | 9.405 | 1.75 | 8.0 | 8.0 | 2.25 | |
APOBEC-5 | 24.778 | 21.098 | 18.271 | 20.418 | 2.5 | 3.25 | 5.167 | 1.5 | |
APOBEC-6 | 17.181 | 13.885 | 13.728 | 9.795 | 1.25 | 1.5 | 1.667 | 3.667 | |
APOBEC-7 | 8.959 | 11.797 | 15.47 | 11.982 | 0 | 5.167 | 2.25 | 1.0 | |
APOBEC-8 | 7.02 | 7.768 | 6.713 | 12.629 | 0 | 1.667 | 3.333 | 4.833 | |
BCbox-1 | 2.968 | 2.358 | 19.877 | 15.193 | 3.0 | 1.5 | 6.5 | 7.167 | |
BCbox-2 | 18.463 | 10.643 | 12.466 | 6.09 | 8.5 | 3.667 | 5.0 | 1.75 | |
BCbox-3 | 32.909 | 20.1 | 4.296 | 8.448 | 8.5 | 7.0 | 0 | 3.583 | |
CBFb-1 | 11.616 | 11.748 | 10.25 | 16.498 | 0 | 0 | 2.0 | 1.667 | |
CBFb-2 | 6.472 | 9.06 | 16.676 | 7.163 | 0 | 1.75 | 0 | 1.5 | |
Cul5-1 | 5.434 | 7.001 | 9.168 | 9.902 | 0 | 0 | 0 | 1.333 | |
Cul5-2 | 4.838 | 3.505 | 6.829 | 6.275 | 0 | 0 | 1.333 | 1.5 | |
Cul5-3 | 13.068 | 16.147 | 7.622 | 5.628 | 5.25 | 4.25 | 1.667 | 2.25 | |
NLIS | 6.834 | 4.887 | 8.631 | 36.326 | 0 | 2.0 | 2.0 | 8.5 | |
Threshold | 20.2 | 20.25 | 19.18 | 19.85 | 8.187 | 6.328 | 6.454 | 5.326 |
a Previous Results | b MAREV-1 | c MAREV-2 | ||||||
---|---|---|---|---|---|---|---|---|
Clinical Endpoint | Variable | Rank | Rank | Variable | Rank | Variable | ||
CD4Ini | BCbox-3 | 1 | = | 1 | BCbox-3 | = | 1 | BCbox-3 |
APOBEC-4 | 2 | - | APOBEC-3 | - | BCbox-2 | |||
Cul-5 | 3 | - | APOBEC-5 | |||||
- | APOBEC-2 | |||||||
CD4Hist | APOBEC-2 | 1 | = | 1 | APOBEC-2 | = | 1 | APOBEC-2 |
APOBEC-3 | 2 | = | 2 | APOBEC-3 | - | APOBEC-4 | ||
- | APOBEC-5 | - | BCbox-3 | |||||
VLIni | APOBEC-2 | 1 | = | 1 | APOBEC-2 | = | 1 | APOBEC-2 |
BCbox-1 | 2 | = | 2 | BCbox-1 | - | APOBEC-4 | ||
BCBox-2 | 3 | - | APOBEC-3 | - | BCbox-1 | |||
VLHist | NLIS | 1 | = | 1 | NLIS | - | APOBEC-2 | |
BCbox-1 | 2 | - | APOBEC-3 | - | NLIS | |||
APOBEC-2 | 3 | - | APOBEC-5 | - | BCbox-1 |
References
- UNAIDS. Data 2020. 2020. Available online: https://www.unaids.org/en/resources/documents/2020/unaids-data (accessed on 28 May 2020).
- Clercq, E.D. Emerging anti-HIV drugs. Expert Opin. Emerg. Drugs 2005, 10, 241–274. [Google Scholar] [CrossRef]
- Greene, W.C.; Debyser, Z.; Ikeda, Y.; Freed, E.O.; Stephens, E.; Yonemoto, W.; Buckheit, R.W.; Esté, J.A.; Cihlar, T. Novel targets for HIV therapy. Antivir. Res. 2008, 80, 251–265. [Google Scholar] [CrossRef]
- Eberle, J.; Gürtler, L.G. HIV Types, Groups, Subtypes and Recombinant Forms: Errors in Replication, Selection Pressure and Quasispecies. Intervirology 2012, 55, 79–83. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Scarlata, S.; Carter, C. Role of HIV-1 Gag domains in viral assembly. Biochim. Biophys. Acta (BBA) Biomembr. 2003, 1614, 62–72. [Google Scholar] [CrossRef] [Green Version]
- Coloccini, R.S.; Dilernia, D.; Ghiglione, Y.; Turk, G.; Laufer, N.; Rubio, A.; Socías, M.E.; Figueroa, M.I.; Sued, O.; Cahn, P.; et al. Host Genetic Factors Associated with Symptomatic Primary HIV Infection and Disease Progression among Argentinean Seroconverters. PLoS ONE 2014, 9, e113146. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Goila-Gaur, R.; Strebel, K. HIV-1 Vif, APOBEC, and Intrinsic Immunity. Retrovirology 2008, 5, 1–16. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Romani, B.; Engelbrecht, S.; Glashoff, R.H. Antiviral roles of APOBEC proteins against HIV-1 and suppression by Vif. Arch. Virol. 2009, 154, 1579–1588. [Google Scholar] [CrossRef]
- Beam, A.L.; Motsinger-Reif, A.; Doyle, J. Bayesian neural networks for detecting epistasis in genetic association studies. BMC Bioinform. 2014, 15, 368. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jiang, R.; Tang, W.; Wu, X.; Fu, W. A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinform. 2009, 10, S65. [Google Scholar] [CrossRef] [Green Version]
- Ritchie, M.D.; White, B.C.; Parker, J.S.; Hahn, L.W.; Moore, J.H. Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases. BMC Bioinform. 2003, 4, 28. [Google Scholar] [CrossRef] [Green Version]
- Motsinger-Reif, A.A.; Lee, S.L.; Mellick, G.; Ritchie, M.D. GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC Bioinform. 2006, 7, 39. [Google Scholar] [CrossRef] [Green Version]
- Motsinger, A.; Dudek, S.; Hahn, L.; Ritchie, M.D. Comparison of Neural Network Optimization Approaches for Studies of Human Genetics. Appl. Evol. Comput. 2006, 3907, 103–114. [Google Scholar] [CrossRef]
- Motsinger-Reif, A.A.; Ritchie, M.D. Neural networks for genetic epidemiology: Past, present, and future. BioData Min. 2008, 1, 3. [Google Scholar] [CrossRef] [Green Version]
- Tong, D.L.; Schierz, A.C. Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data. Artif. Intell. Med. 2011, 53, 47–56. [Google Scholar] [CrossRef]
- Cuevas-Tello, J.C.; Hernández-Ramírez, D.; García-Sepúlveda, C.A. Support vector machine algorithms in the search of KIR gene associations with disease. Comput. Biol. Med. 2013, 43, 2053–2062. [Google Scholar] [CrossRef] [PubMed]
- Boutorh, A.; Guessoum, A. Complex diseases SNP selection and classification by hybrid Association Rule Mining and Artificial Neural Network—based Evolutionary Algorithms. Eng. Appl. Artif. Intell. 2016, 51, 58–70. [Google Scholar] [CrossRef]
- Oriol, J.D.V.; Vallejo, E.E.; Estrada, K.; Peña, J.G.T.; Initiative, T.A.D.N. Benchmarking machine learning models for late-onset alzheimer’s disease prediction from genomic data. BMC Bioinform. 2019, 20, 709. [Google Scholar] [CrossRef]
- Hardin, J.; Waddell, M.; Page, C.D.; Zhan, F.; Barlogie, B.; Shaughnessy, J.; Crowley, J.J. Evaluation of Multiple Models to Distinguish Closely Related Forms of Disease Using DNA Microarray Data: An Application to Multiple Myeloma. Stat. Appl. Genet. Mol. Biol. 2004, 3, 1–21. [Google Scholar] [CrossRef] [Green Version]
- Altamirano-Flores, J.S.; Guerra-Palomares, S.E.; Hernandez-Sanchez, P.G.; Ramirez-Garcialuna, J.L.; Arguello-Astorga, J.R.; Noyola, D.E.; Cuevas-Tello, J.C.; Garcia-Sepulveda, C.A. Identification of HIV-1 Vif Protein Attributes Associated With CD4 T Cell Numbers and Viral Loads Using Artificial Intelligence Algorithms. IEEE Access 2020, 8, 87214–87227. [Google Scholar] [CrossRef]
- López, V.; Fernández, A.; García, S.; Palade, V.; Herrera, F. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 2013, 250, 113–141. [Google Scholar] [CrossRef]
- Zieba, M.; Tomczak, J.M. Boosted SVM with active learning strategy for imbalanced data. Soft Comput. 2014, 19, 3357–3368. [Google Scholar] [CrossRef] [Green Version]
- Guerra-Palomares, S.E.; Hernandez-Sanchez, P.G.; Esparza-Pérez, M.A.; Arguello, J.R.; Noyola, D.E.; García-Sepúlveda, C.A. Molecular Characterization of Mexican HIV-1 Vif Sequences. AIDS Res. Hum. Retroviruses 2015, 31, 290–295. [Google Scholar] [CrossRef] [PubMed]
- Govender, S.; Otwombe, K.; Essien, T.; Panchia, R.; de Bruyn, G.; Mohapi, L.; Gray, G.; Martinson, N. CD4 counts and viral loads of newly diagnosed HIV-infected individuals: Implications for treatment as prevention. PLoS ONE 2014, 9, e90754. [Google Scholar] [CrossRef] [PubMed]
- Lane, P.C.; Clarke, D.; Hender, P. On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data. Decis. Support Syst. 2012, 53, 712–718. [Google Scholar] [CrossRef] [Green Version]
- Hastie, T.; Friedman, J.; Tisbshirani, R. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2017; pp. 210–211. [Google Scholar]
- Haixiang, G.; Yijing, L.; Shang, J.; Mingyun, G.; Yuanyue, H.; Bing, G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 2017, 73, 220–239. [Google Scholar] [CrossRef]
- Ignizio, J. An Introduction to Expert Systems; Mc Graw-Hill: New York, NY, USA, 1991. [Google Scholar]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Brooks/Cole Advanced Books & Software: Monterey, CA, USA, 1984. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Buitinck, L.; Louppe, G.; Blondel, M.; Pedregosa, F.; Mueller, A.; Grisel, O.; Niculae, V.; Prettenhofer, P.; Gramfort, A.; Grobler, J.; et al. API design for machine learning software: Experiences from the scikit-learn project. In Proceedings of the ECML PKDD Workshop: Languages for Data Mining and Machine Learning, Prague, Czech Republic, 23–27 September 2013; pp. 108–122. [Google Scholar]
- Singh, S.; Gupta, P. Comparative study ID3, CART and C4.5 decision tree algorithm: A survey. Int. J. Adv. Inf. Sci. Technol. 2014, 27, 97–103. [Google Scholar]
- Mitchell, T. Machine Learning; Mc Graw-Hill: New York, NY, USA, 1997. [Google Scholar]
- Rosenblatt, F. The Perceptron—A Perceiving and Recognizing Automaton; Technical Report 85-460; Cornell Aeronautical Laboratory: Buffalo, NY, USA, 1957. [Google Scholar]
- Hinton, G.E. Connectionist learning procedures. Artif. Intell. 1989, 40, 185–234. [Google Scholar] [CrossRef] [Green Version]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Bishop, C.M.; Hinton, G.E. Neural Networks for Pattern Recognition; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
- Rojas, R. Neural Networks: A Systematic Introduction; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall: Hoboken, NJ, USA, 1999. [Google Scholar]
- Widrow, B.; Hoff, M. Associative Storage and Retrieval of Digital Information in Networks of Adaptive ‘Neurons’. Biol. Prototypes Synth. Syst. 1962, 1, 160. [Google Scholar]
- Byrd, R.; Peihuang, L.; Nocedal, J. A Limited-Memory Algorithm for Bound-Constrained Optimization; Technical Report; U.S. Department of Energy: Washington, DC, USA, 1996. [CrossRef] [Green Version]
- Gunn, S. Support Vector Machines for Classification and Regression; Technical Report; University of Southampton: Southampton, UK, 1998. [Google Scholar]
- Shawe-Taylor, J.; Cristianini, N. Kernel Methods for Pattern Analysis; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. The WEKA Data Mining Software: An Update. SIGKDD Explor 2009, 11, 10–18. [Google Scholar] [CrossRef]
- Simon, J.H.M.; Sheehy, A.M.; Carpenter, E.A.; Fouchier, R.A.M.; Malim, M.H. Mutational Analysis of the Human Immunodeficiency Virus Type 1 Vif Protein. J. Virol. 1999, 73, 2675–2681. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, G.; He, Z.; Wang, T.; Xu, R.; Yu, X.F. A Patch of Positively Charged Amino Acids Surrounding the Human Immunodeficiency Virus Type 1 Vif SLVx4Yx9Y Motif Influences Its Interaction with APOBEC3G. J. Virol. 2009, 83, 8674–8682. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Clinical Endpoint | Algorithm | Mean/S.D. | Range | Clinical Endpoint | Algorithm | Mean/S.D. | Range | |
---|---|---|---|---|---|---|---|---|
CD4Ini | MLP | 79.6 ± 5.7 | 68.6–93.8 | VLIni | MLP | 68.5 ± 3.2 | 61.1–75.2 | |
CART | 77.8 ± 6.0 | 65.7–91.0 | CART | 68.0 ± 3.7 | 59.1–80.2 | |||
SVMs | 76.2 ± 5.6 | 61.0–88.1 | NB | 66.5 ± 3.4 | 57.4–75.0 | |||
NB | 74.9 ± 5.9 | 59.5–90.5 | SVMs | 62.0 ± 4.0 | 51.7–71.5 | |||
CD4Hist | MLP | 76.0 ± 5.4 | 63.3–91.0 | VLHist | MLP | 66.3 ± 2.7 | 60.9–73.8 | |
CART | 74.0 ± 6.2 | 62.9–88.1 | CART | 64.2 ± 2.5 | 59.1–71.4 | |||
NB | 72.6 ± 5.8 | 60.0–87.6 | NB | 64.1 ± 2.9 | 57.5–71.1 | |||
SVMs | 66.7 ± 6.4 | 53.8–81.9 | SVMs | 63.2 ± 3.0 | 51.3–68.5 |
Clinical Endpoint | Algorithm | Combination | Accuracy |
---|---|---|---|
CD4Ini | MLP | BCbox-3, APOBEC-3, BCbox-2, Cul5-3, BCbox-1, APOBEC-5 | 93.8 |
CART | BCbox-3, BCbox-2, Cul5-3, APOBEC-2, APOBEC-3, APOBEC-5 | 91.0 | |
NB | APOBEC-2, BCbox-3, APOBEC-3, BCbox-2, Cul5-3, APOBEC-6 | 90.5 | |
SVMs | BCbox-2, APOBEC-2, APOBEC-3, APOBEC-4, BCbox-1, BCbox-3 | 88.1 | |
CD4Hist | MLP | APOBEC-2, Cul5-3, APOBEC-4, BCbox-3, APOBEC-7, BCbox-2, NLIS, BCbox-1 | 91.0 |
CART | APOBEC-2, BCbox-3, BCbox-2, APOBEC-4, APOBEC-5 | 88.1 | |
NB | APOBEC-2, APOBEC-4, Cul5-3, CBFb-2, BCbox-3, APOBEC-7 | 87.6 | |
SVMs | APOBEC-2, APOBEC-3, APOBEC-4, APOBEC-5, APOBEC-6, APOBEC-8, APOBEC-7, BCbox-3 | 81.9 | |
VLIni | CART | APOBEC-2, BCbox-1, APOBEC-4, BCbox-2 | 80.2 |
MLP | APOBEC-2, BCbox-1, APOBEC-8, APOBEC-3, APOBEC-4, APOBEC-5, Cul5-2 | 75.2 | |
NB | APOBEC-2, APOBEC-4, BCbox-1, BCbox-2, NLIS, Cul5-3, APOBEC-3, APOBEC-5, CBFb-1 | 75.0 | |
SVMs | APOBEC-2, APOBEC-7, APOBEC-3, APOBEC-4, APOBEC-5, APOBEC-6, APOBEC-8, BCbox-2 | 71.5 | |
VLHist | MLP | NLIS, APOBEC-3, APOBEC-2, APOBEC-8, BCbox-1, CBFb-1, Cul5-1, Cul5-2 | 73.8 |
CART | APOBEC-2, BCbox-3, BCbox-1, APOBEC-8, NLIS total | 71.4 | |
NB | APOBEC-2, Cul5-3, NLIS, BCbox-2, APOBEC-3, BCbox-1, APOBEC-8, CBFb-2, APOBEC-6, APOBEC-7 | 71.1 | |
SVMs | NLIS, APOBEC-4, BCbox-1, APOBEC-2, APOBEC-5, APOBEC-6, BCbox-3 | 68.5 |
Contingency Tables | Classification | |||||||
---|---|---|---|---|---|---|---|---|
Approach | Output | Vif Variable Combination | Status | ≥500 cells/μL | <500 cells/μL | Accuracy | Error | p-Valueeffect |
(a) MAREV-1 | Initial CD4 | BCbox-3, APOBEC-3 | absent | 8 | 53 | 81.3% | 18.7% | 0.0011 |
present | 8 | 6 | (61/75) | (14/75) | ||||
Historic CD4 | APOBEC-2, APOBEC-3, APOBEC-5 | absent | 14 | 35 | 52.0% | 48.0% | 0.0136 | |
present | 1 | 25 | (39/75) | (36/75) | ||||
APOBEC-2, APOBEC-3 | absent | 2 | 29 | 56% | 44.0% | 0.0182 | ||
present | 13 | 31 | (42/75) | (33/75) | ||||
<10,000 cp/mL | ≥10,000 cp/mL | |||||||
Initial VL | APOBEC-2, BCbox-1, APOBEC-3 | absent | 22 | 28 | 57.3% | 42.7% | 0.0207 | |
present | 4 | 21 | (43/75) | (32/75) | ||||
Historic VL | —– —– —– —– —– —– —– —– | — | — | — | — | — | — | |
(b) MAREV-2 | Initial CD4 | BCbox-3, BCbox-2 | absent | 15 | 33 | 54.7% | 45.3% | 0.0068 |
present | 1 | 26 | (41/75) | (34/75) | ||||
BCbox-3, BCbox-2 | absent | 10 | 55 | 81.3% | 18.7% | 0.0049 | ||
present | 6 | 4 | (61/75) | (14/75) | ||||
Historic CD4 | APOBEC-2, BCbox-3 | absent | 15 | 40 | 53.3% | 46.7% | 0.0077 | |
present | 0 | 20 | (40/75) | (35/75) | ||||
<10,000 cp/mL | ≥10,000 cp/mL | |||||||
Initial VL | APOBEC-2, BCbox-1, APOBEC-4 | absent | 25 | 38 | 52.0% | 48.0% | 0.0477 | |
present | 1 | 11 | (39/75) | (36/75) | ||||
Historic VL | NLIS, BCbox-1, APOBEC-2 | absent | 41 | 27 | 62.7% | 37.3% | 0.0392 | |
present | 1 | 6 | (47/75) | (28/75) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Altamirano-Flores, J.S.; Alvarado-Hernández, L.Á.; Cuevas-Tello, J.C.; Tino, P.; Guerra-Palomares, S.E.; Garcia-Sepulveda, C.A. Identification of Clinically Relevant HIV Vif Protein Motif Mutations through Machine Learning and Undersampling. Cells 2023, 12, 772. https://doi.org/10.3390/cells12050772
Altamirano-Flores JS, Alvarado-Hernández LÁ, Cuevas-Tello JC, Tino P, Guerra-Palomares SE, Garcia-Sepulveda CA. Identification of Clinically Relevant HIV Vif Protein Motif Mutations through Machine Learning and Undersampling. Cells. 2023; 12(5):772. https://doi.org/10.3390/cells12050772
Chicago/Turabian StyleAltamirano-Flores, José Salomón, Luis Ángel Alvarado-Hernández, Juan Carlos Cuevas-Tello, Peter Tino, Sandra E. Guerra-Palomares, and Christian A. Garcia-Sepulveda. 2023. "Identification of Clinically Relevant HIV Vif Protein Motif Mutations through Machine Learning and Undersampling" Cells 12, no. 5: 772. https://doi.org/10.3390/cells12050772