## 1. Introduction

## 2. Results

#### 2.1. Random Forest Classifier

#### 2.2. Position Analysis of the MYC-MAX Protein

## 3. Materials and Methods

#### 3.1. Materials

#### 3.2. Methods

## 4. Discussion

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A

#### Appendix A.1. Performance Measures with Standard Error

**Table A1.**Prediction performance of Random Forest (RF) classifier on different features using a cut-off of 3.5 Å. The prediction system was evaluated by five-fold cross validation.

Feature | Sensitivity ± SE(%) | Specificity ± SE(%) | MCC ± SE(%) |
---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 29.2 ± 2.20 | 96.3 ± 0.46 | 30.7 ± 0.95 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 38.5 ± 3.04 | 94.9 ± 0.57 | 34.9 ± 1.7 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 41.0 ± 3.23 | 93.9 ± 0.57 | 35.0 ± 1.85 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 41.4 ± 3.42 | 94.0 ± 0.51 | 34.8 ± 2.07 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 33.9 ± 2.32 | 95.8 ± 0.37 | 33.4 ± 1.36 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 41.6 ± 3.05 | 95.0 ± 0.46 | 37.8 ± 2.19 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.1 ± 3.12 | 94.0 ± 0.43 | 37.2 ± 2.37 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 43.9 ± 3.14 | 94.0 ± 0.40 | 37.0 ± 2.25 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 36.7 ± 2.07 | 96.8 ± 0.27 | 39.8 ± 1.58 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 42.2 ± 2.70 | 95.8 ± 0.42 | 40.9 ± 1.95 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.7 ± 3.05 | 95.0 ± 0.38 | 40.3 ± 1.98 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.4 ± 3.12 | 94.7 ± 0.39 | 39.3 ± 2.02 |

**Table A2.**Prediction performance of Random Forest (RF) classifier on different features using a cut-off of 5.0 Å. The prediction system was evaluated by five-folds cross validation.

Feature | Sensitivity ± SE(%) | Specificity ± SE(%) | MCC ± SE(%) |
---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 28.6 ± 2.56 | 96.6 ± 0.47 | 35.0 ± 1.43 5 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 39.5 ± 2.89 | 95.0 ± 0.55 | 40.7 ± 1.99 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 41.8 ± 3.02 | 94.3 ± 0.62 | 41.1 ± 2.05 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 42.6 ± 3.25 | 94.2 ± 0.54 | 41.4 ± 2.37 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 33.4 ± 2.34 | 96.3 ± 0.38 | 38.6 ± 1.90 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 42.4 ± 2.97 | 95.1 ± 0.61 | 43.6 ± 2.43 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.8 ± 2.99 | 94.4 ± 0.56 | 43.8 ± 2.45 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.5 ± 3.04 | 94.4 ± 0.50 | 43.4 ± 2.35 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 33.7 ± 2.48 | 97.5 ± 0.35 | 43.1 ± 2.05 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 41.9 ± 2.89 | 95.8 ± 0.55 | 45.0 ± 2.39 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 43.9 ± 2.89 | 95.2 ± 0.48 | 45.3 ± 2.32 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 44.2 ± 2.91 | 94.9 ± 0.54 | 44.5 ± 2.24 |

#### Appendix A.2. RBscore Dataset Analysis

**Table A3.**The detailed prediction performance of Random Forest (RF) classifier on different features using a cut-off of 3.5 Å.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.458 | 0.974 | 0.476 | 0.866 | 0.460 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.56 | 0.965 | 0.514 | 0.894 | 0.518 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.597 | 0.957 | 0.511 | 0.899 | 0.523 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.591 | 0.958 | 0.511 | 0.90 | 0.526 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.512 | 0.97 | 0.501 | 0.878 | 0.476 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.581 | 0.96 | 0.511 | 0.899 | 0.520 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.611 | 0.953 | 0.508 | 0.903 | 0.526 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.613 | 0.953 | 0.509 | 0.902 | 0.528 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.517 | 0.976 | 0.534 | 0.896 | 0.528 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.58 | 0.967 | 0.54 | 0.907 | 0.543 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.612 | 0.963 | 0.546 | 0.910 | 0.551 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.601 | 0.962 | 0.531 | 0.909 | 0.546 |

**Table A4.**The detailed prediction performance of Random Forest (RF) classifier on different features using a cut-off of 5.0 Å.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.445 | 0.977 | 0.528 | 0.873 | 0.589 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.553 | 0.968 | 0.579 | 0.899 | 0.643 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.57 | 0.962 | 0.572 | 0.900 | 0.642 |

0.569 | 0.963 | 0.574 | 0.895 | 0.642 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.49 | 0.973 | 0.547 | 0.880 | 0.602 |

0.578 | 0.963 | 0.583 | 0.902 | 0.648 | |

0.605 | 0.958 | 0.587 | 0.904 | 0.652 | |

0.603 | 0.959 | 0.587 | 0.902 | 0.653 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.499 | 0.98 | 0.584 | 0.895 | 0.641 |

0.57 | 0.968 | 0.595 | 0.908 | 0.661 | |

0.592 | 0.965 | 0.60 | 0.908 | 0.665 | |

0.594 | 0.964 | 0.597 | 0.907 | 0.663 |

#### Appendix A.3. PreDNA Dataset Analysis

**Table A5.**The detailed prediction performance of Random Forest (RF) classifier on different features using a cut-off of 3.5 Å.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.378 | 0.977 | 0.41 | 0.840 | 0.391 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.498 | 0.963 | 0.448 | 0.865 | 0.453 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.543 | 0.953 | 0.445 | 0.869 | 0.451 |

0.538 | 0.956 | 0.453 | 0.869 | 0.455 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.393 | 0.975 | 0.417 | 0.847 | 0.402 |

0.501 | 0.966 | 0.461 | 0.872 | 0.463 | |

0.545 | 0.959 | 0.465 | 0.876 | 0.468 | |

0.523 | 0.958 | 0.449 | 0.875 | 0.465 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.428 | 0.977 | 0.458 | 0.867 | 0.451 |

0.511 | 0.97 | 0.488 | 0.885 | 0.488 | |

0.539 | 0.962 | 0.475 | 0.888 | 0.488 | |

0.539 | 0.961 | 0.47 | 0.886 | 0.488 |

**Table A6.**The detailed prediction performance of Random Forest (RF) classifier on different features using a cut-off of 5.0 Å.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.373 | 0.979 | 0.463 | 0.833 | 0.496 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.485 | 0.962 | 0.495 | 0.858 | 0.540 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.496 | 0.953 | 0.475 | 0.858 | 0.534 |

0.495 | 0.955 | 0.479 | 0.857 | 0.535 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.389 | 0.977 | 0.47 | 0.839 | 0.501 |

0.49 | 0.963 | 0.501 | 0.863 | 0.550 | |

0.503 | 0.957 | 0.492 | 0.865 | 0.547 | |

0.504 | 0.958 | 0.497 | 0.865 | 0.550 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.395 | 0.98 | 0.488 | 0.858 | 0.530 |

0.48 | 0.968 | 0.511 | 0.874 | 0.563 | |

0.506 | 0.962 | 0.51 | 0.873 | 0.560 | |

0.499 | 0.96 | 0.498 | 0.871 | 0.555 |

**Figure 1.**DNA-binding sites in proto-oncobenic transcription factor MYC-MAX protein complex (PDB-Entry 1NKP). Green spheres denote positions of the DNA-binding sites in both proteins which are detected by RF classifier either using the existing features (${\mathrm{f}}_{\mathrm{PSSM}}$, ${\mathrm{f}}_{\mathrm{OBV}}$, and ${\mathrm{f}}_{\mathrm{SS}}$) alone or combining our new features with these existing features together. Purple spheres show the localization of additional binding sites which were only found by RF classifier using our new features with existing features. Moreover, there are further three binding sites in MYC protein and one binding site in MAX protein, shown with yellow spheres, that could not be identified by the classifier.

**Table 1.**Prediction performance of Random Forest (RF) classifier on different features using a cut-off of 3.5 Å. The prediction system was evaluated by five-fold cross validation.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.292 | 0.963 | 0.307 | 0.777 | 0.313 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.385 | 0.949 | 0.349 | 0.795 | 0.369 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.41 | 0.939 | 0.35 | 0.802 | 0.377 |

0.414 | 0.94 | 0.348 | 0.800 | 0.376 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.339 | 0.958 | 0.334 | 0.794 | 0.338 |

0.416 | 0.95 | 0.378 | 0.808 | 0.390 | |

0.441 | 0.94 | 0.372 | 0.817 | 0.401 | |

0.439 | 0.94 | 0.37 | 0.814 | 0.399 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.367 | 0.968 | 0.398 | 0.838 | 0.413 |

0.422 | 0.958 | 0.409 | 0.837 | 0.425 | |

0.447 | 0.95 | 0.403 | 0.841 | 0.431 | |

0.444 | 0.947 | 0.393 | 0.835 | 0.423 |

**Table 2.**Prediction performance of Random Forest (RF) classifier on different features using a cut-off of 5.0 Å. The prediction system was evaluated by five-fold cross validation.

Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|

${\mathrm{f}}_{\mathrm{PSSM}}$ | 0.286 | 0.966 | 0.350 | 0.778 | 0.425 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}}$ | 0.395 | 0.95 | 0.407 | 0.801 | 0.487 |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathbb{J}\mathbb{S}\mathbb{D}-\mathrm{t}}$ | 0.418 | 0.943 | 0.411 | 0.807 | 0.494 |

0.426 | 0.942 | 0.414 | 0.807 | 0.497 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.334 | 0.963 | 0.386 | 0.796 | 0.455 |

0.424 | 0.951 | 0.436 | 0.814 | 0.513 | |

0.448 | 0.944 | 0.438 | 0.820 | 0.520 | |

0.445 | 0.944 | 0.434 | 0.819 | 0.521 | |

${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.337 | 0.975 | 0.431 | 0.830 | 0.517 |

0.419 | 0.958 | 0.450 | 0.832 | 0.535 | |

0.439 | 0.952 | 0.453 | 0.836 | 0.539 | |

0.442 | 0.949 | 0.445 | 0.832 | 0.535 |

**Table 3.**Prediction performance of Random Forest (RF) classifier on RBscore dataset using different distance cut-offs.

Cut-Off | Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|---|

3.5 Å | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.517 | 0.976 | 0.534 | 0.896 | 0.528 |

0.58 | 0.967 | 0.54 | 0.907 | 0.543 | ||

0.612 | 0.963 | 0.546 | 0.910 | 0.551 | ||

0.601 | 0.962 | 0.531 | 0.909 | 0.546 | ||

5.0 Å | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.499 | 0.98 | 0.584 | 0.895 | 0.641 |

0.57 | 0.968 | 0.595 | 0.908 | 0.661 | ||

0.592 | 0.965 | 0.60 | 0.908 | 0.665 | ||

0.594 | 0.964 | 0.597 | 0.907 | 0.663 |

**Table 4.**Prediction performance of RF classifier on PreDNA dataset using different distance cut-offs.

Cut-Off | Feature | Sensitivity | Specificity | MCC | AUC-ROC | AUC-PR |
---|---|---|---|---|---|---|

3.5 Å | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.428 | 0.977 | 0.458 | 0.867 | 0.451 |

0.511 | 0.97 | 0.488 | 0.885 | 0.488 | ||

0.539 | 0.962 | 0.475 | 0.888 | 0.488 | ||

0.539 | 0.961 | 0.47 | 0.886 | 0.488 | ||

5.0 Å | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.395 | 0.98 | 0.488 | 0.858 | 0.530 |

0.48 | 0.968 | 0.511 | 0.874 | 0.563 | ||

0.506 | 0.962 | 0.51 | 0.873 | 0.560 | ||

0.499 | 0.96 | 0.498 | 0.871 | 0.555 |

**Table 5.**Prediction performance of RF classifier on different features using a cut-off of 3.5 Å for MYC-MAX protein complex (Protein Data Bank (PDB)-Entry 1NKP).

Protein | Feature | Sensitivity | Specificity | MCC |
---|---|---|---|---|

MYC | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.30 | 0.941 | 0.282 |

0.70 | 0.853 | 0.448 | ||

0.70 | 0.853 | 0.448 | ||

0.70 | 0.868 | 0.470 | ||

MAX | ${\mathrm{f}}_{\mathrm{PSSM}}$ + ${\mathrm{f}}_{\mathrm{OBV}}$ + ${\mathrm{f}}_{\mathrm{SS}}$ | 0.222 | 1.0 | 0.447 |

0.888 | 0.906 | 0.664 | ||

0.888 | 0.922 | 0.697 | ||

0.889 | 0.922 | 0.697 |

