This user guide describes the rationale behind, and the modus operandi of a Unix script-driven package for evolutionary searching of optimal Support Vector Machine model parameters as computed by the libsvm package, leading to support vector machine models of maximal predictive power and robustness. Unlike common libsvm parameterizing engines, the current distribution includes the key choice of best-suited sets of attributes/descriptors, in addition to the classical libsvm operational parameters (kernel choice, kernel parameters, cost, and so forth), allowing a unified search in an enlarged problem space. It relies on an aggressive, repeated cross-validation scheme to ensure a rigorous assessment of model quality. Primarily designed for chemoinformatics applications, it also supports the inclusion of decoy instances, for which the explained property (bioactivity) is, strictly speaking, unknown but presumably “inactive”, thus additionally testing the robustness of a model to noise. The package was developed with parallel computing in mind, supporting execution on both multi-core workstations as well as compute cluster environments. It can be downloaded from http://infochim.u-strasbg.fr/spip.php?rubrique178.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited