{"abstracts":[{"sha1":"fcabcdeeacb1f31079af5280139a0f616e35cbed","content":"SVM with an RBF kernel is usually one of the best classification algorithms\nfor most data sets, but it is important to tune the two hyperparameters C and\nγ to the data itself. In general, the selection of the hyperparameters\nis a non-convex optimization problem and thus many algorithms have been\nproposed to solve it, among them: grid search, random search, Bayesian\noptimization, simulated annealing, particle swarm optimization, Nelder Mead,\nand others. There have also been proposals to decouple the selection of\nγ and C. We empirically compare 18 of these proposed search algorithms\n(with different parameterizations for a total of 47 combinations) on 115\nreal-life binary data sets. We find (among other things) that trees of Parzen\nestimators and particle swarm optimization select better hyperparameters with\nonly a slight increase in computation time with respect to a grid search with\nthe same number of evaluations. We also find that spending too much\ncomputational effort searching the hyperparameters will not likely result in\nbetter performance for future data and that there are no significant\ndifferences among the different procedures to select the best set of\nhyperparameters when more than one is found by the search algorithms.","mimetype":"text/plain","lang":"en"},{"sha1":"368b5e76ba687a57f9f399b44c09c6debb1e3b3b","content":"SVM with an RBF kernel is usually one of the best classification algorithms\nfor most data sets, but it is important to tune the two hyperparameters $C$ and\n$\\gamma$ to the data itself. In general, the selection of the hyperparameters\nis a non-convex optimization problem and thus many algorithms have been\nproposed to solve it, among them: grid search, random search, Bayesian\noptimization, simulated annealing, particle swarm optimization, Nelder Mead,\nand others. There have also been proposals to decouple the selection of\n$\\gamma$ and $C$. We empirically compare 18 of these proposed search algorithms\n(with different parameterizations for a total of 47 combinations) on 115\nreal-life binary data sets. We find (among other things) that trees of Parzen\nestimators and particle swarm optimization select better hyperparameters with\nonly a slight increase in computation time with respect to a grid search with\nthe same number of evaluations. We also find that spending too much\ncomputational effort searching the hyperparameters will not likely result in\nbetter performance for future data and that there are no significant\ndifferences among the different procedures to select the best set of\nhyperparameters when more than one is found by the search algorithms.","mimetype":"application/x-latex","lang":"en"}],"refs":[],"contribs":[{"index":0,"raw_name":"Jacques Wainer","role":"author"},{"index":1,"raw_name":"Pablo Fonseca","role":"author"}],"license_slug":"ARXIV-1.0","language":"en","version":"v1","ext_ids":{"arxiv":"2008.11655v1"},"release_year":2020,"release_date":"2020-08-26","release_stage":"submitted","release_type":"article","webcaptures":[],"filesets":[],"files":[{"release_ids":["cdlgg5hdlfakddjxqgcaad3lou"],"mimetype":"application/pdf","urls":[{"url":"https://arxiv.org/pdf/2008.11655v1.pdf","rel":"repository"},{"url":"https://web.archive.org/web/20200829193618/https://arxiv.org/pdf/2008.11655v1.pdf","rel":"webarchive"}],"sha256":"a9ed6d7775d1740f8cdf8964e755e3950ded799c4d83849eb3c6fc536ddf217c","sha1":"c66c7b232dc776908aad1abaed9ddfe452f4a96e","md5":"717e415669121c5451ae892c35861974","size":494295,"revision":"7489cff4-aea3-4e85-96c2-9618cd961638","ident":"2euripzpergxpbfwjq4tvzkeci","state":"active"}],"work_id":"ceyvuuaornf4hp6sgqpx2bhkgy","title":"How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18 search algorithms","state":"active","ident":"cdlgg5hdlfakddjxqgcaad3lou","revision":"bc9d74d8-b084-4d67-af9f-4d608ab062c9","extra":{"arxiv":{"base_id":"2008.11655","categories":["cs.LG","stat.ML"]}}}