Abstract
Machine Learning algorithms have a broad applicability, although generally a huge effort is necessary to find a good configuration for a given task. The tuning of free parameters, for example, is a task that directly affects the algorithm’s performance but is often carried out as an ad hoc process. An alternative approach is to define the problem as a search in the parameters space, which can be computationally expensive and slow. Furthermore, to apply an algorithm to a different problem, all the work must be done from scratch. Transfer learning can be used to avoid this rework. In this paper we propose an approach to tune the parameter by means of transfer learning. The idea is to use data complexity characterization measures to evaluate the similarity among datasets and evaluate whether they share similar configurations of good parameters. To compare our approach, four performance measures were used: area under ROC Curve (AUC), accuracy, F1 and area under Precision-Recall Curve. Results show that the proposed approach may reduce the search space for the parameter tuning by exploiting parameter recommendation of similar datasets and provide competitive performance compared to other widely used techniques.
Similar content being viewed by others
References
Ali, S., Smith-Miles, K.A.: A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing 70(1–3), 173–186 (2006)
Basu, M.: Data Complexity in Pattern Recognition. Springer, Secaucus (2006)
Caruana, R., Karampatziakis, N., Yessenalina, A.: An empirical evaluation of supervised learning in high dimensions. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ICML, ACM, ACM International Conference Proceeding Series, vol. 307, pp. 96–103 (2008)
Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Mach. Learn. 46(1–3), 131–159 (2002)
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, ACM, New York, NY, USA, ICML ’06, pp. 233–240 (2006), doi:10.1145/1143844.1143874
Duan, K., Keerthi, S.S.: Which is the best multiclass svm method? an empirical study. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F (eds.) Systems, Multiple Classifier, Lecture Notes in Computer Science, vol. 3541, pp. 278–285. Springer (2005)
Faceli, K., Lorena, A.C., Gama, J., de Carvalho, A.C.P.L.F.: Inteligṅcia Artificial: Uma Abordagem de Aprendizado de Mq̈uina. LTC (2011)
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Bratko, I., Dzeroski, S. (eds.) ICML, Morgan Kaufmann, pp. 124–133 (1999)
Gomes, T.A.F., Prudêncio, R.B.C., Soares, C., Rossi, A.L.D., Carvalho, A.C.P.L.F.: Combining meta-learning and search techniques to select parameters for support vector machines . Neurocomputing 75(1), 3–13 (2012)
Hernández-Orallo, J., Flach, P.A., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813–2869 (2012)
Ho, T.K., Baird, H.S.: Pattern classification with compact distribution maps. Comp. Vision Image Underst. 70(1), 101–110 (1998)
Hsu, C.W., Chang C.C., Lin, C.J.: A practical guide to support vector classification. Tech. rep., Department of Computer Science, National Taiwan University (2003). http://www.csie.ntu.edu.tw/cjlin/papers.html
Imbault, F., Lebart K.: A stochastic optimization approach for parameter tuning of support vector machines. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004 ICPR, vol. 4, p. 597 (2004)
Japkowicz, N., Shah, M.(eds.) Evaluating Learning Algorithms: A Classification Perspective (2011)
Lin, S.W., Lee, Z.J., Chen, S.C., Tseng, T.Y.: Parameter determination of support vector machine and feature selection using simulated annealing approach. Appl. Soft. Comput. 8(4), 1505–1512 (2008)
Lorena, A.C., de Carvalho, A.C.P.L.F.: An hybrid ga/svm approach for multiclass classification with directed acyclic graphs. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA, Lecture Notes in Computer Science, vol. 3171, pp. 366–375. Springer (2004)
de Miranda, P.B.C., Prudêncio, R.B.C., de Carvalho, A.C.P.L.F., Soares, C.: Combininga multi-objective optimization approach with meta-learning for svm parameter selection. In: SMC, IEEE, pp. 2909–2914 (2012a)
de Miranda, P.B.C., Prudêncio, R.B.C., de Carvalho, A.C.P.L.F., Soares, C.: An experimental study of the combination of meta-learning with particle swarm algorithms for svm parameter selection. In: Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O. (eds.) ICCSA (3), Lecture Notes in Computer Science, vol. 7335, pp. 562–575. Springer (2012b)
Mitchell, T.M.: Machine Learning. McGraw Hill (1997)
Morais, G., Prati, R.: Complex network measures for data set characterization. In: Brazilian Conference on Intelligent Systems (BRACIS’2013), pp. 12–18 (2013)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10) (2010)
Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Meta learning by landmarking various learning algorithms. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 743–750. Morgan Kaufmann (2000)
Prati, R.C.: Combining feature ranking algorithms through rank aggregation. In: The 2012 International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, IEEE, pp 1–8 (2012)
Prati, R.C., Batista, G.E.A.P.A., Monard, M.C.: A survey on graphical methods for classification predictive performance evaluation. IEEE Trans. Knowl. Data Eng. 23(11), 1601–1618 (2011)
Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012). http://dblp.uni-trier.de/db/journals/ml/ml87.html#ReifSD12
Soares, C., Brazdil, P., Kuba, P.: A meta-learning method to select the kernel width in support vector regression. Mach. Learn. 54(3), 195–209 (2004)
Sotoca, J.M., Mollineda, R.A., Sánchez, J.S.: A meta learning framework for pattern classification by means of data complexity measures. Inteligencia Artificial Revista Iberoamericana de Inteligencia Artificial 10(29), 31–38 (2006)
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. SIGKDD Explor. 15(2), 49–60(2013)
Vapnik, V.: Statistical learning theory. Wiley (1998)
Zhang, X.: Structural risk minimization. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 929–930. Springer (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Biondi, G.O., Prati, R.C. Setting Parameters for Support Vector Machines using Transfer Learning. J Intell Robot Syst 80 (Suppl 1), 295–311 (2015). https://doi.org/10.1007/s10846-014-0159-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-014-0159-x