Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-wq2xx Total loading time: 0 Render date: 2024-04-24T18:40:58.211Z Has data issue: false hasContentIssue false

7 - Massive SVM Parallelization Using Hardware Accelerators

from Part Two - Supervised and Unsupervised Learning Algorithms

Published online by Cambridge University Press:  05 February 2012

Igor Durdanovic
Affiliation:
NEC Labs America, Princeton, NJ, USA
Eric Cosatto
Affiliation:
NEC Labs America, Princeton, NJ, USA
Hans Peter Graf
Affiliation:
NEC Labs America, Princeton, NJ, USA
Srihari Cadambi
Affiliation:
NEC Labs America, Princeton, NJ, USA
Venkata Jakkula
Affiliation:
NEC Labs America, Princeton, NJ, USA
Srimat Chakradhar
Affiliation:
NEC Labs America, Princeton, NJ, USA
Abhinandan Majumdar
Affiliation:
NEC Labs America, Princeton, NJ, USA
Ron Bekkerman
Affiliation:
LinkedIn Corporation, Mountain View, California
Mikhail Bilenko
Affiliation:
Microsoft Research, Redmond, Washington
John Langford
Affiliation:
Yahoo! Research, New York
Get access

Summary

Support Vector Machines (SVMs) are some of the most widely used classification and regression algorithms for data analysis, pattern recognition, or cognitive tasks. Yet learning problems that can be solved by SVMs are limited in size because of high computational cost and excessive storage requirements. Many variations of the original SVM algorithm were introduced that scale better to large problems. They change the SVM framework quite drastically, such as apply optimizations other than the maximum margin, or introduce different error metrics for the cost function. Such algorithms may work for some applications, but they do not have the robustness and universality that make SVMs so popular.

The approach taken here is to maintain the SVM algorithm in its original form and scale it to large problems through parallelization. Computer performance cannot be improved anymore at the pace of the last few decades by increasing the clock frequencies. Today, significant accelerations are achieved mostly through parallel architectures, and multicore processors are commonplace nowadays. Mapping the SVM algorithm to multicore processors with shared-memory architectures is straightforward, yet this approach does not scale to a large number of processors. Here we investigate parallelization concepts that scale to hundreds and thousands of cores where, for example, cache coherence can no longer be maintained.

A number of SVM implementations on clusters or graphics processors (GPUs) have been proposed recently. A parallel optimization algorithm based on gradient projections has been demonstrated (see Zanghirati, and Zanni, 2003; Zanni, Serafini, and Zanghirati, 2006) that uses a spectral gradient method for fast convergence while maintaining the Karush-Kuhn-Tucker (KKT) constraints.

Type
Chapter
Information
Scaling up Machine Learning
Parallel and Distributed Approaches
, pp. 127 - 147
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Cadambi, S., Durdanovic, , Igor, , Jakkula, , Venkata, , Sankaradass, , Murugan, , Cosatto, , Eric, , Chakradhar, , Srimat, , and Graf, , Hans, Peter. 2009. A Massively Parallel FPGA-Based Coprocessor for Support Vector Machines. Field-Programmable Custom Computing Machines, Annual IEEE Symposium on, 0, 115–122.Google Scholar
Cadambi, S., Majumdar, A., Becchi, M., Chakradhar, S. T., and Graf, H. P. 2010. A Programmable Parallel Accelerator for Learning and Classification.
Catanzaro, B., Sundaram, N., and Keutzer, K. 2008. Fast Support Vector Machine Training and Classification on Graphics Processors. Pages 104–111 of: Proceedings of the 25th International Conference on Machine Learning (ICML 2008).Google Scholar
Chatterji, S., Narayanan, M., Duell, J., and Oliker, L. 2003. Performance Evaluation of Two Emerging Media Processors: VIRAM and Imagine. Page 229 of: IPDPS.Google Scholar
D'Apuzzo, M., and Marino, M. 2003. Parallel computational issues of an interior point method for solving large bound-constrained quadratic programming problems. Parallel Computing, 29(4), 467–483.CrossRefGoogle Scholar
Diamond, J. R., Robatmili, B., Keckler, S. W., van de Geijn, R. A., Goto, K., and Burger, D. 2008. High Performance Dense Linear Algebra on a Spatially Distributed Processor. Pages 63–72 of: PPOPP.Google Scholar
Durdanovic, I., Cosatto, E., and Graf, H. P. 2007. Large Scale Parallel SVM Implementation. In: Bottou, L., Chapelle, O., DeCoste, D., and Weston, J. (eds), Large Scale Kernel Machines. Cambridge, MA: MIT Press.Google Scholar
Fan, R.-E., Chen, P.-H., and Lin, C.-J. 2005. Working Set Selection Using Second Order Information for Training Support Vector Machines. Journal of Machine Learning Research, 6, 1889–1918.Google Scholar
Graf, H. P., Cosatto, E., Bottou, L., Durdanovic, I., and Vapnik, V. 2005. Parallel Support Vector Machines: The Cascade SVM. Pages 521–528 of: Saul, L. K., Weiss, Y., and Bottou, L. (eds), Advances in Neural Information Processing Systems 17. Cambridge, MA: MIT Press.Google Scholar
Kapasi, U. J., Rixner, S., Dally, W. J., Khailany, B., Ahn, J. H., Mattson, P. R., and Owens, J. D. 2003. Programmable Stream Processors. IEEE Computer, 36(8), 54–62.CrossRefGoogle Scholar
Kelm, J. H., Johnson, D. R., Johnson, M. R., Crago, N. C., Tuohy, W., Mahesri, A., Lumetta, S. S., Frank, M. I., and Patel, S. J. 2009. Rigel: An Architecture and Scalable Programming Interface for a 1000-Core Accelerator. Pages 140–151 of: ISCA.CrossRefGoogle Scholar
Owens, J. D., Luebke, D., Govindaraju, N., Harris, M., Krueger, J., Lefohn, A. E., and Purcell, T. J. 2007. A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum, 26(1), 80–113.CrossRefGoogle Scholar
Platt, J. 1999. Fast Training of Support Vector Machines Using Sequential Minimal Optimization. Pages 185–208 of: Schölkopf, B., Burges, C. J. C., and Smola, A. J. (eds), Advances in Kernel Methods – Support Vector Learning. Cambridge, MA: MIT Press.Google Scholar
Rousseaux, S., Hubaux, D., Guisset, P., and Legat, J. 2007. A High Performance FPGA-Based Accelerator for BLAS Library Implementation. In: Proceedings of the Third Annual Reconfigurable Systems Summer Institute (RSSI'07).Google Scholar
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics, 27(3).CrossRefGoogle Scholar
Taylor, M. B., Kim, J. S., Miller, J. E., Wentzlaff, D., Ghodrat, F., Greenwald, B., Hoffmann, H., Johnson, P., Lee, J.-W., Lee, W., Ma, A., Saraf, A., Seneski, M., Shnidman, N., Strumpen, V., Frank, M., Amarasinghe, S. P., and Agarwal, A. 2002. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs. Institute of Electrical and Electronics Engineers Micro, 22(2), 25–35.Google Scholar
Zanghirati, G., and Zanni, L. 2003. A Parallel Solver for Large Quadratic Programs in Training Support Vector Machines. Parallel Computing, 29(4), 535–551.CrossRefGoogle Scholar
Zanni, L., Serafini, T., and Zanghirati, G. 2006. Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems. Journal of Machine Learning Research, 1467–1492.Google Scholar
Zhuo, L., and Prasanna, V. K. 2005. High Performance Linear Algebra Operations on Reconfigurable Systems. Page 2 of: SC.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×