Skip to content
Open global navigation

Cambridge University Press

AcademicLocation selectorSearch toggleMain navigation toggle
Cart
Register Sign in Wishlist

Scaling up Machine Learning
Parallel and Distributed Approaches

$90.00 (Z)

Ron Bekkerman, Mikhail Bilenko, John Langford, Biswanath Panda, Joshua S. Herbach, Sugato Basu, Roberto J. Bayardo, Mihai Budiu, Dennis Fetterly, Michael Isard, Frank McSherry, Yuan Yu, Edwin Pednault, Elad Yom-Tov, Amol Ghoting, Meichun Hsu, Ren Wu, Bin Zhang, Edward Y. Chang, Hongjie Bai, Kaihua Zhu, Hao Wang, Jian Li, Zhihuan Qiu, Igor Durdanovic, Eric Cosatto, Hans Peter Graf, Srihari Cadambi, Venkata Jakkula, Srimat Chakradhar, Abhinandan Majumdar, Krysta M. Svore, Christopher J. C. Burges, Ramesh Natarajan, Joseph Gonzalez, Yucheng Low, Carlos Guestrin, Arthur Asuncion, Padhraic Smyth, Max Welling, David Newman, Ian Porteous, Scott Triglia, Wen-Yen Chen, Yangqiu Song, Chih-Jen Lin, Martin Scholz, Daniel Hsu, Nikos Karampatziakis, Alex J. Smola, Jeff Bilmes, Amarnag Subramanya, Evan Xiang, Nathan Liu, Qiang Yang, Jeremy Kubica, Sameer Singh, Daria Sorokina, Adam Coates, Rajat Raina, Andrew Y. Ng, Clement Farabet, Yann LeCun, Koray Kavukcuoglu, Berin Martini, Polina Akselrod, Selcuk Talay, Eugenio Culurciello, Shirish Tatikonda, Srinivasan Parthasarathy, Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer
View all contributors
  • Date Published: December 2011
  • availability: In stock
  • format: Hardback
  • isbn: 9780521192248

$90.00 (Z)
Hardback

Add to cart Add to wishlist

Other available formats:
eBook


Looking for an examination copy?

This title is not currently available for examination. However, if you are interested in the title for your course we can consider offering an examination copy. To register your interest please contact collegesales@cambridge.org providing details of the course you are teaching.

Description
Product filter button
Description
Contents
Resources
About the Authors
  • This book presents an integrated collection of representative approaches for scaling up machine learning and data mining methods on parallel and distributed computing platforms. Demand for parallelizing learning algorithms is highly task-specific: in some settings it is driven by the enormous dataset sizes, in others by model complexity or by real-time performance requirements. Making task-appropriate algorithm and platform choices for large-scale machine learning requires understanding the benefits, trade-offs, and constraints of the available options. Solutions presented in the book cover a range of parallelization platforms from FPGAs and GPUs to multi-core systems and commodity clusters, concurrent programming frameworks including CUDA, MPI, MapReduce, and DryadLINQ, and learning settings (supervised, unsupervised, semi-supervised, and online learning). Extensive coverage of parallelization of boosted trees, SVMs, spectral clustering, belief propagation and other popular learning algorithms and deep dives into several applications make the book equally useful for researchers, students, and practitioners.

    • Comprehensive view of modern machine learning, covering most of the contemporary research on large-scale problems
    • Presents methods for scaling up a wide array of learning tasks, including classification, clustering, regression and feature selection
    • Shows how to run state-of-the-art machine learning algorithms, such as boosted decision trees and SVMs, on multiple parallel-computing platforms
    Read more

    Reviews & endorsements

    "One of the landmark achievements of our time is the ability to extract value from large volumes of data. Engineering and algorithmic developments on this front have gelled substantially in recent years, and are quickly being reduced to practice in widely-available, reusable forms. This book provides a broad and timely snapshot of the state of developments in scalable machine learning, which should be of interest to anyone who wishes to understand and extend the state of the art in analyzing data."
    Joseph M. Hellerstein, University of California, Berkeley

    "This is a book that every machine learning practitioner should keep in their library."
    Yoram Singer, Google Inc.

    "This unique, timely book provides a 360 degrees view and understanding of both conceptual and practical issues that arise when implementing leading machine learning algorithms on a wide range of parallel and high-performance computing platforms. It will serve as an indispensable handbook for the practitioner of large-scale data analytics and a guide to dealing with BIG data and making sound choices for efficient applying learning algorithms to them. It can also serve as the basis for an attractive graduate course on Parallel/Distributed Machine Learning and Data Mining."
    Joydeep Ghosh, University of Texas

    "The contributions in this book run the gamut from frameworks for large-scale learning to parallel algorithms to applications, and contributors include many of the top people in this burgeoning subfield. Overall this book is an invaluable resource for anyone interested in the problem of learning from and working with big datasets."
    William W. Cohen, Carnegie Mellon University

    "... an excellent resource for researchers in the field."
    J. Arul for Computing Reviews

    See more reviews

    Customer reviews

    Not yet reviewed

    Be the first to review

    Review was not posted due to profanity

    ×

    , create a review

    (If you're not , sign out)

    Please enter the right captcha value
    Please enter a star rating.
    Your review must be a minimum of 12 words.

    How do you rate this item?

    ×

    Product details

    • Date Published: December 2011
    • format: Hardback
    • isbn: 9780521192248
    • length: 492 pages
    • dimensions: 263 x 186 x 32 mm
    • weight: 1kg
    • contains: 144 b/w illus.
    • availability: In stock
  • Table of Contents

    1. Scaling up machine learning: introduction Ron Bekkerman, Mikhail Bilenko and John Langford
    Part I. Frameworks for Scaling Up Machine Learning:
    2. Mapreduce and its application to massively parallel learning of decision tree ensembles Biswanath Panda, Joshua S. Herbach, Sugato Basu and Roberto J. Bayardo
    3. Large-scale machine learning using DryadLINQ Mihai Budiu, Dennis Fetterly, Michael Isard, Frank McSherry and Yuan Yu
    4. IBM parallel machine learning toolbox Edwin Pednault, Elad Yom-Tov and Amol Ghoting
    5. Uniformly fine-grained data parallel computing for machine learning algorithms Meichun Hsu, Ren Wu and Bin Zhang
    Part II. Supervised and Unsupervised Learning Algorithms:
    6. PSVM: parallel support vector machines with incomplete Cholesky Factorization Edward Chang, Hongjie Bai, Kaihua Zhu, Hao Wang, Jian Li and Zhihuan Qiu
    7. Massive SVM parallelization using hardware accelerators Igor Durdanovic, Eric Cosatto, Hans Peter Graf, Srihari Cadambi, Venkata Jakkula, Srimat Chakradhar and Abhinandan Majumdar
    8. Large-scale learning to rank using boosted decision trees Krysta M. Svore and Christopher J. C. Burges
    9. The transform regression algorithm Ramesh Natarajan and Edwin Pednault
    10. Parallel belief propagation in factor graphs Joseph Gonzalez, Yucheng Low and Carlos Guestrin
    11. Distributed Gibbs sampling for latent variable models Arthur Asuncion, Padhraic Smyth, Max Welling, David Newman, Ian Porteous and Scott Triglia
    12. Large-scale spectral clustering with Mapreduce and MPI Wen-Yen Chen, Yangqiu Song, Hongjie Bai, Chih-Jen Lin and Edward Y. Chang
    13. Parallelizing information-theoretic clustering methods Ron Bekkerman and Martin Scholz
    Part III. Alternative Learning Settings:
    14. Parallel online learning Daniel Hsu, Nikos Karampatziakis, John Langford and Alex J. Smola
    15. Parallel graph-based semi-supervised learning Jeff Bilmes and Amarnag Subramanya
    16. Distributed transfer learning via cooperative matrix factorization Evan Xiang, Nathan Liu and Qiang Yang
    17. Parallel large-scale feature selection Jeremy Kubica, Sameer Singh and Daria Sorokina
    Part IV. Applications:
    18. Large-scale learning for vision with GPUS Adam Coates, Rajat Raina and Andrew Y. Ng
    19. Large-scale FPGA-based convolutional networks Clement Farabet, Yann LeCun, Koray Kavukcuoglu, Berin Martini, Polina Akselrod, Selcuk Talay and Eugenio Culurciello
    20. Mining tree structured data on multicore systems Shirish Tatikonda and Srinivasan Parthasarathy
    21. Scalable parallelization of automatic speech recognition Jike Chong, Ekaterina Gonina, Kisun You and Kurt Keutzer.

  • Editors

    Ron Bekkerman, LinkedIn Corporation, Mountain View, California
    Dr Ron Bekkerman is a computer engineer and scientist whose experience spans across disciplines from video processing to business intelligence. Currently a senior research scientist at LinkedIn, he previously worked for a number of major companies including Hewlett-Packard and Motorola. Bekkerman's research interests lie primarily in the area of large-scale unsupervised learning. He is the corresponding author of several publications in top-tier venues, such as ICML, KDD, SIGIR, WWW, IJCAI, CVPR, EMNLP and JMLR.

    Mikhail Bilenko, Microsoft Research, Redmond, Washington
    Dr Mikhail Bilenko is a researcher in the Machine Learning and Intelligence group at Microsoft Research. His research interests center on machine learning and data mining tasks that arise in the context of large behavioral and textual datasets. Bilenko's recent work has focused on learning algorithms that leverage user behavior to improve online advertising. His papers have been published at KDD, ICML, SIGIR, and WWW among other venues, and he has received best paper awards from SIGIR and KDD.

    John Langford, Yahoo! Research, New York
    Dr John Langford is a computer scientist working as a senior researcher at Yahoo! Research. Previously, he was affiliated with the Toyota Technological Institute and IBM T. J. Watson Research Center. Langford's work has been published at conferences and in journals including ICML, COLT, NIPS, UAI, KDD, JMLR and MLJ. He received the Pat Goldberg Memorial Best Paper Award, as well as best paper awards from ACM EC and WSDM. He is also the author of the popular machine learning weblog, hunch.net.

    Contributors

    Ron Bekkerman, Mikhail Bilenko, John Langford, Biswanath Panda, Joshua S. Herbach, Sugato Basu, Roberto J. Bayardo, Mihai Budiu, Dennis Fetterly, Michael Isard, Frank McSherry, Yuan Yu, Edwin Pednault, Elad Yom-Tov, Amol Ghoting, Meichun Hsu, Ren Wu, Bin Zhang, Edward Y. Chang, Hongjie Bai, Kaihua Zhu, Hao Wang, Jian Li, Zhihuan Qiu, Igor Durdanovic, Eric Cosatto, Hans Peter Graf, Srihari Cadambi, Venkata Jakkula, Srimat Chakradhar, Abhinandan Majumdar, Krysta M. Svore, Christopher J. C. Burges, Ramesh Natarajan, Joseph Gonzalez, Yucheng Low, Carlos Guestrin, Arthur Asuncion, Padhraic Smyth, Max Welling, David Newman, Ian Porteous, Scott Triglia, Wen-Yen Chen, Yangqiu Song, Chih-Jen Lin, Martin Scholz, Daniel Hsu, Nikos Karampatziakis, Alex J. Smola, Jeff Bilmes, Amarnag Subramanya, Evan Xiang, Nathan Liu, Qiang Yang, Jeremy Kubica, Sameer Singh, Daria Sorokina, Adam Coates, Rajat Raina, Andrew Y. Ng, Clement Farabet, Yann LeCun, Koray Kavukcuoglu, Berin Martini, Polina Akselrod, Selcuk Talay, Eugenio Culurciello, Shirish Tatikonda, Srinivasan Parthasarathy, Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer

Sign In

Please sign in to access your account

Cancel

Not already registered? Create an account now. ×

You are now leaving the Cambridge University Press website, your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.

Continue ×

Continue ×

Find content that relates to you

Join us online

© Cambridge University Press 2014

Back to top

Are you sure you want to delete your account?

This cannot be undone.

Cancel Delete

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

×
Please fill in the required fields in your feedback submission.
×