To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
While previous chapters discussed deep learning recommender systems from a theoretical and algorithmic perspective, this chapter shifts focus to the engineering platform that supports their implementation. Recommender systems are divided into two key components: data and model. The data aspect involves the engineering of the data pipeline, while the model aspect is split between offline training and online serving. This chapter is structured into three parts: (1) the data pipeline framework and big data platform technologies; (2) popular platforms for offline training of recommendation models like Spark MLlib, TensorFlow, and PyTorch; and (3) online deployment and serving of deep learning recommendation models. Additionally, the chapter covers the trade-offs between engineering execution and theoretical considerations, offering insights into how algorithm engineers can balance these aspects in practice.
A graduate-level introduction to advanced topics in Markov chain Monte Carlo (MCMC), as applied broadly in the Bayesian computational context. The topics covered have emerged as recently as the last decade and include stochastic gradient MCMC, non-reversible MCMC, continuous time MCMC, and new techniques for convergence assessment. A particular focus is on cutting-edge methods that are scalable with respect to either the amount of data, or the data dimension, motivated by the emerging high-priority application areas in machine learning and AI. Examples are woven throughout the text to demonstrate how scalable Bayesian learning methods can be implemented. This text could form the basis for a course and is sure to be an invaluable resource for researchers in the field.
Recommender systems are ubiquitous in modern life and are one of the main monetization channels for Internet technology giants. This book helps graduate students, researchers and practitioners to get to grips with this cutting-edge field and build the thorough understanding and practical skills needed to progress in the area. It not only introduces the applications of deep learning and generative AI for recommendation models, but also focuses on the industry architecture of the recommender systems. The authors include a detailed discussion of the implementation solutions used by companies such as YouTube, Alibaba, Airbnb and Netflix, as well as the related machine learning framework including model serving, model training, feature storage and data stream processing.
Machine learning has become a dominant problem-solving technique in the modern world, with applications ranging from search engines and social media to self-driving cars and artificial intelligence. This lucid textbook presents the theoretical foundations of machine learning algorithms, and then illustrates each concept with its detailed implementation in Python to allow beginners to effectively implement the principles in real-world applications. All major techniques, such as regression, classification, clustering, deep learning, and association mining, have been illustrated using step-by-step coding instructions to help inculcate a 'learning by doing' approach. The book has no prerequisites, and covers the subject from the ground up, including a detailed introductory chapter on the Python language. As such, it is going to be a valuable resource not only for students of computer science, but also for anyone looking for a foundation in the subject, as well as professionals looking for a ready reckoner.
Chapter 12 is the conclusion. It presents a discussion of how the components of performance evaluation for learning algorithms discussed throughout the book unify into an overall framework for in-laboratory evaluation. This is followed by a discussion of how to move from a laboratory setting to a deployment setting based on the material covered in the last part of the book. We then discuss the potential social consequences of machine learning technology deployment together with their causes, and advocate for the consideration of these consequences as part of the evaluation framework. We follow this discussion with a few concluding remarks.
Chapter 4 reviews frequently used machine learning evaluation procedures. In particular, it presents popular evaluation metrics for binary and multi-class classification (e.g., accuracy, precision/recall, ROC analysis), regression analysis (e.g., mean squared error, root mean squared error, R-squared error), clustering (e.g., Davies–Bouldin Index). It then reviews popular resampling approaches (e.g.,holdout, cross-validation) and statistical tests (e.g., the t-test and the sign test). It concludes with an explanation of why it is important to go beyond these well-known methods in order to achieve reliable evaluation results in all cases.
Chapter 6 addresses the problem of error estimation and resampling in both a theoretical and practical manner. The holdout method is reviewed and cast into the bias/variance framework. Simple resampling approaches such as cross-validation are also reviewed and important variations such as stratified cross-validation and leave-one-out are introduced. Multiple resampling approaches such as bootstrapping, randomization, and multiple trials of simple resampling approaches are then introduced and discussed.
Chapter 2 reviews the principles of statistics that are necessary for the discussion of machine learning evaluation methods, especially the statical analysis discussion of Chapter 7. In particular, it reviews the notions of random variables, distributions, confidence intervals, and hypothesis testing.
In Chapter 10, the book turns to practical considerations. In particular, it surveys the software engineering discipline with its rigorous software testing methods, and asks how these techniques can be adapted to the subfield of machine learning. The adaptation is not straightforward, as machine learning algorithms behave in non-deterministic ways aggravated by data, algorithm, and platform imperfections. These issues are discussed and some of the steps taken to handle them are reviewed. The chapter then turns to the practice of online testing and addresses the ethics of machine learning deployment. The chapter concludes with a discussion of current industry practice along with suggestions on how to improve the safety of industrial deployment in the future.