Machine Learning with Python: Principles and Practical Techniques

Parteek Bhatia

doi:10.1017/9781009170239

Chapter Objectives

• To understand the differences between classification and regression techniques.
• To define classification and understand the types of classification.
• To understand the working principles of various classification techniques.
• To comprehend the decision tree classifier.
• To know the importance of information gain and Gini index in decision tree classifier.
• To comprehend the random forest algorithm.
• To discuss the working of the naive Bayes classification.
• To comprehend the working principle of the k-NN classifier.
• To comprehend the working of logistic regression classifier.
• To understand different quality metrics of the classifier like confusion matrix, precision, recall, and F-measure.

9.1 Introduction to Classification

We rely on machine learning (ML) to make critical decisions or predictions in the modern world. It is very important to understand how computers by using ML make these predictions. Usually, the predictions made by ML models are classified into two types, i.e., classification and regression. The ML models use various techniques to predict the outcome of an event by analyzing already available data. As machines learn from data, the type of training or input data plays a crucial role in deciding the machine's capability to make accurate decisions and predictions. Usually, this data is available in two forms, i.e., labeled and unlabeled. In label data, we know the value of the output attribute for the sample input attributes, while in unlabeled data, we do not have the output attribute value.

For analyzing labeled data, supervised learning is used. Classification and regression are the two types of supervised learning techniques used to predict the outcome of an unknown instance by analyzing the available labeled input instances. Classification is applied when the outcome is finite or discrete, while the regression model is applied when the outcome is infinite or continuous. For example, a classification model is used to predict whether a customer will buy a product or not. Here the outcome is finite, i.e., buying the product or not buying. In this case, the regression model predicts the number of products that the customer may buy. Here the outcome is infinite, i.e., all possible numbers, since the term quantity refers to a set of continuous numbers.

Machine Learning with Python Principles and Practical Techniques

Chapter 9: Classification

Extract

About the book

Access options

Purchase options

Have an access code?