Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 9: Classification

pp. 359-494

Authors

, Washington State University, USA
  • Get access
  • Add bookmark
  • Export citation
  • Share

Extract

Chapter Objectives

  • • To understand the differences between classification and regression techniques.

  • • To define classification and understand the types of classification.

  • • To understand the working principles of various classification techniques.

  • • To comprehend the decision tree classifier.

  • • To know the importance of information gain and Gini index in decision tree classifier.

  • • To comprehend the random forest algorithm.

  • • To discuss the working of the naive Bayes classification.

  • • To comprehend the working principle of the k-NN classifier.

  • • To comprehend the working of logistic regression classifier.

  • • To understand different quality metrics of the classifier like confusion matrix, precision, recall, and F-measure.

9.1 Introduction to Classification

We rely on machine learning (ML) to make critical decisions or predictions in the modern world. It is very important to understand how computers by using ML make these predictions. Usually, the predictions made by ML models are classified into two types, i.e., classification and regression. The ML models use various techniques to predict the outcome of an event by analyzing already available data. As machines learn from data, the type of training or input data plays a crucial role in deciding the machine's capability to make accurate decisions and predictions. Usually, this data is available in two forms, i.e., labeled and unlabeled. In label data, we know the value of the output attribute for the sample input attributes, while in unlabeled data, we do not have the output attribute value.

For analyzing labeled data, supervised learning is used. Classification and regression are the two types of supervised learning techniques used to predict the outcome of an unknown instance by analyzing the available labeled input instances. Classification is applied when the outcome is finite or discrete, while the regression model is applied when the outcome is infinite or continuous. For example, a classification model is used to predict whether a customer will buy a product or not. Here the outcome is finite, i.e., buying the product or not buying. In this case, the regression model predicts the number of products that the customer may buy. Here the outcome is infinite, i.e., all possible numbers, since the term quantity refers to a set of continuous numbers.

About the book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook
US$49.99
Paperback
US$49.99

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers