Chapter Objectives
• To understand the need for simple linear regression.
• To comprehend the concept of hypothesis and parameters of simple linear regression.
• To understand mathematical modeling of cost function and its minimization.
• To understand the importance and different steps of the gradient descent algorithm.
• To comprehend the mathematical modeling of the gradient descent algorithm.
• To understand the role of learning rate α.
5.1 Introduction to Simple Linear Regression
As discussed in earlier chapters, regression predicts a continuous value or real-valued output. This chapter will discuss how regression works (from a mathematical aspect) to predict the continuous value for the given dataset. Our first learning algorithm is simple linear regression. In this section, we will discuss the fundamental concepts and mathematical modeling of simple linear regression.
We usually have a dependent variable having a continuous value whose value we wish to predict based on one or more independent variables. If we have only one independent or input variable, this situation is known as simple linear regression (also called univariate regression). If we have multiple independent or input variables, it is known as multiple linear regression or multivariate regression.
Linear regression could be used for studying patterns in different real-life scenarios. Consider a research lab where a researcher wants to understand how the stipend is effected by the years of experience, or, in simple words, we wish to predict the stipend based on the years of experience of the researcher. Machine learning (ML) is about learning from past experiences or data. Thus, to predict the researcher's stipend, we have to collect some data about past researchers, specifically their stipend and experience.
In the supervised learning models, we need a dataset called a training set. We will use the dataset as given in Table 5.1 for training the model, and our job will be to build the ML model that learns from this data and hence predicts the stipend of a researcher based on his experience. Here, the stipend will be considered the dependent or output variable because it depends on the researcher's years of experience. Thus, years of experience will be considered an independent or input variable. So, we will use simple linear regression to build the ML model. For proceeding with this problem, we will use a dataset of researchers’ stipends with their corresponding years of experience, as shown in Table 5.1.
Review the options below to login to check your access.
Log in with your Cambridge Aspire website account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.