Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 6: Implementing Simple Linear Regression

pp. 241-266

Authors

, Washington State University, USA
  • Get access
  • Add bookmark
  • Export citation
  • Share

Extract

Chapter Objectives

  • • To import the necessary libraries and loading the dataset.

  • • To split the dataset into training and testing datasets.

  • • To build the simple linear model and make predictions.

  • • To visualize the training set and testing set results.

  • • To calculate mean absolute error, mean squared error, and root mean squared error.

In this chapter, we are going to implement the simple linear regression in Python. To implement this concept, we will analyze how the stipend of a researcher is related to their years of research experience. Our aim is to predict the stipend of the researcher based on his/her research experience.

Problem Statement and Dataset

To perform this task, we will consider a dataset consisting of two attributes: ResearchExperience and Stipend. There are 30 observations in this dataset to draw the correlation between the research experience and their corresponding stipend. A research institute aims to find this correlation between research experience and the stipend. This will assist the management in providing an appropriate stipend to new research scholars based on their years of research experience, rather than deciding randomly. The obvious thing is that the stipend is directly proportional to the research experience. The higher the experience, the more will be the stipend. We will use a simple linear regression model to solve this problem.

Let us quickly refresh our concepts of a simple linear regression model. We know that simple linear regression can best fit the straight line to generate a relationship between the research experience and the stipend. Though the dataset is quite simple, it has a great business value to it, as the model created will help the institute predict the stipend of the researcher based on their experience. Therefore, using this model, the stipend of the new researchers can be easily predicted, and this would also acknowledge the transparency in the management. Here, ResearchExperience (independent variable) will be our X and act as horizontal axis. In contrast, the Stipend (variable to be predicted is the dependent variable) will be Y and act as vertical axis, as shown in Figure 6.1.

About the book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook
US$49.99
Paperback
US$49.99

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers