Machine Learning with Python: Principles and Practical Techniques

Parteek Bhatia

doi:10.1017/9781009170239

Chapter 6: Implementing Simple Linear Regression

pp. 241-266

Parteek Bhatia

, Washington State University, USA

Get access

Add bookmark
Export citation
Share

Extract

Chapter Objectives

• To import the necessary libraries and loading the dataset.
• To split the dataset into training and testing datasets.
• To build the simple linear model and make predictions.
• To visualize the training set and testing set results.
• To calculate mean absolute error, mean squared error, and root mean squared error.

In this chapter, we are going to implement the simple linear regression in Python. To implement this concept, we will analyze how the stipend of a researcher is related to their years of research experience. Our aim is to predict the stipend of the researcher based on his/her research experience.

Problem Statement and Dataset

To perform this task, we will consider a dataset consisting of two attributes: ResearchExperience and Stipend. There are 30 observations in this dataset to draw the correlation between the research experience and their corresponding stipend. A research institute aims to find this correlation between research experience and the stipend. This will assist the management in providing an appropriate stipend to new research scholars based on their years of research experience, rather than deciding randomly. The obvious thing is that the stipend is directly proportional to the research experience. The higher the experience, the more will be the stipend. We will use a simple linear regression model to solve this problem.

Let us quickly refresh our concepts of a simple linear regression model. We know that simple linear regression can best fit the straight line to generate a relationship between the research experience and the stipend. Though the dataset is quite simple, it has a great business value to it, as the model created will help the institute predict the stipend of the researcher based on their experience. Therefore, using this model, the stipend of the new researchers can be easily predicted, and this would also acknowledge the transparency in the management. Here, ResearchExperience (independent variable) will be our X and act as horizontal axis. In contrast, the Stipend (variable to be predicted is the dependent variable) will be Y and act as vertical axis, as shown in Figure 6.1.

About the book

Book DOI https://doi.org/10.1017/9781009170239
Subjects Communications and Signal Processing,Computer Science,Engineering,Machine Learning and Pattern Recognition
Format: Paperback
- Publication date: 26 March 2026
- ISBN: 9781009170246
Format: Digital
- Publication date: 22 February 2025
- ISBN: 9781009170239
Find out more details about this book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook

US$49.99

Paperback

US$49.99

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers

Machine Learning with Python Principles and Practical Techniques