Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Follow publication

Machine Learning

Linear Regression with python

Ahmed Yasin
Dev Genius
Published in
7 min readMar 26, 2021

--

We will predict the value of student attitude on the behalf of his/her correct answers of the questions in exam.

Linear Regression

We need to ask a question before learning something new. And that question is “WHY we use that ?”.

Linear Regression is used when we want to predict the value of a variable based on the value of another variable. The variable we want to predict is called the dependent variable (or sometimes, the outcome variable).

We know computer is dumb machine and can’t perform any action by on its on until unless we don’t give or provide instruction. In Machine learning we go through the concept of linear regression because this is the technique to predict some value by some formulas.

So, Lets just jump into it and perform some predictions.

Linear Regression Formula:

Fig:1 (Linear Regression Formula)

On the second paragraph. I mentioned the term dependent variable. Here Y is that dependent variable and we will predict its value.

If you’re think why its dependent variable then its answers is quite obvious. Y is depending upon the value y intercept which is α and on the value of slope which β and on the value of x which is independent variable here. So if any value changes. The value of Y will also change.

Formulas We Use

We can implement it quite sightly as we have the formula. But right now we don’t have values of α,β and x.

β (Slope Formula) :

Lets start with β first. β is the slope here. The formula of slope has given below.

fig:2 (Slope Regression Formula)

We won’t deep dive into all slope concept. But in brief r is correlation coefficient. Sy is standard deviation of Y values and Sx is the standard deviation of X values.

Correlation coefficient (r) Formula :

fig:3 (Correlation coefficient)

The symbol Σ (sigma) is generally used to denote a sum of multiple terms. is the mean of x andis the mean of y.

“Standard deviation(Sy or Sx)” Formula :

fig:4 ( Standard deviation formula)

This is formula for standard deviation the values will change according to intercept. If Sx then use given formula as it is or If Sy then replace x with y.

y-intercept (α) formula:

fig:5 (y-intercept (α) formula)

Average value of Y is also known as mean value of Y. And average value of X is also knows as mean value of X.

Set of Steps

So far we have discussed all the formulas we are going to use to find linear regression.

Lets we clear our path quite more by creating map.

Step one :

We shall find all possible mean of values which we need to use in formulas.

Step two :

We shall find the value of Correlation coefficient (r).

Step three :

We shall find the value of Standard deviation (Sy and Sx).

Step Four :

We shall put step three and step four values in the slope(β) formula .

Step Five :

We shall find the value of y intercept which is α.

Step Six :

We shall put Step four and step five values in the linear regression formula.

Implementation

First create python file with any name. In my case I am setting its name “ManualURL.py

We need to have numpy,sklearn installed in our computer or in virtual environment. which we can simply install by following commands.

pip install numpy
pip install sklearn

Then we need to import them on the top of the file.

import numpy as np
import math
from sklearn.linear_model import LinearRegression

Then create two arrays. we shall predict student attitude on the behalf of correct answers of questions he/she gave. And at the end we shall compare our output with LinearRegression api which we imported from sklearn.

x = correct_questions_answers = [17,13,12,15,16,14,16,16,18,19]
y = attitude = [94,73,59,80,93,85,66,79,77,91]
print("Hello! I'll predict how much happy you're on the behalf of your correct answers")
print("Please put the number of correct answers")
print("NOTe *This assumption is without any API*")

I have created two arrays with x and y and to give you more clear idea I gave them two names to each. and added some print statements for user interference when user run the program at the end so user can understand what to do.

Implementing Step One :

# Finding mean of x.
mean_of_X = float(sum(x))/(float(len(x)))
# Finding mean of y.
mean_of_Y = float(sum(y))/(float(len(y)))

We are finding means of x and y or average value of x and y respectively. We are using float data type because data can be in decimal. Sum() and len() are python built in function sum() adds all the entries of arrays and len() count the number of entries in arrays.

# Finding (x — mean_of_X)x_Minus_mean_of_X = []
a = 0
for a in x:
append_value = 0
append_value = a — mean_of_X
# print(append_value)
x_Minus_mean_of_X.append(append_value)

Here I setup an other array with name x_Minus_mean_of_X which stores the results of (x — mean_of_X). Simply I’m subtracting the value of mean of x from x(correct_questions_answers) values .I’m using for loop so that all values of array subtracted by mean of x then storing the result temporary in append_value variable. And .append adding that resulted value in the array with name x_Minus_mean_of_X.

# Finding (y — mean_of_Y)
y_Minus_mean_of_Y = []
b = 0
for b in y:
append_value = 0
append_value = b — mean_of_Y
# print(append_value)
y_Minus_mean_of_Y.append(append_value)

This is exactly same steps for y array which is attitude array.

# Finding (x — mean_of_X) * (y — mean_of_Y)
# using multiply function to multiply two arrays
product_of_xy_from_their_mean = np.multiply(x_Minus_mean_of_X, y_Minus_mean_of_Y)

I have declared quite long variable just to give you idea what is what. And put commenting as well. As you can see I have imported numpy as np on very top of this file. And now using the multiply() function from that to multiply to arrays.

# Finding sqaure of (x — mean_of_X)
Sqaure_of__x__mean_of_X = []
c = 0
for c in x_Minus_mean_of_X:
append_value = 0
append_value = c**2
# print(append_value)
Sqaure_of__x__mean_of_X.append(append_value)

Now finding square of array x_Minus_mean_of_X of each value. That is why we are using for loop here and appending resulting values in an other new declared array with name Sqaure_of__x__mean_of_X.

# Finding sqaure of (x — mean_of_X)Sqaure_of__y__mean_of_Y = []
d = 0
for d in y_Minus_mean_of_Y:
append_value = 0
append_value = d**2
# print(append_value)
Sqaure_of__y__mean_of_Y.append(append_value)

This is again exactly same steps for y array which is y_Minus_mean_of_Y array.

Implementing Step Two :

Find the value of correlation coefficient (r). First we find nominator and denominator respectively. then we’ll find correlation coefficient (r) itself.

#Finding numirator of correcation coefficient (r)
numirator = sum(product_of_xy_from_their_mean)

I’m adding the values product_of_xy_from_their_mean array. And I suggest you to look at the formula same also to understand better.

#Finding denominator of correcation coefficient (r)
denominator = sum(Sqaure_of__x__mean_of_X) * sum(Sqaure_of__y__mean_of_Y)

I’m adding the values Sqaure_of__x__mean_of_X array and Sqaure_of__y__mean_of_Y then multiplying them with * operator .

r = numirator/math.sqrt(denominator)

Here in the denominator I’m using math.sqrt(). .sqrt() is the square root method from the math library. I’m using that to find the square root denominator.

Implementing Step Three :

To find the value of slope we need to find the value of standard deviation (Sy and Sx). Lets find that :

#Finding SySy_numirator = sum(Sqaure_of__y__mean_of_Y)
Sy_denomirator = len(y) — 1
Sy = math.sqrt(Sy_numirator/Sy_denomirator)

First we find the nominator of Sy and we find that and store that in Sy_numirator variable and same in denominator case.

#finding Sx
Sx_numirator = sum(Sqaure_of__x__mean_of_X)
Sx_denomirator = len(x) — 1
Sx = math.sqrt(Sx_numirator/Sx_denomirator)

First we find the nominator of Sx and we find that and store that in Sx_numirator variable and same in denominator case.

Implementing Step Four :

Now we can find the value of slope. We have found correlation coefficient (r) in second step and found standard deviation(Sy/Sx) in third step.

# Finding Sy/Sx
_S_ = Sy/Sx
slope = r * _S_

We simple store the value of ( Sy/Sx) in __S__ single variable and the multiply with correlation coefficient (r) and store that value is slope variable.

Implementing Step Five :

#Finding y intercept_a = mean_of_Y — slope*mean_of_X

We can now find y intercept by putting the values we got so far.

Implementing Step Six :

Now we can find linear regression because we have everything now for that. but first need to ask the student how much correct answers he gave. We can simple do that by adding input field.

question_input = float(input())

After this you can use linear regression formula.

_y = _a + slope*question_input

_y is the dependent variable and got the prediction value of this.

Compare with Linear Regression API

You can simply compare your prediction by putting the given below code at the bottom of your file.

print(“We predict that you’ll be happy”,_y,”this much !”)
print(“”)
print(“Lets resure the prediction with experts program”)
print(“Please put the same number of correct answers”)
print(“NOTE *This time the assumption is with API*”)
s = np.array([17,13,12,15,16,14,16,16,18,19]).reshape((-1,1))
t = np.array([94,73,59,80,93,85,66,79,77,91])
print(“”)
print(“X”,x)
print(“Chick X to resure the prediction or quite by pressing any other key”)
ans = input()
value = None
if ans is “X”:
value = s
else:
print(“Thank You!”)
quit()
model = LinearRegression().fit(s,t)
_X = [value]
r_sq = model.predict(s)
print(r_sq)

Thank you!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Published in Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Written by Ahmed Yasin

Youtuber / Content Writer / C++ /Learning Java/ https://www.youtube.com/channel/UC-gDcN99rYbeyH1oKsoPM3A Instagram : belongs_to_mars

No responses yet

Write a response