Optimization Alorithms
Ex : G.D. , S.D. , Adam, RMS prop , momentum , adelta
Gradient Descent is an optimization algorithm that find a best fit line and local minima of a differentiable function for given training data set. Simply used to find the coefficients (weights) and intercept (bias) that minimize a cost function as far as possible.
There are three types of gradient descent techniques:
- Regular Batch GD (Gradient Descent) - Studiously descend the curve in one path towards one minima ; every hop calculates the cost function for entire training data. If training data is large, one should not use this.
- Random GD (Stochastic GD) - Calculates the Cost function for only one (randomly selected) training data per hop ; tend to jump all over the place due to randomness but due to it actually jump across minima’s.
- Mini Batch gradient descent - Somewhere midway between the above 2. Does the calculation for a bunch of random data points for each hop.
Here I've code a gradient descent algorithm from scratch using root mean squared error (rsme) formula.
import numpy as np
import matplotlib.pyplot as plt
# %%
x = np.array([1,2,3,4,5])
y = np.array([5,7,9,11,13])
# %%
def gradient_descent(x,y):
m = b = 0
n = len(x)
learning_rate = 0.06 # this value you have to operate manually
iterations = 100
for iter in range(iterations):
y_pred = (m * x) + b
plt.plot(x,y, color='b')
plt.plot(x,y_pred, color='g')
# plt.show()
diff_ = y - y_pred
mse = (1/n) * sum([val**2 for val in (diff_)])
rmse = mse**0.5
d_m = -(1/n) * (mse**(-0.5)) * sum( x * (diff_))
d_b = -(1/n) * (mse**(-0.5)) * sum(diff_)
m = m - learning_rate * d_m
b = b - learning_rate * d_b
print('iterations: {}, weight: {}, bias: {}, cost: {},'. format(iter+1, round(m,2), round(b,2),'%0.2f'%rmse))
pass
gradient_descent(x,y)
# %%
Comments
Post a Comment