Gradient Descent Algorithm via julia

This tutorial illustrate how to use julia to conduct gradient descent algorithm

Load Julia modules

using RDatasets
using DataFrames
mtcars = dataset("datasets", "mtcars")

32 rows × 12 columns (omitted printing of 4 columns)

ModelMPGCylDispHPDRatWTQSec
StringFloat64Int64Float64Int64Float64Float64Float64
1Mazda RX421.06160.01103.92.6216.46
2Mazda RX4 Wag21.06160.01103.92.87517.02
3Datsun 71022.84108.0933.852.3218.61
4Hornet 4 Drive21.46258.01103.083.21519.44
5Hornet Sportabout18.78360.01753.153.4417.02
6Valiant18.16225.01052.763.4620.22
7Duster 36014.38360.02453.213.5715.84
8Merc 240D24.44146.7623.693.1920.0
9Merc 23022.84140.8953.923.1522.9
10Merc 28019.26167.61233.923.4418.3
11Merc 280C17.86167.61233.923.4418.9
12Merc 450SE16.48275.81803.074.0717.4
13Merc 450SL17.38275.81803.073.7317.6
14Merc 450SLC15.28275.81803.073.7818.0
15Cadillac Fleetwood10.48472.02052.935.2517.98
16Lincoln Continental10.48460.02153.05.42417.82
17Chrysler Imperial14.78440.02303.235.34517.42
18Fiat 12832.4478.7664.082.219.47
19Honda Civic30.4475.7524.931.61518.52
20Toyota Corolla33.9471.1654.221.83519.9
21Toyota Corona21.54120.1973.72.46520.01
22Dodge Challenger15.58318.01502.763.5216.87
23AMC Javelin15.28304.01503.153.43517.3
24Camaro Z2813.38350.02453.733.8415.41
25Pontiac Firebird19.28400.01753.083.84517.05
26Fiat X1-927.3479.0664.081.93518.9
27Porsche 914-226.04120.3914.432.1416.7
28Lotus Europa30.4495.11133.771.51316.9
29Ford Pantera L15.88351.02644.223.1714.5
30Ferrari Dino19.76145.01753.622.7715.5

Julia Function for Gradient Descent

  • learn_rate: the magnitude of the steps the algorithm takes along the slope of the MSE function
  • conv_threshold: threshold for convergence of gradient descent n: number of iternations
  • max_iter: maximum of iteration before the algorithm stopss
function gradientDesc(x, y, learn_rate, conv_threshold, n, max_iter)
    β = rand(Float64, 1)[1]
    α = rand(Float64, 1)[1]
    ŷ = α .+ β .* x
    MSE = sum((y .- ŷ).^2)/n
    converged = false
    iterations = 0

    while converged == false
        # Implement the gradient descent algorithm
        β_new = β - learn_rate*((1/n)*(sum((ŷ .- y) .* x)))
        α_new = α - learn_rate*((1/n)*(sum(ŷ .- y)))
        α = α_new
        β = β_new
        ŷ = β.*x .+ α
        MSE_new = sum((y.-ŷ).^2)/n
        # decide on whether it is converged or not
        if (MSE - MSE_new) <= conv_threshold
            converged = true
            println("Optimal intercept: $α; Optimal slope: $β")
        end
        iterations += 1
        if iterations > max_iter
            converged = true
            println("Optimal intercept: $α; Optimal slope: $β")
        end
    end
end
gradientDesc (generic function with 1 method)
gradientDesc(mtcars[:,:Disp], mtcars[:,:MPG], 0.0000293, 0.001, 32, 2500000)

Optimal intercept: 29.599851506041713; Optimal slope: -0.0412151089535404

Compared to linear regression

using GLM
linearRegressor = lm(@formula(MPG ~ Disp), mtcars)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

MPG ~ 1 + Disp

Coefficients:
───────────────────────────────────────────────────────────────────────────
                  Coef.  Std. Error      t  Pr(>|t|)  Lower 95%   Upper 95%
───────────────────────────────────────────────────────────────────────────
(Intercept)  29.5999     1.22972     24.07    <1e-20  27.0884    32.1113
Disp         -0.0412151  0.00471183  -8.75    <1e-09  -0.050838  -0.0315923
───────────────────────────────────────────────────────────────────────────

comments powered by Disqus

Related