Deep Learning with PyTorch: Autograd

October 28, 2020
data science machine learning deep learning book
Estimated Reading Time: 3 minute(s)

This is a post in the Deep learning with Pytorch series

This is based on code from the following book

The follow blog post walks through what PyTorch’s Autograd is.

Link to Jupyter Notebook that this blog post was made from

%matplotlib inline
import numpy as np
import torch

Taking our input from the previous notebook and applying our scaling

t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0,
                    3.0, -4.0, 6.0, 13.0, 21.0])
t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9,
                    33.9, 21.8, 48.4, 60.4, 68.4])
t_un = 0.1 * t_u

Same model and loss function as before.

def model(t_u, w, b):
    return w * t_u + b
def loss_fn(t_p, t_c):
    squared_diffs = (t_p - t_c)**2
    return squared_diffs.mean()

This time instead of keeping track of our parameters and applying the gradient with respect to the parameters we’ll leverage torch’s auto gradient feature.

params = torch.tensor([1.0, 0.0], requires_grad=True)

How does requires_grad work?

Internally, autograd represents this graph as a graph of Function objects (really expressions), which can be apply() ed to compute the result of evaluating the graph. When computing the forwards pass, autograd simultaneously performs the requested computations and builds up a graph representing the function that computes the gradient (the .grad_fn attribute of each torch.Tensor is an entry point into this graph). When the forwards pass is completed, we evaluate this graph in the backwards pass to compute the gradients. [1]

This can be done as long as our model is differentiable.

Torch will track a graph of operations used to compute our current tensor.

params.grad is None

We apply a single forward and backward pass and can print out the

loss = loss_fn(model(t_u, *params), t_c)

tensor([4517.2969,   82.6000])
if params.grad is not None:

Notice that we are not ready to perform our training_loop and we only had to define our model and loss_fn.

def training_loop(n_epochs, learning_rate, params, t_u, t_c):
    for epoch in range(1, n_epochs + 1):
        # clears out the accumulated derivatives at the leaf nodes
        if params.grad is not None:  # <1>
        t_p = model(t_u, *params) 
        # computes the loss 
        loss = loss_fn(t_p, t_c)
        # accumulate the derivatives at the leaf nodes
        # inplace update of params which autograd does not like
        # the pytorch autograd mechanism will not apply in this block to avoid issues
        with torch.no_grad():  # <2>
            params -= learning_rate * params.grad

        if epoch % 500 == 0:
            print(f"params.grad {params.grad}")
            print('Epoch %d, Loss %f' % (epoch, float(loss)))
    return params
    n_epochs = 5000, 
    learning_rate = 1e-2, 
    params = torch.tensor([1.0, 0.0], requires_grad=True), # <1> 
    t_u = t_un, # <2> 
    t_c = t_c)
params.grad tensor([-0.2252,  1.2748])
Epoch 500, Loss 7.860116
params.grad tensor([-0.0962,  0.5448])
Epoch 1000, Loss 3.828538
params.grad tensor([-0.0411,  0.2328])
Epoch 1500, Loss 3.092191
params.grad tensor([-0.0176,  0.0995])
Epoch 2000, Loss 2.957697
params.grad tensor([-0.0075,  0.0425])
Epoch 2500, Loss 2.933134
params.grad tensor([-0.0032,  0.0182])
Epoch 3000, Loss 2.928648
params.grad tensor([-0.0014,  0.0078])
Epoch 3500, Loss 2.927830
params.grad tensor([-0.0006,  0.0033])
Epoch 4000, Loss 2.927679
params.grad tensor([-0.0003,  0.0014])
Epoch 4500, Loss 2.927652
params.grad tensor([-9.7513e-05,  6.1291e-04])
Epoch 5000, Loss 2.927647

tensor([  5.3671, -17.3012], requires_grad=True)

Same loss as previous notebook

References ΒΆ

If anything is unclear, please post a comment below!

This is a post in the Deep learning with Pytorch series.
Other posts in this series:

Digest The Power Of Full Engagement

December 1, 2020
digest book self help

Deep Learning with PyTorch: Optimizers

October 29, 2020
data science machine learning deep learning book

Deep Learning with PyTorch: Parameter Estimation

October 27, 2020
data science machine learning deep learning book
comments powered by Disqus