mygrad.Tensor.grad#

property Tensor.grad: ndarray | None#

Returns the derivative of ℒ with respect to this tensor.

ℒ is the terminal node in the compuational graph from which ℒ.backward() was invoked.

If this tensor is a view of another tensor then their gradients will exhibit the same memory-sharing relationship as their data.

Returns:

dℒ/dx: numpy.ndarray: The gradient of the terminal node in a computational graph with respect to this tensor. The shape of this numpy array matches self.shape

Examples

>>> import mygrad as mg
>>> x = mg.Tensor([1.0, 2.0])

Prior to backpropagation tensors have None set for their gradients.

>>> x.grad is None
True

Now we trigger backpropagation…

>>> ℒ = x ** 2
>>> ℒ.backward()

and we see that x.grad stores dℒ/dx

>>> x.grad  # dℒ/dx
array([2., 4.])

Now we will demonstrate the relationship between gradient a view tensor and that of its base.

>>> base = mg.Tensor([1.0, 2.0, 3.0])
>>> view = base[:2]; view
Tensor([1., 2.])

>>> ℒ = base ** 2
>>> ℒ.backward()

Although view is not directly involved in the computation in ℒ, and thus would not typically store a gradient in due to ℒ.backward(), it shares memory with base and thus it stores a gradient in correspondence to this “view relationship”. I.e. because view == base[:2], then we expect to find that view.grad == base.grad[:2].

>>> base.grad
array([2., 4., 6.])
>>> view.grad
array([2., 4.])

>>> view.grad.base is base.grad
True

The reasoning here is that, because a base tensor and its view share the same array data, then varying an element in that data implies that both the base tensor and the view will change (assuming the variation occurs specifically in a shared region). It follows that the base tensor’s gradient must share the same relationship with the view-tensor since these are measures of “cause and effects” associated with varying elements of data (albeit infinitesmaly).