MyGrad’s Tensor#

Tensor is the most critical piece of MyGrad. It is a numpy-array-like object capable of serving as a node in a computational graph that supports back-propagation of derivatives via the chain rule.

You can effectively do a drop-in replacement of a numpy array with a Tensor for all basic mathematical operations. This includes basic and advanced indexing, broadcasting, sums over axes, etc; it will simply just work.

>>> import mygrad as mg  # note that we replace numpy with mygrad here
>>> x = mg.arange(9).reshape(3, 3)
>>> x
Tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
>>> y = x[x == 4] ** 2
>>> y
Tensor([16], dtype=int32)

Thus MyGrad users can spend their time mastering numpy and their skills will transfer seamlessly when using this autograd library.

Creating a Tensor#

Tensor can be passed any “array-like” object of numerical data. This includes numbers, sequences (e.g. lists), nested sequences, numpy-ndarrays, and other mygrad-tensors. mygrad also provides familiar numpy-style tensor-creation functions (e.g. arange(), linspace(), etc.)

>>> import mygrad as mg
>>> mg.tensor(2.3)  # creating a 0-dimensional tensor
Tensor(2.3)
>>> mg.tensor(np.array([1.2, 3.0]))  # casting a numpy-array to a tensor
Tensor([1.2, 3.0])
>>> mg.tensor([[1, 2], [3, 4]])  # creating a 2-dimensional tensor from lists
Tensor([[1, 2],
       [3, 4]])
>>> mg.arange(4)    # using numpy-style tensor creation functions
Tensor([0, 1, 2, 3])

Integer-valued tensors are treated as constants

>>> mg.astensor(1, dtype=np.int8).constant
True

By default, float-valued tensors are not treated as constants

>>> mg.astensor(1, dtype=np.float32).constant
False

Forward and Back-Propagation#

Let’s construct a computational graph consisting of two zero-dimensional tensors, x and y, which are used to compute an output tensor, ℒ. This is a “forward pass imperative” style for creating a computational graph - the graph is constructed as we carry out the forward-pass computation.

>>> x = Tensor(3.0)
>>> y = Tensor(2.0)
>>> ℒ = 2 * x + y ** 2

Invoking ℒ.backward() signals the computational graph to compute the total-derivative of ℒ with respect to each one of its dependent variables. I.e. x.grad will store dℒ/dx and y.grad will store dℒ/dy. Thus we have back-propagated a gradient from ℒ through our graph.

Each tensor of derivatives is computed elementwise. That is, if x = Tensor(x0, x1, x2), then dℒ/dx represents [dℒ/d(x0), dℒ/d(x1), dℒ/d(x2)]

>>> ℒ.backward()  # computes dℒ/dx and dℒ/dy
>>> x.grad  # dℒ/dx
array(6.0)
>>> y.grad  # dℒ/dy
array(4.0)
>>> ℒ.grad
array(1.0)  # dℒ/dℒ

Once the gradients are computed, the computational graph containing x, y, and ℒ is cleared automatically. Additionally, involving any of these tensors in a new computational graph will automatically null their gradients.

>>> 2 * x
>>> x.grad is None
True

Or, you can use the null_grad() method to manually clear a tensor’s gradient

>>> y.null_grad()
Tensor(2.)
>>> y.grad is None
True

Accessing the Underlying NumPy Array#

Tensor is a thin wrapper on numpy.ndarray. A tensor’s underlying numpy-array can be accessed via .data. This returns a direct reference to the numpy array.

>>> x = mg.tensor([1, 2])
>>> x.data
array([1, 2])

>>> import numpy as np
>>> np.asarray(x)
array([1, 2])

Producing a “View” of a Tensor#

MyGrad’s tensors exhibit the same view semantics and memory-sharing relationships as NumPy arrays. I.e. any (non-scalar) tensor produced via basic indexing will share memory with its parent.

>>> x = mg.tensor([1., 2., 3., 4.])
>>> y = x[:2]  # the view: Tensor([1., 2.])
>>> y.base is x
True
>>> np.shares_memory(x, y)
True

Mutating shared data will propagate through views:

>>> y *= -1
>>> x
Tensor([-1., -2.,  3.,  4.])
>>> y
Tensor([-1., -2.])

And this view relationship will also manifest between the tensors’ gradients

>>> (x ** 2).backward()
>>> x.grad
array([-2., -4.,  6.,  8.])
>>> y.grad
array([-2., -4.])

Documentation for mygrad.Tensor#

`Tensor.astype`(dtype[, casting, copy, constant])	Copy of the tensor with the specified dtype.
`Tensor.backward`([grad])	Trigger backpropagation and compute the derivatives of this tensor.
`Tensor.base`	A reference to the base tensor that the present tensor is a view of.
`Tensor.clear_graph`()	Removes the current tensor – and tensors above it – from their shared computational graph.
`Tensor.constant`	If `True`, this tensor is a constant; it will not propagate any gradient.
`Tensor.copy`(*[, constant])	Produces a copy of `self` with `copy.creator=None`.
`Tensor.creator`	The `Operation` instance that produced `self`.
`Tensor.dtype`	Data-type of the tensor's elements.
`Tensor.grad`	Returns the derivative of `ℒ` with respect to this tensor.
`Tensor.item`()	Copy an element of a tensor to a standard Python scalar and return it.
`Tensor.ndim`	Number of tensor dimensions.
`Tensor.null_grad`(*[, _clear_view_info])	Sets this tensor's gradient to be `None`.
`Tensor.null_gradients`([clear_graph])	**Deprecated: Tensors will automatically have their computational graphs cleared during backprop.
`Tensor.shape`	Tuple of tensor dimension-sizes.
`Tensor.size`	Number of elements in the tensor.
`Tensor.T`	Same as self.transpose(), except that self is returned if self.ndim < 2 and a view of the underlying data is utilized whenever possible.