MyGrad’s Tensor

Tensor is the most critical piece of MyGrad. It is a numpy-array-like object capable of serving as a node in a computational graph that supports back-propagation of derivatives via the chain rule.

You can effectively do a drop-in replacement of a numpy array with a Tensor for all basic mathematical operations. This includes basic and advanced indexing, broadcasting, sums over axes, etc; it will simply just work.

>>> import mygrad as mg  # note that we replace numpy with mygrad here
>>> x = mg.arange(9).reshape(3, 3)
>>> x
Tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
>>> y = x[x == 4] ** 2
>>> y
Tensor([16], dtype=int32)

Thus MyGrad users can spend their time mastering numpy and their skills will transfer seamlessly when using this autograd library.

Creating a Tensor

Tensor can be passed any “array-like” object of numerical data. This includes numbers, sequences (e.g. lists), nested sequences, numpy-ndarrays, and other mygrad-tensors. mygrad also provides familiar numpy-style tensor-creation functions (e.g. arange(), linspace(), etc.)

>>> import mygrad as mg
>>> mg.tensor(2.3)  # creating a 0-dimensional tensor
>>> mg.tensor(np.array([1.2, 3.0]))  # casting a numpy-array to a tensor
Tensor([1.2, 3.0])
>>> mg.tensor([[1, 2], [3, 4]])  # creating a 2-dimensional tensor from lists
Tensor([[1, 2],
       [3, 4]])
>>> mg.arange(4)    # using numpy-style tensor creation functions
Tensor([0, 1, 2, 3])

Integer-valued tensors are treated as constants

>>> mg.astensor(1, dtype=np.int8).constant

By default, float-valued tensors are not treated as constants

>>> mg.astensor(1, dtype=np.float32).constant

Forward and Back-Propagation

Let’s construct a computational graph consisting of two zero-dimensional tensors, x and y, which are used to compute an output tensor, . This is a “forward pass imperative” style for creating a computational graph - the graph is constructed as we carry out the forward-pass computation.

>>> x = Tensor(3.0)
>>> y = Tensor(2.0)
>>>  = 2 * x + y ** 2

Invoking ℒ.backward() signals the computational graph to compute the total-derivative of with respect to each one of its dependent variables. I.e. x.grad will store dℒ/dx and y.grad will store dℒ/dy. Thus we have back-propagated a gradient from through our graph.

Each tensor of derivatives is computed elementwise. That is, if x = Tensor(x0, x1, x2), then dℒ/dx represents [dℒ/d(x0), dℒ/d(x1), dℒ/d(x2)]

>>> .backward()  # computes dℒ/dx and dℒ/dy
>>> x.grad  # dℒ/dx
>>> y.grad  # dℒ/dy
>>> .grad
array(1.0)  # dℒ/dℒ

Once the gradients are computed, the computational graph containing x, y, and is cleared automatically. Additionally, involving any of these tensors in a new computational graph will automatically null their gradients.

>>> 2 * x
>>> x.grad is None

Or, you can use the null_grad() method to manually clear a tensor’s gradient

>>> y.null_grad()
>>> y.grad is None

Accessing the Underlying NumPy Array

Tensor is a thin wrapper on numpy.ndarray. A tensor’s underlying numpy-array can be accessed via .data. This returns a direct reference to the numpy array.

>>> x = mg.tensor([1, 2])
array([1, 2])
>>> import numpy as np
>>> np.asarray(x)
array([1, 2])

Producing a “View” of a Tensor

MyGrad’s tensors exhibit the same view semantics and memory-sharing relationships as NumPy arrays. I.e. any (non-scalar) tensor produced via basic indexing will share memory with its parent.

>>> x = mg.tensor([1., 2., 3., 4.])
>>> y = x[:2]  # the view: Tensor([1., 2.])
>>> y.base is x
>>> np.shares_memory(x, y)

Mutating shared data will propagate through views:

>>> y *= -1
>>> x
Tensor([-1., -2.,  3.,  4.])
>>> y
Tensor([-1., -2.])

And this view relationship will also manifest between the tensors’ gradients

>>> (x ** 2).backward()
>>> x.grad
array([-2., -4.,  6.,  8.])
>>> y.grad
array([-2., -4.])

Documentation for mygrad.Tensor


Tensor.astype(dtype[, casting, copy, constant])

Copy of the tensor with the specified dtype.


Trigger backpropagation and compute the derivatives of this tensor.


A reference to the base tensor that the present tensor is a view of.


Removes the current tensor – and tensors above it – from their shared computational graph.


If True, this tensor is a constant; it will not propagate any gradient.

Tensor.copy(*[, constant])

Produces a copy of self with copy.creator=None.


The Operation instance that produced self.


Data-type of the tensor’s elements.


Returns the derivative of with respect to this tensor.


Copy an element of a tensor to a standard Python scalar and return it.


Number of tensor dimensions.

Tensor.null_grad(*[, _clear_view_info])

Sets this tensor’s gradient to be None.


**Deprecated: Tensors will automatically have their computational graphs cleared during backprop.


Tuple of tensor dimension-sizes.


Number of elements in the tensor.


Same as self.transpose(), except that self is returned if self.ndim < 2 and a view of the underlying data is utilized whenever possible.