MyGrad#

MyGrad is a lightweight library that adds automatic differentiation to NumPy – its only dependency is NumPy. Simply “drop in” a MyGrad tensor into your NumPy-based code, and start differentiating!

>>> import mygrad as mg
>>> import numpy as np

>>> x = mg.tensor([1., 2., 3.])  # like numpy.array, but supports backprop
>>> f = np.sum(x * x)  # tensors can be passed directly to native numpy functions!
>>> f.backward() # triggers automatic differentiation
>>> x.grad  # stores [df/dx0, df/dx1, df/dx2]
array([2., 4., 6.])

MyGrad’s primary goal is to make automatic differentiation accessible and easy to use across the Python/NumPy ecosystem. As such, it strives to behave and feel exactly like NumPy so that users need not learn yet another array-based math library.

Of the various modes and flavors of auto-diff, MyGrad currently only supports back-propagation from a scalar quantity.

“Drop in” automatic differentiation?#

What we mean by drop in automatic differentiation is that you can take a third party function, which is written in NumPy, and pass MyGrad tensors as its inputs – this will coerce it into using MyGrad functions internally so that we can differentiate the function.

What we mean by drop in autodiff#

from third_party_lib import some_numpy_func

import mygrad as mg

arr1 = mg.tensor(...) # some MyGrad Tensor (instead of a NumPy array)
arr2 = mg.tensor(...) # some MyGrad Tensor (instead of a NumPy array)

output = some_numpy_func(arr1, arr2)  # "drop in" the MyGrad tensors

output.backward()  # output is a MyGrad tensor, not a NumPy array!

arr1.grad  # stores d(some_numpy_func) / d(arr1)
arr2.grad  # stores d(some_numpy_func) / d(arr2)

MyGrad aims for parity with NumPy’s major features#

NumPy’s ufuncs are richly supported. We can even differentiate through an operation that occur in-place on a tensor and applies a boolean mask to the results:

>>> x = mg.tensor([1., 2., 3.])
>>> y = mg.zeros_like(x)
>>> np.multiply(x, x, where=[True, False, True], out=y)
>>> y.backward()
>>> x.grad
array([2., 0., 6.])

NumPy’s view semantics are also mirrored to a high fidelity: performing basic indexing and similar operations on tensors will produce a “view” of that tensor’s data, thus a tensor and its view share memory. This relationship will also manifest between the derivatives stored by a tensor and its views!

>>> x = mg.arange(9.).reshape(3, 3)
>>> diag_view = np.einsum("ii->i", x)  # returns a view of the diagonal elements of `x`
>>> x, diag_view
(Tensor([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]]),
Tensor([0., 4., 8.]))

# views share memory
>>> np.shares_memory(x, diag_view)
True

# mutating a view affects its base (and all other views)
>>> diag_view *= -1  # mutates x in-place
>>> x
Tensor([[-0.,  1.,  2.],
        [ 3., -4.,  5.],
        [ 6.,  7., -8.]])

>>> (x ** 2).backward()
>>> x.grad, diag_view.grad
(array([[ -0.,   2.,   4.],
        [  6.,  -8.,  10.],
        [ 12.,  14., -16.]]),
 array([ -0.,  -8., -16.]))

# the gradients have the same view relationship!
>>> np.shares_memory(x.grad, diag_view.grad)
True

Basic and advanced indexing is fully supported

>>> (x[x < 4] ** 2).backward()
>>> x.grad
array([[0., 2., 4.],
       [6., 0., 0.],
       [0., 0., 0.]])

NumPy arrays and other array-likes play nicely with MyGrad’s tensor. These behave like constants during automatic differentiation

>>> x = mg.tensor([1., 2., 3.])
>>> constant = [-1., 0., 10]  # can be a numpy array, list, or any other array-like
>>> (x * constant).backward()  # all array-likes are treated as constants
>>> x.grad
array([-1.,  0., 10.])

What About JAX?#

Doesn’t JAX already provide drop in automatic differentiation? Not quite; JAX provides swap-out automatic differentiation: you must swap out the version of NumPy you are using before you write your code. Thus you cannot simply differentiate some third party function by passing it a JAX array.

“Is MyGrad a competitor to JAX? Should I stop using JAX and start using MyGrad?”

Goodness gracious, no! MyGrad is not meant to compete with the likes of JAX, which offers far more functionality in the way of computing higher-order derivatives, Jacobian vector projects, in terms of providing a jit… this list goes on. MyGrad is meant to be a simple and highly accessible way to provide basic automatic differentiation capabilities to the NumPy ecosystem. Anyone who knows how to use NumPy can very easily learn to use MyGrad. It is especially great for teaching. But once your auto-diff needs extend beyond derivatives of scalars, it is time to graduate to JAX.

Contents: