mygrad.linalg.norm(x: ArrayLike, ord: Optional[Union[int, float]] = None, axis: Optional[Union[int, Tuple[int]]] = None, keepdims: bool = False, *, nan_to_num: bool = True, constant: Optional[bool] = None) Tensor[source]#

Vector norm.

This function is an infinite number of vector norms (described below), depending on the value of the ord parameter.

In contrast to numpy.linalg.norm, matrix norms are not supported.

This docstring was adapted from that of numpy.linalg.norm [1].


Input tensor. If axis is None, then x must be 1-D unless ord is None. If both axis and ord are None, the 2-norm of x.ravel will be returned.

ordOptional[Union[int, float]]

Order of the norm (see table under Notes). inf means numpy’s inf object. The default is None.

axisOptional[Union[int, Tuple[int]]]

If axis is an integer, it specifies the axis of x along which to compute the vector norms. The default is None.

keepdimsbool, optional (default=False)

If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.

nan_to_numbool, optional (default=True)

If True then gradients that would store nans due to the presence of zeros in x will instead store zeros in those places.


If True, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e. constant.grad will always return None).

Defaults to False for float-type data. Defaults to True for integer-type data.

Integer-type tensors must be constant.


Norm(s) of the vector(s).


For values of ord < 1, the result is, strictly speaking, not a mathematical ‘norm’, but it may still be useful for various numerical purposes.

The following norms can be calculated:


norm for vectors






sum(x != 0)


as below


as below


as below


as below



The Frobenius norm is given by [1]:

\(||A||_F = [\sum_{i,j} abs(a_{i,j})^2]^{1/2}\)

The nuclear norm is the sum of the singular values.

Both the Frobenius and nuclear norm orders are only defined for matrices and raise a ValueError when x.ndim != 2.



Retrieved from:


G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15


>>> import mygrad as mg
>>> x = mg.tensor([[1.0, 2.0, 3.0],
...                [1.0, 0.0, 0.0]])
>>> l2_norms = mg.linalg.norm(x, axis=1, ord=2)
>>> l2_norms
Tensor([3.74165739, 1.        ])

The presence of the elementwise absolute values in the norm operation means that zero-valued entries in any of input vectors have an undefined derivative. When nan_to_num=False is specified these derivatives will be reported as nan, otherwise they will be made to be 0.0.

>>> l2_norms = mg.linalg.norm(x, axis=1, ord=2, nan_to_num=True)
>>> l2_norms.backward()
>>> x.grad
array([[0.26726124, 0.53452248, 0.80178373],
       [1.        ,        nan,        nan]])

This is rigorously true, but is often not the desired behavior in autodiff applications. Rather, it can be preferable to use 0.0 to fill these undefined derivatives. This is the default behavior, when nan_to_num is not specified.

>>> l2_norms = mg.linalg.norm(x, axis=1, ord=2, nan_to_num=False)  # default setting: `nan_to_num=False`
>>> l2_norms.backward()
>>> x.grad
array([[0.26726124, 0.53452248, 0.80178373],
      [1.        ,          0.,         0.]])

L1 norms along each of the three columns:

>>> mg.linalg.norm(x, axis=0, ord=1)
Tensor([2., 2., 3.])