mygrad.matmul#

class mygrad.matmul(x1: ArrayLike, x2: ArrayLike, out: Optional[Union[Tensor, ndarray]] = None, *, dtype: DTypeLikeReals = None, constant: Optional[bool] = None)#

Matrix product of two tensors:

matmul(x, y) is equivalent to x @ y.

This documentation was adapted from numpy.matmul

The behavior depends on the arguments in the following way.

  • If both arguments are 2-D they are multiplied like conventional matrices.

  • If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

  • If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.

  • If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.

Multiplication by a scalar is not allowed, use * instead. Note that multiplying a stack of matrices with a vector will result in a stack of vectors, but matmul will not recognize it as such.

matmul differs from numpy.dot in two important ways.

  • Multiplication by scalars is not allowed.

  • Stacks of matrices are broadcast together as if the matrices were elements.

Parameters
x1ArrayLike
x2ArrayLike
constantOptional[bool]

If True, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e. constant.grad will always return None).

Defaults to False for float-type data. Defaults to True for integer-type data.

Integer-type tensors must be constant.

dtypeOptional[DTypeLikeReals]

The dtype of the resulting tensor.

outOptional[Union[ndarray, Tensor]]

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated tensor is returned.

Returns
outputmygrad.Tensor

Returns the matrix product of x1 and x2`.

Raises
ValueError
If :
  • The last dimension of x1 is not the same size as the second-to-last dimension of x2.

  • If scalar value is passed.

See also

einsum

Einstein summation convention.

Notes

The matmul function implements the semantics of the @ operator introduced in Python 3.5 following PEP465.

Examples

For two 2D tensors, matmul(a, b) is the matrix product \(\sum_{j}{A_{ij} B_{jk}} = F_{ik}\):

>>> import mygrad as mg
>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> mg.matmul(a, b)
Tensor([[4, 1],
        [2, 2]])

For 2-D mixed with 1-D, the result is the matrix-vector product, \(\sum_{j}{A_{ij} B_{j}} = F_{i}\):

>>> a = [[1, 0], [0, 1]]
>>> b = [1, 2]
>>> mg.matmul(a, b)
Tensor([1, 2])

Broadcasting is conventional for stacks of arrays. Here a is treated like a stack of three 5x6 matrices, and the 6x4 matrix b is broadcast matrix-multiplied against each one. This produces a shape-(3, 5, 4) tensor as a result.

>>> a = mg.arange(3*5*6).reshape((3,5,6))
>>> b = mg.arange(6*4).reshape((6,4))
>>> mg.matmul(a,b).shape
(3, 5, 4)

Scalar multiplication raises an error.

>>> mg.matmul(a, 3)
Traceback (most recent call last):
...
ValueError: Scalar operands are not allowed, use '*' instead
Attributes
identity

Methods

accumulate([axis, dtype, out, constant])

Not implemented

at(indices[, b, constant])

Not implemented

outer(b, *[, dtype, out])

Not Implemented

reduce([axis, dtype, out, keepdims, ...])

Not Implemented

reduceat(indices[, axis, dtype, out])

Not Implemented

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

accumulate([axis, dtype, out, constant])

Not implemented

at(indices[, b, constant])

Not implemented

outer(b, *[, dtype, out])

Not Implemented

reduce([axis, dtype, out, keepdims, ...])

Not Implemented

reduceat(indices[, axis, dtype, out])

Not Implemented

Attributes

identity

nargs

nin

nout

ntypes

signature

types