mygrad.matmul#

class mygrad.matmul(x1: ArrayLike, x2: ArrayLike, out: Optional[Union[Tensor, ndarray]] = None, *, dtype: DTypeLikeReals = None, constant: Optional[bool] = None)#

Matrix product of two tensors:

matmul(x, y) is equivalent to x @ y.

This documentation was adapted from numpy.matmul

The behavior depends on the arguments in the following way.

If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.

Multiplication by a scalar is not allowed, use * instead. Note that multiplying a stack of matrices with a vector will result in a stack of vectors, but matmul will not recognize it as such.

matmul differs from numpy.dot in two important ways.

Multiplication by scalars is not allowed.
Stacks of matrices are broadcast together as if the matrices were elements.

Parameters

x1ArrayLike

x2ArrayLike

constantOptional[bool]

If True, this tensor is treated as a constant, and thus does not facilitate back propagation (i.e. constant.grad will always return None).

Defaults to False for float-type data. Defaults to True for integer-type data.

Integer-type tensors must be constant.

dtypeOptional[DTypeLikeReals]

The dtype of the resulting tensor.

outOptional[Union[ndarray, Tensor]]

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated tensor is returned.

Returns

outputmygrad.Tensor: Returns the matrix product of x1 and x2`.

Raises

ValueError

If :

The last dimension of x1 is not the same size as the second-to-last dimension of x2.
If scalar value is passed.

See also

einsum: Einstein summation convention.

Notes

The matmul function implements the semantics of the @ operator introduced in Python 3.5 following PEP465.

Examples

For two 2D tensors, matmul(a, b) is the matrix product \(\sum_{j}{A_{ij} B_{jk}} = F_{ik}\):

>>> import mygrad as mg
>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> mg.matmul(a, b)
Tensor([[4, 1],
        [2, 2]])

For 2-D mixed with 1-D, the result is the matrix-vector product, \(\sum_{j}{A_{ij} B_{j}} = F_{i}\):

>>> a = [[1, 0], [0, 1]]
>>> b = [1, 2]
>>> mg.matmul(a, b)
Tensor([1, 2])

Broadcasting is conventional for stacks of arrays. Here a is treated like a stack of three 5x6 matrices, and the 6x4 matrix b is broadcast matrix-multiplied against each one. This produces a shape-(3, 5, 4) tensor as a result.

>>> a = mg.arange(3*5*6).reshape((3,5,6))
>>> b = mg.arange(6*4).reshape((6,4))
>>> mg.matmul(a,b).shape
(3, 5, 4)

Scalar multiplication raises an error.

>>> mg.matmul(a, 3)
Traceback (most recent call last):
...
ValueError: Scalar operands are not allowed, use '*' instead

Attributes

identity

Methods

`accumulate`([axis, dtype, out, constant])	Not implemented
`at`(indices[, b, constant])	Not implemented
`outer`(b, *[, dtype, out])	Not Implemented
`reduce`([axis, dtype, out, keepdims, ...])	Not Implemented
`reduceat`(indices[, axis, dtype, out])	Not Implemented

__init__(*args, **kwargs)#

Methods

`__init__`(args, *kwargs)
`accumulate`([axis, dtype, out, constant])	Not implemented
`at`(indices[, b, constant])	Not implemented
`outer`(b, *[, dtype, out])	Not Implemented
`reduce`([axis, dtype, out, keepdims, ...])	Not Implemented
`reduceat`(indices[, axis, dtype, out])	Not Implemented

Attributes

`identity`
`nargs`
`nin`
`nout`
`ntypes`
`signature`
`types`