mygrad.nnet.losses.focal_loss#

mygrad.nnet.losses.focal_loss(class_probs: ArrayLike, targets: ArrayLike, *, alpha: float = 1, gamma: float = 0, constant: Optional[bool] = None) → Tensor[source]#

Return the per-datum focal loss.

Parameters

class_probsArrayLike, shape=(N, C): The C class probabilities for each of the N pieces of data. Each value is expected to lie on (0, 1]
targetsArrayLike, shape=(N,): The correct class indices, in [0, C), for each datum.
alphaReal, optional (default=1): The ɑ weighting factor in the loss formulation.
gammaReal, optional (default=0): The ɣ focusing parameter. Note that for Ɣ=0 and ɑ=1, this is cross-entropy loss. Must be a non-negative value.
constantOptional[bool]: If True, the returned tensor is a constant (it does not back-propagate a gradient)

Returns

mygrad.Tensor, shape=(N,): The per-datum focal loss.

Notes

The formulation for the focal loss introduced in https://arxiv.org/abs/1708.02002. It is given by -ɑ(1-p)ˠlog(p).

The focal loss for datum-\(i\) is given by

\[-\alpha \hat{y}_i(1-p_i)^\gamma\log(p_i)\]

where \(\hat{y}_i\) is one in correspondence to the label associated with the datum and 0 elsewhere. That is, if the label \(y_k\) is 2 and there are four possible label values, then \(\hat{y}_k = (0, 0, 1, 0)\).

It is recommended in the paper that you normalize by the number of foreground samples.