Skip to content

loss_fns #

Loss functions.

Classes:

Name Description
BCELoss
BCEWithLogitsLoss
BCEWithLogitsLossVR

Binary Cross Entropy Loss with reduced variance for models with stochastic parameters.

CrossEntropyLoss
CrossEntropyLossVR

Cross Entropy Loss with reduced variance for models with stochastic parameters.

FocalLoss

The focal loss rescales the cross entropy loss with a factor that induces a regularizer on the output class probabilities.

L1Loss
MSELoss
MSELossVR

Mean-Squared Error Loss with reduced variance for models with stochastic parameters.

NLLLoss
VariationalFreeEnergy

Variational Free Energy Loss.

Attributes:

Name Type Description
NegativeELBO

NegativeELBO #

NegativeELBO = VariationalFreeEnergy

BCELoss #

Bases: MultipleBatchDimensionsLossMixin, BCELoss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

BCEWithLogitsLoss #

BCEWithLogitsLoss(
    weight: Tensor | None = None,
    reduction: Literal["mean", "sum", "none"] = "mean",
    pos_weight: Tensor | None = None,
)

Bases: MultipleBatchDimensionsLossMixin, BCEWithLogitsLoss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

BCEWithLogitsLossVR #

BCEWithLogitsLossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Binary Cross Entropy Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[-\log p(y_n \mid f_w(x_n))]\\ &= -\mathbb{E}_{w}[y_n \log \sigma(f_w(x_n)) + (1 - y_n) \log \sigma(-f_w(x_n))]\\ &\leq \mathbb{E}_{w_{1:L-1}}\big[y_n \log(1+\mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(-f_w(x_n))])\\ &\qquad+ (1 - y_n) \log(1+\mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(f_w(x_n))])\big] \end{align*} \]

which for a linear Gaussian output layer equals

\[ \begin{align*} \ell_n &\leq \mathbb{E}_{w_{1:L-1}}\big[ y_n \big(\log(1 + \exp(\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)]) \big) \\ &\qquad+ (1-y_n) \big(\log(1 + \exp(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)])\big)\big]. \end{align*} \]

which defines an upper bound on the expected value of the cross entropy loss. For models with stochastic parameters, this loss has lower variance in exchange for bias compared to inferno.loss_fns.CrossEntropyLoss, which directly computes a Monte-Carlo approximation of the expected loss .

The reduction is applied over all sample and batch dimensions.

Parameters:

Name Type Description Default
reduction Literal['none', 'sum', 'mean']

Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the weighted mean of the output is taken, 'sum': the output will be summed.

'mean'

Methods:

Name Description
forward

Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name Type Description Default
input_representation Float[Tensor, '*sample batch *feature']

(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.

required
output_layer BNNMixin

Output layer of the model.

required
target Float[Tensor, 'batch *out_feature']

Target tensor.

required

CrossEntropyLoss #

Bases: MultipleBatchDimensionsLossMixin, CrossEntropyLoss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

CrossEntropyLossVR #

CrossEntropyLossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Cross Entropy Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[-\log p(y_n \mid f_w(x_n))]\\ &= \mathbb{E}_{w}[-\log \operatorname{softmax}(f_w(x_n))_{y_n}]\\ &\leq \mathbb{E}_{w_{1:L-1}}\bigg[\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)_{y_n}] + \log \sum_{c=1}^C \mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(f_w(x_n)_{c})]\bigg], \end{align*} \]

which for a linear Gaussian output layer equals

\[ \begin{equation*} \ell_n \leq \mathbb{E}_{w_{1:L-1}}\bigg[\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)_{y_n}] + \operatorname{logsumexp}\big(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)_{c}] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)_c]\big)\bigg]. \end{equation*} \]

This loss defines an upper bound on the expected value of the cross entropy loss and for models with stochastic parameters has lower variance in exchange for bias compared to inferno.loss_fns.CrossEntropyLoss, which directly computes a Monte-Carlo approximation of the expected loss.

The reduction is applied over all sample and batch dimensions.

Parameters:

Name Type Description Default
reduction Literal['none', 'sum', 'mean']

Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the weighted mean of the output is taken, 'sum': the output will be summed.

'mean'

Methods:

Name Description
forward

Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name Type Description Default
input_representation Float[Tensor, '*sample batch *feature']

(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.

required
output_layer BNNMixin

Output layer of the model.

required
target Float[Tensor, 'batch *out_feature']

Target tensor.

required

FocalLoss #

FocalLoss(
    task: Literal["binary", "multiclass"],
    gamma: float = 2.0,
    num_classes: int | None = None,
    weight: Tensor | None = None,
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _WeightedLoss

The focal loss rescales the cross entropy loss with a factor that induces a regularizer on the output class probabilities.

The focal loss is useful to address class imbalance (Lin et al. 2017) and to improve calibration (Mukhoti et al. 2020). The loss on a single datapoint is given by

\[ \begin{equation*} \ell_n = -(1-\hat{p}_{y_n})^\gamma\log \hat{p}_{y_n}. \end{equation*} \]

For \(\gamma=1\) the focal loss equals the cross entropy loss with an entropic regularizer on the predicted class probabilities.

Parameters:

Name Type Description Default
task Literal['binary', 'multiclass']

Specifies the type of task: 'binary' or 'multiclass'.

required
gamma float

Focusing parameter, controls the strength of the modulating factor \((1-\hat{p}_{y_n})^\gamma\).

2.0
num_classes int | None

Number of classes (only required for multi-class classification)

None
weight Tensor | None

A manual rescaling weight given to each class. If given, has to be a Tensor of size C.

None
reduction Literal['none', 'sum', 'mean']

Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the weighted mean of the output is taken, 'sum': the output will be summed.

'mean'

Methods:

Name Description
forward

Attributes:

Name Type Description
ce_loss_fn
gamma
num_classes
task

ce_loss_fn #

ce_loss_fn = BCEWithLogitsLoss(
    weight=weight, reduction="none"
)

gamma #

gamma = gamma

num_classes #

num_classes = num_classes

task #

task = task

forward #

forward(pred: Tensor, target: Tensor)

L1Loss #

Bases: MultipleBatchDimensionsLossMixin, L1Loss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

MSELoss #

Bases: MultipleBatchDimensionsLossMixin, MSELoss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

MSELossVR #

MSELossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Mean-Squared Error Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[(f_w(x_n) - y_n)^2]\\ &= \mathbb{E}_{w_{1:L-1}}\big[\mathbb{E}_{w_L \mid w_{1:L-1}}[(f_w(x_n) - y_n)^2]\big]\\ &= \mathbb{E}_{w_{1:L-1}}\big[(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)] - y_n)^2 + \operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)]\big]. \end{align*} \]

For models with stochastic parameters, the conditional Monte-Carlo estimate results in variance reduction compared to using inferno.loss_fns.MSELoss which directly computes a Monte-Carlo approximation of the expected loss.

The reduction is applied over all sample and batch dimensions.

Parameters:

Name Type Description Default
reduction Literal['none', 'sum', 'mean']

Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the weighted mean of the output is taken, 'sum': the output will be summed.

'mean'

Methods:

Name Description
forward

Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name Type Description Default
input_representation Float[Tensor, '*sample batch *feature']

(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.

required
output_layer BNNMixin

Output layer of the model.

required
target Float[Tensor, 'batch *out_feature']

Target tensor.

required

NLLLoss #

Bases: MultipleBatchDimensionsLossMixin, NLLLoss

Methods:

Name Description
forward

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

VariationalFreeEnergy #

VariationalFreeEnergy(
    nll: _Loss,
    model: BNNMixin,
    prior_loc: Float[Tensor, "parameter"] | None = None,
    prior_scale: Float[Tensor, "parameter"] | None = None,
    kl_weight: float | None = 1.0,
    reduction: str = "mean",
)

Bases: Module

Variational Free Energy Loss.

Computes the variational free energy loss for variational inference with the Kullback-Leibler regularization term computed in weight space. This is also known as the negative evidence lower bound (ELBO).

Parameters:

Name Type Description Default
nll _Loss

Loss function defining the negative log-likelihood.

required
model BNNMixin

The probabilistic model.

required
prior_loc Float[Tensor, 'parameter'] | None

Location(s) of the prior Gaussian distribution.

None
prior_scale Float[Tensor, 'parameter'] | None

Scale(s) of the prior Gaussian distribution.

None
kl_weight float | None

Weight for the KL divergence term. If None, chooses the weight inversely proportional to the number of mean parameters.

1.0
reduction str

Specifies the reduction to apply to the output: ``'mean' | 'sum'. 'mean': the weighted mean of the output is taken, 'sum': the output will be summed.

'mean'

Methods:

Name Description
forward

Attributes:

Name Type Description
kl_weight
model
nll
numel_mean_parameters
prior_loc
prior_scale
reduction

kl_weight #

kl_weight = kl_weight

model #

model = model

nll #

nll = nll

numel_mean_parameters #

numel_mean_parameters = sum(
    (numel())
    for name, param in (named_parameters())
    if requires_grad
    and "params." in name
    and "cov." not in name
)

prior_loc #

prior_loc = prior_loc

prior_scale #

prior_scale = prior_scale

reduction #

reduction = reduction

forward #

forward(
    input: Float[Tensor, "*sample batch in_feature"],
    target: Float[Tensor, "batch out_feature"],
) -> Float[Tensor, ""]