loss_fns #

Loss functions.

Classes:

Name	Description
`BCELoss`
`BCEWithLogitsLoss`
`BCEWithLogitsLossVR`	Binary Cross Entropy Loss with reduced variance for models with stochastic parameters.
`CrossEntropyLoss`
`CrossEntropyLossVR`	Cross Entropy Loss with reduced variance for models with stochastic parameters.
`FocalLoss`	The focal loss rescales the cross entropy loss with a factor that induces a regularizer on the output class probabilities.
`L1Loss`
`MSELoss`
`MSELossVR`	Mean-Squared Error Loss with reduced variance for models with stochastic parameters.
`NLLLoss`
`VariationalFreeEnergy`	Variational Free Energy Loss.

Attributes:

Name	Type	Description
`NegativeELBO`

NegativeELBO #

NegativeELBO = VariationalFreeEnergy

BCELoss #

Bases: MultipleBatchDimensionsLossMixin, BCELoss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

BCEWithLogitsLoss #

BCEWithLogitsLoss(
    weight: Tensor | None = None,
    reduction: Literal["mean", "sum", "none"] = "mean",
    pos_weight: Tensor | None = None,
)

Bases: MultipleBatchDimensionsLossMixin, BCEWithLogitsLoss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

BCEWithLogitsLossVR #

BCEWithLogitsLossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Binary Cross Entropy Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[-\log p(y_n \mid f_w(x_n))]\\ &= -\mathbb{E}_{w}[y_n \log \sigma(f_w(x_n)) + (1 - y_n) \log \sigma(-f_w(x_n))]\\ &\leq \mathbb{E}_{w_{1:L-1}}\big[y_n \log(1+\mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(-f_w(x_n))])\\ &\qquad+ (1 - y_n) \log(1+\mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(f_w(x_n))])\big] \end{align*} \]

which for a linear Gaussian output layer equals

\[ \begin{align*} \ell_n &\leq \mathbb{E}_{w_{1:L-1}}\big[ y_n \big(\log(1 + \exp(\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)]) \big) \\ &\qquad+ (1-y_n) \big(\log(1 + \exp(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)])\big)\big]. \end{align*} \]

which defines an upper bound on the expected value of the cross entropy loss. For models with stochastic parameters, this loss has lower variance in exchange for bias compared to inferno.loss_fns.CrossEntropyLoss, which directly computes a Monte-Carlo approximation of the expected loss .

The reduction is applied over all sample and batch dimensions.

Parameters:

Name	Type	Description	Default
`reduction`	`Literal['none', 'sum', 'mean']`	Specifies the reduction to apply to the output: `'none'` \| `'mean'` \| `'sum'`. `'none'`: no reduction will be applied, `'mean'`: the weighted mean of the output is taken, `'sum'`: the output will be summed.	`'mean'`

Methods:

Name	Description
`forward`	Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name	Type	Description	Default
`input_representation`	`Float[Tensor, 'sample batch feature']`	(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.	required
`output_layer`	`BNNMixin`	Output layer of the model.	required
`target`	`Float[Tensor, 'batch *out_feature']`	Target tensor.	required

CrossEntropyLoss #

Bases: MultipleBatchDimensionsLossMixin, CrossEntropyLoss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

CrossEntropyLossVR #

CrossEntropyLossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Cross Entropy Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[-\log p(y_n \mid f_w(x_n))]\\ &= \mathbb{E}_{w}[-\log \operatorname{softmax}(f_w(x_n))_{y_n}]\\ &\leq \mathbb{E}_{w_{1:L-1}}\bigg[\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)_{y_n}] + \log \sum_{c=1}^C \mathbb{E}_{w_L \mid w_{1:L-1}}[\exp(f_w(x_n)_{c})]\bigg], \end{align*} \]

which for a linear Gaussian output layer equals

\[ \begin{equation*} \ell_n \leq \mathbb{E}_{w_{1:L-1}}\bigg[\mathbb{E}_{w_L \mid w_{1:L-1}}[-f_w(x_n)_{y_n}] + \operatorname{logsumexp}\big(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)_{c}] + \frac{1}{2}\operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)_c]\big)\bigg]. \end{equation*} \]

This loss defines an upper bound on the expected value of the cross entropy loss and for models with stochastic parameters has lower variance in exchange for bias compared to inferno.loss_fns.CrossEntropyLoss, which directly computes a Monte-Carlo approximation of the expected loss.

The reduction is applied over all sample and batch dimensions.

Parameters:

Name	Type	Description	Default
`reduction`	`Literal['none', 'sum', 'mean']`	Specifies the reduction to apply to the output: `'none'` \| `'mean'` \| `'sum'`. `'none'`: no reduction will be applied, `'mean'`: the weighted mean of the output is taken, `'sum'`: the output will be summed.	`'mean'`

Methods:

Name	Description
`forward`	Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name	Type	Description	Default
`input_representation`	`Float[Tensor, 'sample batch feature']`	(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.	required
`output_layer`	`BNNMixin`	Output layer of the model.	required
`target`	`Float[Tensor, 'batch *out_feature']`	Target tensor.	required

FocalLoss #

FocalLoss(
    task: Literal["binary", "multiclass"],
    gamma: float = 2.0,
    num_classes: int | None = None,
    weight: Tensor | None = None,
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _WeightedLoss

The focal loss rescales the cross entropy loss with a factor that induces a regularizer on the output class probabilities.

The focal loss is useful to address class imbalance (Lin et al. 2017) and to improve calibration (Mukhoti et al. 2020). The loss on a single datapoint is given by

\[ \begin{equation*} \ell_n = -(1-\hat{p}_{y_n})^\gamma\log \hat{p}_{y_n}. \end{equation*} \]

For \(\gamma=1\) the focal loss equals the cross entropy loss with an entropic regularizer on the predicted class probabilities.

Parameters:

Name	Type	Description	Default
`task`	`Literal['binary', 'multiclass']`	Specifies the type of task: 'binary' or 'multiclass'.	required
`gamma`	`float`	Focusing parameter, controls the strength of the modulating factor \((1-\hat{p}_{y_n})^\gamma\).	`2.0`
`num_classes`	`int \| None`	Number of classes (only required for multi-class classification)	`None`
`weight`	`Tensor \| None`	A manual rescaling weight given to each class. If given, has to be a Tensor of size C.	`None`
`reduction`	`Literal['none', 'sum', 'mean']`	Specifies the reduction to apply to the output: `'none'` \| `'mean'` \| `'sum'`. `'none'`: no reduction will be applied, `'mean'`: the weighted mean of the output is taken, `'sum'`: the output will be summed.	`'mean'`

Methods:

Name	Description
`forward`

Attributes:

Name	Type	Description
`ce_loss_fn`
`gamma`
`num_classes`
`task`

ce_loss_fn #

ce_loss_fn = BCEWithLogitsLoss(
    weight=weight, reduction="none"
)

gamma #

gamma = gamma

num_classes #

num_classes = num_classes

task #

task = task

forward #

forward(pred: Tensor, target: Tensor)

L1Loss #

Bases: MultipleBatchDimensionsLossMixin, L1Loss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

MSELoss #

Bases: MultipleBatchDimensionsLossMixin, MSELoss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

MSELossVR #

MSELossVR(
    reduction: Literal["none", "sum", "mean"] = "mean",
)

Bases: _Loss

Mean-Squared Error Loss with reduced variance for models with stochastic parameters.

The loss on a single datapoint is given by

\[ \begin{align*} \ell_n &= \mathbb{E}_{w}[(f_w(x_n) - y_n)^2]\\ &= \mathbb{E}_{w_{1:L-1}}\big[\mathbb{E}_{w_L \mid w_{1:L-1}}[(f_w(x_n) - y_n)^2]\big]\\ &= \mathbb{E}_{w_{1:L-1}}\big[(\mathbb{E}_{w_L \mid w_{1:L-1}}[f_w(x_n)] - y_n)^2 + \operatorname{Var}_{w_L \mid w_{1:L-1}}[f_w(x_n)]\big]. \end{align*} \]

For models with stochastic parameters, the conditional Monte-Carlo estimate results in variance reduction compared to using inferno.loss_fns.MSELoss which directly computes a Monte-Carlo approximation of the expected loss.

The reduction is applied over all sample and batch dimensions.

Parameters:

Name	Type	Description	Default
`reduction`	`Literal['none', 'sum', 'mean']`	Specifies the reduction to apply to the output: `'none'` \| `'mean'` \| `'sum'`. `'none'`: no reduction will be applied, `'mean'`: the weighted mean of the output is taken, `'sum'`: the output will be summed.	`'mean'`

Methods:

Name	Description
`forward`	Runs the forward pass.

forward #

forward(
    input_representation: Float[
        Tensor, "*sample batch *feature"
    ],
    output_layer: BNNMixin,
    target: Float[Tensor, "batch *out_feature"],
)

Runs the forward pass.

Parameters:

Name	Type	Description	Default
`input_representation`	`Float[Tensor, 'sample batch feature']`	(Penultimate layer) representation of input tensor. This is the representation produced by a forward pass through all hidden layers, which will be fed as inputs to the output layer in a forward pass.	required
`output_layer`	`BNNMixin`	Output layer of the model.	required
`target`	`Float[Tensor, 'batch *out_feature']`	Target tensor.	required

NLLLoss #

Bases: MultipleBatchDimensionsLossMixin, NLLLoss

Methods:

Name	Description
`forward`

forward #

forward(pred: Tensor, target: Tensor) -> Tensor

VariationalFreeEnergy #

VariationalFreeEnergy(
    nll: _Loss,
    model: BNNMixin,
    prior_loc: Float[Tensor, "parameter"] | None = None,
    prior_scale: Float[Tensor, "parameter"] | None = None,
    kl_weight: float | None = 1.0,
    reduction: str = "mean",
)

Bases: Module

Variational Free Energy Loss.

Computes the variational free energy loss for variational inference with the Kullback-Leibler regularization term computed in weight space. This is also known as the negative evidence lower bound (ELBO).

Parameters:

Name	Type	Description	Default
`nll`	`_Loss`	Loss function defining the negative log-likelihood.	required
`model`	`BNNMixin`	The probabilistic model.	required
`prior_loc`	`Float[Tensor, 'parameter'] \| None`	Location(s) of the prior Gaussian distribution.	`None`
`prior_scale`	`Float[Tensor, 'parameter'] \| None`	Scale(s) of the prior Gaussian distribution.	`None`
`kl_weight`	`float \| None`	Weight for the KL divergence term. If `None`, chooses the weight inversely proportional to the number of mean parameters.	`1.0`
`reduction`	`str`	Specifies the reduction to apply to the output: ``'mean' \| `'sum'`. `'mean'`: the weighted mean of the output is taken, `'sum'`: the output will be summed.	`'mean'`

Methods:

Name	Description
`forward`

Attributes:

Name	Type	Description
`kl_weight`
`model`
`nll`
`numel_mean_parameters`
`prior_loc`
`prior_scale`
`reduction`

kl_weight #

kl_weight = kl_weight

model #

model = model

nll #

nll = nll

numel_mean_parameters #

numel_mean_parameters = sum(
    (numel())
    for name, param in (named_parameters())
    if requires_grad
    and "params." in name
    and "cov." not in name
)

prior_loc #

prior_loc = prior_loc

prior_scale #

prior_scale = prior_scale

reduction #

reduction = reduction

forward #

forward(
    input: Float[Tensor, "*sample batch in_feature"],
    target: Float[Tensor, "batch out_feature"],
) -> Float[Tensor, ""]