Skip to content

models #

Pre-defined models.

Classes:

Name Description
Ensemble

An ensemble of models.

LeNet5

A simple convolutional neural network for image classification of 28x28 grayscale images.

MLP

A fully-connected feedforward neural network with the same activation function in each layer.

ResNeXt101_32X8D

ResNext-101 (32x8d)

ResNeXt101_64X4D

ResNext-101 (32x4d)

ResNeXt50_32X4D

ResNext-50 (32x4d)

ResNet

A residual neural network for image classification.

ResNet101

ResNet-101

ResNet18

ResNet-18

ResNet34

ResNet-34

ResNet50

ResNet-50

ViT_B_16

ViT_B_16

ViT_B_32

ViT_B_32

ViT_H_14

ViT_H_14

ViT_L_16

ViT_L_16

ViT_L_32

ViT_L_32

VisionTransformer

Vision Transformer as per https://arxiv.org/abs/2010.11929.

WideResNet101

WideResNet-101-2

WideResNet50

WideResNet-50-2

Attributes:

Name Type Description
___all__

___all__ #

___all__ = [
    "Ensemble",
    "LeNet5",
    "MLP",
    "ResNet",
    "ResNet18",
    "ResNet34",
    "ResNet50",
    "ResNet101",
    "ResNeXt50_32X4D",
    "ResNeXt101_32X8D",
    "ResNeXt101_64X4D",
    "WideResNet50",
    "WideResNet101",
    "as_torch_model",
]

Ensemble #

Ensemble(members: Iterable[Module])

Bases: BNNMixin, Module

An ensemble of models.

This class ensembles multiple models with the same architecture by averaging their predictions.

Parameters:

Name Type Description Default
members Iterable[Module]

List of models to ensemble.

required

Methods:

Name Description
forward
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
base_module
members
parametrization Parametrization

Parametrization of the module.

base_module #

base_module = [to(device='meta')]

members #

members = ModuleList(members)

parametrization #

parametrization: Parametrization

Parametrization of the module.

forward #

forward(
    input: Float[Tensor, "*batch in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch out_feature"]

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

LeNet5 #

LeNet5(
    out_size: int = 10,
    parametrization: Parametrization = MaximalUpdate(),
    cov: FactorizedCovariance | None = None,
    activation_layer: Callable[..., Module] | None = ReLU,
)

Bases: Sequential

A simple convolutional neural network for image classification of 28x28 grayscale images.

Parameters:

Name Type Description Default
out_size int

Size of the output (i.e. number of classes).

10
parametrization Parametrization

The parametrization to use. Defines the initialization and learning rate scaling for the parameters of the module.

MaximalUpdate()
cov FactorizedCovariance | None

Covariance structure of the weights.

None
activation_layer Callable[..., Module] | None

Activation function following a linear layer.

ReLU

Methods:

Name Description
forward
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
out_size
parametrization

Parametrization of the module.

out_size #

out_size = out_size

parametrization #

parametrization = parametrization

Parametrization of the module.

forward #

forward(
    input: Float[Tensor, "*batch in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch out_feature"]

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

MLP #

MLP(
    in_size: int,
    hidden_sizes: list[int],
    out_size: int | Size,
    norm_layer: Callable[..., Module] | None = None,
    activation_layer: Callable[..., Module] | None = ReLU,
    inplace: bool | None = None,
    bias: bool = True,
    dropout: float | None = None,
    parametrization: Parametrization = MaximalUpdate(),
    cov: (
        FactorizedCovariance
        | list[FactorizedCovariance]
        | None
    ) = None,
)

Bases: Sequential

A fully-connected feedforward neural network with the same activation function in each layer.

Parameters:

Name Type Description Default
in_size int

Size of the input.

required
hidden_sizes list[int]

List of hidden layer sizes.

required
out_size int | Size

Size of the output (e.g. number of classes).

required
norm_layer Callable[..., Module] | None

Normalization layer which will be stacked on top of the linear layer.

None
activation_layer Callable[..., Module] | None

Activation function following a linear layer.

ReLU
inplace bool | None

Whether to apply the activation function and dropout inplace. Default is None, which uses the respective default values.

None
bias bool

Whether to use bias in the linear layer.``

True
dropout float | None

The probability for the dropout layer.

None
parametrization Parametrization

The parametrization to use. Defines the initialization and learning rate scaling for the parameters of the module.

MaximalUpdate()
cov FactorizedCovariance | list[FactorizedCovariance] | None

Covariance structure of the weights.

None

Methods:

Name Description
forward
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
hidden_sizes
in_size
out_size
parametrization

Parametrization of the module.

hidden_sizes #

hidden_sizes = hidden_sizes

in_size #

in_size = in_size

out_size #

out_size = out_size

parametrization #

parametrization = parametrization

Parametrization of the module.

forward #

forward(
    input: Float[Tensor, "*batch in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch out_feature"]

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNeXt101_32X8D #

ResNeXt101_32X8D(*args, **kwargs)

Bases: ResNet

ResNext-101 (32x8d)

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNeXt101_64X4D #

ResNeXt101_64X4D(*args, **kwargs)

Bases: ResNet

ResNext-101 (32x4d)

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNeXt50_32X4D #

ResNeXt50_32X4D(*args, **kwargs)

Bases: ResNet

ResNext-50 (32x4d)

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNet #

ResNet(
    out_size: int,
    block: type["BasicBlock"] | type["Bottleneck"],
    num_blocks_per_layer: Sequence[int],
    zero_init_residual: bool = False,
    groups: int = 1,
    width_per_group: int = 64,
    replace_stride_with_dilation: Sequence[bool] = (
        False,
        False,
        False,
    ),
    norm_layer: Callable[..., Module] = lambda c: GroupNorm(
        num_groups=32, num_channels=c
    ),
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    parametrization: Parametrization = MaximalUpdate(),
    cov: FactorizedCovariance | None = None,
)

Bases: BNNMixin, Module

A residual neural network for image classification.

Parameters:

Name Type Description Default
out_size int

Size of the output (i.e. number of classes).

required
block type['BasicBlock'] | type['Bottleneck']

Block type to use.

required
num_blocks_per_layer Sequence[int]

Number of blocks per layer.

required
zero_init_residual bool
False
groups int

Number of groups for the convolutional layers.

1
width_per_group int

Width per group for the convolutional layers.

64
replace_stride_with_dilation Sequence[bool]

Whether to replace the 2x2 stride with a dilated convolution. Must be a tuple of length 3.

(False, False, False)
norm_layer Callable[..., Module]

Normalization layer to use.

lambda c: GroupNorm(num_groups=32, num_channels=c)
architecture Literal['imagenet', 'cifar']

Type of ResNet architecture. Either "imagenet" or "cifar".

'imagenet'
parametrization Parametrization

The parametrization to use. Defines the initialization and learning rate scaling for the parameters of the module.

MaximalUpdate()
cov FactorizedCovariance | None

Covariance structure of the probabilistic layers.

None

Methods:

Name Description
forward
from_pretrained_weights

Load a ResNet model with pretrained weights.

parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    weights: Weights,
    freeze: bool = False,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    *args,
    **kwargs
)

Load a ResNet model with pretrained weights.

Depending on the out_size and architecture parameters, the first and last layers of the model are not initialized with the pretrained weights.

Parameters:

Name Type Description Default
out_size int

Size of the output (i.e. number of classes).

required
weights Weights

Pretrained weights to use.

required
freeze bool

Whether to freeze the pretrained weights.

False
architecture Literal['imagenet', 'cifar']

Type of ResNet architecture. Either "imagenet" or "cifar".

'imagenet'

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNet101 #

ResNet101(*args, **kwargs)

Bases: ResNet

ResNet-101

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNet18 #

ResNet18(*args, **kwargs)

Bases: ResNet

ResNet-18

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNet34 #

ResNet34(*args, **kwargs)

Bases: ResNet

ResNet-34

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ResNet50 #

ResNet50(*args, **kwargs)

Bases: ResNet

ResNet-50

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

ViT_B_16 #

ViT_B_16(*args, **kwargs)

Bases: VisionTransformer

ViT_B_16

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to VisionTransformer.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

ViT_B_32 #

ViT_B_32(*args, **kwargs)

Bases: VisionTransformer

ViT_B_32

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to VisionTransformer.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

ViT_H_14 #

ViT_H_14(*args, **kwargs)

Bases: VisionTransformer

ViT_H_14

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to VisionTransformer.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

ViT_L_16 #

ViT_L_16(*args, **kwargs)

Bases: VisionTransformer

ViT_L_16

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to VisionTransformer.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

ViT_L_32 #

ViT_L_32(*args, **kwargs)

Bases: VisionTransformer

ViT_L_32

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to VisionTransformer.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

VisionTransformer #

VisionTransformer(
    in_size: int,
    patch_size: int,
    num_layers: int,
    num_heads: int,
    hidden_dim: int,
    mlp_dim: int,
    dropout: float = 0.0,
    attention_dropout: float = 0.0,
    out_size: int = 1000,
    representation_size: int | None = None,
    norm_layer: Callable[..., Module] = partial(
        LayerNorm, eps=1e-06
    ),
    conv_stem_configs: list[NamedTuple] | None = None,
    parametrization: Parametrization = MaximalUpdate(),
    cov: (
        FactorizedCovariance
        | dict[FactorizedCovariance]
        | dict[dict[FactorizedCovariance]]
        | None
    ) = None,
)

Bases: BNNMixin, Module

Vision Transformer as per https://arxiv.org/abs/2010.11929.

The covariance can be specified as None (resulting in a non-stochastic model), as an instance of inferno.bnn.params.FactorizedCovariance (resulting in the same covariance in all layers), or as a nested dictionary with the keys indicating the module. For example, the following will place a low rank covariance in the conv_proj, the last layer of the encoder, and the output head:

cov = params.LowRankCovariance(rank=2)
last_layer_cov = {
    "conv_proj": copy.deepcopy(cov),
    "encoder": {
        "layers.encoder_layer_1": {
            "self_attention": {
                "q": copy.deepcopy(cov),
                "k": copy.deepcopy(cov),
                "v": copy.deepcopy(cov),
                "out": copy.deepcopy(cov),
            },
            "mlp": copy.deepcopy(cov)
        },
    },
    "heads.head": copy.deepcopy(cov),
}
model = VisionTransformer(
    in_size=32,
    patch_size=16,
    num_layers=2,
    num_heads=2,
    hidden_dim=10,
    mlp_dim=10,
    cov=last_layer_cov,
)
Note that any modules omitted from the covariance specification will default to None (in this example, any modules part of last_layer_cov["encoder"]["layers.encoder_layer_0"]).

Parameters:

Name Type Description Default
in_size int

Size of the input (i.e. image size).

required
patch_size int

Size of the patch.

required
num_layers int

Number of layers in the encoder.

required
num_heads int

Number of heads.

required
hidden_dim int

Hidden size in encoder.

required
mlp_dim int

Dimension of MLP block.

required
dropout float

Dropout probability.

0.0
attention_dropout float

Attention dropout probability.

0.0
out_size int

Size of the output (i.e. number of classes).

1000
representation_size int | None

Size of pre-logits layer before output head.

None
norm_layer Callable[..., Module]

Normalization layer to use.

partial(LayerNorm, eps=1e-06)
conv_stem_configs list[NamedTuple] | None

Currently not supported.

None
parametrization Parametrization

The parametrization to use. Defines the initialization and learning rate scaling for the parameters of the module.

MaximalUpdate()
cov FactorizedCovariance | dict[FactorizedCovariance] | dict[dict[FactorizedCovariance]] | None

Covariance structure of the probabilistic layers.

None

Methods:

Name Description
forward
from_pretrained_weights

Load a VisionTransformer model with pretrained weights.

parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
attention_dropout
class_token
conv_proj
dropout
encoder
heads
hidden_dim
in_size
mlp_dim
norm_layer
out_size
parametrization Parametrization

Parametrization of the module.

patch_size
representation_size
seq_length

attention_dropout #

attention_dropout = attention_dropout

class_token #

class_token = Parameter(zeros(1, 1, hidden_dim))

conv_proj #

conv_proj = Conv2d(
    in_channels=3,
    out_channels=hidden_dim,
    kernel_size=patch_size,
    stride=patch_size,
    cov=cov["conv_proj"],
    parametrization=parametrization,
    layer_type="input",
)

dropout #

dropout = dropout

encoder #

encoder = Encoder(
    seq_length,
    num_layers,
    num_heads,
    hidden_dim,
    mlp_dim,
    dropout,
    attention_dropout,
    norm_layer,
    parametrization=parametrization,
    cov=cov["encoder"],
)

heads #

heads = Sequential(heads_layers)

hidden_dim #

hidden_dim = hidden_dim

in_size #

in_size = in_size

mlp_dim #

mlp_dim = mlp_dim

norm_layer #

norm_layer = norm_layer

out_size #

out_size = out_size

parametrization #

parametrization: Parametrization

Parametrization of the module.

patch_size #

patch_size = patch_size

representation_size #

representation_size = representation_size

seq_length #

seq_length = seq_length

forward #

forward(
    x: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    in_size: int,
    out_size: int,
    weights: Weights,
    freeze: bool = False,
    *args,
    **kwargs
)

Load a VisionTransformer model with pretrained weights.

Depending on the in_size and out_size parameters, the first and last layers of the model are not initialized with the pretrained weights.

Parameters:

Name Type Description Default
in_size int

Size of the input (i.e. image size).

required
out_size int

Size of the output (i.e. number of classes).

required
weights Weights

Pretrained weights to use.

required
freeze bool

Whether to freeze the pretrained weights.

False

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

Needs to be implemented because VisionTransformer has direct parameters.

WideResNet101 #

WideResNet101(*args, **kwargs)

Bases: ResNet

WideResNet-101-2

Architecture described in Wide Residual Networks. The model is the same as a ResNet except for the bottleneck number of channels which is twice larger in every block.

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.

WideResNet50 #

WideResNet50(*args, **kwargs)

Bases: ResNet

WideResNet-50-2

Architecture described in Wide Residual Networks. The model is the same as a ResNet except for the bottleneck number of channels which is twice larger in every block.

Parameters:

Name Type Description Default
**kwargs

Additional keyword arguments passed on to ResNet.

{}

Methods:

Name Description
forward
from_pretrained_weights
parameters_and_lrs

Get the parameters of the module and their learning rates for the chosen optimizer

reset_parameters

Reset the parameters of the module and set the parametrization of all children

Attributes:

Name Type Description
avgpool
base_width
bn1
conv1
dilation
fc
groups
inplanes
layer1
layer2
layer3
layer4
optional_pool
parametrization Parametrization

Parametrization of the module.

relu

avgpool #

avgpool = AdaptiveAvgPool2d((1, 1))

base_width #

base_width = width_per_group

bn1 #

bn1 = norm_layer(inplanes)

conv1 #

conv1 = Conv2d(
    3,
    inplanes,
    kernel_size=3,
    stride=1,
    padding=1,
    bias=False,
    cov=deepcopy(cov),
    parametrization=parametrization,
    layer_type="input",
)

dilation #

dilation = 1

fc #

fc = Linear(
    512 * expansion,
    out_size,
    parametrization=parametrization,
    cov=deepcopy(cov),
    layer_type="output",
)

groups #

groups = groups

inplanes #

inplanes = 64

layer1 #

layer1 = _make_layer(
    block,
    64,
    num_blocks_per_layer[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer2 #

layer2 = _make_layer(
    block,
    128,
    num_blocks_per_layer[1],
    stride=2,
    dilate=replace_stride_with_dilation[0],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer3 #

layer3 = _make_layer(
    block,
    256,
    num_blocks_per_layer[2],
    stride=2,
    dilate=replace_stride_with_dilation[1],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

layer4 #

layer4 = _make_layer(
    block,
    512,
    num_blocks_per_layer[3],
    stride=2,
    dilate=replace_stride_with_dilation[2],
    parametrization=parametrization,
    cov=(
        deepcopy(cov)
        if isinstance(cov, DiagonalCovariance)
        else None
    ),
    layer_type="hidden",
)

optional_pool #

optional_pool = MaxPool2d(
    kernel_size=3, stride=2, padding=1
)

parametrization #

parametrization: Parametrization

Parametrization of the module.

relu #

relu = ReLU(inplace=True)

forward #

forward(
    input: Float[Tensor, "*sample batch *in_feature"],
    /,
    sample_shape: Size | None = Size([]),
    generator: Generator | None = None,
    input_contains_samples: bool = False,
    parameter_samples: (
        dict[str, Float[Tensor, "*sample parameter"]] | None
    ) = None,
) -> Float[Tensor, "*sample *batch *out_feature"]

from_pretrained_weights #

from_pretrained_weights(
    out_size: int,
    architecture: Literal["imagenet", "cifar"] = "imagenet",
    weights: Weights = DEFAULT,
    freeze: bool = False,
    *args,
    **kwargs
)

parameters_and_lrs #

parameters_and_lrs(
    lr: float, optimizer: Literal["SGD", "Adam"]
) -> list[dict[str, Tensor | float]]

Get the parameters of the module and their learning rates for the chosen optimizer and the parametrization of the module.

Parameters:

Name Type Description Default
lr float

The global learning rate.

required
optimizer Literal['SGD', 'Adam']

The optimizer being used.

required

reset_parameters #

reset_parameters() -> None

Reset the parameters of the module and set the parametrization of all children to the parametrization of the module.

This method should be implemented by subclasses to reset the parameters of the module.