deeplabv3

class mfai.pytorch.models.deeplabv3.ASPP(in_channels, out_channels, atrous_rates, separable=False)[source]

Bases: Module

Parameters:
forward(x)[source]

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

x (Tensor)

Return type:

Tensor

class mfai.pytorch.models.deeplabv3.ASPPConv(in_channels, out_channels, dilation)[source]

Bases: Sequential

Parameters:
  • in_channels (int)

  • out_channels (int)

  • dilation (int)

class mfai.pytorch.models.deeplabv3.ASPPPooling(in_channels, out_channels)[source]

Bases: Sequential

Parameters:
  • in_channels (int)

  • out_channels (int)

forward(x)[source]

Runs the forward pass.

Return type:

Tensor

Parameters:

x (Tensor)

class mfai.pytorch.models.deeplabv3.ASPPSeparableConv(in_channels, out_channels, dilation)[source]

Bases: Sequential

Parameters:
  • in_channels (int)

  • out_channels (int)

  • dilation (int)

class mfai.pytorch.models.deeplabv3.Activation(name, **params)[source]

Bases: Module

Parameters:
  • name (str | None)

  • params (Any)

forward(x)[source]

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

x (Tensor)

Return type:

Tensor

class mfai.pytorch.models.deeplabv3.DeepLabV3(in_channels, out_channels, input_shape, settings=DeepLabV3Settings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=8, aux_params=None, autopad_enabled=False))[source]

Bases: BaseModel, AutoPaddingModel

DeepLabV3 implementation from “Rethinking Atrous Convolution for Semantic Image Segmentation”.

Parameters:
  • in_channels (int) – A number of input channels for the model, default is 3 (RGB images)

  • out_channels (int) – A number of channels for output mask (or you can think as a number of classes of output mask)

  • settings (DeepLabV3Settings) – DeepLabV3Settings

  • input_shape (tuple[int, int])

Returns:

DeepLabV3

Return type:

torch.nn.Module

check_input_shape(x)[source]
Return type:

None

Parameters:

x (Tensor)

features_last: bool = False
forward(x)[source]

Sequentially pass x trough model`s encoder, decoder and heads.

Return type:

Tensor | tuple[Tensor, Tensor]

Parameters:

x (Tensor)

get_classification_head(in_channels, out_channels, pooling='avg', dropout=0.2, activation_name=None)[source]
Return type:

Sequential

Parameters:
  • in_channels (int)

  • out_channels (int)

  • pooling (str)

  • dropout (float)

  • activation_name (str | None)

get_decoder(in_channels, out_channels)[source]
Return type:

DeepLabV3Decoder

Parameters:
  • in_channels (int)

  • out_channels (int)

get_segmentation_head(in_channels, out_channels, kernel_size=3, activation_name=None, upsampling=1)[source]
Return type:

Sequential

Parameters:
  • in_channels (int)

  • out_channels (int)

  • kernel_size (int)

  • activation_name (str | None)

  • upsampling (int)

initialize()[source]
Return type:

None

initialize_decoder(module)[source]
Return type:

None

Parameters:

module (Module)

initialize_head(module)[source]
Return type:

None

Parameters:

module (Module)

model_type: ModelType = 2
num_spatial_dims: int = 2
onnx_supported: bool = False
predict(x)[source]

Inference method. Switch model to eval mode, call .forward(x) with torch.no_grad().

Parameters:

x (Tensor) – 4D torch tensor with shape (batch_size, in_channels, height, width)

Returns:

4D torch tensor with shape (batch_size, out_channels, height, width)

Return type:

prediction

register: bool = True
property settings: DeepLabV3Settings

Returns the settings instance used to configure the model.

settings_kls

alias of DeepLabV3Settings

supported_num_spatial_dims = (2,)
validate_input_shape(input_shape)[source]
Given an input shape, verifies whether the inputs fit with the

calling model’s specifications.

Parameters:

input_shape (Size) – The shape of the input data, excluding any batch dimension and channel dimension. For example, for a batch of 2D tensors of shape [B,C,W,H], [W,H] should be passed. For 3D data instead of shape [B,C,W,H,D], instead, [W,H,D] should be passed.

Returns:

Returns a tuple where the first element is a boolean signaling whether the given input shape

already fits the model’s requirements. If that value is False, the second element contains the closest shape that fits the model, otherwise it will be None.

Return type:

tuple[bool, Size]

class mfai.pytorch.models.deeplabv3.DeepLabV3Decoder(in_channels, out_channels=256, atrous_rates=(12, 24, 36))[source]

Bases: Sequential

Parameters:
forward(*features)[source]

Runs the forward pass.

Return type:

Tensor

Parameters:

features (tuple[Tensor])

class mfai.pytorch.models.deeplabv3.DeepLabV3Plus(in_channels, out_channels, input_shape, settings=DeepLabV3PlusSettings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=4, aux_params=None, autopad_enabled=False, encoder_output_stride=16, decoder_atrous_rates=(12, 24, 36)))[source]

Bases: DeepLabV3

DeepLabV3+ implementation from “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”.

Parameters:
  • in_channels (int) – A number of input channels for the model, default is 3 (RGB images)

  • classes – A number of classes for output mask (or you can think as a number of channels of output mask)

  • settings (DeepLabV3PlusSettings) – DeepLabV3Settings

  • out_channels (int)

  • input_shape (tuple[int, int])

Returns:

DeepLabV3Plus

Return type:

torch.nn.Module

Reference:

https://arxiv.org/abs/1802.02611v3

property settings: DeepLabV3Settings

Returns the settings instance used to configure the model.

settings_kls

alias of DeepLabV3PlusSettings

class mfai.pytorch.models.deeplabv3.DeepLabV3PlusDecoder(encoder_channels, out_channels=256, atrous_rates=(12, 24, 36), output_stride=16)[source]

Bases: Module

Parameters:
forward(*features)[source]

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

features (tuple[Tensor])

Return type:

Tensor

class mfai.pytorch.models.deeplabv3.DeepLabV3PlusSettings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=4, aux_params=None, autopad_enabled=False, encoder_output_stride=16, decoder_atrous_rates=(12, 24, 36))[source]

Bases: DeepLabV3Settings

encoder_output_stride: Downsampling factor for last encoder features (see original paper for explanation) decoder_atrous_rates: Dilation rates for ASPP module (should be a tuple of 3 integer values) upsampling: Final upsampling factor. Default is 4 to preserve input-output spatial shape identity.

Parameters:
  • encoder_name (Literal['resnet18', 'resnet34', 'resnet50'])

  • encoder_depth (int)

  • encoder_weights (bool)

  • decoder_channels (int)

  • activation (str | None)

  • upsampling (int)

  • aux_params (dict | None)

  • autopad_enabled (bool)

  • encoder_output_stride (int)

  • decoder_atrous_rates (tuple)

activation: str | None
autopad_enabled: bool
aux_params: Optional[dict]
decoder_atrous_rates: tuple
decoder_channels: int
encoder_depth: int
encoder_name: Literal['resnet18', 'resnet34', 'resnet50']
encoder_output_stride: int
encoder_weights: bool
classmethod from_dict(kvs, *, infer_missing=False)
Return type:

TypeVar(A, bound= DataClassJsonMixin)

Parameters:

kvs (dict | list | str | int | float | bool | None)

classmethod from_json(s, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw)
Return type:

TypeVar(A, bound= DataClassJsonMixin)

Parameters:

s (str | bytes | bytearray)

classmethod schema(*, infer_missing=False, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)
Return type:

SchemaF[TypeVar(A, bound= DataClassJsonMixin)]

Parameters:
to_dict(encode_json=False)
Return type:

Dict[str, Union[dict, list, str, int, float, bool, None]]

to_json(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, indent=None, separators=None, default=None, sort_keys=False, **kw)
Return type:

str

Parameters:
upsampling: int
class mfai.pytorch.models.deeplabv3.DeepLabV3Settings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=8, aux_params=None, autopad_enabled=False)[source]

Bases: object

encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)

to extract features of different spatial resolution

encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features

two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on). Default is 5

encoder_weights: One of None (random initialization), “imagenet” (pre-training on ImageNet) and

other pretrained weights (see table with available weights for each encoder_name)

decoder_channels: A number of convolution filters in ASPP module. Default is 256 activation: An activation function to apply after the final convolution layer.

Available options are “sigmoid”, “softmax”, “logsoftmax”, “tanh”, “identity”,

callable and None.

Default is None

upsampling: Final upsampling factor. Default is 8 to preserve input-output spatial shape identity aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

on top of encoder if aux_params is not None (default). Supported params:
  • classes (int): A number of classes

  • pooling (str): One of “max”, “avg”. Default is “avg”

  • dropout (float): Dropout factor in [0, 1)

  • activation (str): An activation function to apply “sigmoid”/”softmax”

    (could be None to return logits).

Parameters:
  • encoder_name (Literal['resnet18', 'resnet34', 'resnet50'])

  • encoder_depth (int)

  • encoder_weights (bool)

  • decoder_channels (int)

  • activation (str | None)

  • upsampling (int)

  • aux_params (dict | None)

  • autopad_enabled (bool)

activation: str | None
autopad_enabled: bool
aux_params: Optional[dict]
decoder_channels: int
encoder_depth: int
encoder_name: Literal['resnet18', 'resnet34', 'resnet50']
encoder_weights: bool
classmethod from_dict(kvs, *, infer_missing=False)
Return type:

TypeVar(A, bound= DataClassJsonMixin)

Parameters:

kvs (dict | list | str | int | float | bool | None)

classmethod from_json(s, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw)
Return type:

TypeVar(A, bound= DataClassJsonMixin)

Parameters:

s (str | bytes | bytearray)

classmethod schema(*, infer_missing=False, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)
Return type:

SchemaF[TypeVar(A, bound= DataClassJsonMixin)]

Parameters:
to_dict(encode_json=False)
Return type:

Dict[str, Union[dict, list, str, int, float, bool, None]]

to_json(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, indent=None, separators=None, default=None, sort_keys=False, **kw)
Return type:

str

Parameters:
upsampling: int
class mfai.pytorch.models.deeplabv3.SeparableConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, bias=True)[source]

Bases: Sequential

Parameters: