deeplabv3¶

class mfai.pytorch.models.deeplabv3.ASPP(in_channels, out_channels, atrous_rates, separable=False)[source]¶

Bases: Module

Parameters:

in_channels (int)
out_channels (int)
atrous_rates (tuple[int, int, int])
separable (bool)

forward(x)[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:: x (Tensor)
Return type:: Tensor

class mfai.pytorch.models.deeplabv3.ASPPConv(in_channels, out_channels, dilation)[source]¶

Bases: Sequential

Parameters:

in_channels (int)
out_channels (int)
dilation (int)

class mfai.pytorch.models.deeplabv3.ASPPPooling(in_channels, out_channels)[source]¶

Bases: Sequential

Parameters:

in_channels (int)
out_channels (int)

forward(x)[source]¶

Runs the forward pass.

Return type:: Tensor
Parameters:: x (Tensor)

class mfai.pytorch.models.deeplabv3.ASPPSeparableConv(in_channels, out_channels, dilation)[source]¶

Bases: Sequential

Parameters:

in_channels (int)
out_channels (int)
dilation (int)

class mfai.pytorch.models.deeplabv3.Activation(name, **params)[source]¶

Bases: Module

Parameters:

name (str | None)
params (Any)

forward(x)[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Parameters:: x (Tensor)
Return type:: Tensor

class mfai.pytorch.models.deeplabv3.DeepLabV3(in_channels, out_channels, input_shape, settings=DeepLabV3Settings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=8, aux_params=None, autopad_enabled=False))[source]¶

Bases: BaseModel, AutoPaddingModel

DeepLabV3 implementation from “Rethinking Atrous Convolution for Semantic Image Segmentation”.

Parameters:

in_channels (int) – A number of input channels for the model, default is 3 (RGB images)
out_channels (int) – A number of channels for output mask (or you can think as a number of classes of output mask)
settings (DeepLabV3Settings) – DeepLabV3Settings
input_shape (tuple[int, int])

Returns:

DeepLabV3

Return type:

torch.nn.Module

check_input_shape(x)[source]¶

Return type:: None
Parameters:: x (Tensor)

features_last: bool = False¶

forward(x)[source]¶

Sequentially pass x trough model`s encoder, decoder and heads.

Return type:: Tensor | tuple[Tensor, Tensor]
Parameters:: x (Tensor)

get_classification_head(in_channels, out_channels, pooling='avg', dropout=0.2, activation_name=None)[source]¶

Return type:

Sequential

Parameters:

in_channels (int)
out_channels (int)
pooling (str)
dropout (float)
activation_name (str | None)

get_decoder(in_channels, out_channels)[source]¶

Return type:

DeepLabV3Decoder

Parameters:

in_channels (int)
out_channels (int)

get_segmentation_head(in_channels, out_channels, kernel_size=3, activation_name=None, upsampling=1)[source]¶

Return type:

Sequential

Parameters:

in_channels (int)
out_channels (int)
kernel_size (int)
activation_name (str | None)
upsampling (int)

initialize()[source]¶

Return type:: None

initialize_decoder(module)[source]¶

Return type:: None
Parameters:: module (Module)

initialize_head(module)[source]¶

Return type:: None
Parameters:: module (Module)

model_type: ModelType = 2¶

num_spatial_dims: int = 2¶

onnx_supported: bool = False¶

predict(x)[source]¶

Inference method. Switch model to eval mode, call .forward(x) with torch.no_grad().

Parameters:: x (Tensor) – 4D torch tensor with shape (batch_size, in_channels, height, width)
Returns:: 4D torch tensor with shape (batch_size, out_channels, height, width)
Return type:: prediction

register: bool = True¶

property settings: DeepLabV3Settings¶: Returns the settings instance used to configure the model.

settings_kls¶: alias of DeepLabV3Settings

supported_num_spatial_dims = (2,)¶

validate_input_shape(input_shape)[source]¶

Given an input shape, verifies whether the inputs fit with the: calling model’s specifications.

Parameters:

input_shape (Size) – The shape of the input data, excluding any batch dimension and channel dimension. For example, for a batch of 2D tensors of shape [B,C,W,H], [W,H] should be passed. For 3D data instead of shape [B,C,W,H,D], instead, [W,H,D] should be passed.

Returns:

Returns a tuple where the first element is a boolean signaling whether the given input shape: already fits the model’s requirements. If that value is False, the second element contains the closest shape that fits the model, otherwise it will be None.

Return type:

tuple[bool, Size]

class mfai.pytorch.models.deeplabv3.DeepLabV3Decoder(in_channels, out_channels=256, atrous_rates=(12, 24, 36))[source]¶

Bases: Sequential

Parameters:

in_channels (int)
out_channels (int)
atrous_rates (tuple[int, int, int])

forward(*features)[source]¶

Runs the forward pass.

Return type:: Tensor
Parameters:: features (tuple[Tensor])

class mfai.pytorch.models.deeplabv3.DeepLabV3Plus(in_channels, out_channels, input_shape, settings=DeepLabV3PlusSettings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=4, aux_params=None, autopad_enabled=False, encoder_output_stride=16, decoder_atrous_rates=(12, 24, 36)))[source]¶

Bases: DeepLabV3

DeepLabV3+ implementation from “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”.

Parameters:

in_channels (int) – A number of input channels for the model, default is 3 (RGB images)
classes – A number of classes for output mask (or you can think as a number of channels of output mask)
settings (DeepLabV3PlusSettings) – DeepLabV3Settings
out_channels (int)
input_shape (tuple[int, int])

Returns:

DeepLabV3Plus

Return type:

torch.nn.Module

Reference:: https://arxiv.org/abs/1802.02611v3

property settings: DeepLabV3Settings¶: Returns the settings instance used to configure the model.

settings_kls¶: alias of DeepLabV3PlusSettings

class mfai.pytorch.models.deeplabv3.DeepLabV3PlusDecoder(encoder_channels, out_channels=256, atrous_rates=(12, 24, 36), output_stride=16)[source]¶

Bases: Module

Parameters:

encoder_channels (tuple[int, ...])
out_channels (int)
atrous_rates (tuple[int, int, int])
output_stride (int)

forward(*features)[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Parameters:: features (tuple[Tensor])
Return type:: Tensor

class mfai.pytorch.models.deeplabv3.DeepLabV3PlusSettings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=4, aux_params=None, autopad_enabled=False, encoder_output_stride=16, decoder_atrous_rates=(12, 24, 36))[source]¶

Bases: DeepLabV3Settings

encoder_output_stride: Downsampling factor for last encoder features (see original paper for explanation) decoder_atrous_rates: Dilation rates for ASPP module (should be a tuple of 3 integer values) upsampling: Final upsampling factor. Default is 4 to preserve input-output spatial shape identity.

Parameters:

encoder_name (Literal['resnet18', 'resnet34', 'resnet50'])
encoder_depth (int)
encoder_weights (bool)
decoder_channels (int)
activation (str | None)
upsampling (int)
aux_params (dict | None)
autopad_enabled (bool)
encoder_output_stride (int)
decoder_atrous_rates (tuple)

activation: str | None¶

autopad_enabled: bool¶

aux_params: Optional[dict]¶

decoder_atrous_rates: tuple¶

decoder_channels: int¶

encoder_depth: int¶

encoder_name: Literal['resnet18', 'resnet34', 'resnet50']¶

encoder_output_stride: int¶

encoder_weights: bool¶

classmethod from_dict(kvs, *, infer_missing=False)¶

Return type:: TypeVar(A, bound= DataClassJsonMixin)
Parameters:: kvs (dict | list | str | int | float | bool | None)

classmethod from_json(s, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw)¶

Return type:: TypeVar(A, bound= DataClassJsonMixin)
Parameters:: s (str | bytes | bytearray)

classmethod schema(*, infer_missing=False, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)¶

Return type:

SchemaF[TypeVar(A, bound= DataClassJsonMixin)]

Parameters:

infer_missing (bool)
many (bool)
partial (bool)

to_dict(encode_json=False)¶

Return type:: Dict[str, Union[dict, list, str, int, float, bool, None]]

to_json(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, indent=None, separators=None, default=None, sort_keys=False, **kw)¶

Return type:

str

Parameters:

skipkeys (bool)
ensure_ascii (bool)
check_circular (bool)
allow_nan (bool)
indent (int | str | None)
separators (Tuple[str, str] | None)
default (Callable | None)
sort_keys (bool)

upsampling: int¶

class mfai.pytorch.models.deeplabv3.DeepLabV3Settings(encoder_name='resnet18', encoder_depth=5, encoder_weights=True, decoder_channels=256, activation=None, upsampling=8, aux_params=None, autopad_enabled=False)[source]¶

Bases: object

encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone): to extract features of different spatial resolution
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features: two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on). Default is 5
encoder_weights: One of None (random initialization), “imagenet” (pre-training on ImageNet) and: other pretrained weights (see table with available weights for each encoder_name)

decoder_channels: A number of convolution filters in ASPP module. Default is 256 activation: An activation function to apply after the final convolution layer.

Available options are “sigmoid”, “softmax”, “logsoftmax”, “tanh”, “identity”,
callable and None.

Default is None

upsampling: Final upsampling factor. Default is 8 to preserve input-output spatial shape identity aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

on top of encoder if aux_params is not None (default). Supported params:

classes (int): A number of classes

pooling (str): One of “max”, “avg”. Default is “avg”

dropout (float): Dropout factor in [0, 1)

activation (str): An activation function to apply “sigmoid”/”softmax”
(could be None to return logits).

Parameters:

encoder_name (Literal['resnet18', 'resnet34', 'resnet50'])
encoder_depth (int)
encoder_weights (bool)
decoder_channels (int)
activation (str | None)
upsampling (int)
aux_params (dict | None)
autopad_enabled (bool)

activation: str | None¶

autopad_enabled: bool¶

aux_params: Optional[dict]¶

decoder_channels: int¶

encoder_depth: int¶

encoder_name: Literal['resnet18', 'resnet34', 'resnet50']¶

encoder_weights: bool¶

classmethod from_dict(kvs, *, infer_missing=False)¶

Return type:: TypeVar(A, bound= DataClassJsonMixin)
Parameters:: kvs (dict | list | str | int | float | bool | None)

classmethod from_json(s, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw)¶

Return type:: TypeVar(A, bound= DataClassJsonMixin)
Parameters:: s (str | bytes | bytearray)

classmethod schema(*, infer_missing=False, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)¶

Return type:

SchemaF[TypeVar(A, bound= DataClassJsonMixin)]

Parameters:

infer_missing (bool)
many (bool)
partial (bool)

to_dict(encode_json=False)¶

Return type:: Dict[str, Union[dict, list, str, int, float, bool, None]]

to_json(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, indent=None, separators=None, default=None, sort_keys=False, **kw)¶

Return type:

str

Parameters:

skipkeys (bool)
ensure_ascii (bool)
check_circular (bool)
allow_nan (bool)
indent (int | str | None)
separators (Tuple[str, str] | None)
default (Callable | None)
sort_keys (bool)

upsampling: int¶

class mfai.pytorch.models.deeplabv3.SeparableConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, bias=True)[source]¶

Bases: Sequential

Parameters:

in_channels (int)
out_channels (int)
kernel_size (tuple[int, int] | int)
stride (int)
padding (Literal['valid', 'same'] | int)
dilation (int)
bias (bool)