NamedTensor¶
class mfai.NamedTensor(tensor: Tensor, names: List[str], feature_names: List[str], feature_dim_name: str = "features")
The NamedTensor class is a wrapper around a PyTorch tensor with additionnal attributes and methods, it allows us to pass consistent object linking data and metadata with extra utility methods (concat along features dimension, flatten in place, …).
Parameters¶
tensor (torch.Tensor): The tensor to wrap.
names (list of str): Names of the tensor’s dimensions.
feature_names (list of str): Names of the features along the ‘feature_dim_name’ of the tensor.
feature_dim_name (str, optional): Name of the feature dimension.
Attributes¶
device(torch.device): Device where the Tensor is stored.feature_dim_name(str): Name of the feature dimension.feature_names(list of str): Names of the features along the ‘feature_dim_name’ of the tensor.feature_names_to_idx(dict): Dictionary mapping feature names to their indices.names(list of str): Names of the tensor’s dimensions.ndims(int): Number of dimensions of the tensor.num_spatial_dims(int): Number of spatial dimensions of the tensor.spatial_dim_idx(list of int): Indices of the spatial dimensions in the tensor.tensor(torch.Tensor): The wrapped tensor.
Methods¶
Method |
Description |
|---|---|
|
Clone with a deepcopy. |
|
Collates a batch of |
|
Concatenates a list of |
|
Return the index of a dimension given its name. |
|
Returns the size of a dimension given its name. |
|
Creates a new |
|
Flatten in place the underlying tensor from start_dim to end_dim. Deletes flattened dimension names and insert the new one. |
|
Returns the tensor indexed along the dimension |
|
Same as |
|
Creates a new |
|
Iterates over the specified dimension, yielding |
|
Iterates over the specified dimension, yielding Tensor instances. |
|
In place operation to pin the underlying tensor to memory. |
|
Rearranges the tensor in place using einops-like syntax. |
|
Returns the |
|
Return the Tensor indexed along the dimension dim_name with the index index. Allows the selection of the feature dimension. Allows the selection of the feature dimension. |
|
Squeeze the underlying tensor along the dimension(s) given its/their name(s). |
|
Stacks a list of |
|
In place operation to call torch’s ‘to’ method on the underlying tensor. |
|
Modifies the type of the underlying torch tensor by calling torch’s |
|
Unflatten the dimension dim of the underlying tensor. Insert unflattened_size dimension instead. |
|
Insert a new dimension dim_name of size 1 at dim_index. |
|
Unsqueeze and expand the tensor to have the same number of spatial dimensions as another |
Special Methods¶
Method |
Description |
|---|---|
|
Get one feature from the features dimension of the tensor by name. The returned tensor has the same number of dimensions as the original tensor. |
|
Concatenate two |
|
Returns a string representation of the |
Example Usage¶
Instantiation¶
In the following example, we create a NamedTensor from a PyTorch tensor with the following dimensions: batch, lat, lon, features. We also provide the names of the dimensions and the names of the features using respectively the names and feature_names arguments.
import torch
from torch import Tensor
from mfai.pytorch.namedtensor import NamedTensor
tensor = torch.rand(4, 256, 256, 3)
nt = NamedTensor(
tensor,
names=["batch", "lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
Concatenation, Indexing, New Like, Flatten, Rearrange¶
import torch
from torch import Tensor
from mfai.pytorch.namedtensor import NamedTensor
nt1 = NamedTensor(
torch.rand(4, 256, 256, 3),
names=["batch", "lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
nt2 = NamedTensor(
torch.rand(4, 256, 256, 1),
names=["batch", "lat", "lon", "features"],
feature_names=["q"],
)
# Concatenate along the features dimension
nt3 = nt1 | nt2
# Index by feature name
u_feature = nt3["u"]
# Create a new NamedTensor with the same names but different data
nt4 = NamedTensor.new_like(torch.rand(4, 256, 256, 4), nt3)
# Flatten in place the lat and lon dimensions and rename the new dim to 'ngrid'
nt3.flatten_("ngrid", 1, 2)
# String representation of the NamedTensor yields useful statistics
print(nt3)
# Rearrange in place using einops-like syntax
nt3.rearrange_("batch ngrid features -> batch features ngrid")
Selection and Index Selection¶
nt = NamedTensor(
torch.rand(4, 256, 256, 3),
names=["batch", "lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
# Return the tensor indexed along the dimension dim_name with the desired index. The given dimension is removed from the tensor.
selected_named_tensor = nt.select_dim("lat", 0)
selected_tensor = nt.select_tensor_dim("lat", 0)
assert selected_named_tensor.tensor == selected_tensor
# Return the tensor indexed along the dimension dim_name with the indices tensor. The returned tensor has the same number of dimensions as the original tensor (input). The dim_name dimension has the same size as the length of indices; other dimensions have the same size as in the original tensor.
indexed_tensor = nt.index_select_dim("features", torch.tensor([0, 2]))
Iteration¶
nt = NamedTensor(
torch.rand(4, 256, 256, 3),
names=["batch", "lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
# Iterate over the 'batch' dimension, yielding NamedTensor instances
for named_tensor in nt.iter_dim("batch"):
print(named_tensor)
# Iterate over the 'batch' dimension, yielding Tensor instances
for tensor in nt.iter_tensor_dim("batch"):
print(tensor.shape)
Collation¶
nt1 = NamedTensor(
torch.rand(256, 256, 3),
names=["lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
nt2 = NamedTensor(
torch.rand(256, 256, 3),
names=["lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
# Collate a batch of NamedTensor instances into a single NamedTensor
collated_nt = NamedTensor.collate_fn([nt1, nt2])
print(collated_nt)
# Collate a batch with zero padding on lat, lon dimensions
nt1 = NamedTensor(
torch.rand(128, 128, 3),
names=["lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
nt2 = NamedTensor(
torch.rand(256, 256, 3),
names=["lat", "lon", "features"],
feature_names=["u", "v", "t2m"],
)
collated_nt_padded = NamedTensor.collate_fn([nt1, nt2], pad_dims=("lat", "lon"), pad_value=0.0)
For more details, refer to the NamedTensor class in mfai/pytorch/namedtensor.py.