clip¶
LightningModule used to train a Clip model.
- class mfai.pytorch.lightning_modules.clip.CLIPAccuracySkillScore(top_k, batch_size)[source]¶
Bases:
MetricCLIP Accuracy Skill Score. The accuracy is computed from the probabilities matrix returned by CLIP. Then we use a uniformly random model as a reference for the skill score. * 0 or negative = worse than random model * 1 = perfect model.
- class mfai.pytorch.lightning_modules.clip.CLIPLightningModule(settings, learning_rate=0.0005, min_learning_rate=0.0001, lr_scheduler_interval='step')[source]¶
Bases:
LightningModule- Parameters:
settings (ClipSettings)
learning_rate (float)
min_learning_rate (float)
lr_scheduler_interval (Literal['step', 'epoch', None])
- configure_optimizers()[source]¶
Lightning method to define optimizers and learning-rate schedulers used for optimization. For more details about this method, please see: https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.configure_optimizers.
- forward(images, texts)[source]¶
Same as
torch.nn.Module.forward().- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
images (NamedTensor)
texts (Tensor)
- Return type:
- Returns:
Your model’s output
- plot_probabilities_matrix(sim_matrix)[source]¶
Plot the clip pair probabilities matrix.
- Return type:
Figure- Parameters:
sim_matrix (Tensor)
- training_step(batch, batch_idx)[source]¶
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch (
Tuple[NamedTensor,Tensor,Tensor]) – The output of your data iterable, normally aDataLoader.batch_idx (
int) – The index of this batch.dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Return type:
- Returns:
Tensor- The loss tensordict- A dictionary which can include any keys, but must include the key'loss'in the case of automatic optimization.None- In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches> 1, the loss returned here will be automatically normalized byaccumulate_grad_batchesinternally.
- validation_step(batch, batch_idx)[source]¶
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch (
Tuple[NamedTensor,Tensor,Tensor]) – The output of your data iterable, normally aDataLoader.batch_idx (
int) – The index of this batch.dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Return type:
- Returns:
Tensor- The loss tensordict- A dictionary. Can include any keys, but must include the key'loss'.None- Skip to the next batch.
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. x, y = batch # implement your own out = self(x) if dataloader_idx == 0: loss = self.loss0(out, y) else: loss = self.loss1(out, y) # calculate acc labels_hat = torch.argmax(out, dim=1) acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs separately for each dataloader self.log_dict({f"val_loss_{dataloader_idx}": loss, f"val_acc_{dataloader_idx}": acc})
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- class mfai.pytorch.lightning_modules.clip.SaveCLIPVisualEncoderWeights[source]¶
Bases:
CallbackCallback to save the weights of the visual encoder during training.
- on_validation_epoch_end(trainer, pl_module)[source]¶
Called at the end of the validation epoch. Saves the visual encoder weights of CLIP if the validation loss has improved.
- Return type:
- Parameters:
trainer (Trainer)
pl_module (LightningModule)