engine.generate

The generate function is used for both prototype generation directly from the model, and for feature isolation on your input samples.

leap_ie.vision.engine.generate(
    project_name,
    model,
    class_list,
    config,
    target_classes=None,
    preprocessing=None,
    samples=None,
    device=None,
    mode="pt",
)

Arguments

  • project_name (str): Name of your project. Used for logging.

    • Required: Yes

    • Default: None

  • model (object): Model for interpretation. Currently we support image classification models only. We expect the model to take a batch of images as input, and return a batch of logits (NOT probabilities). If using pytorch, we expect the model to take images to be in channels first format, e.g. of shape [1, channels, height, width]. If tensorflow, channels last, e.g.[1, height, width, channels].

    • Required: Yes

    • Default: None

  • class_list (list): List of class names corresponding to your model's output classes, e.g. ['hotdog', 'not hotdog', ...].

    • Required: Yes

    • Default: None

  • config (dict or str): Configuration dictionary, or path to a json file containing your configuration. At minimum, this must contain {"leap_api_key": "YOUR_LEAP_API_KEY"}. See #config

    • Required: Yes

    • Default: None

  • target_classes (list, optional): List of target class indices to generate prototypes or isolations for, e.g. [0,1]. If None, prototypes will be generated for the class at output index 0 only, e.g. 'hotdog', and feature isolations will be generated for the top 3 predicted classes.

    • Required: No

    • Default: None

  • preprocessing (function, optional): Preprocessing function to be used for generation. This can be None, but for best results, use the preprocessing function used on inputs for inference.

    • Required: No

    • Default: None

  • samples (array, optional): None, or a batch of images to perform feature isolation on. If provided, only feature isolation is performed (not prototype generation). We expect samples to be of shape [num_images, height, width, channels] if using tensorflow, or [1, channels, height, width] if using pytorch.

    • Required: No

    • Default: None

  • device (str, optional): Device to be used for generation. If None, we will try to find a device.

    • Required: No

    • Default: None

  • mode (str, optional): Framework to use, either 'pt' for pytorch or 'tf' for tensorflow. Default is 'pt'.

    • Required: No

    • Default: pt

Config

Leap provides a number of configuration options to fine-tune the interpretability engine's performance with your models. You can provide it as a dictionary or a path to a .json file.

  • hf_weight (int): How much to penalise high-frequency patterns in the input. If you are generating very blurry and indistinct prototypes, decrease this. If you are getting very noisy prototypes, increase it. This depends on your model architecture and is hard for us to predict, so you might want to experiment. It's a bit like focussing a microscope. Best practice is to start with zero, and gradually increase.

    • Default: 0

  • input_dim (list): The dimensions of the input that your model expects.

    • Default: [1, 224, 224, 3] if mode is "tf" else [1, 3, 224, 224]

  • isolation (bool): Whether to isolate features for entangled classes. Set to False if you want prototypes only.

    • Default: True

  • find_lr_steps (int): How many steps to tune the learning rate over at the start of the generation process. We do this automatically for you, but if you want to tune the learning rate manually, set this to zero and provide a learning rate with lr.

    • Default: 300

  • max_steps (int): How many steps to run the prototype generation/feature isolation process for. If you get indistinct prototypes or isolations, try increasing this number.

    • Default: 1000

Here are all of the config options currently available:

alpha_mask: bool = False
alpha_only: bool = False
alpha_weight: float = 0
baseline_init: float = 0
channel_ix: int = None
diversity_weight: float = 0
find_lr_steps: int = 300
hf_weight: float = 0
input_dim: tuple = (0, 0, 0, 0)
isolate_classes: list = None
isolation: bool = True
isolation_alpha_weight: float = 1
isolation_hf_weight: float = 1
isolation_lr: float = 0.05
log_freq: int = 100
logit_clamp: bool = True
lr: float = 0.05
max_isolate_classes: int = 3
max_lr: float = 2.0
max_steps: int = 1000
min_lr: float = 0.001
objective_weight: float = 1
samples: list = None
seed: int = 0
target_classes: tuple = ((0),)
transform: str = "shift_scale"
use_alpha: bool = False
use_baseline: bool = False
  • alpha_mask (bool): If True, applies a mask during prototype generation which encourages the resulting prototypes to be minimal, centered and concentrated. Experimental.

    • Default: False

  • alpha_only (bool): If True, during the prototype generation process, only an alpha channel is optimised. This results in generation prototypical shapes and textures only, with no colour information.

    • Default: False

  • alpha_weight (float): How much to encourage generated prototypes to be minimal. Experimental.

    • Default: 0

  • baseline_init (int): How to initialise the input. A sensible option is the mean of your expected input data, if you know it.

    • Default: 0

  • channel_ix (int): Index of the channel dimension of your input, e.g. 1 for channels-first, 3 for channels last. We try to find this automatically for you, but you can overwrite it here.

    • Default: None

  • diversity_weight (int): When generating multiple prototypes for the same class, we can apply a diversity objective to push for more varied inputs. The higher this number, the harder the optimisation process will push for different inputs. Experimental.

    • Default: 0

  • find_lr_steps (int): How many steps to tune the learning rate over at the start of the generation process. We do this automatically for you, but if you want to tune the learning rate manually, set this to zero and provide a learning rate with lr.

    • Default: 300

  • hf_weight (int): How much to penalise high-frequency patterns in the input. If you are generating very blurry and indistinct prototypes, decrease this. If you are getting very noisy prototypes, increase it. This depends on your model architecture and is hard for us to predict, so you might want to experiment. It's a bit like focussing binoculars. Best practice is to start with zero, and gradually increase. (Note, there is no maximum value for this.)

    • Default: 0

  • input_dim (list): The dimensions of the input that your model expects.

    • Default: [1, 224, 224, 3] if mode is "tf" else [1, 3, 224, 224]

  • isolate_classes (list): If you'd like to isolate features for specific classes, rather than the top n, specify their indices here for EACH target, e.g. [[2,7,8], [2,3]].

    • Default: None

  • isolation (bool): Whether to isolate features for entangled classes. Set to False if you want prototypes only.

    • Default: True

  • isolation_alpha_weight (int): How much to optimise for high feature mask coverage during feature isolation.

    • Default: 1

  • isolation_hf_weight (int): How much to penalise high-frequency patterns during feature isolation. See hf_weight.

    • Default: 1

  • isolation_lr (float): How much to update the isolation mask at each step during the feature isolation process.

    • Default: 0.05

  • log_freq (int): Interval at which to log images.

    • Default: 100

  • lr (float): How much to update the prototype at each step during the prototype generation process. We find this for you automatically between max_lr and min_lr, but if you would like to tune it manually, set find_lr_steps to zero and provide it here.

    • Default: 0.05

  • max_isolate_classes (int): How many classes to isolate features for, if isolate_classes is not provided.

    • Default: min(3, len(class_list))

  • max_lr (float): Maximum learning rate for learning rate finder.

    • Default: 2.0

  • max_steps (int): How many steps to run the prototype generation/feature isolation process for. If you get indistinct prototypes or isolations, try increasing this number.

    • Default: 1000

  • min_lr (float): Minimum learning rate for learning rate finder.

    • Default: 0.001

  • objective_weight (float): How much to weight the main objective - i.e. maximising the target classes. Set to -1 to minimise the target classes. (Use this to optimise the 0 class in binary classification).

    • Default: 1

  • seed (int): Random seed for initialisation.

    • Default: 0

  • transform (str): Random affine transformation to guard against adversarial noise. You can also experiment with the following options: ['s', 'm', 'l', 'xl', 'shift_scale', 'roll']. You can also set this to None and provide your own transformation in `engine.generate(preprocessing=your transformation).

    • Default: shift_scale

  • use_alpha (bool): If True, adds an alpha channel to the prototype. This results in the prototype generation process returning semi-transparent prototypes, which allow it to express ambivalence about the values of pixels that don't change the model prediction.

    • Default: False

  • use_baseline (bool): Whether to generate an equidistant baseline input prior to the prototype generation process. It takes a bit longer, but setting this to True will ensure that all prototypes generated for a model are not biased by input initialisation.

    • Default: False

  • wandb_api_key (str): Provide your weights and biases API key here to enable logging results directly to your WandB dashboard.

    • Default: None

  • wandb_entity (str): If logging to WandB, make sure to provide your WandB entity name here.

    • Default: None

Returns

  • results_df (DataFrame)

    • Results of results of prototype, isolation and entanglement analyses including relative paths to image files in the local filesystem. For more information on the different types of results see Concepts.

  • results_dict (Dictionary)

    • Probability values at each stage of the generation process. Only required for advanced debugging use cases.