Adding New Attacks
This guide will walk you through the process of adding an attack similar to BadNet, an image-based backdoor attack, which poisons image data by overlaying a patch trigger. To create a similar class, you will define a new backdoor method or enhance an existing one while following the same structure.
Here we show the badnet attack as an example to illustrate how to add a new attack. The steps are as follows:
class BadNet(ImageBase):
def __init__(self, dataset, args=None, mode="train", pop=True) -> None:
super().__init__(dataset, args, mode, pop)
self.attack_type = "image"
self.attack_name = "badnet"
# define poison image transformer
trans = transforms.Compose(
[
transforms.Resize(self.args.input_size[:2], antialias=True),
transforms.ToTensor(),
]
)
trigger_path = Path(self.args.patch_mask_path)
self.bd_transform = AddMaskPatchTrigger(
trans(Image.open(BASE_DIR / trigger_path))
)
def make_poison_data(self, data):
# poison the image data
x, y = data
x_poison = self.bd_transform(x)
# set mislabel
y_poison = self.args.attack_target
is_poison = 1
y_original = y
return (x_poison, y_poison, is_poison, y_original)
Define the Attack Class
Inherit from the base class
ImageBase, and it’s aLightningModule.
- Add relevant parameters:
dataset: the clean dataset
mode: indicate you are training or testing
pop: whether to pop the original
attack_targeclass from the datasetInside the
__init__method, set up key attributes likeattack_typeandattack_name, which describe the attack.Example code:
class BadNet(ImageBase): def __init__(self, dataset, args=None, mode="train", pop=True) -> None: super().__init__(dataset, args, mode, pop) self.attack_type = "image" # Set the attack type self.attack_name = "badnet" # Name your custom attack
Define the Poison Data Transform Function
The poison data transformation is a critical step in generating poisoned data. The transform function involves applying a trigger to the input image to create a poisoned data for BadNets.
However for more sophisticated attack method, the transform function can be more complex. For example, Label-Consistent backdoor attacks using GAN to generate the poisoned data. In this case, the transform function can be GAN model to generate the poisoned data.
Here we using BadNet as example to illustrate how to define the poison data transform function:
Load the custom trigger (e.g., a patch or any other backdoor trigger) using
Image.open()and apply it to the transformation.The
AddMaskPatchTriggeris a placeholder for the specific backdoor technique you are implementing. You may need to define or modify this function for your needs.Example code:
# Define poison image transformer trigger_path = Path(self.args.patch_mask_path) self.bd_transform = AddMaskPatchTrigger( trans(Image.open(BASE_DIR / trigger_path)) )
Implement the Poison Data Method
Write the
make_poison_data()method to modify the input data and generate poisoned versions. This function will be called bymake_and_save_poisoned_data()in the base class.Use the backdoor transformer (
self.bd_transform) to apply the trigger to the input imagex.Assign the poisoned label to
y_poisonand track whether the data has been poisoned with a flagis_poison.The method should return a tuple following the format: (poisoned_image, target_label, poisoning_flag, original_label).
Example code:
def make_poison_data(self, data): # Poison the image data x, y = data x_poison = self.bd_transform(x) # Apply the backdoor trigger y_poison = self.args.attack_target # Assign the target label is_poison = 1 # Mark the data as poisoned y_original = y # Keep the original label return (x_poison, y_poison, is_poison, y_original)
Testing
Customize the trigger application or transformation based on the type of attack you are developing.
Test your new attack by passing different datasets and configurations to ensure the poison data is created correctly.