.. _getting-started: Getting Started =============== This section provides a quick guide to help you get started with BackdoorMBTI, including downloading data, running backdoor attacks, and executing defense experiments. Backdoor Attack --------------- Here is an example to quickly start an attack experiment and reproduce the BadNets backdoor attack results. .. code-block:: python import argparse from pathlib import Path from torchvision import transforms from torchvision.datasets import CIFAR10 # if you install from PyPI from backdoormbti.attacks.image import BadNet # if you cloned the package from attacks.image import BadNet # prepare dataset transform = transforms.Compose([transforms.ToTensor()]) trainset = CIFAR10( root="./data/cifar10", download=True, train=True, transform=transform ) testset = CIFAR10( root="./data/cifar10", download=True, train=False, transform=transform ) # load args parser = argparse.ArgumentParser() args = parser.parse_args() args.data_type = "image" args.dataset = "cifar10" args.attack_name = "badnet" args.pratio = 0.1 args.attack_target = 0 args.random_seed = 0 args.input_size = (32, 32, 3) args.patch_mask_path = "resources/badnet/trigger_image.png" # create attack instance poison_trainset = BadNet(trainset, args=args, mode="train", pop=False) poison_testset = BadNet(testset, args=args, mode="test", pop=False) # make and save poison data poison_trainset.make_and_save_dataset(save_dir=Path("./")) poison_testset.make_and_save_dataset(save_dir=Path("./")) After running the above code, the backdoor attack will be executed, and the poison dataset `image_badnet_poison_train_set.pt` will be saved in the current directory. The following images show the benign and poison images generated by the BadNets attack. The right image is the poison image with a trigger pattern added to the bottom-right corner. .. list-table:: :widths: 50 50 :header-rows: 0 * - .. image:: ./getting_started_benign_image.png :width: 200px - .. image:: ./getting_started_poison_image.png :width: 200px Backdoor Training via Customed Training Pipeline ------------------------------------------------ When the poiosn dataset is generated, you can use it to train a backdoor model in your own code. If you want to customize the training pipeline, you can use the following code snippet: .. code-block:: python import torch import torch.nn as nn import torch.optim as optim from torchvision import models from tqdm import tqdm device = torch.device("cuda" if torch.cuda.is_available() else "cpu") poisonset = torch.load("image_badnet_poison_train_set.pt") backdoor_trainloader = torch.utils.data.DataLoader( poisonset, batch_size=32, shuffle=True ) # define your model model = models.resnet18(pretrained=False) model.fc = nn.Linear(model.fc.in_features, 10) model.to(device) # define your criterion and optimizer criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=0.001) # train the model num_epochs = 10 for epoch in range(num_epochs): model.train() running_loss = 0.0 # the data format in poison datset: (inputs, labels, is_poison, pre_labels) for inputs, labels, is_poison, pre_labels in tqdm( backdoor_trainloader, desc="training" ): inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() print( f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(backdoor_trainloader):.4f}" ) torch.save(model.state_dict(), "backdoor_model.pth") Backdoor Attack Evaluation --------------------------- After training the backdoor model, you can evaluate the attack success rate (ASR) and robustness accuracy (RA) of the model on the test set. .. code-block:: python import torch from torchvision import models device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # load the backdoor model state_dict = torch.load("backdoor_model.pth") backdoor_model = models.resnet18(weights=None) backdoor_model.fc = torch.nn.Linear(backdoor_model.fc.in_features, 10) backdoor_model.load_state_dict(state_dict) backdoor_model.to(device) # load poison test set poison_testset = torch.load("image_badnet_poison_test_set.pt") testloader = torch.utils.data.DataLoader(poison_testset, batch_size=32, shuffle=False) backdoor_model.eval() robustness = 0 success = 0 total = 0 with torch.no_grad(): for inputs, labels, is_poison, pre_labels in testloader: inputs, labels, pre_labels = ( inputs.to(device), labels.to(device), pre_labels.to(device), ) outputs = backdoor_model(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) robustness += (predicted == pre_labels).sum().item() success += (predicted == labels).sum().item() print( f"Robust Accuracy of the model on the test images: {100 * robustness / total:.2f}%" ) print( f"Attack Success Rate of the model on the test images: {100 * success / total:.2f}%" ) Backdoor Defense ---------------- After evaluating the backdoor attack, you can start the defense experiment. Here is an example to quickly start a defense experiment and reproduce the fine-pruning defense results. .. code-block:: python import argparse import torch from torchvision import models from torchvision.datasets import CIFAR10 from torchvision.transforms import transforms from defenses.image import STRIP from models.wrapper import ImageModelWrapper as ModelWrapper from utils.data import CleanDatasetWrapper as DatasetWrapper from utils.eval import eval_def_acc from utils.io import save_results # init args parser = argparse.ArgumentParser() args = parser.parse_args() args.fast_dev = False args.random_seed = 0 args.batch_size = 32 args.num_workers = 4 args.num_devices = 1 args.num_classes = 10 args.collate_fn = None # defense args args.repeat = 5 args.pertub_ratio = 0.8 args.frr = 0.05 args.use_oppsite_set = False # training args args.client_optimizer = "sgd" args.lr = 0.01 args.lr_scheduler = "CosineAnnealingLR" args.weight_decay = 0.0005 args.freqency_save = 10 # set device device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # prepare dataset transform = transforms.Compose([transforms.ToTensor()]) trainset = CIFAR10( root="./data/cifar10", download=True, train=True, transform=transform ) args.train_set = DatasetWrapper(trainset) poison_trainset = torch.load("image_badnet_poison_train_set.pt") testset = CIFAR10( root="./data/cifar10", download=True, train=False, transform=transform ) poison_testset = torch.load("image_badnet_poison_test_set.pt") # load backdoor model state_dict = torch.load("backdoor_model.pth") backdoor_model = models.resnet18(weights=None) backdoor_model.fc = torch.nn.Linear(backdoor_model.fc.in_features, 10) backdoor_model.load_state_dict(state_dict) backdoor_model.to(device) bkd_lit_model = ModelWrapper(backdoor_model, args) # evaluate the backdoor model backdoor_model.eval() # initialize the defense defense = STRIP(args) defense.setup( clean_train_set=DatasetWrapper(trainset), clean_test_set=DatasetWrapper(testset), poison_train_set=DatasetWrapper(poison_trainset), poison_test_set=DatasetWrapper(poison_testset), model=bkd_lit_model, collate_fn=None, ) is_clean_lst = defense.get_sanitized_lst(poison_trainset) results = eval_def_acc(is_clean_lst, poison_trainset) save_results("results.json", results) STRIP is a sample detection defense method. After running the above code, the detection accuracy of the defense will be collected, and the sanitized dataset can be used to retrain the model. After retraining, the ACC, ASR, and RA metrics will be collected for further evaluation. Backdoor Attack Training via Command Line ----------------------------------------- We use ResNet-18 as the default model and a poison ratio of 0.1. For users installing from PyPI, you can run the following commands directly in the terminal: .. code-block:: bash atk_train --data_type image --dataset cifar10 --attack_name badnet --model_name resnet18 --pratio 0.1 --num_workers 4 --epochs 100 atk_train --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --num_workers 4 --epochs 100 --add_noise true atk_train --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --num_workers 4 --epochs 100 --mislabel true For users installing from source code, use the following command structure: .. code-block:: bash cd backdoormbti cd training_pipeline python atk_train.py --data_type image --dataset cifar10 --attack_name badnet --model_name resnet18 --pratio 0.1 --num_workers 4 --epochs 100 python atk_train.py --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --num_workers 4 --epochs 100 --add_noise true python atk_train.py --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --num_workers 4 --epochs 100 --mislabel true To introduce noise or label mislabeling, you can add the `--add_noise true` or `--mislabel true` arguments. After the experiment, metrics such as ACC (Accuracy), ASR (Attack Success Rate), and RA (Robustness Accuracy) will be collected in the attack phase. For more detailed command options, run: .. code-block:: bash atk_train -h python atk_train.py -h Backdoor Defense Training via Command Line ------------------------------------------ For defense experiments, it depends on the backdoor model generated in the attack phase, so make sure to complete the corresponding attack experiment before running defense. For users installing from PyPI, use the following commands: .. code-block:: bash def_train --data_type image --dataset cifar10 --attack_name badnet --pratio 0.1 --defense_name finetune --num_workers 4 --epochs 10 def_train --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --defense_name fineprune --num_workers 4 --epochs 1 --add_noise true def_train --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --defense_name strip --num_workers 4 --epochs 1 --mislabel true For users installing from source code, use the following command structure: .. code-block:: bash cd backdoormbti cd training_pipeline python def_train.py --data_type image --dataset cifar10 --attack_name badnet --pratio 0.1 --defense_name finetune --num_workers 4 --epochs 10 python def_train.py --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --defense_name fineprune --num_workers 4 --epochs 1 --add_noise true python def_train.py --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --defense_name strip --num_workers 4 --epochs 1 --mislabel true For more details on defense commands, run: .. code-block:: bash def_train -h python def_train.py -h In the defense phase, detection accuracy will be collected if the defense is a detection method, and the sanitized dataset will be used to retrain the model. After retraining, ACC, ASR, and RA metrics will be collected for further evaluation.