.. _getting-started:

Getting Started
===============

This section provides a quick guide to help you get started with BackdoorMBTI, including downloading data, running backdoor attacks, and executing defense experiments.


Backdoor Attack
---------------

Here is an example to quickly start an attack experiment and reproduce the BadNets backdoor attack results.

.. code-block:: python

   import argparse
   from pathlib import Path

   from torchvision import transforms
   from torchvision.datasets import CIFAR10

   
   # if you install from PyPI
   from backdoormbti.attacks.image import BadNet
   
   # if you cloned the package 
   from attacks.image import BadNet

   # prepare dataset
   transform = transforms.Compose([transforms.ToTensor()])
   trainset = CIFAR10(
      root="./data/cifar10", download=True, train=True, transform=transform
   )
   testset = CIFAR10(
      root="./data/cifar10", download=True, train=False, transform=transform
   )

   # load args
   parser = argparse.ArgumentParser()
   args = parser.parse_args()
   args.data_type = "image"
   args.dataset = "cifar10"
   args.attack_name = "badnet"
   args.pratio = 0.1
   args.attack_target = 0
   args.random_seed = 0
   args.input_size = (32, 32, 3)
   args.patch_mask_path = "resources/badnet/trigger_image.png"

   # create attack instance
   poison_trainset = BadNet(trainset, args=args, mode="train", pop=False)
   poison_testset = BadNet(testset, args=args, mode="test", pop=False)

   # make and save poison data
   poison_trainset.make_and_save_dataset(save_dir=Path("./"))
   poison_testset.make_and_save_dataset(save_dir=Path("./"))


After running the above code, the backdoor attack will be executed, and the poison dataset `image_badnet_poison_train_set.pt` will be saved in the current directory. The following images show the benign and poison images generated by the BadNets attack. The right image is the poison image with a trigger pattern added to the bottom-right corner.

.. list-table::
   :widths: 50 50
   :header-rows: 0

   * - .. image:: ./getting_started_benign_image.png
          :width: 200px
     - .. image:: ./getting_started_poison_image.png
          :width: 200px

Backdoor Training via Customed Training Pipeline
------------------------------------------------

When the poiosn dataset is generated, you can use it to train a backdoor model in your own code.
If you want to customize the training pipeline, you can use the following code snippet:

.. code-block:: python

   import torch
   import torch.nn as nn
   import torch.optim as optim
   from torchvision import models
   from tqdm import tqdm

   device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
   poisonset = torch.load("image_badnet_poison_train_set.pt")
   backdoor_trainloader = torch.utils.data.DataLoader(
      poisonset, batch_size=32, shuffle=True
   )

   # define your model
   model = models.resnet18(pretrained=False)
   model.fc = nn.Linear(model.fc.in_features, 10)
   model.to(device)

   # define your criterion and optimizer
   criterion = nn.CrossEntropyLoss()
   optimizer = optim.Adam(model.parameters(), lr=0.001)

   # train the model
   num_epochs = 10
   for epoch in range(num_epochs):
      model.train()
      running_loss = 0.0
      # the data format in poison datset: (inputs, labels, is_poison, pre_labels)
      for inputs, labels, is_poison, pre_labels in tqdm(
         backdoor_trainloader, desc="training"
      ):
         inputs, labels = inputs.to(device), labels.to(device)

         optimizer.zero_grad()

         outputs = model(inputs)
         loss = criterion(outputs, labels)
         loss.backward()
         optimizer.step()

         running_loss += loss.item()

      print(
         f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(backdoor_trainloader):.4f}"
      )

   torch.save(model.state_dict(), "backdoor_model.pth")

Backdoor Attack Evaluation
---------------------------

After training the backdoor model, you can evaluate the attack success rate (ASR) and robustness accuracy (RA) of the model on the test set.

.. code-block:: python

   import torch
   from torchvision import models

   device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

   # load the backdoor model
   state_dict = torch.load("backdoor_model.pth")
   backdoor_model = models.resnet18(weights=None)
   backdoor_model.fc = torch.nn.Linear(backdoor_model.fc.in_features, 10)
   backdoor_model.load_state_dict(state_dict)
   backdoor_model.to(device)

   # load poison test set
   poison_testset = torch.load("image_badnet_poison_test_set.pt")
   testloader = torch.utils.data.DataLoader(poison_testset, batch_size=32, shuffle=False)

   backdoor_model.eval()
   robustness = 0
   success = 0
   total = 0
   with torch.no_grad():
      for inputs, labels, is_poison, pre_labels in testloader:
         inputs, labels, pre_labels = (
               inputs.to(device),
               labels.to(device),
               pre_labels.to(device),
         )
         outputs = backdoor_model(inputs)
         _, predicted = torch.max(outputs.data, 1)
         total += labels.size(0)
         robustness += (predicted == pre_labels).sum().item()
         success += (predicted == labels).sum().item()

   print(
      f"Robust Accuracy of the model on the test images: {100 * robustness / total:.2f}%"
   )
   print(
      f"Attack Success Rate of the model on the test images: {100 * success / total:.2f}%"
   )


Backdoor Defense
----------------

After evaluating the backdoor attack, you can start the defense experiment. Here is an example to quickly start a defense experiment and reproduce the fine-pruning defense results.

.. code-block:: python

   import argparse

   import torch
   from torchvision import models
   from torchvision.datasets import CIFAR10
   from torchvision.transforms import transforms

   from defenses.image import STRIP
   from models.wrapper import ImageModelWrapper as ModelWrapper
   from utils.data import CleanDatasetWrapper as DatasetWrapper
   from utils.eval import eval_def_acc
   from utils.io import save_results

   # init args
   parser = argparse.ArgumentParser()
   args = parser.parse_args()
   args.fast_dev = False
   args.random_seed = 0
   args.batch_size = 32
   args.num_workers = 4
   args.num_devices = 1
   args.num_classes = 10
   args.collate_fn = None
   # defense args
   args.repeat = 5
   args.pertub_ratio = 0.8
   args.frr = 0.05
   args.use_oppsite_set = False
   # training args
   args.client_optimizer = "sgd"
   args.lr = 0.01
   args.lr_scheduler = "CosineAnnealingLR"
   args.weight_decay = 0.0005
   args.freqency_save = 10


   # set device
   device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

   # prepare dataset
   transform = transforms.Compose([transforms.ToTensor()])
   trainset = CIFAR10(
      root="./data/cifar10", download=True, train=True, transform=transform
   )
   args.train_set = DatasetWrapper(trainset)

   poison_trainset = torch.load("image_badnet_poison_train_set.pt")
   testset = CIFAR10(
      root="./data/cifar10", download=True, train=False, transform=transform
   )
   poison_testset = torch.load("image_badnet_poison_test_set.pt")

   # load backdoor model
   state_dict = torch.load("backdoor_model.pth")
   backdoor_model = models.resnet18(weights=None)
   backdoor_model.fc = torch.nn.Linear(backdoor_model.fc.in_features, 10)
   backdoor_model.load_state_dict(state_dict)
   backdoor_model.to(device)
   bkd_lit_model = ModelWrapper(backdoor_model, args)

   # evaluate the backdoor model
   backdoor_model.eval()

   # initialize the defense
   defense = STRIP(args)
   defense.setup(
      clean_train_set=DatasetWrapper(trainset),
      clean_test_set=DatasetWrapper(testset),
      poison_train_set=DatasetWrapper(poison_trainset),
      poison_test_set=DatasetWrapper(poison_testset),
      model=bkd_lit_model,
      collate_fn=None,
   )

   is_clean_lst = defense.get_sanitized_lst(poison_trainset)
   results = eval_def_acc(is_clean_lst, poison_trainset)
   save_results("results.json", results)

STRIP is a sample detection defense method. After running the above code, the detection accuracy of the defense will be collected, and the sanitized dataset can be used to retrain the model. After retraining, the ACC, ASR, and RA metrics will be collected for further evaluation.

Backdoor Attack Training via Command Line
-----------------------------------------

We use ResNet-18 as the default model and a poison ratio of 0.1. For users installing from PyPI, you can run the following commands directly in the terminal:

.. code-block:: bash

   atk_train --data_type image --dataset cifar10 --attack_name badnet --model_name resnet18 --pratio 0.1 --num_workers 4 --epochs 100
   atk_train --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --num_workers 4 --epochs 100 --add_noise true
   atk_train --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --num_workers 4 --epochs 100 --mislabel true

For users installing from source code, use the following command structure:

.. code-block:: bash

   cd backdoormbti
   cd training_pipeline
   python atk_train.py --data_type image --dataset cifar10 --attack_name badnet --model_name resnet18 --pratio 0.1 --num_workers 4 --epochs 100
   python atk_train.py --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --num_workers 4 --epochs 100 --add_noise true
   python atk_train.py --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --num_workers 4 --epochs 100 --mislabel true

To introduce noise or label mislabeling, you can add the `--add_noise true` or `--mislabel true` arguments. After the experiment, metrics such as ACC (Accuracy), ASR (Attack Success Rate), and RA (Robustness Accuracy) will be collected in the attack phase.

For more detailed command options, run:

.. code-block:: bash

   atk_train -h
   python atk_train.py -h

Backdoor Defense Training via Command Line
------------------------------------------

For defense experiments, it depends on the backdoor model generated in the attack phase, so make sure to complete the corresponding attack experiment before running defense.

For users installing from PyPI, use the following commands:

.. code-block:: bash

   def_train --data_type image --dataset cifar10 --attack_name badnet --pratio 0.1 --defense_name finetune --num_workers 4 --epochs 10
   def_train --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --defense_name fineprune --num_workers 4 --epochs 1 --add_noise true
   def_train --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --defense_name strip --num_workers 4 --epochs 1 --mislabel true

For users installing from source code, use the following command structure:

.. code-block:: bash

   cd backdoormbti
   cd training_pipeline
   python def_train.py --data_type image --dataset cifar10 --attack_name badnet --pratio 0.1 --defense_name finetune --num_workers 4 --epochs 10
   python def_train.py --data_type audio --dataset speechcommands --attack_name blend --model_name audiocnn --pratio 0.1 --defense_name fineprune --num_workers 4 --epochs 1 --add_noise true
   python def_train.py --data_type text --dataset sst2 --attack_name addsent --model_name bert --pratio 0.1 --defense_name strip --num_workers 4 --epochs 1 --mislabel true

For more details on defense commands, run:

.. code-block:: bash

   def_train -h
   python def_train.py -h

In the defense phase, detection accuracy will be collected if the defense is a detection method, and the sanitized dataset will be used to retrain the model. After retraining, ACC, ASR, and RA metrics will be collected for further evaluation.