This is the core implementation for (VRifle) "Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time", in Proceedings of Network and Distributed System Security 2024 Symposium (NDSS 2024).
We would like to thank the author of deepspeech2-pytorch-adversarial-attack for providing an excellent foundation for our code, which targets the DeepSpeech2 model.
We also extend our gratitude to the contributors of deepspeech.pytorch for developing an easy-to-use DeepSpeech framework.
If you think this repo helps you, please consider cite in the following format.
@inproceedings{li2024vrifle,
title={Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time},
author={Li, Xinfeng and Yan, Chen and Lu, Xuancun and Zeng, Zihan and Ji, Xiaoyu and Xu, Wenyuan},
booktitle={In the 31st Annual Network and Distributed System Security Symposium (NDSS)},
year={2024}
}
Several dependencies required to be installed first. Please follow the instruction in DeepSpeech 2 PyTorch to build up the environments.
It is recommended to setup your folders of DeepSpeech 2 PyTorch in the following structure.
ROOT_FOLDER/
├── this_repo/
│ ├──main_vrifle.py
│ └──...
├──deepspeech.pytorch/
│ ├──models/
│ │ └──librispeech/
│ │ └──librispeech_pretrained_v2.pth
│ └──...
Then, you should download the DeepSpeech pretrained model from this link provided by the DeepSpeech 2 PyTorch
Deep Speech 2[1] is a state-of-the-art Automatic Speech Recognition (ASR) system, notable for its end-to-end training capability where spectrograms are directly utilized to generate predicted sentences.
In this work, we implement the first trial of completely inaudible (ultrasonic) adversarial perturbation attacks against this ASR system. In this way, the classical PGD (Projected Gradient Descent) algorithm can also render an efficient optimization.
[1] Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., ... & Zhu, Z. (2016, June). Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning (pp. 173-182).
- Download the Fluent Speech Command Dataset
- If you want to speed up the optimization on 3090 GPU. Turn to Support DeepSpeech on 3090 GPUs (NVIDIA)
It is easy to perturb the original raw wave file to generate desired sentence with main_vrifle.py
.
python main_vrifle.py --attack_type Mute_robust --device 0
python main_vrifle.py --attack_type Universal_robust --device 0
Actually, several parameters are available to make your adversarial attack better. You may tune hypyerparameters such as epsilon
, alpha
, and PGD_iter
to adjusted for better results. For the details, please refer to main_vrifle.py
and vrifle_attack.py
.
Through our numerous attempts and extensive research, we have established the following setup details :)
- Download deepspeech.pytorch
- cd into the folder and then
pip install -r requirements.txt
pip install -e . # Dev install
pip install adversarial-robustness-toolbox[pytorch]
pip install torchaudio
git clone https://github.com/SeanNaren/warp-ctc.git
- You should replace the
#include <THC/THC.h>extern THCState* state
, which refers to https://blog.csdn.net/weixin_41868417/article/details/123819183修改binding.cpp`
6. Install Warp-CTC
- edit the CMakeLists.txt
# Before replacement
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_30,code=sm_30 -O2")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_35,code=sm_35")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_50,code=sm_50")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_52,code=sm_52")
# After
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_86,code=sm_86")
- Compilation
cd warp-ctc
mkdir build
cd build
cmake ..
make
cd ../pytorch_binding
- Modifying binding.cpp
## replace
#include <THC/THC.h>
extern THCState* state;
void* gpu_workspace = THCudaMalloc(state, gpu_size_bytes);
## into
void* gpu_workspace = c10::cuda::CUDACachingAllocator::raw_alloc(gpu_size_bytes);
## replace
THCudaFree(state, (void *) gpu_workspace);
## into
c10::cuda::CUDACachingAllocator::raw_delete((void *) gpu_workspace);
- the last step
python setup.py install
- You should notice that the
--recursive
is required for a workable CTCdecode dependency
git clone --recursive [email protected]:parlance/ctcdecode.git