Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

Chao Feng, Ziyang Chen, Andrew Owens
University of Michigan, Ann Arbor

CVPR 2023 (Highlight)

This is the code for audio-visual forensics.

Steps to run the python code directly:

pip install -r requirements.txt

# 1. test a sample fake video (path of video should be full path)
CUDA_VISIBLE_DEVICES=8 python detect.py --test_video_path /home/xxxx/test_video.mp4 --device cuda:0 --max-len 50 --n_workers 4  --bs 1 --lam 0 --output_dir /home/xxx/save 
# 2. test a list of fake videos (path of .txt file should be full path, and list should contain full paths of testing videos)
CUDA_VISIBLE_DEVICES=8 python detect.py --test_video_path /home/xxxx/fake_videos.txt --device cuda:0 --max-len 50 --n_workers 4 --bs 1 --lam 0 --output_dir /home/xxx/save

(lam is a hyperparameter you can tune to combine scores from distributions over delays and audio-visual network activations mentioned in paper method section. Default lam=0 is distributions over delays only.)

Audio-visual synchronization model checkpoint sync_model.pth can be donwloaded by this link. Noted that AV synchronization model consists of video branch, audio branch, and audio-visual feature fusion transformer.

In the end, there would be a output.log file and a testing_score.npy file under output_dir generated to record scores for all the testing videos.

Audio-visual synchronization model code is based on vit-pytorch

Decoder only autoregressive model is partially based on memory-compressed-attention

Visual encoder is heavily borrowed from action classifiction

Any questions please contact [email protected], I will try to respond ASAP, sorry for any delay before.

@inproceedings{feng2023self,
  title={Self-supervised video forensics by audio-visual anomaly detection},
  author={Feng, Chao and Chen, Ziyang and Owens, Andrew},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10491--10503},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backbone		backbone
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
audio_process.py		audio_process.py
avfeature_regressive_model.pth		avfeature_regressive_model.pth
config_deepfake.py		config_deepfake.py
deep_fake_data.py		deep_fake_data.py
detect.py		detect.py
dist_regressive_model.pth		dist_regressive_model.pth
fake_celeb_dataset.py		fake_celeb_dataset.py
load_audio.py		load_audio.py
load_video.py		load_video.py
model.py		model.py
pca.pkl		pca.pkl
requirements.txt		requirements.txt
test.mp4		test.mp4
test.wav		test.wav
transformer_component.py		transformer_component.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

About

Releases

Packages

Contributors 2

Languages

License

cfeng16/audio-visual-forensics

Folders and files

Latest commit

History

Repository files navigation

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages