ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
-
Updated
Oct 23, 2023 - Python
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Transformer-based online speech recognition system with TensorFlow 2
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
Audio-visual diarization pipeline used for creating VoxConverse dataset
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
Towards Audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio
Accepted by TMM 2022
Code and datasets for 'Move2Hear: Active Audio-Visual Source Separation' (ICCV 2021)
Add a description, image, and links to the audio-visual topic page so that developers can more easily learn about it.
To associate your repository with the audio-visual topic, visit your repo's landing page and select "manage topics."