MMAudio generates synchronized audio given video and/or text inputs.
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis.
Related contents: