3 papers - avg viability 6.7
A probabilistic few-shot acoustic synthesis method that generates spatially continuous sound rendering for immersive environments with minimal data.
FoleyFlow generates coordinated audio from video by aligning audio-visual encoders with masked modeling and dynamic conditional flows, surpassing existing benchmarks.
Tutti offers advanced multi-singer synthesis with structure-level timbre control for more realistic choral music generation.