TL;DR: We propose acoustic volume rendering for impulse response rendering. Wave propagation principle is incorporated into the acoustic volume rendering to ensure multi-pose consistency for acoustic signal. Our method provide a better method to model the acoustic propagation.
As technologies like Virtual Reality continue to advance, realistic synthesis of spatial sound has become crucial for creating immersive experiences. Synthesizing the sound received at any position relies on the estimation of impulse response (IR), which describes how sound propagates in one scene. While various learning-based approaches have been proposed, their generalization to unseen poses remains unsatisfactory. In this paper, we present Acoustic Volume Rendering (AVR) for impulse response rendering. AVR enables the construction of an impulse response field that follows wave propagation principles and achieves state-of-the-art performance in synthesizing impulse responses for unseen poses. We introduce frequency-domain signal rendering and spherical integration specifically for acoustic volume rendering. Experiments show that AVR surpasses the current leading methods by a substantial margin. Additionally, we developed an acoustic simulation platform, AcoustiX, which simulates more realistic impulse responses than existing simulators.
Rendered impulse response from our method on the Real Acoustic Field dataset. Headphones are strongly recommended.
Our model can render binaural audio. We can render binaural audio by simply synthesizing the sound heard at the left and right ears separately.
Left: Task illustration. From observations of the sound emitted by a
speaker, our model constructs an impulse response field that can synthesize observations at any listener
position.
Right: Waveform visualization We transfer the signal into frequency domain, and we visualize phase
and amplitude distributions at a specific wavelength. Our method predict correct signal spatial
distributions.
Rendering pipeline. We sample points along the ray that is shot from the microphone, and query the network to obtain signals and density. Time delay is applied to account for the wave propagation. After that, we combine signals and densities to perform acoustic volume rendering for each ray to get the directional signal. We integrate along the sphere to combine signals from all possible directions with gain pattern to obtain the final rendered impulse response.
Visualization of spatial signal distributions
Synthesized impulse response across different methods