본문 바로가기

Music & Audio Technology/Machine Learning

Interactive Spatial Audio Performance: Controlling the spatial location of sound sources (3rd-order Ambisonics) in real time by head movement detection

Demo video (Full Performance video is below)

 

In this project, I designed an AI tool for interactive spatial audio performance (with an Ambisonics loudspeaker array) that enables performers to dynamically manipulate the spatial locations of sound sources in real time through their body movements, including head movement and hand gestures. This project demonstrates the possibility of controlling the spatial audio environment in real time, providing a unique and immersive experience for both performers and audiences.

 

The envisioned outcome of this project is an actual musical dance performance, where dancers act as conductors, shaping the spatial audio environment through their choreography. Audiences, in turn, will experience the spatial audio elements moving around them in synchronization with the dancers' movements, enhancing their overall sensory experience.

 

Full Performance video

 

 

How it works?

 

Other than the performance video above, I also made a hand gesture detection system with which we can control two sound sources' location in space independently with two hands.

VisonOSC, which detects hand gesture data, and FaceOSC, which detects head movement data, were used.

These detected data are sent to Wekinator, which is trained to convert them to the azimuth and elevation angle.

And then, the azimuth and elevation angle data are sent to the AmbiX plugin.

 

AmbiX plugin is an open-source VST plugin that encodes a mono or stereo source into higher order Ambisonics format, and it controls the spatial location of the sound source with the azimuth and elevation angle.

 

In this project, 3rd order Ambisonics which requires 16 channels was used.