본문 바로가기

Music & Audio Technology

[Spatial Audio] Spatial Separation of Audio Language using a Line Array Loudspeaker system

* Detailed technical report on this project : [pdf link]

 

Spatial Separation of Audio Language Using a Line Array Loudspeaker System.pdf

 

drive.google.com

 

 

 

▲ Demo Video

 

In this project, a line array loudspeaker system was built which can spatially separate audio languages.

 

People on the right side in front of the line array heard lyrics in Korean and people on the left side heard lyrics in Japanese. A multi-beam sound focusing and beam steering technique using the Delay-And-Sum (DAS) structure were used to accomplish this.

 

 

 

In the far-field region, the beam pattern of a line array loudspeaker system - in short, a line array - is given as the spatial Fourier transform of the spatial excitation of the line array under the Fraunhofer approximation, as shown in the figure above. A real-world line array can be mathematically modeled as a discrete line array by sampling an ideal continuous line array along the line array direction. Mathematically, spatial sampling involves multiplying the spatial excitation by an impulse train and using the property of a Fourier transform; this results in the convolution of the wavenumber spectrum and an impulse train in the spectrum domain.

 

By assigning the appropriate time delay value for each loudspeaker, we can steer the beam pattern left or right on the horizontal plane relative to the direction the loudspeakers face.

 

Two different language versions, Korean and Japanese, of the song <Eyes, Nose, Lips> by Taeyang were used.

 

The Korean version was steered to the right side by inputting eight different delayed signals to the eight loudspeakers,  respectively, and the Japanese signal was steered to the left side in the same way. Then, for each loudspeaker, the Korean and Japanese signals were summed.

 

To achieve better quality sound focusing, the main lobe should be narrower, which is achieved by reducing the half-power beam width (HPBW). To this end, the total length of the line array should be as long as possible, as we can see in the equation below:

L is the total length of the line array

 

At the same time, due to the spatial sampling of the discrete line array, we have to consider the Nyquist sampling theorem in space to avoid the grating lobe issue where the repetition of the main lobe occurs too closely. This gives an upper limit to the distance between each loudspeaker:

Theta_0 is a steered angle value

 

Although the interval should be smaller than the upper limit, at the same time, in order to achieve better sound focusing, the total length of the line array should be as long as possible to reduce the half-power beam width.

 

For this reason, we chose the interval value of 41 mm, which was close to the upper limit.

 

Abbreviated version of the MATLAB code assigning the appropriate delay time to each loudspeaker. Many auxiliary parts of the code are omitted here.

 

MATLAB simulation: we found that a Taylor window provides better sound focusing

 

 

* This project was the final term project for the course Introduction to Audio Signal Processing by Prof. Jungwoo Choi of KAIST (Korea Advanced Institute of Science and Technology) in 2018. Soohyun Kim, Minkyung Kim, and Sooin Sung worked as a team on this project. Soohyun Kim did a major role in the MATLAB coding of this project.