본문 바로가기

Music & Audio Technology

[Speech Recognition] My seminar notes: from GMM-HMM to recent DNN techniques

From June 2017 to February 2018, at Statistical Speech & Sound Computing Lab of prof. Hoi-rin Kim,

 

I monthly did seminar presentations on Automatic Speech Recognition (ASR) from its classical techniques (e.g. GMM, HMM) to recent techniques based on DNN with mathematical derivations and proofs. The seminar covered their history, extended versions, strengths and drawbacks, and most importantly, the mathematical theories behind them.

 

I followed the contents of the textbook <Automatic Speech Recognition – a deep learning approach> by Dong Yu and Li Deng. This textbook is great for it provides a wide viewpoint on ASR, which is especially good for students who just started to learn ASR.

 

I supplemented more description if needed, and when the mathematical derivations and proofs were omitted in the textbook, I did on my own and added on my seminar notes.

 

Below are some example pages of my seminar notes: