IN4182 Digital Audio and Speech Processing

Introduction

The course covers advanced digital signal processing techniques used for acoustic signal processing. Specifically, the course covers techniques for audio coding/compression, suppression of noise in noisy (speech) signals, and speech coding/compression.

The course consists of  plenary lectures and  mini projects.

For study guide information (teaching goals, etc.), see Study guide.

 

Plenary Lectures

The plenary lectures cover the main subjects of the reading material (see the reading guide), but do not aim at being all-encompassing. By following the plenary lectures, easier access is gained to the reading material, and a clearer insight is obtained as to why certain subjects are relevant and how they relate to other subjects already covered. Although the plenary lectures are not mandatory, it is strongly recommended to follow them as topics needed for the mini projects are treated here.

Mini Project/Project groups

During the course a mini project is conducted. In this project, algorithms for speech enhancement are designed, implemented (in Matlab), evaluated, and documented. The project work is carried out in groups of 2 students which are formed in the beginning of the course. It is expected that some project work must be carried out either at home or at the university. The physical outcome of the project is a technical report motivating the algorithm design and describing the implementation and evaluation of the algorithms. As this technical report will be discussed during the evaluation, participation in the mini project is compulsory in order to qualify for the course exam.

 

The final report (per group of 2 students) must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".

Project details:

  • Design and build a single-channel speech enhancement (noise reduction) system for far-end noise reduction.
  • Use matlab
  • The speech enhancement system should consist of a gain function, noise PSD estimator and speech PSD estimator.
  • Perform an evaluation of the speech enhancement system.

Optional:

  • Implement a multi-microphone system

Sound files for mini-project:

Impulse reponses to model multi-microphone signals in mini-project:

The mat. file contains 5 sets of impulse reponses. Each set contains the impulse responses from a source location to the 4 microphone locations. Each of these 5 sets can then be convolved with a sound signal to model the signal at each of the micrphone locations.

Evaluation

The course is evaluated in an individual oral exam. In order to qualify for this exam, the project report documenting the project must be handed in in advance. During the exam, content from the project report is discussed. In addition, a random topic covered in the lectures is discussed.The final grade will be based on the discussion of the project and the discussion on the random topic. 

During the course the students carry out a project (in couples) and write a project report. In addition, the students are asked to write a 2 A4 essay (individual) on a the topic of  "multi-microphone speech enhancement, and the related theory". Both the project report and the individual assignment must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".

During an oral examination, both the project and its report, as well as the essay are discussed. The discussion on both parts count for 50 % of the final grade each. In order to qualify for the oral exam, the project report and individual assignment must be handed in in advance.

Signing up for the oral exam (before June 2nd!): go to the tab "collaboration", Choose "Groups" and then select "DASP Q&A session 2021". There you can then sign up for a time slot. To be able to organise these sessions, I would like to ask you to sign up before June 2nd. 

disclaimer: information may change depending on the developments around the coronavirus.

Teachers

The responsible teacher for this course is dr.ir. Richard C. Hendriks (RCH). During this course we will have some guest lecturers. These are dr. ir. Richard Heusdens (RH), dr. Nikolay D. Gaubitch (NDG) and dr. Odette Scharenborg (OS).

Program

The program for the study year 2020 - 2021 is as follows:

 

<

Date
Content Litarature Slides
1. 19/4/2021 RCH

1) Introduction to the course 2)Introduction to speech enhancement

Overviewbook Speech Enhancement SE 1
2. 20/4/2021 RCH

MMSE based speech enhancement

  Overviewbook Speech Enhancement SE 2
3.

26/4/2021

 

No lecture, but read papars on quality Assessment

An Evaluation of Intrusive Instrumental Intelligibility Metrics - Steven Van Kuyk et.al.

An Instrumental Intelligibility Metric Based on Information Theory - Steven Van Kuyk et.al.

Speech Quality Assessment - Grancharov & kleijn

STOI - Cees Taal

 
4.

3/5/2021

RCH

Noise PSD tracking for speech enhancement

Overviewbook Speech Enhancement

Voice activity detection

Minimum statistics

Unbiased MMSE-based Noise PSD Estimation

SE 3
5.

4/5/2021

RH

Psycho-acoustics

Perceptual coding of Digital Audio PA
6.

10/5/2021

RCH Multi-microphone speech enhancement 1   MM 1
7.

11/5/2021

RH

Multi-microphone speech enhancement 2

  MM 2
8.

17/5/2021

RCH

Speech production

Some background material

SP

9.

18/5/2021

RH

Multi-microphone speech enhancement 3

MM 3
10. 25/5/2021  

no lecture

 
11. 31/5/2021 OS Automatic speech recognition

 

ASR
12.

1/6/2021

NDG

Audio Features 1 (audio features 1)

 

FR1
13. 7/6/2021 NDG Audio Features 2 (audio features 2)   FR2
14. 8/6/2021

RH

RCH

Clock synchronization invariant beamforming

An outlook to Biomedical Signal processing and audio graduation topics.

GEVD

bio SP & audio topics