IN4182 Digital Audio and Speech Processing

Introduction

The course covers advanced digital signal processing techniques used for acoustic signal processing. Specifically, the course covers techniques for audio coding/compression, suppression of noise in noisy (speech) signals, and speech coding/compression.

The course consists of plenary lectures and mini projects.

For study guide information (teaching goals, etc.), see Study guide.

Plenary Lectures

The plenary lectures cover the main subjects of the reading material (see the reading guide), but do not aim at being all-encompassing. By following the plenary lectures, easier access is gained to the reading material, and a clearer insight is obtained as to why certain subjects are relevant and how they relate to other subjects already covered. Although the plenary lectures are not mandatory, it is strongly recommended to follow them as topics needed for the mini projects are treated here.

Mini Project/Project groups

During the course a mini project is conducted. In this project, algorithms for speech enhancement are designed, implemented (in Matlab), evaluated, and documented. The project work is carried out in groups of 2 students which are formed in the beginning of the course. It is expected that some project work must be carried out either at home or at the university. The physical outcome of the project is a technical report motivating the algorithm design and describing the implementation and evaluation of the algorithms. As this technical report will be discussed during the evaluation, participation in the mini project is compulsory in order to qualify for the course exam.

The final report (per group of 2 students) must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".

Project details:

Design and build a single-channel speech enhancement (noise reduction) system for far-end noise reduction.
Use matlab
The speech enhancement system should consist of a gain function, noise PSD estimator and speech PSD estimator.
Perform an evaluation of the speech enhancement system.

Optional:

Implement a multi-microphone system

Sound files for mini-project:

Impulse reponses to model multi-microphone signals in mini-project:

Impulse responses

The mat. file contains 5 sets of impulse reponses. Each set contains the impulse responses from a source location to the 4 microphone locations. Each of these 5 sets can then be convolved with a sound signal to model the signal at each of the micrphone locations.

Evaluation

The course is evaluated in an individual oral exam. In order to qualify for this exam, the project report documenting the project must be handed in in advance. During the exam, content from the project report is discussed. In addition, a random topic covered in the lectures is discussed.The final grade will be based on the discussion of the project and the discussion on the random topic.

During the course the students carry out a project (in couples) and write a project report. In addition, the students are asked to write a 2 A4 essay (individual) on a the topic of "multi-microphone speech enhancement, and the related theory". Both the project report and the individual assignment must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".

During an oral examination, both the project and its report, as well as the essay are discussed. The discussion on both parts count for 50 % of the final grade each. In order to qualify for the oral exam, the project report and individual assignment must be handed in in advance.

Signing up for the oral exam (before June 2nd!): go to the tab "collaboration", Choose "Groups" and then select "DASP Q&A session 2021". There you can then sign up for a time slot. To be able to organise these sessions, I would like to ask you to sign up before June 2nd.

disclaimer: information may change depending on the developments around the coronavirus.

Teachers

The responsible teacher for this course is dr.ir. Richard C. Hendriks (RCH). During this course we will have some guest lecturers. These are dr. ir. Richard Heusdens (RH), dr. Nikolay D. Gaubitch (NDG) and dr. Odette Scharenborg (OS).

Program

The program for the study year 2020 - 2021 is as follows:

	Date		Content	Litarature	Slides
1.	19/4/2021	RCH	1) Introduction to the course 2)Introduction to speech enhancement	Overviewbook Speech Enhancement	SE 1
2.	20/4/2021	RCH	MMSE based speech enhancement	Overviewbook Speech Enhancement	SE 2
3.	26/4/2021		No lecture, but read papars on quality Assessment	An Evaluation of Intrusive Instrumental Intelligibility Metrics - Steven Van Kuyk et.al. An Instrumental Intelligibility Metric Based on Information Theory - Steven Van Kuyk et.al. Speech Quality Assessment - Grancharov & kleijn STOI - Cees Taal
4.	3/5/2021	RCH	Noise PSD tracking for speech enhancement	Overviewbook Speech Enhancement Voice activity detection Minimum statistics Unbiased MMSE-based Noise PSD Estimation	SE 3
5.	4/5/2021	RH	Psycho-acoustics	Perceptual coding of Digital Audio	PA
6.	10/5/2021	RCH	Multi-microphone speech enhancement 1		MM 1
7.	11/5/2021	RH	Multi-microphone speech enhancement 2		MM 2
8.	17/5/2021	RCH	Speech production	Some background material	SP
9.	18/5/2021	RH	Multi-microphone speech enhancement 3		MM 3
10.	25/5/2021		no lecture
11.	31/5/2021	OS	Automatic speech recognition		ASR
12.	1/6/2021	NDG	Audio Features 1 (audio features 1)		FR1
13.	7/6/2021	NDG	Audio Features 2 (audio features 2)		FR2
14.	8/6/2021	RH RCH	Clock synchronization invariant beamforming An outlook to Biomedical Signal processing and audio graduation topics.	Clock sync. invariant BF	GEVD bio SP & audio topics