IN4182 Digital Audio and Speech Processing
Introduction
The course covers advanced digital signal processing techniques used for acoustic signal processing. Specifically, the course covers techniques for audio coding/compression, suppression of noise in noisy (speech) signals, and speech coding/compression.
The course consists of plenary lectures and mini projects.
For study guide information (teaching goals, etc.), see Study guide.
Plenary Lectures
The plenary lectures cover the main subjects of the reading material (see the reading guide), but do not aim at being all-encompassing. By following the plenary lectures, easier access is gained to the reading material, and a clearer insight is obtained as to why certain subjects are relevant and how they relate to other subjects already covered. Although the plenary lectures are not mandatory, it is strongly recommended to follow them as topics needed for the mini projects are treated here.
Mini Project/Project groups
During the course a mini project is conducted. In this project, algorithms for speech enhancement are designed, implemented (in Matlab), evaluated, and documented. The project work is carried out in groups of 2 students which are formed in the beginning of the course. It is expected that some project work must be carried out either at home or at the university. The physical outcome of the project is a technical report motivating the algorithm design and describing the implementation and evaluation of the algorithms. As this technical report will be discussed during the evaluation, participation in the mini project is compulsory in order to qualify for the course exam.
The final report (per group of 2 students) must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".
Project details:
- Design and build a single-channel speech enhancement (noise reduction) system for far-end noise reduction.
- Use matlab
- The speech enhancement system should consist of a gain function, noise PSD estimator and speech PSD estimator.
- Perform an evaluation of the speech enhancement system.
Optional:
- Implement a multi-microphone system
Sound files for mini-project:
- Clean speech 1
- Clean speech 2
- Babble noise
- Aritificial nonstationary noise
- Stationary speech shaped noise
Impulse reponses to model multi-microphone signals in mini-project:
The mat. file contains 5 sets of impulse reponses. Each set contains the impulse responses from a source location to the 4 microphone locations. Each of these 5 sets can then be convolved with a sound signal to model the signal at each of the micrphone locations.Evaluation
The course is evaluated in an individual oral exam. In order to qualify for this exam, the project report documenting the project must be handed in in advance. During the exam, content from the project report is discussed. In addition, a random topic covered in the lectures is discussed.The final grade will be based on the discussion of the project and the discussion on the random topic.
During the course the students carry out a project (in couples) and write a project report. In addition, the students are asked to write a 2 A4 essay (individual) on a the topic of "multi-microphone speech enhancement, and the related theory". Both the project report and the individual assignment must be uploaded (individually) before June 18th 2021 via Brightspace. To do so, go to the course page in Brightspace, select the tab "assignments".
During an oral examination, both the project and its report, as well as the essay are discussed. The discussion on both parts count for 50 % of the final grade each. In order to qualify for the oral exam, the project report and individual assignment must be handed in in advance.
Signing up for the oral exam (before June 2nd!): go to the tab "collaboration", Choose "Groups" and then select "DASP Q&A session 2021". There you can then sign up for a time slot. To be able to organise these sessions, I would like to ask you to sign up before June 2nd.
disclaimer: information may change depending on the developments around the coronavirus.
Teachers
The responsible teacher for this course is dr.ir. Richard C. Hendriks (RCH). During this course we will have some guest lecturers. These are dr. ir. Richard Heusdens (RH), dr. Nikolay D. Gaubitch (NDG) and dr. Odette Scharenborg (OS).Program
The program for the study year 2020 - 2021 is as follows:
Date | Content | Litarature | Slides | ||
---|---|---|---|---|---|
1. | 19/4/2021 | RCH | 1) Introduction to the course 2)Introduction to speech enhancement |
Overviewbook Speech Enhancement | SE 1 |
2. | 20/4/2021 | RCH | MMSE based speech enhancement |
Overviewbook Speech Enhancement | SE 2 |
3. | 26/4/2021 |
No lecture, but read papars on quality Assessment |
An Evaluation of Intrusive Instrumental Intelligibility Metrics - Steven Van Kuyk et.al. An Instrumental Intelligibility Metric Based on Information Theory - Steven Van Kuyk et.al. |
||
4. | 3/5/2021 |
RCH | Noise PSD tracking for speech enhancement |
SE 3 | |
5. | 4/5/2021 |
RH | Psycho-acoustics |
Perceptual coding of Digital Audio | PA |
6. | 10/5/2021 |
RCH | Multi-microphone speech enhancement 1 | MM 1 | |
7. | 11/5/2021 |
RH | Multi-microphone speech enhancement 2 |
MM 2 | |
8. | 17/5/2021 |
RCH | Speech production |
||
9. | 18/5/2021 |
RH | Multi-microphone speech enhancement 3 |
MM 3 | |
10. | 25/5/2021 | no lecture |
|||
11. | 31/5/2021 | OS | Automatic speech recognition |
|
ASR |
12. | 1/6/2021 |
NDG | Audio Features 1 (audio features 1) |
|
FR1 |
13. | 7/6/2021 | NDG | Audio Features 2 (audio features 2) | FR2 | |
14. | 8/6/2021 | RH RCH |
Clock synchronization invariant beamforming An outlook to Biomedical Signal processing and audio graduation topics. |
< |