MSc thesis project proposal

Adaptive audio processing (with the company Axign)

Project outside the university

Axign

1.1 Earphones/earbuds and hearables

Audio reproduction with earphones, whether in-ear or over-ear applications face different problems. Earphones are used everywhere, in house or outdoors and still people want to listen to the music undisturbed and free of noises. In headphones the spatial information of the sound image is minimalistic resulting in a flat stereo image to the listener. For gaming industry precise head tracking for 3D audio with 3D audio processing is a much sought-after feature, making computer games even more realistic. Active noise cancelling also plays an important role in the lowering of environmental noises. Most earbuds do not achieve hifi performance once active noise cancelling is used. In earbuds the reduction of noise is limited to the low frequencies with a reduction of 20 - 30 dBA. These ANC algorithms will get the best results with stationary noise, however in real life noises vary in frequency and in volume. The Axign technology should be able to deliver broadband noise cancelling (from 20Hz to 15kHz) whilst delivering music content in hifi mode. In hearables an extra dimension is added to the performance of the device namely compensating of hearing loss of its user. Hearing loss can be measured separately by means of audiological measurements. During these interactive measurements the listener is asked to respond to discrete steps in the frequency spectrum with predefined sound pressure levels. In this way the spectrum of the hearing loss is determined. The corresponding curve should be compensated for during use of the device. As an alternative way in determining the hearing loss one could use the frequency response of the cochlea on stimuli generated by het hearing aid. This would enable a continuously monitoring of the hearing loss of the user.

1.2 Use voice as human machine interface

An ever-increasing number of products is using voice enabled interaction as their user interface, not only audio products, but also control of all the electrical appliances in the home. Major hick-ups of such systems occur in loud and/or windy environments or when there is a larger distance between speaker and microphone. In these cases, the engine is not able to discriminate the control words from the rest of the audio content. Improvement of the reception and discrimination of these control words will increase the acceptance of such systems in the market. In mobile phone applications these short comings can be disguised by putting the microphone close to your mouth (near field recognition), but in systems that have a fixed position, like a TV, a HomePod, an Echo or Nest device, this is not possible as they are most of the time in a far field detection mode. These devices need to have a very good voice command recognition. The start of communication with these products starts with the voice activation detection. These are special words to have a limited set thereof and enhances the speed of recognition. To improve speech recognition one can think of using beam forming microphones and bone conduction.

1.3 Listening experience

Audio is produced in multiple conditions, from open air music stages to small intimate rooms, from opera to talk shows from quiet to noisy environments. The ideal audio system should be able to reproduce the listeners experience in either of these environments. To create such experience multiple speaker systems like Dolby surround systems have been developed. Although it enhances the experience it still is not perfect. In stereo devices one is limited in getting the spatial impressions of the sound stage and so limits the audio experience. The sound stage is only present in the sweet spot where the listener is at one of the corners from an equilateral triangle between speakers and the listener. D’Appolito speakers will enhance the stereo image, but only to a certain extend. In such system, the drivers all have similar horizontal dispersion at a selected crossover frequency between tweeter and subwoofer. The result is an absence of any sudden change in directivity with frequency. This may not mean much for monitors where there is a limited listening area, but in a typical room where a large percentage of the sound is reflected by the room, the effect is dramatic. Whether using surround sound systems or D’Appolito speakers, room characteristics remain an important part in the reproduction and the creation of the ultimate experience that need to be dealt with.

About Axign

mission is to create the ultimate audio experience for high volume products at lowest system cost enabled by Axign’s disruptive innovative & patented inventions. Axign is based in The Netherlands with offices in Enschede (headquarter) and Nijmegen. Its first product is a breakthrough audio controller chip, that combines benchmark performance with a low bill of materials for audio system manufacturers. The chip is especially suited for active speaker systems in streaming audio devices. http://www.axign.nl/

For more information, please contact:

Jeroen Langevoort CTO /Founder Axign jeroen.langevoort@axign.nl

or

Remko van Heeswijk System Architect remko.van.heeswijk@axign.nl

Assignment

Design adaptive audio algorithms or hardware accelerators

  1. Earphones/earbuds
    1. active noise canceling in earbuds
    2. Improve voice recognition/pickup in loud and windy environments
    3. 3D head tracking based spatial audio processing
    4. Improve directionality dependent noise canceling in headphones
    5. When playing back audio files which are mastered for stereo home HiFi setups on an earphone/headphone, spatial information is missing. How can we improve this without requiring binaural recorded audio files?
  2. Smart speakers
    1. Improve voice recognition in the far field
    2. Optimize spatial audio field in a room
  3. Hearables
    1. Compensation of hearing loss in hearables: by measuring one’s hearing loss in situ and adapt the setting of the earbud amplifier and equalizer to deliver the best audio performance given the amount of hearing loss of the user.
    2. Optimize voice recognition during cocktail parties

All above mentioned research activities may involve development of algorithms (which might be based on combinations of conventional signal processing and AI/machine learning) or efficient hardware accelerators for adaptive audio processing and neural networks. Power efficiency of the solution is key.

Contact

dr.ir. Richard Hendriks

Circuits and Systems Group

Department of Microelectronics

Last modified: 2021-09-01