Development of a Neural Network based Acoustic Beamformer for Input Preprocessing in Digital Hearing Aids
Noise Cancellation in hearing aids enables pleasant hearing for users. In this study, we perform both spectral and spatial noise cancellation. We develop a Generalized Eigen Vector Beamformer for spatial noise cancellation and use the concept of Time-Frequency masking for spectral noise cancellation. The beamformer uses an estimated time frequency mask to compute the spatial covariance matrices of the desired sound source and noise separately. The beamforming coefficients are computed from the covariance matrices by generalized eigenvector decomposition, which is then used to generate a weighted sum of the microphone inputs to produce a spatially filtered output. A 4-microphone system is chosen as the setup, with a pair of microphones to be worn on either ear, thereby satisfying the hearing aid fitting constraints. The two hearing aids are assumed to have a wireless connection, thus enabling the sharing of microphone inputs between the two hearing devices, providing a 4-channel input for the beamformer. The mask estimation is performed by using a Deep Neural Network (DNN), where we explore Ideal Binary Mask(IBM) and Complex Ideal Ratio Mask (cIRM) as targets. Spectral features given as input include Mel Frequency Cepstral Coefficients (MFCC), Log Spectrogram. The training dataset is a custom generated dataset where the desired source signals and noise signals are individually recorded are mixed together. Source signals include speech, instrumental music, various important sounds including but not limited to alarms and ambulance sounds. Noise signals include babble noise, crowd noise, vehicular noise, pure white noise.
Keywords: Deep Neural Network, Multi-Channel Speech Enhancement, Computational Auditory Scene Analysis( CASA), Automatic Speech Recognition(ASR).