Mathematics of the Ear and Sound Signal Processing

January 31 - February 2, 2005


The human auditory system from the inner ear to the auditory cortex is a complex multi-level pathway of sound information processing. Three distinct levels have been studied the most. First, the preprocessing of sound in the peripheral system, most notably the mechanical vibrations of basilar membrane (BM) in the inner ear, or the cochlea. The vibration pattern encodes the acoustic characteristics of incoming sound signals. Though well-known partial differential equations (PDEs) in classical mechanics provide a solid foundation for describing these mechanical activities, additional nonlinearities must be modeled to capture nonlinear responses such as tonal suppressions and generation of combination tones observed in vivo experimentally. Despite decades of work, this area remains active.

The next level of information processing occurs in as many as 30,000 nerve fibers from the inner ear to the brain. Nonlinearities are associated with peripheral auditory neurons when the hair cell converts sound signals from mechanical to neural representation. It is now well known that outer hair cells of the cochlea play an active role in increasing the sensitivity and dynamic range of ear. In addition, the frequency distribution of sound is maintained by the wave patterns on the BM, and is preserved along the fibers, resulting in tonotopic organization of frequency responses in the auditory cortex of the brain. The last level of the information flow is sound perception. The study of the relation between the physics of sounds and their perception is referred to as psychoacoustics, where enormous data have been collected from human subjects. Interestingly, as sound information reaches higher levels, many nonlinear features remain. For example, tonal suppression is observed in neural responses as well as in psychoacoustics (called masking).

Mathematical models exist at all levels of auditory processing. Because of the increasingly lack of physiological detail and understanding at higher levels, first principle equations for the auditory system are not well formulated. Many available models are empirical. Systematic evaluation and investigation of these models is needed, in order to find more accurate mathematical representations to match essential properties in neural and psychoacoustic data. It is encouraging that the empirical mathematical models of the ear often provide a useful framework for applications. For example, MP3 is a remarkable sound compression tool that employed masking threshold curves in psychoacoustics to significantly reduce the bit width of digital sounds. It is routinely used for downloading and playing digital music. In addition, signal processing tools correctly compensating for reduced functions of impaired ears can greatly improve the performance of hearing aids and enhance the efficiency of ear implants. Signal processing methods based on key aspects of human hearing also help to increase the performance of speech recognition systems.

The proposed workshop aims to explore the mathematical models of the ear at all levels, PDE and statistical methods, and their connections to various signal processing applications in industry and health sciences. Some key workshop topics are:

  • Cochlea: models and signal processing.
  • Neural models, hair cell functions and responses of auditory neurons to sounds, neural signal processing.
  • Psychoacoustic models, masking of tones and noises.
  • Multi-level signal processing models; PDE methods; statistical methods; analysis of models and computational algorithms.
  • Applications of signal processing methods to hearing aids, ear implants and speech recognition.

The key workshop participants consist of leading researchers from universities and industry. The workshop will bring hearing and speech experts together with applied mathematicians, and offer a wonderful opportunity to review and advance a field where mathematics will play a crucial role in the future.

Organizing Committee

Li Deng (Microsoft Research)
Stanley Osher (IPAM)
Yingyong Qi (Qualcomm)
James Sneyd (University of Auckland)
Jack Xin, Chair (University of Texas, Austin)