Application of signal processing techniques - preliminary results

Examples (sound-files) | Publications | Nephthys project - by Philip Jackson

Aspiration Models for Signal Synthesis

A variety of techniques has been used to extract the periodic and noise components of speech signals.

The models for signal synthesis are typical of simple examples in the literature (Klatt DH, Review of text-to-speech conversion for English, Journal of the Acoustical Society of America , Vol.82 (3), pp.737-793, 1987).

Synthesis Schema

Basic Inverse Filters

The basic inverse filters used are popular signal processing techniques:
  1. Comb filter (Shields VC Jr, Separation of added speech signals by digital comb filtering, S. M. Thesis , Department of Electrical Engineering, Massachusetts Institute of Technology, 1970),
  2. Wiener filter (Deller JR, Proakis JG, and Hansen JHL, Discrete-time Processing of Speech Signals , ed. Griffin J, Macmillan Publishing Co., NJ, USA, 1993)
  3. Pitch-synchronous filter (Muta H, Baer T, Wagatsuma K, Muraoka T, and Fukuda H, A pitch-synchronous analysis of hoarseness in running speech, Journal of the Acoustical Society of America , Vol.84 (4), pp.1292-1301, 1988)
  4. Wavelet thresholding (Donoho DL, Nonlinear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data, Proceedings of Symposia in Applied Mathematics , Vol.47, 1993)

The following figures are schematic representations of these filters.

Filter Schema

Basic Synthetic (Input) Signal

The figure below shows how signals were synthesised using the basic model.

Input Signal

Spectra of Pitch-synchronous Filter Input and Outputs

The figure below illustrates how the harmonic and noisy power spectra are estimated from the input spectrum. The harmonic part (crosses) are taken from every fourth bin (since there are four pitch-periods in the windowed input signal), and the noisy part from the minimum of each group of four.

Pitch-synchronous Spectra

Preliminary Results

The figure show some preliminary results, which correspond to improvements in the signal-to-noise ratio (SNR):
  1. Comb filter: +19.9 dB (periodic), +33.3 dB (aperiodic)
  2. Wiener filter: +8.8 dB (periodic), +22.7 dB (aperiodic)
  3. Pitch-synchronous filter: +4.7 dB (periodic), +2.2 dB (aperiodic)
  4. Wavelet threshold filter: +5.9 dB (periodic), +19.6 dB (aperiodic)
Output Signals

Preliminary Results (continued)

Wavelet Output

Examples of decomposed speech (wav-files)

Sound FileOriginal speech | Sound FilePeriodic part | Sound FileAperiodic part

[ Projects | ISIS group | Dept. Electronics and Computer Science | University of Southampton ]

© maintained by Philip Jackson, last updated on 6 November 2002.