Michael McAuliffe

Welcome to the STRAIGHT tutorial


STRAIGHT is a program that analyzes speech, decomposes it into source and filter characteristics, and then recombines it after some manipulation or interpolation.

This tutorial is more of a pratical guide for using STRAIGHT to create various synthetic speech. More detailed information and publications about STRAIGHT are available on the official website.

What is STRAIGHT good for?

STRAIGHT is best used for creating continua between two naturally-produced endpoints. This allows for holistic morphing between the two, rather than just mixing one part of the sound file, so aspects of coarticulation can be better preserved. Alignment between sound files is both temporal in the waveform and also in the frequency domain, allowing for peaks in the spectra to be shifted around, rather than just mixed linearly.

What is STRAIGHT not good for?

  • Parametric synthesis
    • Can’t specify formant values
    • Can’t specify values for manipulation precisely (click and drag interface)
      • At least in the GUI (Matlab source has greater control, but less user-friendly)
  • Articulatory synthesis
  • Transient acoustic event synthesis
    • Waveform gets averaged out, stops don’t have as strong of bursts
  • Non-speech synthesis
    • Explicitly assumes source-filter approach

Structure of this tutorial