There are three steps to analyze a file, which involve separating a waveform into source and filter characteristics. Source and filter can then be manipulated separately and recombined in the final synthesis.
Analysis can be done either from the MorphingMenu program, or from the TandemSTRAIGHTHandler program. I recommend doing the analysis of all sound files in one big batch through the TandemSTRAIGHTHandler program, save the results as .mat files, and then load those .mat files as needed in the MorphingMenu to create continua.
This step extracts the voiced source characteristics of the signal, for all voiced segments in a file. To begin extracting the F0 structure, press the “F0/F0 structure extraction” button to open the F0 extraction dialog.
The following screen shot is for the asi_initialstress.wav file. You’ll notice in the waveform at the bottom of the dialog that there are two vowel portions, a glottal stop at the beginning, and an sibiliant between the vowels.
The first step is press the “Calculate” button to generate an initial pitch track and voiced/unvoiced regions of the waveform.
This pitch track is generally pretty good, but clearly has some issues in silent and unvoiced sections. Likewise, there’s a few errors in which frames are voiced versus unvoiced, namely with missing voiced frames in the middle of vowels.
The next step is to apply “Auto tracking” to clean up the pitch track.
The pitch contour is now much cleaner, and the missing voiced frames in the middle of vowels are now there. However, extra voiced frames are now present in the middle of the sibilant, which should be corrected manually.
To change voicing regions, you can click the edge of a voiced region and drag it around. To delete these regions, I find it easiest to drag one of the voiced regions from the vowel over the erroneous regions, and then drag it back to where it should be. You can also click the right side of a voiced interval, and drag it all the way to the left side of it, which will remove it as well. Also, refer to the F0 extraction interface page for more information about interacting with the dialog.
Once you’re happy with the voiced/unvoiced parts of the file, press the “Finish/upload” button to send what you’ve done back to the main analysis window.
Aperiodic portions of the signal are extracted and analyzed. Aperiodic portions are usually obstruents, but it can also (unintentionally) model noise in the recording.
This is simple from the user’s point of view, no extra dialog is opened or extra input needed.
This step extracts the filter characteristics of the signal. This is similar to an LPC spectrum or a power spectrum lacking source information like harmonics or noise sources.
Like the aperiodicity extraction, no extra dialog is opened when calculating the spectrum.
Once a waveform has been fully analyzed, it is ready to be manipulated and resynthesized or morphed with another analyzed sound file to create a continuum.
The most common use case is to create a continuum, which is covered in the morphing walkthrough.