Michael McAuliffe

Synthesizing a voicing continuum

This page focuses on making a continuum between “task” and “dask”. Below you’ll find the wav files used, as well as a prevoiced version of “dask”, for if you’d like to play around with it.

You’ll see below the temporal anchors for the morphing substrate. The main focus of the anchors is around when the burst and aspiration occur, so anchors have been placed at various points: before the burst, after the burst, when voicing begins, and when formant transitions are complete.

alt text

It could be argued that rather than placing a boundary where voicing begins, that boundary should be placed relative to the formant transitions. In “dask”, that aligns with voicing, but in “task” formant transitions begin during the aspiration. This would result in a continua centered around the source going from aperiodic to periodic, rather than the duration of the aperiodic aspiration. I’m going to leave this as a task to you if you’re interested in comparing the resulting continua. It should be less natural, because the amplitude of prevoicing would weaken as the VOT gets shorter, which doesn’t make sense from an articulatory standpoint. A potentially better way would be to create two continua next to one another, one from aspirated to unaspirated and then from unaspirated to prevoiced.

Below you’ll find the full morphed continuum from aspirated “task” to unaspirated “dask”.

Step File Step File Step File
1 task-dask001.wav 2 task-dask002.wav 3 task-dask003.wav
4 task-dask004.wav 5 task-dask005.wav 6 task-dask006.wav
7 task-dask007.wav 8 task-dask008.wav 9 task-dask009.wav
10 task-dask010.wav 11 task-dask011.wav    

One thing to note, is that due to the way STRAIGHT does its synthesis, quick transients like stop bursts can get weakened compared to their original form. If we look at the natural stop burst for “task”, we see that it is higher in amplitude than the following aspiration.

alt text

However, the same cannot be said for the synthesized version, where the burst amplitude has been reduced and is comparable to the aspiration amplitude.

alt text

In general, while STRAIGHT excels at modelling continous sources like voicing and turbulence, it’s not as good at modelling transient sources like bursts.