Michael McAuliffe

Applications

Speech Corpus Tools

Speech Corpus Tools is a graphical application for interacting, querying, and visualizing large speech corpora. It parses a wide range of formats into a database, which allow for fast and consistent queries across different sources of corpora.

References
  1. McAuliffe, M., Stengel-Eskin, E., Socolof, M., & Sonderegger, M. (submitted). Polyglot and Speech Corpus Tools: a system for representing, integrating, and querying speech corpora. In Interspeech 2017.
  2. McAuliffe, M., & Sonderegger, M. (2016). Easier speech corpus analysis: A practical introduction to Montreal Corpus Tools (including Speech Corpus Tools). Glasgow, UK: Scottish Graduate School of Social Science; University of Glasgow.
  3. McAuliffe, M., Stengel-Eskin, E., Socolof, M., & Sonderegger, M. (2016). Speech Corpus Tools. [http://montrealcorpustools.github.io/speechcorpustools/]

Montreal Forced Aligner

Montreal Forced Aligner is a command line utility for performing forced alignment on audio datasets using orthographic transcriptions and a pronunciation dictionary. It is trainable on larger datasets and can align smaller datasets through pretrained models. It is built using Kaldi.

References
  1. McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., & Sonderegger, M. (submitted). Montreal Forced Aligner: trainable text-speech alignment using Kaldi. In Interspeech 2017.
  2. McAuliffe, M., Socolof, M., Mihuc, S., & Wagner, M. (2016). Montreal Forced Aligner. [https://montrealcorpustools.github.io/Montreal-Forced-Aligner/]

Phonological CorpusTools

Phonological CorpusTools has Python implementations of algorithms reported in the linguistic literature with the ability to run these algorithms on a wide variety of corpora. The primary contributors to this project are Kathleen Currie Hall, Blake Allen, Michael Fry, Scott Mackie and myself.

References
  1. Hall, K. C., Allen, B., Fry, M., Mackie, S., & McAuliffe, M. (2015). Phonological CorpusTools. [https://github.com/PhonologicalCorpusTools/CorpusTools/releases]
  2. McAuliffe, M. (2015). Statistical phonological analysis in corpora using Phonological Corpus Tools. Montreal, CA.
  3. Hall, K. C., Allen, B., Fry, M., Mackie, S., & McAuliffe, M. (2014). Phonological CorpusTools: A free, open-source tool for phonological analysis. In 14th Conference for Laboratory Phonology. Tokyo, Japan.

Python packages

PolyglotDB

PolyglotDB is the package responsible for the storage and database aspects of Speech Corpus Tools.

python-acoustic-similarity

Python-acoustic-similary represents most of my work in signal processing for creating MFCC, amplitude envelope, and gammatone representations of speech. Future versions will also include algorithms to calculate linguistically-relevant measurements such as pitch and formants.

python-praat-scripts

Python package for calling Praat scripts, available here.

python-BLICK

Python port of Bruce Hayes’ BLICK for calculating phonotatic probability in English, available here.