Montreal Forced Aligner

Montreal Forced Aligner is a command line utility for performing forced alignment on audio datasets using orthographic transcriptions and a pronunciation dictionary. It is trainable on larger datasets and can align smaller datasets through pretrained models. It is built using Kaldi.

Integrated Speech Corpus Analysis (ISCAN)

ISCAN is a web application to manage corpora and perform large-scale analyses through PolyglotDB. It includes both a REST API and a front end application for non-technical users to use PolyglotDB.

Speech Corpus Tools

Speech Corpus Tools is a graphical application for interacting, querying, and visualizing large speech corpora. It parses a wide range of formats into a database, which allow for fast and consistent queries across different sources of corpora.

Phonological CorpusTools

Phonological CorpusTools has Python implementations of algorithms reported in the linguistic literature with the ability to run these algorithms on a wide variety of corpora. The primary contributors to this project are Kathleen Currie Hall, Blake Allen, Michael Fry, Scott Mackie and myself.

Omnic Intelligence

Omnic Intelligence is a web application for automatically and manually annotating events in professional Overwatch matches on Twitch and Youtube. Automatic annotation is done through deep neural network models.


PolyglotDB is the package responsible for the storage and database aspects of Speech Corpus Tools.


Python-acoustic-similary represents most of my work in signal processing for creating MFCC, amplitude envelope, and gammatone representations of speech. Future versions will also include algorithms to calculate linguistically-relevant measurements such as pitch and formants.


Python package for calling Praat scripts, available here.


Python port of Bruce Hayes’ BLICK for calculating phonotatic probability in English, available here.