Projects

Montreal Forced Aligner ¶

Montreal Forced Aligner is a command line utility for performing forced alignment on audio datasets using orthographic transcriptions and a pronunciation dictionary. It is trainable on larger datasets and can align smaller datasets through pretrained models. It is built using Kaldi .

Integrated Speech Corpus Analysis (ISCAN) ¶

ISCAN is a web application to manage corpora and perform large-scale analyses through PolyglotDB. It includes both a REST API and a front end application for non-technical users to use PolyglotDB.

Speech Corpus Tools ¶

Speech Corpus Tools is a graphical application for interacting, querying, and visualizing large speech corpora. It parses a wide range of formats into a database, which allow for fast and consistent queries across different sources of corpora.

Phonological CorpusTools ¶

Phonological CorpusTools has Python implementations of algorithms reported in the linguistic literature with the ability to run these algorithms on a wide variety of corpora. The primary contributors to this project are Kathleen Currie Hall , Blake Allen , Michael Fry , Scott Mackie and myself.

Omnic Intelligence ¶

Omnic Intelligence is a web application for automatically and manually annotating events in professional Overwatch matches on Twitch and Youtube. Automatic annotation is done through deep neural network models.

PolyglotDB ¶

PolyglotDB is the package responsible for the storage and database aspects of Speech Corpus Tools.

python-acoustic-similarity ¶

Python-acoustic-similary represents most of my work in signal processing for creating MFCC, amplitude envelope, and gammatone representations of speech. Future versions will also include algorithms to calculate linguistically-relevant measurements such as pitch and formants.

python-praat-scripts ¶

Python package for calling Praat scripts, available here .

python-BLICK ¶

Python port of Bruce Hayes’ BLICK for calculating phonotatic probability in English, available here .