Get the MELI corpus

What's in the download?

The download comes with:

A README.md file
Tabular metadata including language background
Out-of-vocabulary (OOV) files used in the forced alignment process
A copy of this documentation
2 languages x 51 talkers = 102 .wav files (stereo, 16-bit, 44.1 kHz)
2 languages x 51 talkers = 102 .TextGrid files (each with 4 tiers: task, utterance-corrected, word, phone)
1 map x 51 talkers = 51 .png files from the draw-a-map task

The wav and textgrid files have a consitent format. For example, F01A_man would correspond to the following:

Text What it means Other categories

F Female M = Male

01 Year of Birth varies (19)89-(20)05

man Language eng

Unique participant IDs are made up of the first four characters. In this example, that would be F01A. This ID is used in the language background summary.

Ethics

The corpus was developed in accordance with the University of British Columbia Behavioural Research Ethics Board (H23-03205).

Funding

MELI was funded by Arts Graduate Research Awards to Suyuan Liu and by a Social Sciences and Humanities Research Council of Canada (SSHRC) Insight Grant to Molly Babel.