Get the MELI corpus

What's in the download?

The download comes with:

  • A README.md file
  • Tabular metadata including language background
  • Out-of-vocabulary (OOV) files used in the forced alignment process
  • A copy of this documentation
  • 2 languages x 51 talkers = 102 .wav files (stereo, 16-bit, 44.1 kHz)
  • 2 languages x 51 talkers = 102 .TextGrid files (each with 4 tiers: task, utterance-corrected, word, phone)
  • 1 map x 51 talkers = 51 .png files from the draw-a-map task

The wav and textgrid files have a consitent format. For example, F01A_man would correspond to the following:

Text What it means Other categories
F Female M = Male
01 Year of Birth varies (19)89-(20)05
man Language eng


Unique participant IDs are made up of the first four characters. In this example, that would be F01A. This ID is used in the language background summary.

Ethics

The corpus was developed in accordance with the University of British Columbia Behavioural Research Ethics Board (H23-03205).

Funding

MELI was funded by Arts Graduate Research Awards to Suyuan Liu and by a Social Sciences and Humanities Research Council of Canada (SSHRC) Insight Grant to Molly Babel.