Categories | Inventors
CORPORA from CSLU: Isolet.
OHSU # 0681-F
Categories:
Inventors:
- CSLU, SOM CSLU
Overview
ISOLET is a database of letters of the English
alphabet spoken in isolation. The database consists of 7800 spoken letters, two
productions of each letter by 150 speakers. It contains approximately 1.25 hours
of speech.h The recordings were done under quiet, laboratory conditions with a
noise-canceling microphone.
Recording Conditions
Speech was
recorded in the OGI speech recognition laboratory. The room is 15' by 15' with a
tile floor and standard office wall board and drop ceiling. There are two Sun
workstations in the room, and three disk drives.
The recording equipment
was selected to mimic the equipment used to collect the TIMIT database as
closely as possible. The speech was recorded with a Sennheiser HMD 224
noise-canceling microphone, lowpass filtered at 7.6 kHz. Data capture was
performed using the AT&T DSP32 board installed in a Sun 4/110. The data were
sampled at 16 kHz.
The subjects were seated in front of a Sun
workstation and prompted with letters in random order. After each prompt, the
subject would strike the return
key and say the letter. Two seconds of speech were recorded and immediately
played back for verification. If the subject spoke too soon or too late and
missed the two second buffer, or if the experimenter or subject decided the
letter was mis-spoken, the recording would be repeated. There was no attempt to
elicit ideal speech. A letter was judged mis-spoken only if there was a
significant departure from normal pronunciation.
The ISOLET corpus was
collected in 1990.
Speaker Population
Subjects were obtained
through advertising. Each subject was given a free dessert at a local restaurant
in exchange for his or her participation. All speakers reported English as their
native language. The ages varied from 14 to 72 years, with an average of 35.
For more information, contact:
Michele Gunness
Senior Technology Development Manager
503-494-4184
