CORPORA from CSLU: Isolet.
OHSU # 0681-F
ISOLET is a database of letters of the English alphabet spoken in isolation. The database consists of 7800 spoken letters, two productions of each letter by 150 speakers. It contains approximately 1.25 hours of speech.h The recordings were done under quiet, laboratory conditions with a noise-canceling microphone.
Speech was recorded in the OGI speech recognition laboratory. The room is 15' by 15' with a tile floor and standard office wall board and drop ceiling. There are two Sun workstations in the room, and three disk drives.
The recording equipment was selected to mimic the equipment used to collect the TIMIT database as closely as possible. The speech was recorded with a Sennheiser HMD 224 noise-canceling microphone, lowpass filtered at 7.6 kHz. Data capture was performed using the AT&T DSP32 board installed in a Sun 4/110. The data were sampled at 16 kHz.
The subjects were seated in front of a Sun workstation and prompted with letters in random order. After each prompt, the subject would strike the return key and say the letter. Two seconds of speech were recorded and immediately played back for verification. If the subject spoke too soon or too late and missed the two second buffer, or if the experimenter or subject decided the letter was mis-spoken, the recording would be repeated. There was no attempt to elicit ideal speech. A letter was judged mis-spoken only if there was a significant departure from normal pronunciation.
The ISOLET corpus was collected in 1990.
Subjects were obtained through advertising. Each subject was given a free dessert at a local restaurant in exchange for his or her participation. All speakers reported English as their native language. The ages varied from 14 to 72 years, with an average of 35.
There were 75 male and 75 female subjects used in this data collection.
After the recording session, each utterance was verified by a human examiner. The examiner viewed a waveform of the utterance to verify that the speech was padded with silence. Secondly, the examiner would listen to the speech and note any ambiguous or mis-spoken utterances.
All utterances noted by the examiner were examined by two other human examiners. If a majority of the examiners perceived that an utterance was abnormal, that utterance, and the rest of the utterances from that speaker, were removed from the corpus.
The Center for Spoken Language Understanding (CSLU) distributes corpora to commercial entities and academic institutions for a fee. Commercial entities can use these corpora for research but also for creating commercial products such as generating acoustic models for speech recognition.
To place your order:
1. Click on the type of license you wish to order: Academic or non-profit entity or Commercial entity.
2. Terms of the license agreement can be viewed by clicking on the word "terms".
3. You agree to the terms of the license agreement when you click on "Add to Order" and proceed to the next screen.
4. If information on the "Order Contents" screen is correct, press "Check out".
5. On the next screen, a brief "Intended Use" is required. For "Recipient Scientist Information" enter the appropriate information for yourself or if you are placing the order for another person enter that information. We will use this information should we have questions about the order, payment or shipping address.
6. Once your payment has been received and verified by OHSU, your order will be approved by Technology Transfer & Business Development and then the DVD will be sent out by the Center for Spoken Language Understanding by FedEx within 5-10 business days.
For demos and more information, visit the CSLU Corpora website at: http://www.cslu.ogi.edu/corpora/corpCurrent.html
- CSLU, SOM CSLU
For more information, contact:
Technology Development Manager