OHSU
Categories | Inventors

CORPORA from CSLU: SR4X v1.2

 

OHSU # 0681-P

General Description
This corpus is a collection of 36 speakers saying eleven words six times on four different channels.

Recording Details
The data was captured digitally from the CSLU T1 connection, and saved as 8 khz 8-bit ulaw.

There were four different recording channels used to call the T1 data line. They were:

  • office telephone
  • home telephone
  • carbon microphone telephone
  • speaker phone


All of the data use the RIFF standard file format. This file format is 16-bit linearly encoded.

Directory Structure
There are five top-level directories in this distribution:

  • docs -- the docs directory contains assorted documentation files
  • labels -- the labels directory contains time align phonetic transcriptions
  • misc -- the misc directory contains scripts and archival information
  • speech -- the speech directory contains the .wav files containing speech data
  • trans -- the trans directory contains .txt files containing transcriptions of the corresponding .wav files in the speech directory


The speech and trans directories contain the data files, which have the following name structure:

SD-xxxx-tttt-tc-yy.zzz

xxxx = call number

tttt = utterance code

c = channel type

yy = arbitrary serial number

zzz = file extension (txt/wav)


For example:

SD-1030-computer-t4-42.wav



This utterance is from caller 1030 and consists of the word "computer". It was recorded on channel four and the serial number was 42. The serial numbers were used to differentiate between different instances of the same utterance by the same speaker.

The audio files are subdivided into directories based on their call number. So, this call would be found in /sr4x/speech/1030.

Transcriptions
There were no orthographic transcriptions done, but automatically generated word-level transcriptions do exist. In addition, each call was verified and categorized as: good, bad, noisy or different. A detailed description of the categories can be found in the SR4X publication.



Agreement was about 85%. The following confusion matrix shows where most of the confusions occurred.

 

                  good    bad    noisy     diff

good         6877      0        314       628

bad            31         45        0           2

noisy          142        3        414      60

good          119        6         25         3

 

The Center for Spoken Language Understanding (CSLU) distributes corpora to commercial entities and academic institutions for a fee. Commercial entities can use these corpora for research but also for creating commercial products such as generating acoustic models for speech recognition.

 

To place your order:

1. Click on the type of license you wish to order: Academic or non-profit entity or Commercial entity.

2. Terms of the license agreement can be viewed by clicking on the word "terms".

3. You agree to the terms of the license agreement when you click on "Add to Order" and proceed to the next screen.

4. If information on the "Order Contents" screen is correct, press "Check out".

5. On the next screen, a brief "Intended Use" is required. For "Recipient Scientist Information" enter the appropriate information for yourself or if you are placing the order for another person enter that information. We will use this information should we have questions about the order, payment or shipping address.

6. Once your payment has been received and verified by OHSU, your order will be approved by Technology Transfer & Business Development and then the DVD will be sent out by the Center for Spoken Language Understanding by FedEx within 5-10 business days.  

 

For demos and more information, visit the CSLU Corpora website at:   

http://www.cslu.ogi.edu/corpora/corpCurrent.html  

Inventors:

Categories:

For more information, contact:

Trina Voss
Technology Development Manager
503-494-9839

OptionPrice
(terms)  $50.00
(terms)  $1500.00