OHSU
Categories | Inventors

CORPORA from CSLU Portland cellular v1.3

 

OHSU # 0681-M

General Description
The Portland Cellular Corpus consists of utterances gathered from callers who were using cellular telephones. Each participant called a toll-free cellular number, then listened and responded to the protocol. A total of 515 different callers participated.

Recording Details
The data was captured digitally from the CSLU T1 connection, and saved as 8 khz 16-bit linear.

Directory Structure
There are four top-level directories in this distribution: docs, speech, phn, and txt. The docs directory contains assorted documentation files.

The speech, txt, and phn directories contain the data files, which have the following name structure:

 

PCxxxyyy.zzz

xxx = call number

yyy = 3-letter utterance code

zzz = file extension (txt/wav/phn)



For example:

PC131FAS.wav



This utterance is from caller 131 and contains a response to the question "About how fast are you traveling right now?".

The mappings between 3-letter utterance codes and the expected utterance can be found in the map file. This file is a comma delimited list that can be easily adapted to suit your purposes.

Corresponding tex and phonetic transcriptions can be found in these files:

 

PC131FAS.txt

PC131FAS.phn



These audio and text files are subdivided into directories based on their call number mod 10. So, these files would be found in /portcell/speech/13, /portcell/txt/13, and /portcell/phn/13, respectively.

Transcriptions
The text transcriptions were performed according to the non time-aligned word-level conventions described in the CSLU Labeling Guide. This document is available at the CSLU web site.

Phonetic transcriptions are plain text files that carry time-aligned phonetic labels. The first two lines of the file are a header, which defines the length of a "frame" in milliseconds. The rest of the files consist of two numbers that define a frame range, and a label that applies to that region. For example:


MillisecondsPerFrame: 1.000000
END OF HEADER
2 113 .pau
113 191 w
191 267 ^
267 395 n


So, we can see here that a frame corresponds to 1 millisecond (ms) of time, and that from 2 to 113 ms into the file, there is a pause (.pau), with the first phoneme (w) starting at 113 ms and stretching to 191 ms.

The Center for Spoken Language Understanding (CSLU) distributes corpora to commercial entities and academic institutions for a fee. Commercial entities can use these corpora for research but also for creating commercial products such as generating acoustic models for speech recognition.

 

To place your order:

1. Click on the type of license you wish to order: Academic or non-profit entity or Commercial entity.

2. Terms of the license agreement can be viewed by clicking on the word "terms".

3. You agree to the terms of the license agreement when you click on "Add to Order" and proceed to the next screen.

4. If information on the "Order Contents" screen is correct, press "Check out".

5. On the next screen, a brief "Intended Use" is required. For "Recipient Scientist Information" enter the appropriate information for yourself or if you are placing the order for another person enter that information. We will use this information should we have questions about the order, payment or shipping address.

6. Once your payment has been received and verified by OHSU, your order will be approved by Technology Transfer & Business Development and then the DVD will be sent out by the Center for Spoken Language Understanding by FedEx within 5-10 business days.  

 

For demos and more information, visit http://www.cslu.ogi.edu/corpora/corpCurrent.html

Inventors:

Categories:

For more information, contact:

Trina Voss
Technology Development Manager
503-494-9839

OptionPrice
(terms)  $50.00
(terms)  $3500.00