Current TTS Systems to Get $1.8 Million Grant to OGI School of Science & Engineering

11/11/02    Portland, Ore.

The Oregon Health & Science University school is a world leader in speech recognition technology

Current computer text-to-speech (TTS) synthesizers used by the hard-of-hearing and language-impaired are good at getting across basic facts. But typically, such synthesizers sound "bored," with little to no intonation or expression. Because words and sounds are just cut and pasted together, current TTS systems are punctuated with blips, clicks and an unevenness that makes it difficult for young listeners, in particular, to thoroughly understand what's being said.

Now, thanks to a new $1.8 million grant from the National Science Foundation to Oregon Health & Science University's OGI School of Science & Engineering, such technological shortcomings could someday be a problem of the past. Scientists at the school, based in Hillsboro, Ore., intend to create a computer TTS synthesizer system that more closely resembles natural speech. The system will figure into the next generation of interactive computer programs that teach language skills to children with reading, language and other communications disorders, such as autistic spectrum disorder.

"It's an ambitious project," said professor Jan van Santen, Ph.D., a mathematical psychologist in the Department of Electrical and Computer Engineering who heads the school's Center for Spoken Language Understanding (www.cslu.cse.ogi.edu) and is the lead scientist on the interdisciplinary, collaborative project. "But we think we can do it."

The project teams van Santen with a talented group of computational linguists, autism experts, computer scientists and neuropsychologists at Carnegie-Mellon University, AT&T Research, and the School of Science & Engineering. The five-year grant totals $2.75 million.

"Intonation is an essential part of meaning," said van Santen. "For kids with developmental or language problems, having educational materials that contain great expression or intonation, is essential to learning and paying attention in class.

"For a TTS system to be able to more expressively synthesize language, the computer has to be programmed to understand the context of say, a children's story," said van Santen. "The computer needs to decide which words to emphasize, which words should be spoken as parenthetic, and where in the story to briefly accelerate to better express, for example, a bunny who is running. For this to be possible, we want to program the computer in ways that it can better understand the world so it can make all these inferences from text."

For example, said van Santen, the computer could be modeled so it would know that bunnies in young children's stories are often good and wolves are generally bad, or that a grocery cart should be pushed by hands gripping a handle, and not pushed by, say, the wheels of the cart. "This kind of innate real-world knowledge will help the computer system speak more realistically and, therefore, naturally," said van Santen. "It's a tough problem to solve for computers, but we think its doable."

Speech technology has traditionally been driven by the military and telecommunications industries, noted van Santen, a longtime Bell Labs researcher who joined the School of Science & Engineering in 2001. "But there is huge potential for speech technology that is useful for education and health. We are trying to tap into that market and make our work helpful for the average person who has a learning or medical problem." The OGI School of Science & Engineering is the only school in the United States focusing on speech technology for education and health, he said.

Speech technology could someday be used to help illiterate people learn to read, to help non native speakers learn English, and to give autistic people more ways to communicate. Researchers at the School of Science & Engineering are studying a variety of ways humans and computers can better interact, and are developing innovative solutions that are strengthening communications between man and machine.

The OGI School of Science & Engineering (formerly the Oregon Graduate Institute of Science & Technology) became one of four specialty schools of Oregon Health & Science University in 2001. The OHSU OGI School of Science & Engineering has 63 faculty and more than 300 master's and doctoral students in five academic departments.

The Center for Spoken Language Understanding has five fulltime faculty, four postdocs, a dozen graduate students, and additional programming staff. For more information, visit www.cslu.cse.ogi.edu/.

###