Thursday, January 15, 2009

IBM ViaScribe for Captioning Live Events

Liberated Learning technology centers around two core applications: using speech recognition to automatically caption spoken language and create web accessible multimedia transcripts. Building upon a proof of concept application developed at Saint Mary's University, the Consortium has been researching and developing a second generation technology called IBM ViaScribe. IBM’s Human Ability and Accessible Centre Asia Pacific additionally developed IBM Caption Editing System (CES), an application that can be distinguished by its powerful editor. In 2008, the Consortium is anticipating the release of a powerful new system that will addresses a number of outstanding technical and user challenges.

ViaScribe Overview

ViaScribe contains a speech recognition engine capable of transcribing live or prerecorded speech. Live speech is delivered to the system via a standard or USB microphone. Typically, public speakers wear noise-canceling wireless headsets or lavalieres (lapel mics) that record high quality sound without impeding movement. ViaScribe can also transcribe pre-recorded speech from a variety of audio and video formats, including WAV, MP3, and AVI.

During a live presentation, ViaScribe serves as a real time text display--like a closed captioning window--outputting text as it is processed by the Speech Recognition engine. Because natural spoken language generally does not lend itself to rules of grammar and punctuation, ViaScribe promotes readability by introducing a paragraph break or other markers whenever the speaker pauses to take a breath. These pauses can be customized according to the speaker’s individual speech characteristics.

ViaScribe text display

The speaker can also use interactive voice commands to navigate PowerPoint slides or other applications during a live transcription, and automatically create captioned multimedia presentations.


Anonymous said...

I am profoundly deaf. Is it possible to get a copy of via scribe to run on a high specification laptop.

Also How much voice training is involved ?

Michael (Ireland)

billcreswell said...

The more you train, the better it is.

I am at about 90-95% accuracy, depending on whether you count against what I spoke, or the words acutually spoken by the original speaker. (I do "echoing" for the live captioning I do). I use captiomic instead, but it is still built on ViaVoice technology.