|
|
|
|
|
|
|
|
|
A speech server for Emacspeak and yasr (or other screen readers) that allows them to interface with Festival Lite, a free text-to-speech engine developed at the CMU Speech Center as an off-shoot of Festival.
Due to limitations inherited from its backend, EFlite does only provide support for the English language at the moment.
A small fast run-time speech synthesis engine. It is the latest addition to the suite of free software synthesis tools including University of Edinburgh's Festival Speech Synthesis System and Carnegie Mellon University's FestVox project, tools, scripts and documentation for building synthetic voices. However, flite itself does not require either of these systems to run.
It currently only supports the English language.
A general multi-lingual speech synthesis system developed at the CSTR [Centre for Speech Technology Research] of University of Edinburgh.
Festival offers a full text to speech system with various APIs, as well an environment for development and research of speech synthesis techniques. It is written in C++ with a Scheme-based command interpreter for general control.
Besides research into speech synthesis, festival is useful as a stand-alone speech synthesis program. It is capable of producing clearly understandable speech from text.
Recite is a program to do speech synthesis. The quality of sound produced is not terribly good, but it should be adequate for reporting the occasional error message verbally.
Given some English text, recite will convert it to a series of phonemes, then convert the phonemes to a sequence of vocal tract parameters, and then synthesis the sound a vocal tract would make to say the sentence. Recite can perform a subset of these operations, so it can be used to convert text into phonemes, or to produce an utterance based on vocal tract parameters computed by another program.
Provides a device independent layer for speech synthesis. It supports various software and hardware speech synthesizers as backends and provides a generic layer for synthesizing speech and playing back PCM data via those different backends to applications.
Various high level concepts like enqueueing vs. interrupting speech and application specific user configurations are implemented in a device independent way, therefore freeing the application programmer from having to yet again reinvent the wheel.
All the currently available free solutions for software based speech synthesis seem to share one common deficiency: They are mostly limited to English, providing only very marginal support for other languages, or in most cases none at all. Among all the free software speech synthesizers for Linux, only CMU Festival supports more than one natural language. CMU Festival can synthesize English, Spanish and Welsh. German is not supported. French is not supported. Russian is not supported. When internationalization and localization are the trends in software and web services, is it reasonable to require blind people interested in Linux to learn English just to understand their computer's output and to conduct all their correspondence in a foreign tongue?
Unfortunately, speech synthesis is not really Jane Hacker's favourite homebrew project. Creating an intelligible software speech synthesizer involves time-consuming tasks. Concatenative speech synthesis requires the careful creation of a phoneme database containing all the possible combinations of sounds for the target language. Rules that determine the transformation of the text representation into individual phonemes also need to be developed and fine-tuned, usually requiring the division of the stream of characters into logical groups such as sentences, phrases and words. Such lexical analysis requires a language-specific lexicon seldom released under a free license.
One of the most promising speech synthesis systems is Mbrola, with phoneme databases for over ten different languages. Unfortunately, the license chosen by the project is very restrictive. Mbrola can only be distributed as a pre-built binary. In addition, the phoneme databases are for non-military and non-commercial use only. We contacted the project developers, but they were unable to change the licensing of their work due to the limitations set by various contributors. Unfortunately, given the restrictive licensing model of Mbroa, it can not be used as a basis for further work in this direction, at least not in the context of the Debian Operating System.
Without a broadly multi-lingual software speech synthesizer, Linux cannot be accepted by assistive technology providers and people with visual disabilities. What can we do to improve this?
There are basically two approaches possible:
A speech output system that will allow someone who cannot see to work directly on a UNIX system. Once you start emacs with emacspeak loaded, you get spoken feedback for everything you do. Your mileage will vary depending on how well you can use Emacs. There is nothing that you cannot do inside Emacs :-). This package includes speech servers written in tcl to support the DECtalk Express and DECtalk MultiVoice speech synthesizers. For other synthesizers, look for separate speech server packages such as emacspeak-ss or eflite.
An Emacs client and an Elisp library to Speech Dispatcher. It provides a complex speech interface to Emacs, focused especially on (but not limited to) the blind and visually impaired users. It allows the user to work with Emacs without looking on the screen, using the speech output produced by the synthesizers supported in Speech Dispatcher.
A daemon which provides access to the Linux console for a blind person using a soft braille display. It drives the braille terminal and provides complete screen review functionality.
The following display models are currently (as of version 3.4.1-2) supported:
BRLTTY also provides a client/server based infrastructure for applications wishing to utilize a Braille display. The daemon process listens for incoming TCP/IP connections on a certain port. A shared object library for clients is provided in the package libbrlapi. A static library, header files and documentation is provided in package libbrlapi-dev. This functionality is for instance used by Gnopernicus to provide support for display types which are not yet support by Gnopernicus directly.
The background program screader reads the screen and puts the information through to a software Text-To-Speech package (Like `festival') or a hardware speech synthesizer.
The kernel package kernel-image-2.4.24-speakup contains a Linux kernel patched with speakup, a screen reader for the Linux console. The special property of speakup is that it runs in kernel space, which does provide a little bit more low level access to the system then other screen readers can provide. Speakup can for instance read critical kernel messages to you at a point where the kernel has already Oopsed, and no user space program could do anything useful at all anymore.
Speakup currently supports the following hardware speech synthesizers:
A general-purpose console screen reader for GNU/Linux and other Unix-like operating systems. The name "yasr" is an acronym that can stand for either "Yet Another Screen Reader" or "Your All-purpose Screen Reader".
Currently, yasr attempts to support the Speak-out, DEC-talk, BNS, Apollo, and DoubleTalk hardware synthesizers. It is also able to communicate with Emacspeak speech servers and can thus be used with synthesizers not directly supported, such as Festival Lite (via eflite) or FreeTTS.
Yasr works by opening a pseudo-terminal and running a shell, intercepting all input and output. It looks at the escape sequences being sent and maintains a virtual "window" containing what it believes to be on the screen. It thus does not use any features specific to Linux and can be ported to other Unix-like operating systems without too much trouble.
Accessibility of graphical user interfaces on UNIX platforms has only recently received a significant upswing with the various development efforts around the GNOME Desktop, especially the GNOME Accessibility Project.
This package contains the core components of GNOME Accessibility. It allows Assistive technology providers like screen readers to query all applications running on the desktop for accessibility related information as well as provides bridging mechanisms to support other toolkits than GTK.
ATK is a toolkit providing accessibility interfaces for applications or other toolkits. By implementing these interfaces, those other toolkits or applications can be used with tools such as screen readers, magnifiers, and other alternative input devices.
The runtime part of ATK, needed to run applications built with it is available in package libatk1.0-0. Development files for ATK, needed for compilation of programs or toolkits which use it are provided by package libatk1.0-dev.
The GNOME Speech library gives a simple yet general API for programs to convert text into speech, as well as speech input.
Multiple backends are supported, but currently only the Festival backend is enabled in this package; the other backends require either Java or proprietary software.
Gnopernicus is designed to allow users with limited or no vision to access GNOME applications. It provides a number of features, including magnification, focus tracking, braille output, and more.
Dasher is an information-efficient text-entry interface, driven by natural continuous pointing gestures. Dasher is a competitive text-entry system wherever a full-size keyboard cannot be used - for example,
The eyetracking version of Dasher allows an experienced user to write text as fast as normal handwriting - 25 words per minute; using a mouse, experienced users can write at 39 words per minute.
Dasher uses a more advanced prediction algorithm than the T9(tm) system often used in mobile phones, making it sensitive to surrounding context.
GOK [GNOME Onscreen Keyboard] is a dynamic onscreen keyboard for UNIX and UNIX-like operating systems. It features Direct Selection, Dwell Selection, Automatic Scanning and Inverse Scanning access methods and includes word completion.
GOK includes an alphanumeric keyboard and a keyboard for launching applications. Keyboards are specified in XML enabling existing keyboards to be modified and new keyboards to be created. The access methods are also specified in XML providing the ability to modify existing access methods and create new ones.
See the Debian contact page for information on contacting us.
Last Modified: Sat, Apr 3 18:10:11 UTC 2004
Copyright © 1997-2004
SPI; See license terms
Debian is a registered trademark of Software in the Public Interest, Inc.