Note
The CSS3 Speech module was not a recommendation at the time that EPUB 3 was finalized, but has since reach
Candidate Recommendation status. Until an update to EPUB 3 occurs, all supported properties from the
module must still be prefixed with -epub-
(reading systems will automatically map the
prefixed properties to the new official versions, so there will be no need to revisit old content).
The CSS3 Speech module provides additional text-to-speech (TTS) enhancement functionality. Unlike PLS lexicons and SSML markup, the Speech module properties are not focused on defining the correct pronunciation of words.
The primary property the CSS3 Speech module adds for enhancing TTS playback is speak-as
. This property provides the ability to control whether the TTS engine will read each
character (setting to spell-out
) or number (digit
)
in a string out. (See Example 1 and Example 2.) TTS
engines often use unreliable tests based on the apparent wordiness of acronyms to determine whether to
voice them, but this property allows you to override that behavior.
The speak-as
property also takes the complimentary values literal-punctuation
and no-punctuation
. The values,
as expected, control whether the TTS engine will voice punctuation.
The module also includes the speak
property, which provides the ability to
control TTS rendering of content, regardless of whether the containing element is visible or not. Setting
the none
value disables rendering on an element, and setting the normal
value enables.
The following table lists the remaining properties from the Speech module that are supported in EPUB 3. These properties are focused on non-prosodic aspects of TTS playback.
Description | |
---|---|
pause |
The The value of the
that time is applied both before and after the associated element. You can individually control the time to pause before and after by including a second time value:
The amount of pause specified occurs before any aural |
cue |
The Note that the cue property will render the associated audio clip both before and after the heading if only a single value is specified:
Readers typically only expect a cue to signal the start, so use the
The aural cue occurs between any |
rest |
The The value of the
that time is applied both before and after the associated element. You can individually control the time to pause before and after by including a second time value:
The amount of rest specified occurs after any |
voice-family |
The Although it's possible to name the voice to use:
in practice, with the wide variety of devices an EPUB may be played on, such specificity is only so useful as it requires knowing the names of all voices available on all devices. Instead, it is better to request a voice using the pattern: age?, gender, integer? (where the question mark indicates the field is optional):
The age value may be |
At the time of writing, no reading systems have appeared that support the CSS3 Speech properties. Please send a report if the situation changes and this page has not been updated.
The Speech module does not provide a way to tell an engine it must voice a capitalized term. When including an acronym like EPUB, you would have to use a lexicon or attach an SSML pronunciation attribute to absolutely ensure that it does not get spelled out.
Although most engines will voice significant pause points, such as colons, they will typically not render each punctuation point in a document as it would ruin the reading experience. There are times when it is critical to ensure that the reader is able to hear all the punctuation in a sentence or phrase, such as in grammar textbooks, programming guides and the like. (See Example 3.)
Accessible technologies also enable the pronunciation of all punctuation by default in elements
such as pre
and code
. Although the benefit of reading all punctuation
in computer code should be obvious, it's not always the case that preformatted text needs to such
detailed rendering. Applying no-punctuation
to a pre
block of text ensures that it will be read without punctuation being announced.