One of the limitations of the core HTML markup grammar is that it is not well-suited for defining rich
data structures because of its small set of elements. There may be hundreds of aside
elements in a publication, for example, but reliably distinguishing which ones represent notes from
sidebars from warnings and alerts has not been possible.
For sighted readers, the deficiency this causes has been masked by the enhanced visual rendering that CSS style sheets afford (backgrounds, borders and shading are used to convey roles visually). For readers using assistive technologies — which rely on an understanding of the underlying markup in order to facilitate navigation — Web-based technologies, like EPUB, have only had limited accessibility because primary and secondary material was often indistinguishable below the visual surface.
To make ebooks more accessible, you need to consider that many readers will be interacting with the
content in non-visual ways, and for that reason the logical reading order must
be defined at the markup level. To facilitate this discovery, EPUB 3 includes a new
epub:type
attribute that allows more precise meanings to be applied to the generic tags,
a process called semantic inflection.
Although critical for accessible navigation, creating semantically-rich data has benefits for all readers. The enabling of specialized behaviors, such as the opening of footnotes, is directly predicated on content being properly identified. Rich data also future-proofs content, both by identifying the original authoring intent, in cases where it may be ambiguous, and by making it simpler to archive and reprocess.
Note
Semantic inflection can only be used to define the nature of structural markup. It is not defined for making associations between your content, a process called semantic enrichment. See the faq for more about the availability of semantic enrichment mechanisms in EPUB.
The epub:type
attribute can be attached to any element in the body of a document, and it
accepts any of the terms defined in the EPUB
Structural Semantics Vocabulary by default.
For example, the section containing the dedication for the work could be identified as follows:
<section epub:type="dedication">
…
<section>
The dedication
value used in the above example is not just a random
string, but is a predictable value that reading systems can expect to encounter across
publications.
Although, in theory, any semantic could be applied to any element, only certain semantics make sense
to use on any given tag. Marking an aside
element as a footnote is appropriate, for
example, but marking a section
as a footnote not so much. The Structural Semantics
Vocabulary lists the common element(s) each semantic is intended to be used in conjunction with to
facilitate this process (although exceptions to the rule may arise).
You are not limited to making only one statement in the epub:type
attribute, either. You
could, for example, explicitly note whether a dedication falls in the front or a back matter by
including a second space-delimited semantic:
<section epub:type="dedication backmatter">
…
<section>
Note that the order of the semantics is not important to their processing. Including more than one semantic can affect styling, however. The following CSS rule to match tables of contents:
section[epub|type='dedication'] {
…
}
would not match the second example, only the first. When using attribute selectors in CSS,
you must account for space-separated values by using the ~=
notation. The following CSS
declaration would match dedication
in both the preceding markup
examples:
section[epub|type~='dedication'] {
…
}
Once a semantic has been defined, the nature of the containing element influences all content defined
in it. For example, although the previous example attached the backmatter
semantic to the element containing the dedication, all the back matter
sections could be grouped into a parent backmatter
section as
follows:
<section epub:type="backmatter">
<section epub:type="dedication">
…
</section>
<section epub:type="index">
…
</section>
…
<section>
Since front/body/back matter is more of an ephemeral context in which content is used than an actual
section of content, a better approach is to include this information on the body
tag:
<body epub:type="backmatter">
<section epub:type="dedication">
…
</section>
<section epub:type="index">
…
</section>
…
<body>
When processing elements based on their semantics, applications typically will check the entire ancestor chain to determine the applicable relationships.
epub
namespaceWhen using the epub:type
attribute in a content document, the epub
namespace
must be declared on the element containing the attribute, or on one of its ancestors. The namespace
is typically declared once on the root html
element, as in the following example:
<html …
xmlns:epub="http://www.idpf.org/2007/ops">
…
<dl epub:type="glossary">
…
</dl>
…
</html>
The epub:type
attribute is not limited to values defined in the EPUB Structural Semantics Vocabulary. Additional
terms may be used, whether defined in an RDF vocabulary or not, so long as a unique prefix has been
defined in the epub:prefix
attribute for them. To use terms from the more expansive Z39.98 Structural Semantics
Vocabulary, for example, the prefix z3998
could be defined as follows:
<html …
xmlns:epub="http://www.idpf.org/2007/ops"
epub:prefix="z3998: http://www.daisy.org/z3998/2012/vocab/structure/#">
…
<section epub:type="frontmatter z3998:published-works">
…
</section>
…
</html>
The URI associated with a prefix is currently only a unique identifier string; it does not have to resolve to a document. It is recommended that only terms from industry-standard vocabularies and controlled list be used, however, since reading system support for arbitrary values is unlikely (but there is no reason to strip semantics from an internal workflow, for example).
Note
Although the EPUB specification reserves the option to define prefixes for industry-standard vocabularies and controlled lists, none are reserved for content documents at this time.
epub:type
attributeSupport for both RDFa and microdata was added in the 3.0.1 revision, but these metadata frameworks
handle semantic enrichment (making the content itself more easily understandable and
processable). These technologies do not compete with the epub:type
attribute, but
complement it.
Creating meaningful class names for your CSS is certainly encouraged, but reading systems are
neither required nor expected to do anything with the class
attribute as far as
semantic processing goes.
Microformats, more generally, are not recommended as they blur the line between content authoring (and styling) and semantic inflection, and appropriate elements and attributes for non-standard uses. This latter use creates problems for accessible processing and rendering of content.