My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
IndexesCharterProposal  
Charter Proposal, Indexes
Updated Apr 6, 2012 by markus.g...@gmail.com

Charter Proposal - Indexes

Status of this proposal

This charter proposal has been accepted by vote of the IDPF membership as of 1/31/2012 and a Working Group formed. Please see the Indexes Working Group Main Page for current activity on this project. The final charter document is available at http://idpf.org/charters/2012/indexes/.

Note that the Definitions section contains definitions for terms as used in this document. The definitions are intended to apply narrowly, within the scope of this document, and should not be construed as applying to the field of indexing in general or to EPUBs as a whole.

Need for this proposal

Indexes are specialized navigational and supplemental information tools that offer readers an interaction with content that is enhanced, more powerful, and more specific than simple search. Users will expect to have indexes available in the EPUB3 ecosystem and accessible as easily as search. Publishers of EPUB3 content wish to make this data available to users, to allow them to explore book contents beyond what search results reveal.

Readers use indexes in a variety of ways: to quickly locate discussions in content, to discover relevant content that is discussed with differing synonyms, to discover new terminology for concepts, and to see details of topics covered in an eBook. Indexes convey a sense of the depth of topic coverage in an eBook, break down large concepts into important subcategories, and allow exploration of content through granular and user-friendly access points. Indexes provide the added value of human analysis, enabling an interactive conversation between the reader and the book. Indexers are not constrained to use as entries the terms used by the author, or even in some cases only the terms that appear in the entire document: indexers are focused on meanings, not just words. Indexes are also a pre-coordinate search system, as opposed to search's propensity to being post-coordinate.

Index information and metadata can be used by devices to provide navigation and supplemental search details to the reader. Search can be supplemented and fine-tuned by reading index metadata to provide better results. Index metadata can provide new views into the semantic underpinnings of an eBook.

This proposal describes the scope, required functionality, and timeline to deliver a standard for producing ePUB3 publications that meet the use cases included in this proposal

As a navigation feature, support for indexes relates directly to Item 6 in the EPUB Revision Working Group Charter, regarding enhanced navigation support (see here).

Main wiki page for the Working Group is here.

Scope

In-scope (Deliverables)

The scope of this project is to define a declarative mechanism for the representation of indexes in EPUB Publications. As further detailed in Use Cases, Needed Publication Properties, and Reading System Behaviors below, the delivered mechanism shall have the following top-level functional properties:

  • Allow users to read or browse an index in full chapter-like format
  • Allow users to quickly access index information in a search context
  • Allow users to see index entries associated with a range of text

Out of Scope

Indexers write indexes using a variety of tools, ranging from built-in modules in page layout and XML content management software to dedicated index preparation software. Details of how to implement indexing in those tools are out of scope.

Ordering of main headings and subheadings in the index are part of the creation process and thus out of scope.

Index display format in chapter form (e.g., indenting, spacing, etc.) can vary greatly, depending on the writer and publisher. Suggested presentation formats are out of scope.

Low-level, system-oriented functionality for fast lookup, reverse lookup, and retrieval, typically described in terms of a database-like file, are out of scope.

Integration Constraints

The defined mechanism shall integrate with EPUB 3 as follows:

  • Graceful fallback: it must allow EPUB 3 Reading Systems to open and reasonably render Publications containing the mechanism, even if the Reading System has not been updated to explicitly support the mechanism.
  • Native grammars and extension points: it must utilize EPUB 3 Content Document grammars to the maximum extent possible, and it must only use extension points defined within EPUB 3 and XHTML 5.
  • Shallow implementation: Reading System implementation of the mechanism must not require changes to underlying (browser-based or other) XHTML rendering engines; full implementations must be possible on the Reading System level alone.

Timeline and Participation

Project participation is open to IDPF members and invited experts. (Note that invited expert status needs to be renewed for each IDPF project.)

The project charter spans one year in total. Once formed, the working group will decide on feature prioritization and possibly also versioning strategies, after which the milestones below can be dated.

Draft Charter Proposal to WG for review December 2, 2011
Submission to Membership for Approval January 6, 2012
WG creation, formal project start January 23, 2012
WG Face-to-face Feb timeframe TBD
First WG Draft TBD
Second WG Draft TBD
Proposed Specification TBD
Recommended Specification TBD
Maintenance/Tutorials Through Jan 2013

This project is intended to be run concurrently with the project on dictionaries and glossaries, and so shares the charter span with that project.

Working Group Leads

Suggested Leads of this working group are:

  • Michele Combs, American Society for Indexing (Co-Chair)
  • David Ream, American Society for Indexing (Co-Chair)
  • Jan Wright, American Society for Indexing (Co-Chair)

Use Cases

  1. Chapter-like index:
    • User navigates to chapter-like index to browse topics and find information.
    • User expands or collapses main headings/subheadings.
    • User selects special symbol, prefix or suffix (for example; asterisks, dagger symbols) to display meaning of symbol.
    • User hovers over index link to display contextual information (e.g., 3-4 words from each side of target location in text)
    • User clicks index links to navigate to the book's content.
    • User clicks cross reference to navigate to the target heading or to view a list of target headings.
  2. Pop-up index:
    • User selects a term or phrase and triggers a pop-up view of the index displaying the first matching main heading.
    • User opens index from book's content with nothing selected, displaying top of pop-up index or last-used position in index.
    • User browses for terms or enters search text in pop-up index display. Entering search text triggers stemming and auto-fill.
    • User expands or collapses main heading/subheading levels.
    • User hovers over index link to display contextual information (e.g., 3-4 words from each side of target location in text)
    • User clicks link in pop-up index display and returns to book content.
  3. Reverse index:
    • User selects a range of text and triggers pop-up list of all in-context index entries for the range.
    • User selects in-context index entry, and triggers access to selected entry in pop-up index.
  4. Standalone index:
    • User opens a publication that consists of one or more master indexes which contain links to other EPUBs.
    • User browses topics and finds information.
    • User expands or collapses main headings/subheadings.
    • User clicks index links to navigate to other EPUBs.

Needed Publication Properties

Package metadata

  • A publication contains one or several indexes as files or as sections, and declares these resources via package metadata.
  • A reverse index contains the same information as the chapter-like index but sorted in locator order rather than alphabetical order. The presence of a reverse index is declared via package metadata.
  • A publication contains that one or more standalone indexes and declares these via package metadata.
  • An index contains group break navigation data, available for use in a floating or persistent navigation feature for the chapter-like index, and declares this resource via package metadata.
  • A publication contains a machine-discoverable index-symbols list (legend), and declares this resource via package metadata.
  • A publication containing semantic markup to display lists of related main headings for generic cross references declares this capability via package metadata.

Index links

  • Index links can identify single locations, multiple locations, or ranges for lengthy subject coverage in the publication's main content.
  • Group break navigation data provide links to symbol, number, or letter group breaks within an index.
  • Generic cross references can identify and display the related semantically-marked main headings as a list.
  • Index entries contain targets for the navigational system's links within the index; for instance, cross references will have targets, as will the group breaks for letter sections.

Index presentation

  • Master indexes that index multiple volumes will include links to targets in other epubs.
  • Unique characters and numbers that act as group breaks for letter sections in the index are present, and marked in machine-discoverable form.
  • Headnotes, if present, are marked as such and are presented at the beginning of the chapter-like index.
  • In-line editor's notes, if present, are displayed in chapter-like and pop-up index.
  • Proper text alignment and indentation is maintained in the chapter-like index.
  • Special formatting of the index’s content (italics, bold, sub-, super-script, fonts, special characters) are preserved in the index’s content.
  • Decorations, prefixes or suffixes used in the index to annotate locators (daggers, ff, n, nn and so on), if present, are marked up as such and machine-discoverable.
  • A legend containing definitions or explanations of each decoration, prefix or suffix, if available, defines the description for each symbol.

Reading System Behaviors

Note: the intent of this project is not to mandate reading system behaviors. The list below only serves the purpose of illustrating Reading System/Index interactions.

Implied/assumed (existing functionality in EPUB readers that indexes will use)

  • Reading system will properly display text encoded with special formatting, i.e. bold, italic, subscript, superscript.
  • Reading system must be able to discover whether an EPUB contains one or more indexes.
  • Reading system must be able to discover whether an EPUB consists of one or more standalone indexes.
  • Reading system will properly display text encoded as a link, i.e. as text that can be hovered over or clicked to trigger an action (taking user to target, displaying contextual phrase, etc.)
  • Reading system includes buttons or menu options to access either the chapter-like index or the pop-up directly from the text, without having to visit the table of contents.
  • Reading system allows the reader to select collapsed or expanded views of the index levels (main headings only, main and subheadings, etc.).
  • Reading system determines how targeted location is displayed on the screen after its link has been clicked (top of screen, middle of screen, highlighted term, highlighted range of text, blinking symbol or indicator of location, etc.)
  • Reading system displays a legend for special symbols used in the index’s locator decorations in a pop-up.

Chapter-like index

  • Reading system displays chapter-like index as normal pages.
  • Reading system displays a floating or persistent set of group break navigational links in the chapter-like index to allow navigation to other sections of the index.
  • Reading system displays floating or persistent access to headnotes.
  • Reading system persistently displays applicable parent entry(ies) as user scrolls through lower-level entries, if applicable/necessary.

Pop-up index

  • Reading system displays pop-up index as separate window, automatically scrolled to the term selected when it was activated (or defaults to top of index if nothing was selected)
  • Reading system provides search functionality within popup index.
  • Reading system displays floating or persistent access to headnotes.
  • Reading system persistently displays applicable parent entry(ies) as user scrolls through lower-level entries, if applicable/necessary.

Reverse index

  • Reading system must be able to uniquely identify multiple index targets in a selected section of text (e.g., a paragraph).
  • Reading system must be able to extend that identification to include index targets whose range encompasses the selected text (e.g. a range that begins prior to the selected text and ends after the selected text)
  • Reading system must be able to locate the main headings in the index associated with each of those anchors.
  • Reading system must be able to display those main headings to the user.
  • Reading system must be able to render each main heading as a live link to the heading's location in the chapter-like index.

Standalone index

  • Reading system must be able to link from one EPUB to another, and have a return mechanism.

References

Definitions

This section contains definitions for terms as used in this document. The definitions are intended to apply narrowly, within the scope of this document, and should not be construed as applying to the field of indexing in general or to EPUBs as a whole.

Auto-fill
Auto-fill functionality pre-scrolls a pop-up index to main headings in the index matching the letters as they are typed in by the user.
Browsing
Reading/skimming index content.
Chapter-like index
An index presented in a book's content as a chapter, accessed from the table of contents and from special menus or icons. It can be paged through and browsed as normal content, with hyperlinks back into the book's content, and cross-reference hyperlinks to other areas of the index.
Cross reference
Entry in an index that directs the reader from one term to another term. An entry should be hyperlinked to the targeted term. There are three types: See references, See also references, and Generic cross references (defined below).
Decoration
A prefix, suffix, symbol or special formatting added to locators to indicate special content, such as tables, figures, or primary discussions.
Editor's note, inline
Editorial note that is part of an index entry, found inline after the main heading or subheadings.
Entry
A unit of an index, consisting of a main heading, zero or more subheadings, and at least one locator or cross reference.
Generic cross reference
Cross reference to a category of entries rather than a specific entry. For example, in a software manual: "Commands. See names of specific commands", or in a book on pets: "Dogs. See names of specific breeds, e.g. golden retriever".
Group break navigation data
A string of hyperlinked letters and/or digits (e.g., A-Z, 0-9) used to easily navigate to another section of the index: for example, clicking P would take the user to the section of the index beginning with P. Other alphabets and character systems would display the appropriate glyphs for any navigation data.
Headnote
Explanatory paragraph(s) at the head of the index that describe unique features of the index (e.g., special typography, scope of the index, omitted items, etc.) that the reader needs to know in order to effectively use the index.
Index
An intuitively sorted (usually alphabetical) list of entry terms providing a variety of different access points to all significant discussions of subjects, which might be concepts, entities, processes, individuals and organizations within a document, with associated locators indicating where these discussions are to be found.
Legend
A section of content that explains locator decorations, special symbols, or other typography for the user.
Level
Nested depth of subheadings beneath each main heading. A main heading is level 1; a subheading is level 2; a sub-subheading is level 3; and so on. There can be as many levels as the indexer and publisher feel necessary.
Locator
Pointer from an entry in the index to a significant treatment of the topic in the text, which may be a page number, section number, etc. In an EPUB the locator should appear as a hyperlink.
Main heading
Words, symbols, or phrases based on or selected from the book's content, expressing a concept, idea, or proper name. A main heading may or may not have subheadings, but must have one or more locators or a cross reference.
Master index
An index that covers more than one publication. A master index can be part of an EPUB with other content or part of a standalone index.
Package metadata
Data about the EPUB as a whole. Please see descriptions at package document and package metadata.
Pop-up index
Index view activated by user while in the text and displayed in a separate window.
Post-coordinate system
System in which the user enters one or more terms which are matched character-by-character in the target text. Search engines are an example of post-coordinate systems.
Pre-coordinate system
System in which co-relations (e.g., broader/narrower relations, semantic connections) between topics have been determined by human analysis, adding an enhanced level of sophistication and specificity. An index is an example of a pre-coordinate system.
Range
A locator that indicates a span of text, i.e., where coverage of a subject begins and ends.
Reverse index
Index view activated when the reader highlights a range of text, which displays in a separate window the index entries associated with the range.
See also reference
Cross reference that directs the reader to related, broader, or narrower subjects covered at other main headings.
See reference
Cross reference that directs the user from an term not used in the index to the preferred term in the index.
Standalone index
A publication that consists only of one or more indexes to other EPUBs or external targets.
Stemming
Stemming engines supply root forms of words and incorporate multiple versions (grow, growing, grows, growth) into search, extending the search's results.
Subheading
Second-level, third-level, fourth-level, etc. headings subordinate to a main heading.
Target
Unique id code located in book's content, available for links to use in navigation.
Comment by bentraff...@gmail.com, Dec 6, 2011

This is a well-understood problem space. There should be a requirement that this work is informed by existing mechanisms defined in popular markup languages, such as the Text Encoding Initiative - see here for their work on back matter such as indices: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSBACK

Comment by project member janwri...@gmail.com, Dec 6, 2011

From Steve Ingle: "1) It might be useful to mention the index's superiority not just to simple search but also to hyperlinked table of contents. Some editors have opined that the hyperlinked, detailed TOC, coupled with search obfuscates the need for the index. So maybe in the first sentence of the "Need for this proposal" section, add "or hyperlinked table of contents" before the period.

2) The definition of index is given as "Alphabetical list of names..."; perhaps change to "A list, usually alphabetical, of names...", since some print indexes (and conceivably digital indexes as well) may not necessarily be alphabetical, but could be ordered numerically, chronologically, etc.

3) Add a definition for "breadcrumbs" to the "Definitions" section."

Comment by project member janwri...@gmail.com, Dec 6, 2011

From Mary Russell: "(1) Use cases / Chapter-like Indexes

a) there ought to be quick navigation tools to reduce scrolling down the index to get to the point you want.

b) there ought to be a mechanism to navigate back to the Index entry which was the starting point for navigating out to the text (or any intermediate point). This would save starting your indexing search again to get back to the same point in the index.

c) also there ought to be some indication when they are in relation to where they came from (or some visual or textual map) and can jump

- example of a visual map would be the tree and branch path navigating out of the index into the text

- example of a textual map would a text line showing the path i.e. if a user clicks on index entry Pigeons, Homing, South America then a text line appears at the top or bottom of the page "Index: Pigeons: Homing: South America>"

(2) Generally

a) do indexes need any special nomenclature for URLs, video-clips etc?

To save jumping around you might have mixture of links in the index and wondered if you needed to note the difference in some way?"

Comment by project member janwri...@gmail.com, Dec 6, 2011

From me (as opposed to the comments above that are coming in from our indexing listserv) Could there be a mechanism that would allow an index to be added later to a book. Or perhaps swapped out and replaced.

This might be feasible if there is an alternative place to look on the web, declared in package metadata, that updates the index or pulls in one if it is created later.

This is an interesting idea in light of Amazon's XRAY as well, in which books that didn't make the initial cut for having the sidefile are getting the feature added later. Readers are noticing that yesterday a book didn't have it, and then today it does.

I wonder if there would be a way to declare a URI that the eReader checks, and updates itself.

Comment by project member bkasd...@apexcovantage.com, Dec 7, 2011

At the risk of overcomplicating this (I think the separation of the index and dictionary work makes sense), would there be a benefit to providing the ability to link an index to a dictionary or, even better, a thesaurus, so that if a user didn't find a term being sought in an index, authoritative alternatives could be suggested?

Comment by project member eb2m...@gmail.com, Dec 9, 2011

Is this proposal intended to support authoring of indexes? For example, DocBook? has indexterm elements, from which indexes can be generated. See Mastering DocBook Indexes

Or, is this proposal concerned about final EPUB documents, which are generated from other sources containing indexterm or something similar?

Does the proposed extension make it difficult to use HTML rendering engines as a basis for EPUB reading systems? Or, is everything intended to be represented by existing HTML5 constructs? If this is the case, why do we need this extension? Aren't in-house conventions by publishers good enough?

Comment by project member janwri...@gmail.com, Dec 10, 2011

The SI Publishing Technology Group would like to comment on the Charter Proposal – Indexes, on behalf of the Society of Indexers (UK).

First we want to congratulate DTTF on this valuable initiative and the rapid progress you have made in engaging with EPUB on behalf of the profession worldwide. We would like to suggest a few improvements in those few areas of the proposal whose terminology bears on what we see as the fundamental challenge facing human analytical indexing today: not of electronic delivery per se but the displacement of human indexes by mere listings of keyword and key phrase occurrences.

We hope DTTF would agree that, leaving aside later refinements like the provision of cross-references and subentries, the key contribution of human indexers is twofold: first, applying tests of significance and uniqueness to any instance of a discussion within a text and second, controlling the entry vocabulary. This latter might involve harmonising terminology choices (especially between multiple authors); choosing between groups of synonyms and near-synonyms; relating concepts within a hierarchy and biasing the language toward the anticipated readership. For example a technical manual may well describe processes only by their specific names but the user is likely to approach it in terms of a perceived problem, thinking of very different terms. If so, the key requirement for EPUB is that the indexer is not constrained to use as entries the terms used by the author, or even in some cases only terms that appear in the entire document: we index meanings, not words. Furthermore, the analysis of word occurrences – a concordance function – is a process that could be carried out faster and more efficiently by a computer; surely it is the application of a human reader’s judgment to the analysis of human discourse that adds value to the indexed document?

We are sure we also agree that without this careful exercise of judgement, reflecting the way authors express concepts and readers apprehend their meaning, keyword-based indexes are bloated by noise and false drops, while also failing to collocate instances of a discussion using even minor terminological variants. Keyword indexes waste the readers’ time by retrieving passing, negative, repetitive, indicative and figurative uses of terms, yet fail to include the same concepts described elliptically, allusively or indeed in any novel manner. Because human authors don’t repeat a predictable succession of identical key phrases, keyword indexes will always list too much, yet cannot provide confidence that everything relevant has been found.

This detail is not specifically spelled out in ‘Need for this proposal’ and, though the statement of need is commendably clear as far as it goes, we think that, in the definitions section, the entries for index and locators signally fail to make these necessary distinctions and run the risk of implying that an index is about keywords and that index entries will necessarily match terms in the text. We are sure this wasn’t DTTF’s intention. At present the first definition reads:

Index

Alphabetical list of names, subjects etc. that appear in a document, with associated locators showing the places where they occur.

We would suggest this is seriously misrepresents what an index delivers and understates the value it adds. Might a better formulation be:

Index

An intuitively sorted (usually alphabetical) list of entry terms providing a variety of different access points to all significant discussions of subjects, which might be concepts, entities, processes, individuals and organisations within a document, with associated locators indicating where these discussions are to be found.

The second currently reads:

Locators

Pointers from an entry in the index to its occurrence in the text, such as page numbers, section numbers, etc. In an EPUB these locators should appear as hyperlinks.

Since the implication that entries must occur in the text is very misleading, we would suggest:

Locators

Pointers from an entry in the index to a significant treatment of the topic in the text, which may be page numbers, section numbers, etc. In an EPUB these locators should appear as hyperlinks.

Pre-coordination

The use of ‘sophistication’ is vague here; ‘specificity’ perhaps? Similarly in the Need for this proposal, ‘sophisticated’ might perhaps be replaced by ‘powerful’?

Standalone index

This may cause confusion. In the UK at least a standalone index is one supplied as a separate file, as opposed to an embedded index. We would suggest that, if it is retained, a clearer distinction needs to be drawn between this and the master index.

Under ‘In scope (Deliverables)’, we meet ‘semantic metadata’. Should this not appear in the definitions too?

Similarly, under Reading system behaviors, displaying breadcrumbs is listed. We think the utility of breadcrumb trails in a pop-up index needs to be explained and that ‘breadcrumbs’, as a relatively novel concept in indexing, should also be included in the definitions.

Under ‘Out of scope’, since indexes cannot be created by any tools without human intervention, we’d suggest a better opening sentence might be ‘Index-like listings can be created by a variety of tools…’

Floating access to headnotes is mentioned under ‘Reading system behaviors’ but, under ‘Needed publication properties’, headnotes are only required to be presented at the beginning of the chapter-like index.

Under ‘Reading system behaviors’, ‘highlighted term’ implies the necessary occurrence of an index term in the indexed text. This seems very undesirable for the reasons already stated. Highlighting a paragraph where the discussion begins or a section covered by a range indicator might be more generally applicable.

Under definitions, we would suggest that ‘generic cross-references’ are not a third category of cross-reference. They are simply a sub-class (usually of see reference, as generic see alsos would be of limited use). Indeed, under ‘Needed publication properties: index links’, the display of semantically-marked main headings could in some cases become completely unmanageable.

‘Entries’ is plural but the definition is of an entry.

A few general points:

We think the idea of a reverse index functionality opens up intriguing possibilities; a genuine expansion of the scope of an index permitted by electronic display.

We thought perhaps it might be helpful for the reader to be able to switch between browsing the alphabetical index and displaying terms related by cross-references or suggestions from thesauri, taxonomies or the master index. Some authorities differentiate between the low ‘specificity’ and high ‘sensitivity’ of keyword based indexes. Might such terms be helpful in our discussions?

While the ability to expand or collapse subentries when browsing chapter-like indexes is included, we believe a similar facility would be especially valuable if selecting less than a whole book or when accessing more than one related text. The expansion and collapse could even be triggered automatically by exceeding or dropping beneath a manageable number of unqualified entries (say six). We suspect this distinct area needs more investigation: unless indexing has been subject to tight vocabulary control, amalgamating indexes will result in a master index containing conflicting see references each burdened with locators left over from the text where they are preferred terms and potentially unmanageable numbers of occurrences, beside inconsistencies arising from conventional approaches to the metatopic. We believe it is our responsibility to point out to aggregators that indexes compiled for individual monographs are not designed for wider applicability, nor have authors been concerned to ensure terminological consistency, so master indexes should be implemented only after careful testing.

Bill Johncocks, Jan Ross, Maureen MacGlashan?, Linda Sutherland, Ruth Ellis, Rebecca Linford

Publishing Technology Group (SI) 8th December 2011

Comment by del...@twcny.rr.com, Dec 10, 2011

<i>From er eb2mmrt: Is this proposal intended to support authoring of indexes...or, is this proposal concerned about final EPUB documents?</i>

The latter. How indexes are authored and the process by which they are rendered into EPUB is out of scope. However, by clearly defining the desired functionality of indexes in EPUBs (which may at some point include describing the EPUB/HTML5 encoding necessary to support same), publishers would we hope be encouraged to produce EPUB documents with indexes that contain that encoding, so that the end product is as useful to readers as possible.

We are concerned not only with how the index in an EPUB displays if you page to that section in the back of the book (what the Proposal calls the "chapter-like" index) but also with the larger question, "In what ways can/should we enable the reader, the body of the book, and the index to interact?"

Comment by project member eb2m...@gmail.com, Dec 11, 2011

Please reference to Unicode Technical Standard #10: UNICODE COLLATION ALGORITHM in the definition of sorting, and avoid the word A-Z.

http://unicode.org/reports/tr10/

In the case of Japanese, please mention parallel representations in Hiragana and Kanji: an index entry name in Kanji and a sort key in Hiragana. See "Parallel writing in East Asian languages and its representation in metadata in light of the DCMI Abstract Model " by Akira Miyazawa, available at:

http://dcpapers.dublincore.org/index.php/pubs/article/view/863

Comment by project member eb2m...@gmail.com, Dec 11, 2011

Thank you for making clear that this proposal is concerned about final EPUB publications.

Please clearly indicate whether this proposal requires changes to browser components such as webkit. Functionalities beyond them would be nice but will endanger the synergy with the web world and the e-book world.

To support this proposal, EPUB reading systems are certainly required to be extended. But are components (e.g. webkit) required to be changed? In other words, are requirements on such components are the same as in 2.1.2 Reading System Conformance? Or, do such components have to be extended?

Comment by project member eb2m...@gmail.com, Dec 11, 2011

When you use the word "Publication Property", are you talking about properties as defined in EPUB3 publications? http://idpf.org/epub/30/spec/epub30-publications.html#sec-property-datatype

Comment by project member eb2m...@gmail.com, Dec 11, 2011

The proposed schedule is not ambitious or aggressive. It is preposterous.

Comment by project member janwri...@gmail.com, Dec 12, 2011

@ebwmmrt: about "Does the proposed extension make it difficult to use HTML rendering engines as a basis for EPUB reading systems? Or, is everything intended to be represented by existing HTML5 constructs? If this is the case, why do we need this extension? Aren't in-house conventions by publishers good enough? "

We aren't proposing anything that HTML shouldn't be able to render. But we are proposing a file that eReading systems could script into an additional view of the index, one more conducive to search-like behavior. In-house conventions, if you mean the chapter-like indexes that publishers are including now, aren't good enough. They are for the most part, missing, or if present, unlinked. If they are linked, rarely do they link to a paragraph, so the reader is off by a screen and can't find the text. We are proposing to make them more accurate and easier to use. And we are proposing two expansions; a pop-up index, and a reverse index view.

Comment by project member janwri...@gmail.com, Dec 12, 2011

@ebwmmrt: about "When you use the word "Publication Property", are you talking about properties as defined in EPUB3 publications? http://idpf.org/epub/30/spec/epub30-publications.html#sec-property-datatype"

No, we are using the word "property" in a very generic sense... as a term for the kinds of items an Epub would have to have in it to provide the functionality we discuss.

Comment by project member janwri...@gmail.com, Dec 12, 2011

@bkasdorf: about "At the risk of overcomplicating this (I think the separation of the index and dictionary work makes sense), would there be a benefit to providing the ability to link an index to a dictionary or, even better, a thesaurus, so that if a user didn't find a term being sought in an index, authoritative alternatives could be suggested? " One of our interface prototypes actually suggests this very idea: that a glossary entry can be pulled in and displayed if available, and that the index entries appear below. If neither exist, Search would be invoked. If search found nothing, then the idea of hitting an outside resource could be incorporated, as in Amazon's XRay. I've been experimenting with XRay this week, pulled out the sidefile, changed it, and found that I created a link to someone who didn't exist on Wikipedia, and managed to trigger a general search on Wikipedia for his name, with no results. That's not great for a user, but it was fun to change the file. We would have to allow the publisher to choose which resource they would like to go to, and designate that in metadata.

Comment by del...@twcny.rr.com, Dec 13, 2011

eb2mmrt asked, "Please clearly indicate whether this proposal requires changes to browser components such as webkit. Functionalities beyond them would be nice but will endanger the synergy with the web world and the e-book world."

I don't think we can say at this point. I think that will come out (if the Charter is approved and a working group authorized) as the Working Group begins examining the desired capabilities and considering what it would take to implement them. If a particular piece of functionality required such changes, then the decision would have to be made, "Does this functionality warrant endangering interoperability?" I assume that's the case with all other proposed changes to EPUB - that they are assessed based on their potential impact, right?

Michele Combs

Comment by del...@twcny.rr.com, Dec 13, 2011

Follow-up from Michele on changes to browser components etc: I would say we don't anticipate any such changes will be needed, and we would be willing to make "no changes" a prerequisite for our group's work.

Comment by project member janwri...@gmail.com, Dec 14, 2011

@Mary Russell: From Mary Russell: "(1) Use cases / Chapter-like Indexes a) there ought to be quick navigation tools to reduce scrolling down the index to get to the point you want. We are including A-Z navigational tools in the Chapter-like. For quick scrolling, a user would be better off choosing the pop-up index, that scrolls immediately down as you type into the field.

b) there ought to be a mechanism to navigate back to the Index entry which was the starting point for navigating out to the text (or any intermediate point). This would save starting your indexing search again to get back to the same point in the index. We have included this as a Reader specific suggestion.

c) also there ought to be some indication when they are in relation to where they came from (or some visual or textual map) and can jump We are addressing this by the breadcrumb-like persistent index entry. Maps and visual representations are beyond our scope for this iteration of the ePub functionality.

- example of a visual map would be the tree and branch path navigating out of the index into the text

- example of a textual map would a text line showing the path i.e. if a user clicks on index entry Pigeons, Homing, South America then a text line appears at the top or bottom of the page "Index: Pigeons: Homing: South America>"

(2) Generally

a) do indexes need any special nomenclature for URLs, video-clips etc? These can be indicated in the decorations of the locators if the publisher wishes to use special decorations for them.

To save jumping around you might have mixture of links in the index and wondered if you needed to note the difference in some way?" These again will be decorations.

Comment by project member janwri...@gmail.com, Dec 14, 2011

re: pulling in replacement indexes or later material. This is out of scope, as it applies to epubs as a whole: replacement TOCs, replacement chapters, etc.

Comment by project member janwri...@gmail.com, Dec 14, 2011

@ bkasdorf "would there be a benefit to providing the ability to link an index to a dictionary or, even better, a thesaurus, so that if a user didn't find a term being sought in an index, authoritative alternatives could be suggested? " This is a great concept, but out of scope for this first go-round. Publishers could provide links between a proprietary dictionary or thesaurus shipped as part of an epub, but we don't want to get into linking to outside resources that could change, or define that linking structure at this point in the game.

Comment by project member markus.g...@gmail.com, Dec 14, 2011

Re schedule: the proposed charter length is one year (through 2012). I will clarify the prose around the milestones table to make clear that the WG once formed can rearrange milestones once feature prioritization has been completed.

The aggressive milestones that are in the table now should be read as a marker to remind us and the coming WG about the general consensus from the workshop that we should focus on producing an initial basic version of this feature, which would not aspire to address every use-case -- and then increase functionality with additional versions moving forward as and when this is deemed the right thing to do.

Comment by project member daver...@levtechinc.com, Dec 14, 2011

@bentrafford, Dec 6 and @eb2mmrt, Dec 9

"Is this proposal intended to support authoring of indexes? For example, DocBook?? has indexterm elements, from which indexes can be generated" and "There should be a requirement that this work is informed by existing mechanisms defined in popular markup languages, such as the Text Encoding Initiative"

Authoring and any other processes leading up to the generation of an EPUB file are considered out of scope. We are only defining how indexes should be presented in EPUBs to create a fuller integrated experience with the index metadata and the contents of an eBook. It is likely that new or upgraded existing tools will be necessary to accommodate the new functionalities that indexes in EPUBs could provide.

Comment by project member daver...@levtechinc.com, Dec 14, 2011

@eb2mmrt, Dec 11

The Unicode Collation Algorithm and any other sorting specifications that publishers require should be adhered to during the creation of the EPUB. It is not expected that any sorting will actually occur during the use of the index(es).

Comment by b...@sagehill.net, Dec 14, 2011

I'm wondering how an index term that has multiple target locations will be presented. That is, if there are ten pages containing a "cat" index term, how are the ten links for "cat" in the index displayed?

This is an issue the DocBook? stylesheets have struggled with for HTML output. Currently each link displays the title of the section containing the target, but many people don't like that and it makes for a messy display. But there are no page numbers in HTML output.

While an epub browser can display page counts, that page count changes when you resize the type or the window, so an index display that shows clickable page numbers would have be entirely recomputed after such a change.

You could have a row of ten icons after the index entry, perhaps little page icons numbered 1 through 10, for example.

Or perhaps clicking on an index term pops up a list of section titles to choose from, assuming there are section titles.

Using an entry with multiple links requires navigational aids to avoid user frustration. If the user cannot distinguish the ten links, then typically each link is followed until the right link is found. Being able to return to the index entry after following a link is very important in that context.

Comment by project member janwri...@gmail.com, Dec 15, 2011

From Mary Russell and the Australian and New Zealand Society of Indexers: ANZSI Council met today and discussed the EPUB Indexes Charter and the Dictionaries Charter proposal. We acknowledge and support all the work that has gone into putting these together. We had collected a few comments, but they have already been noted by the Society of Indexers in their detailed response.

The only additional thing that was raised was to do with master indexes and potential copyright implications. Master indexes of chapters from the same book, or books on the same subject from the same publisher should not be an issue. The concern is with end users taking indexes on the same subject, but from different publishers, and creating a master index. In isolation each index should fine, it is the forced amalgamation of different publishers’ indexes which might have copyright implications.

Council members all send good wishes to you.

Regards, Mary Australian and New Zealand Society of Indexers (ANZSI) President www.anzsi.org Life is easier with an index

Comment by del...@twcny.rr.com, Dec 15, 2011

bobs@sagehill.net asked: "I'm wondering how an index term that has multiple target locations will be presented. That is, if there are ten pages containing a "cat" index term, how are the ten links for "cat" in the index displayed?"

This is a good question, Bob. In terms of display, a link (in the index or elsewhere) is presented as whatever text is inside the <a> element. So if a publisher wants to use page numbers, they produce an index with <a>number?</a>; one could also use as you say an icon or image, or the ID numbers of the <indexterm> elements (that's what I use when generating DocBook? indexes for proofing purposes), or whatever. Since this is determined during the authoring and/or production process, it's out of scope for our Charter.

However, your question did make us ponder whether there was some way to use the interactive capability of HTML/EPUBs to enable the reader to usefully distinguish between multiple links. What if, for example, when one hovers over an index link (whether page number or whatever) the reading system displayed contextual information about the target, a sort of "preview" of its location in the text? For example, maybe the 3-4 words on either side of the target? So for example, hovering over the first link to cat might display "...question. When choosing a cat many factors must be..." while the second might display "...types of pet foods. Cats are finicky eaters and..." This would give the reader some contextual information 'before' clicking on a link and leaving the index, which we think would be really valuable. So we've added some text about "hover" capability. Thanks!

Michele Combs

Comment by project member glendabr...@gmail.com, Dec 15, 2011

This is all excellent. I have one suggestion - that you add 'package metadata' to the glossary. While I can sort of guess the meaning, it is not a term with easily findable definitions on the web.

Comment by project member eb2m...@gmail.com, Dec 15, 2011

I asked, "Please clearly indicate whether this proposal requires changes to browser components such as webkit. Functionalities beyond them would be nice but will endanger the synergy with the web world and the e-book world."

Michele Combs wrote: "I don't think we can say at this point. I think that will come out (if the Charter is approved and a working group authorized) as the Working Group begins examining the desired capabilities and considering what it would take to implement them. If a particular piece of functionality required such changes, then the decision would have to be made, "Does this functionality warrant endangering interoperability?" I assume that's the case with all other proposed changes to EPUB - that they are assessed based on their potential impact, right?"

Then, please clearly indicate this point in the charter proposal so that IDPF members can have chances to explain their positions before they approve or disapprove the proposed WG. One approach is to join W3C to do required extensions as W3C specifications and extend EPUB accordingly. Note that the support of vertical writing and text-to-speech was introduced to EPUB3 in parallel to development at W3C.

Comment by del...@twcny.rr.com, Dec 16, 2011

Hi eb2mmrt - sorry for any confusion about my initial response (I was being excessively cautious!). In my followup we tried to clarify this, as follows: No, we do not anticipate that anything in this Charter would require any changes to webkit/browser components. We anticipate that anything we do would be supported by existing webkit/browser capability.


Sign in to add a comment
Powered by Google Project Hosting