文档搜索 > Creating NIMAS Files
Creating NIMAS Files
1. The NIMAS fileset
A NIMAS fileset consists of the following:
A NIMAS fileset is a set of source files that may be rendered into a variety of output formats, including student-ready versions such as audio books, Braille editions, etc. It is not a post-production product; it is a pre-production product. NIMAS files are intended for use by publishers, authorized entities, and others to produce accessible versions of printed instructional materials. They are not intended to be used as-is and should not be considered finished products. XML content files in NIMAS filesets must be conformant to the NIMAS 1.1 specification, a sub-set of the DAISY 2005-2 DTD. By definition, a NIMAS XML file will validate to the NIMAS 1.1 DTD.
A print work’s XML content file includes all of the content found in its print version, including frontmatter, such as tables of contents and prefaces; backmatter, such as epilogues and indices; front and back covers, if they include content that is not presented elsewhere; and bodymatter, such as chapters and sections text, images, charts, tables, etc. Essential information about a print work should be included in its XML content file as well as in its OPF file to ensure that such information is available whether a fileset’s OPF is utilized practically or not; such information typically includes a work’s title, author(s), publisher, copyright, and ISBN, and may include other similar items. A print work’s package file includes metadata about the content of the print work, including its images, as well as that required by the NIMAC for submission to its repository. Required PDF-format pages provide a necessary way for users of the fileset to ensure it’s accuracy and completeness as well as providing a way to verify the work itself. Images present in the print work must be included in its fileset (see below for additional information regarding images).
See the History and Core Technologies document for more information on the background of NIMAS and NIMAS files.
2. Creating a NIMAS-conformant XML content file
A NIMAS-conformant XML content file is an XML file that validates against the NIMAS DTD. What this means is that the file is consistent with the requirements of specific NIMAS DTD items. A NIMAS-conformant XML file qualifies as a well-formed XML document. Anyone new to NIMAS should begin by creating a well-
formed XML file and continue from there to create a NIMAS file. Well-formed means that all of the XML is correct. For example, it’s analogous to writing that is grammatically correct. In the same way that correct writing does not have spelling errors, punctuation errors, syntax errors, and the like; well-formed XML does not have nesting errors, opening- or closing-tag errors, or syntax errors. A well-formed XML document is a basic, correct file; a NIMAS-conformant XML file is a correct file using XML specific to the NIMAS DTD.
There are many XML editors available on the market, and it would probably be best for anyone new to XML to create files using one of these. (Although it is possible to create perfectly good XML with a simple text editor such as Notepad or TextPad.) Examples of XML editors include Dreamweaver, XMLSpy, XMLmind, <oXygen/>. These editors will ‘test’ an XML document for errors and let you know whether a document is well-formed or if it validates to a specific DTD that is listed in an XML declaration. Many editors will also point out where, or approximately where, an error has occurred. The larger an XML document is, the more valuable this function is. The best way to create good XML files is simply to begin. The following tips may prove useful.
More information is available throughout this web site, in the NIMAS Technical Specification, and in the DAISY Structure Guidelines.
To create well-formed XML documents, keep the following in mind:
To create valid NIMAS-conformant XML files, keep the following in mind:
Since NIMAS files align with the DAISY standard, the following DAISY 2005-specific items should be kept in mind:
The following items are suggested best practices for NIMAS/DAISY XML files:
For more specific best practices recommendations based on NIMAS implementation and the NIMAS Technical Assistance Center technical support, see the NIMAS Files Best Practices document.
NIMAS 1.1 DTD Required Elements
Following is a list of elements that are required (taken from the NIMAS Technical Specification)—other elements may be required if applicable to content.
The root element in a Digital Talking Book DTD. <dtbook> contains metadata in <head> and the contents itself in <book>.
Contains metainformation about the book but no actual content of the book itself, which is placed in <book>. This information is consonant with the <head> information in XHTML, (see XHTML11STRICT). Other miscellaneous elements can occur before and after the required <title>. By convention, <title> should occur first.
Surrounds the actual content of the document, which is divided into <frontmatter>, <bodymatter>, and <rearmatter>. <head>, which contains metadata, precedes <book>.
Indicates metadata about the book. It is an empty element that may appear repeatedly only in <head>. Metadata may appear in the OPF file instead.
Contains the title of the book but is used only as metainformation in <head>. Use <doctitle> within <frontmatter> for the actual book title, which will usually be the same.
NIMAS 1.1 DTD required attributes
Following is a list of attributes that are required if the noted elements are used.
version on <dtbook>
src and alt on <img>
<img src="./images/U01C04/p036-003.jpg" alt="photo of an apple"/>
type on <list>
render on <prodnote> and <sidebar>
id on <pagenum>, <note>, and <annotation>
idref on <noteref>, <annoref>
content on <meta>
dir on <bdo>
DTD and OPF references
While recognizing that the current NIMAS Technical Specification permits use of DAISY/NISO Z39.86 2005 (1) and later (-2 and -3) as the XML source file DTD reference, the NIMAS Center strongly urges the use of the most current DTD in NIMAS filesets. The most current DTD may be found here: http://www.daisy.org/daisyniso-standard-dtd-and-css-files.
The components of a fileset should be evaluated according to the NIMAS Technical Specification (currently v1.1), a sub-set of the DAISY ANSI/NISO Z39.86 standard. Relevant language regarding references and validation are as follows (the entire document is available at the AIM web site: http://aim.cast.org/experience/technologies/spec-v1_1):
"NIMAS-conformant content must be valid to the NIMAS 1.1 [see DAISY/NISO Z39.86 2005 or subsequent revisions]."
"Use of the most current standard is recommended."
"The package file is based on the Open eBook Publication Structure 1.2 package file specification (For most recent detail please see http://www.openebook.org/oebps/oebps1.2/download/oeb12-xhtml.htm#sec2.) A NIMAS package file must be a valid XML OeBPS 1.2 package file instance...."
As an extensible specification, the NIMAS Technical Specification was written to provide some flexibility as well as to state that updates and additions are and would be ongoing. The NIMAS is based upon ANSI/NISO Z39.86, which, as a NISO standard, and “all NISO standards undergo a review and maintenance cycle” (http://www.niso.org/standards/), the NIMAS must remain conformant to that standard.
3. Validating a NIMAS-conformant XML content file
An XML file must validate to the NIMAS 1.1 DTD (derived from DAISY ANSI/NISO Z39.86) to be considered NIMAS-conformant. To validate a NIMAS XML file, add the following declaration to the top of the XML file:
<!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook 2005-3//EN"
Once the XML file is completed, test it to see if it validates against the DTD. In an XML editor, use the validate function; otherwise, use an online validator:
A validator has been developed by the NIMAC repository contractor to ensure that NIMAS XML content files submitted to the national repository are valid and a client version is available to qualified publishers so that files may be tested prior to submission to the NIMAC.
See the XML Editors and Validation section of the NIMAS site’s Content Development and Design page for more information.
In order to view a complete NIMAS file and confirm that images are placed in the appropriate sequence, change the file extension from -.xml to -.html and add a reference to a css file within <head> such as, <link rel="stylesheet" type="text/css" href="filename.css"/>. Close the document, and open it in a browser. Such an extension-only version should only be used as a visual representation of the XML files as an aid to production: such a version is not a true HTML file, is not a student-ready version, and should not be used in place of these. Any HTML version of a NIMAS fileset intended for use by students or in other post-production capacities should be transformed with appropriate code to HTML or XHTML (software and XSLT transformations for doing this are freely available from the DAISY consortium).
4. Character Encoding (UTF-8 and UTF-16)
The Technical Assistance Center has received queries regarding encoding, as UTF-8 and UTF-16 are strongly recommended but not absolutely required by the technical specification. To clarify: only UTF-8 and UTF-16 are required to be supported by applications that process NIMAS files. Using another, it is quite possible the file would be corrupted when processed by, for example, a conversion tool or player. This corruption would most likely take the form of special characters (quotes, accented letters, etc.) being rendered incorrectly. More careful applications may refuse to process the file and return an 'unknown encoding' error. There are several free software packages that convert character sets. The DAISY Pipeline also has a '"charset switcher" filter. A program expecting UTF-8 but finding other encoding will definitely render incorrectly if the file has any characters beyond the traditional ASCII range. It seems that using encoding other than UTF-8 (or UTF-16) is not a good idea. While using UTF-8 is not required, it seems it would not be worthwhile to substitute another for it.
5. Preparing images for a NIMAS fileset
The Technology Working Group of the NIMAS Development Committee has recommended the following:
Organization of images
To simplify images organization and to recommend an efficient way to handle images for use with NIMAS files, the following is recommended:
All images should be saved in an images folder:
Within this parent folder, images should be saved as follows:
The zeroes allow for sequential ordering.
– = bodymatter and rearmatter
Naming here provides information about image location to unit and chapter level.
Naming here provides for the fact that many works contain images that recur throughout content at all levels.
An icon that occurs several times in frontmatter and twice per chapter in bodymatter:
A folder of images occurring in the middle of a work:*
An image that occurs in the frontmatter of a work:
*Note that “U02C08” subsequent to “U01C01” in the second example above illustrates the fact that not all pages/chapters/units of all works will have images. The example demonstrates a numerical sequence where not every step in a possible sequence is necessary.
Go to CAST’s NIMAS Exemplars page to see examples of NIMAS-conformant files that include proper handling of images within a NIMAS-conformant fileset.
Value-added components: mark-up of images using previously-published (print) materials
When marking up images from a previously-published, print-based source, it is recommended that images be named according to their placement within original source content. Create filenames based on an image’s page location and sequential order. Example:
An image with this filename appears on page 4 of the original work and is the 2nd image on that page.
Do not use spaces in image filenames.
Choose an image’s sequential order number according to its position: name image files from top to bottom and from left to right. Example:
Occasionally it will not be entirely obvious which layout position an image holds. In such cases, simply choose a logical sequence number and make a note of it for production.
Value-added components: writing alternative text and long descriptions
An alt text placeholder for images is required (example: <img alt=""/>), and including actual alt tag text is strongly recommended to publishers, as is including (when appropriate) long descriptions. The alt attribute is part of the <img> element and text is placed in standard attribute format (example: <img alt="text"/>). The <prodnote> element is used for images long descriptions in the NIMAS 1.1 DTD (example: <prodnote>Text of long description.</prodnote>). Alt tags and long descriptions are especially important for math equations that will be provided as images temporarily but will not be accessible to screen readers and other devices without inclusion of an alternate representation.
The following is taken from Editorial Process Guidelines for Creating Accessible Digital Textbooks (CAST, Inc., 2004).
Writing for Accessibility: Alt Tags and Long Descriptions
An alt tag is a brief description of an image. The “alt” in alt tag stands for “alternative” and an alt tag is alternative text—another option to the image. Alt tags should state the type of image and a brief summary of the image. They should not have any unnecessary text. Alt tag text should be
approximately four to ten words long. Alt tag text is designed to be brief. The point is to capture the function of the graphic and to express it in terms that make sense.
Every image has an alt tag associated with it. An alt tag must appear for every purposeful image. The alt tags appear on screen with mouse-over, or when the mouse is moved over the image. Assistive technology such as a voice-output screen-reader will not “read” an image but will read the alt tag instead. Text-only browsers display alt tags over the image placeholder.
A long description is a detailed description of an image that supports or adds meaning to the text. Long descriptions are context-specific. The details given depend on how the image supports or supplements the text.
Their purpose is to provide content information conveyed by the image so that students who are unable to “read” the image, for whatever reason, still have access to the information relevant to instruction that is conveyed by the image.
Long descriptions are provided whenever an alt tag is not sufficient to convey the content of an image. Long descriptions should be written for each image (map, timeline, picture, chart, graph, photo, etc.) that supports the text or gives additional or new information needed to understand content or topic. A long description should be included whenever an alt tag cannot provide sufficient information about the object and its purpose for inclusion. Remember that long descriptions vary according to learning goals. Try to create a balance between brevity and sufficient information so that every learner can access key content.
Note: There is no limit to the length of a long description; LDs should be as long as necessary to convey image information.
Images and mark-up of math content
Currently, a standard, problem-free way to treat mathematical content in terms of NIMAS mark-up has not yet been determined. The XML specification MathML has been formally accepted by the DAISY Consortium as a modular extension for mathematical content. MathML is therefore a part of the optional element set of the NIMAS specification. Producers are encouraged to begin using MathML where feasible. As an interim solution, where the use of MathML is not yet possible, math equations and other symbolic content should be presented as images with alternative text and long descriptions. When creating math content images, it is crucial to distinguish between content such as in-line equations, that are math content per se, and images, such as graphs, that just so happen to contain math content. The suggested best-practice for accomplishing this distinction is to code true math content with the notation EQ in their image filenames to indicate that an image represents math content. Images such as charts and illustrations that contain or are about math should be coded as any other image would be. Examples are as follows:
Filename of an image of an in-line equation: EQp212-004.jpg
Filename of a pie chart: p010-002.jpg
Filename of a symbol presented within text: EQp005-003.jpg
Filename of an icon used throughout a math textbook: staricon.jpg
Filename of an equation that recurs throughout a unit: EQdifferential2.jpg
A text description of EQ images must be provided for, either with alt tag placeholders or alt tag text and/or long description text. Coding in-line math content as images temporarily will make it accessible, because text descriptions will be recognized and read by text-to-speech software/readers.
See the Math resources page for more information.
6. Preparing a package file (OPF)
Package files created for use with NIMAS XML files must conform to the oebps 1.2 standard (the Open eBook Publication Structure Specification Version 1.2), i.e., validate to this specification. The following information about the oebps 1.2 Specification is based on information made available to the public by the International Digital Publishing Forum. NIMAS package files should validate to this oebps standard, i.e., these files are NIMAS OPF files, not DAISY OPF files. Additional enhancements will be necessary for a file to validate as a DAISY OPF.
Currently, NIMAS 1.1 includes two metadata elements that were created for future use but turned out not to be needed in practice and that are intended to be phased out as the technical specification is updated. However, these two items are part of NIMAS 1.1 and are therefore required. For all OPF files conforming to NIMAS 1.1, these metadata elements should be included yet left blank:
<meta name="nimas-SourceEdition" content=""/>
<meta name="nimas-SourceDate" content=""/>
NIMAS filesets must also meet metadata submission requirements for the National Intstructional Materials Access Center (NIMAC), the national repository of NIMAS filesets. The NIMAC provides a comprehensive sample, instructions, and details at their NIMAC Metadata web site page. A sample NIMAC-valid OPF file is available through a link from this page, and may also be downloaded from the NIMAS site’s Exemplars page (exemplar 9). All of the NIMAS exemplar filesets include OPF files that are valid to the NIMAS specification and to NIMAC submission requirements.
The current oebps 1.2 declaration for an OPF file is as follows:
<!DOCTYPE package PUBLIC "+//ISBN 0-9673008-1-9//DTD OEB 1.2 Package//EN" "http://openebook.org/dtds/oeb-1.2/oebpkg12.dtd">
To create valid package files, keep the following in mind:
From the oebps specification publication:
The major parts of the OEBPS Package file are:
Package Identity: A unique identifier for the OEBPS Publication as a whole.
Metadata: Publication metadata (title, author, publisher, etc.).
Manifest: A list of files (documents, images, style sheets, etc.) that make up the publication. The manifest also includes fallback declarations for files of types not supported by this specification.
Spine: An arrangement of documents providing a linear reading order.
Tours: A set of alternate reading sequences through the publication, such as selective views for various reading purposes, reader expertise levels, etc.
Guide: A set of references to fundamental structural features of the publication, such as table of contents, foreword, bibliography, etc.
MIME (Multipurpose Internet Mail Extension)
MIME types are standard format extensions used to support the attaching of non-text files to standard Internet mail messages. Non-text files include graphics, spreadsheets, formatted word-processor documents, and sound files. The MIME standard specifies the type of file being sent and the method that should be used to turn it back into its original form.
Manifest list items in OPF files should include standard MIME types in order to be fully accurate. For example,
<item id="idname" href="filename.xml" media-type="text/xml"/> where text in media-type=“text/xml” is a standard MIME type and xml in media-type=“text/xml” is a standard sub-type. Another example is <item id="idname" href="path/path/filename.jpeg" media-type="image/jpeg"/> where image in media-type="image/jpeg" is a standard MIME type and jpeg in media-type="image/jpeg" is a standard sub-type.
The main types are application, audio, example, image, message, model, multipart, text, and video. Sub-types are extensive for each type, but, for example, the most common ones for image are gif, jpeg, png, and tiff; and for text are css, html, plain, richtext, and xml. Lists of types are available at http://www.iana.org/assignments/media-types/. Other MIME information is readily available over the Internet: type “MIME types” into a search engine.
NIMAS filesets and MIME types
MIME types for use with NIMAS filesets reflect the NIMAS’ alignment with DAISY and follow the guidelines of Z39.86-2005, section 3.3 (http://www.daisy.org/z3986/2005/z3986-2005.html#Manifest). They are as follows:
XML content files: media-type="application/x-dtbook+xml"
A full reference/OPF manifest item would read as follows:
<item id="xmlexemplar" href="contentfilename.xml" media-type="application/x-dtbook+xml"/>
PDF files: media-type="application/pdf"
A full reference/OPF manifest item would read as follows:
<item id="copyrightpagepdf" href="copyrightpage.pdf" media-type="application/pdf"/>
images: media-type="image/jpeg" or media-type="image/svg+xml" or media-type="image/png"
A full reference/OPF manifest item would read as follows:
<item id="imgarrow" href="images/U00C00/arrow.jpg" media-type="image/jpeg"/>
<item id="imgarrow" href="images/U00C00/arrow.svg" media-type="image/svg+xml"/>
<item id="imgarrow" href="images/U00C00/arrow.png" media-type="image/png"/>
NOTE: The DAISY DTD on which the NIMAS technical specification is based does not include all possible MIME media types and sub-types because official types are not a static, permanent list but evolve through the coordination of the Internet Assigned Numbers Authority (IANA [http://www.iana.org]).
NIMAC and metadata
The NIMAC is the national repository for NIMAS files, managed by the American Printing House for the Blind (APH). There are additional metadata requirements for OPF files submitted to the NIMAC beyond those required for NIMAS OPF files. Those differences facilitate the storage, management, and retrieval of NIMAS files. The NIMAC web site has its own metadata page, and a NIMAC sample OPF file that includes both NIMAC and NIMAS metadata requirements is available there (full version with history and comments) and on CAST’s NIMAS Exemplars page (Word and XML versions; separate history and comments document). As noted above, all of the NIMAS exemplar filesets include OPF files that are valid to both the NIMAS specification and NIMAC submission requirements.
For quality control purposes, the NIMAC needs to receive from publishers a PDF of both the title page and the copyright/ISBN page for the book. Because the NIMAC staff has found that the XML file for a work may not always include all series, state edition, ISBN, or other essential metadata, the PDF helps ensure that the metadata submitted in the OPF is accurate, complete, and describes the actual file being submitted. Since the NIMAC does not have access to any print copies of works, these PDF pages provide the best, and in some cases the only, way to compare key metadata provided with the the print version.
An important point that has come up regarding fileset submission to the NIMAC concerns discrete item identification information. In rare instances, a work submitted to the NIMAC may use a UPC code for unique identification rather than an ISBN. In these cases (and only these cases) where UPC information is the only source of unique item identification, UPC information should be contained in the <dc:Identifier> element. The format for this usage is “UPC23443562NIMAS” where “UPC” prefaces the UPC numeric code itself and “NIMAS” follows, without punctuation or spaces. Note that the <dc:Source> element is for ISBN information only and must not be used for these exceptions.
7. PDF pages
In NIMAS fileset submission, the NIMAC needs to receive PDF pages of both title and copyright/ISBN page(s) for a book. If copyright/ISBN (unique identification information) appears elsewhere, those page(s) must also be included.
Because the NIMAC has found that submitted XML (source) files are not always complete and may omit essential metadata such as ISBN information, these noted PDF pages help ensure that submitted files can be completed, are accurate, and describe submitted filesets. Since the NIMAC does not have access to print copies of books, these PDF pages provide the best, and, in some cases, only, way to compare provided metadata with a print version.
Go to CAST’s NIMAS Exemplars page to see examples of NIMAS-conformant files that include appropriate package files within a NIMAS-conformant file set.
Go directly to the oebps 1.2 specification at the <OeB> Open eBook Forum.
Opening the eBook, by Didier Martin. (2000.) XML.com. This article explains the oebps standard using IDPF content in a more readable style than the spec itself. Note the standard has since been updated from 1.0 to 1.2.
Changes were made to this document as of March 8, 2007:
Changes were made to this document as of August 20, 2007:
Changes were made to this document as of December 6, 2007:
Changes were made to this document as of January 4, 2008:
Changes were made to this document as of March 25, 2008:
Changes were made to this document as of May 29, 2008:
Changes were made to this document as of October 15, 2008:
Changes were made to this document as of May 5, 2009:
Changes were made to this document as of June 1, 2010:
Changes were made to this document as of October 8, 2010:
 Mark-up for math content—equations, symbols, and the like—should now be created using MathML where feasible. Please see the Images and mark-up of math content section for more information.
 _____. OEBPS 1.2 Specification. IDPF, http://www.idpf.org/oebps/oebps1.2/download/oeb12.doc#_Toc14771680
All Rights Reserved Powered by 文档下载网Copyright © 2011