General Information

OASIS Standards

The most comprehensive information on the ODF format is found in the OASIS Standards, freely available on-line.  The OASIS Standard is ODF 1.2, approved on 2011-09-29 and in the process of being promulgated as ISO/IEC International Standard IS 26300:2015.  The major implementations of ODF 1.2 are the descendants of OpenOffice.org 3.x (especially current versions of LibreOffice and Apache OpenOffice) and Microsoft Office 2013 Word, Excel, and PowerPoint.  There are also degrees of support in Google Docs and in the Microsoft Office web applications and emerging device applications.

 The current specifications can be downloaded from OASIS at <http://docs.oasis-open.org/office/v1.2/os/>.  If you are going to be working with the specification much, it is convenient to download the file OpenDocument-v1.2-os.zip which contains all of the specifications in their PDF, ODT, and HTML forms along with the various schema files.  The Part 1 document has the important details.  Part 2 is about OpenFormula only.  Part 3 is about the use of Zip for packaging an ODF documents as multi-part collections.  The unnumbered Part provides a combined Table of Content for the 3 parts and also has some conformance statements that are important overall.

For ODF 1.2, the schemas are not included in the text.  The schemas are important companions.  That is how one determines when an element or attribute occurrence is optional and whether there are pre-requisites of one XML feature occurrence on another.  Versions of the schemas that are searchable and navigable as hypertexts have been placed at <http://nfoworks.org/notes/2014/05/n140504d.htm> (OpenDocument-v1.2-manifest-schema.rng) and <http://nfoworks.org/notes/2014/05/n140504f.htm> (OpenDocument-v1.2-schema.rng).

The most widely-implemented version of ODF prior to ODF 1.2 is the OASIS Standard ODF 1.1 approved on 2007-02-01 and available at <http://docs.oasis-open.org/office/v1.1/errata01/os/> with its incorporated 2013 Errata.  This is the version that is aligned with the ISO/IEC International Standard 26300:2006/Amd 1:2012, although there are a few additional errata.  ODF 1.1 documents can be found in the wild, and there are some breaking changes between ODF 1.1 and ODF 1.2.  In particular, ODF 1.1 did not specify spreadsheet formulas so implementations had a form that preceded the ODF 1.2 OpenFormula introduction.  Some commercial software that remains in use (such as Microsoft Office 2007) have their support for OpenDocument files based on ODF 1.1.

Exploring and Processing ODF Files

OASIS OpenDocument Essentials

An introductory book can be obtained here:  OASIS OpenDocument Essentials HTML  PDF.   It is part of this overall material: http://books.evc-cit.info/index.html. [Note: This otherwise-excellent 2005 book is based on ODF 1.0 and relies on OpenOffice.org 1.9 with its many deviations and implementation-dependent provisions.  (For example, external DTDs are not used in ODF, which has Relax NG schemas.)  The material should be used cautiously and with consultation of the ODF 1.1 specification, the current International Standard and with OASIS Standard ODF 1.2 (especially its OpenFormula part), becoming the next version of the International Standard.  Current versions of OpenOffice.org descendants are in the 4.x range.]

Apache Project Resources

For implementation details, there is the source code and documentation of the Apache OpenOffice (AOO) project and the http://openoffice.org site. 

AOO is complex software.  An alternative source for some implementation fundamentals is found at the Apache ODF Toolkit podling.  Although the ODF Toolkit consists primarily of Java code, it is more concise and easier to comprehend for design concepts with regard to the consumption and production of such files.  There is also a validator in the Toolkit that may be usable and also informative.

ODF Conformance/Compliance Assurance Helix

The scope for Corinthia includes

Many office document programs claim to read/write to the ISO open standards for office documents, OpenDocument Format (ODF) and Office Open XML (OOXML), but do not document which parts are left unimplemented. Furthermore, the standards have a large number of "implementation defined" parts, making real-world congruence chancy. The Corinthia toolkit wants to put this unacknowledged aspect into the open and provide "compliance sheets" for document formats, as known from industry computer protocols.

Corinthia aims at generating a large set of test documents, which can be used to verify the "compliance sheets". The code can work as test case for other applications (or entities tendering for OOXML/ODF based systems) as well.

It is proposed to address this situation for OpenDocument Format using an Assurance Helix:

  • A set of test documents that provide an iteratively-expanded spiral of conformant and intentionally non-conformant documents  The exercise of features confirms patterns of the document format that can be taken as exemplary of many more variations on the same patterns, with additional test documents created as needed when exceptions to that are discovered.
  • A set of iteratively-developed verification programs and reference implementations that spiral alongside the test-documents, taking those documents as units to employ in test-first development patterns.  An important behavior, at any level, is how features that are not supported or not recognized are dealt with in a dependable manner on which users can rely, along with how compliance is demonstrated with regard to features that are supported. 

The Assurance Helix is independently usable by any party to develop comparisons and calibrations of particular ODF-supporting implementations  The development of compliance sheets is backed up by the Assurance Helix, and the helix is usable as backup to other demonstrations of conformance and identification of deviations and extensions.

Further detailing of the Assurance Helix involves the following aspects.

  1. Conformance Requirements
    Conformance cases for ODF documents are complex.  A matrix of the overall conformance cases is used to characterize how the major format variations are treated
  2. Single File Documents
    The ODF Specification provides for single XML files as carriers of complete ODF documents.  Such documents are not quite as flexible as their multi-part counterparts carried inside of an ODF Package (a special usage of Zip).  The single file documents are very useful, however:
    • Single-file documents are easy to create and employ for the creation of tests that exercise essential features using simple, annotated XML files.  These are useful in controlling the variations among features, making test cases more precise and isolating observed deviations in processing.
    • Single-file documents are appealing for automatic generation of documents from databases or other processes for mechanically producing ODF documents from other data.
    • Single-file documents are often preferable for embedded documents inside of ODF Packages for overall documents (a case that is not adequately verified).
    • Single-file documents can be simpler and preferable as template documents and as parts of master documents
    • Single-file documents are useful as a form for conveying the essential simple structure of ODF document files for familiarization and for more-isolated testing of implementation functionality.
    • Single-file documents are a basis for familiarization with the ODF 1.2 schemas and their feature-level connection with semantic information in the specification.
  3. ODF Packaging
    The specific format employed for ODF Documents conveyed in ODF 1.2 Packages requires iterative development of features and their verification for the packaging structures themselves, as defined in ODF 1.2 Part 3 and supplemented by a few requirements in other parts.  There are a number of unique features (such as application of digital signatures and use of encryption/decryption) that apply for packages.  These are defined for use more broadly than to specific application as a carrier of ODF 1.2 documents.
  4. Multi-Part ODF Document Packages
    Single-file documents all have multi-part flavors that employ ODF Packaging  There are portions of Assurance Helix for those cases  Multi-part forms have important additional cases involving use of embedded materials, cross-references among materials, and additional ways of linking and carrying meta-data information  These are the most-common form of ODF documents "in the wild."  Mulit-Part documents will be the stress cases for interoperability, successful round-trip usage in collaborative work, and ability to substitute implementations while preserving fidelity to the intended document.
  5. Extended Features and Extension Mechanisms
    There are systematic provisions for the presence of extensions in the file formats for ODF documents. Providing benign or gracefully-reduced functions in the face of extensions that are not understood is an important factor in both the definition of extensions and in the recognition of them.  Extensions can be peppered almost anywhere in the above cases and are provided for in the conformance requirements.  The identification of extensions, the source of their definitions, and behavior in the face of them are also factors in the development of compliance information.

Questions and Answers

 

Information Store

  • No labels