There is an implementation of this in Daffodil as of 2019-06-19 as of git hash 015891ff982144ab07f092a25ab133707a9a31e9) See the SchemaComponent.scala file. This will be part of the Daffodil 2.4.0 release and subsequent releases.

Short Schema Component  Designators (SSCD)

This is loosely based on the concepts of the W3C Schema Component Designator (SCD) spec:

http://www.w3.org/TR/xmlschema-ref/

However, this must be adapted to our needs, as it is a bit too verbose to use in diagnostic messages, and doesn't have a notion of schema document, etc.

To clarify:

  • Component: a schema component is one of the things a schema author writes in a schema.
  • Component Instance: a schema component instance is the non-shared instance of a sharable schema component, that is, in its usage context. 
    • For example: a global type definition must be referenced from an element to be used. The type as it appears in the context of that element is called an 'instance' of that schema component.

SSCDs are conceptually related to SCDs, but there are quite a number of differences. 

  • relative schema component designators
  • minimal set of axes which DFDL needs.
    • Note however, that we may need attributes. E.g., to refer to the maxOccurs attribute of a specific element declaration we would write e=e1@maxOccurs
  • only very abbreviated syntactic forms
  • our own abbreviated versions of sequence, choice, and group reference path steps.
  • quasi-elements for access to DFDL annotations
  • convention for referring to a specific schema document (via URI)

So long as what we create is easily mapped onto an official W3C SCD, then what we use can be more abbreviated.

SSCD Syntax

An SSCD consists of a number of path steps separated by ":".

When an SSCD path step contains a reference to a DFDL schema named declaration/definition, the QName of that construct is used. If the schema has no namespace, then this QName will not have a prefix part. It will be a local-name only. When the schema has a target namespace, then this QName will use the usual prefix:name syntax, where the prefix is one of the prefix definitions for the namespace.

An NCName is the local-only part of a name. I.e., without a namespace prefix.

A path step is constructed as per this table. The single letter "N" denotes a number which is the position of the construct within the enclosing element, but only if that position is greater than 1. If the position is 1 (this is 1-based indexing), then no number is used.  This provides uniqueness, but does not provide XPath-style indexing information based on the kind of construct. That is, if an element reference is followed by a sequence then the element reference will get N and the sequence N+1 even though they are not the same kind of construct.

ConstructSSCD Path Step

Element Reference

<xs:element ref="QName" ....>

erN=QName

Local Element Decl

<xs:element name="name" ...>

eN=QName if the element form is "qualified"

eN=NCName if the element form is "unqualified"

Global Element Decl

<xs:element name="name"...>

e=QName

Global ComplexType Def

<xs:complexType name="name" ....>

ct=QName

Local Complex Type Def

<xs: element ...><xs:complexType>...

ct
Element's Type Reference to a Global Complex Type

ct=QName

Global Simple Type Def

<xs:simpleType name="name"...>

st=QName

Local Simple Type Def

<xs:element ...>
  <xs:simpleType>
    <xs:restriction  base="QName">
  ....

st

(tbd: how to reference parts of unions, clauses within restrictions)

Element's Type Reference to a Global Simple Type or to a primitive type.

st=QName
Choice GroupcN
Sequence GroupsN

Global Choice Group Def

<xs:group name="QName"><xs:choice ...>...

cgd=QName

Global Sequence Group Def

<xs:group name="QName"><xs:sequence ...>...

sgd=QName
Group Reference to a Global Choice Group DefcgrN=QName
Group Reference to a Global Sequence Group DefsgrN=QName

Implementation Notes

It is not always possible to form an SSCD for a schema component in a non-well formed DFDL schema. For example, suppose a global element decl is missing its name attribute. There is no way to refer to that problematic part of the schema using an SSCD because the QName must be part of the SSCD. For this case, use an XPath treating the DFDL schema file as an XML document.

Since ":" is also used in QName syntax to separate namespace prefixes from local names, one cannot split an SSCD trivially on the ":" into path steps.

Note that a schema component cannot create its SSCD step without knowing what its index is within its lexically enclosing parent. E.g., the 2nd sequence child of another sequence needs to create a step with a higher N value based on its position.

Since these will be used in diagnostic messages, the code to create these must be minimalist in nature. Nothing can go wrong in it. It cannot throw any sort of exception, nor depend on say, OOLAG LVs. The methods which create these will catch Throwable and Assert.abort() if anything is thrown.

SSCD for DFDL Annotations

Not sure this is needed, but if we want to specify an SSCD for a specific DFDL annotation, then we use quasi-elements dfdl:formatdfdl:sequencedfdl:simpleType, etc. That is, there is no representation of the annotation or appinfo constructs needed for long-format annotations.

SSCD for DFDL Schema Files

XML Schema and w3c SCD provide no means to refer to schema documents; hence, one cannot refer to individual top-level annotations. This is the same bug we see in XSOM and other schema object models.
We solve this by allowing a URI for the schema document, followed by a URI fragment which contains the SSCD for the dfdl:format annotation.

  • No labels