RAN-DOM (and RAN Infoset, XDM)

RAN-DOM is a DOM for RAN documents.  The RAN-XML infoset is a description of how a RAN document may appear to be processed by XML systems.

(This is an incomplete outline only.)

RAN-DOM Document Object Model

Loosely, a RAN-DOM is an XML DOM with the following differences:

  • There are three subtypes of elements, text, comments, PIs, and attributes:  fragment, scoped, or general (i.e. XML).
  • There are two subtypes of attribute values: untyped (i.e. a string literal) or typed (which type is an extension.
  • There are several subtypes of attributes, depending on the = delimiter used
  • under consideration
    • link - extends element
    • RAN header PI
    • anonymous element - this is an array
  • The document is a list of fragments (or a single- top-level element for compatability)

Required extensions:

  • PIs have attribute start-tag syntax
  • All names and attribute values may be lexically typed: string, name, number, date-time-range, boolean, path.
  • An element may have an empty name "" or be anonymous, in which case the name is “[” and "]".
  • Element and fragment end-tags allow arbitrary data after the generic identifier, in the manner of a comment. 

There is no provision in RAN for data-content values to have datatypes, unless RAN-CSV is used, in which case the effective elements may have appropriate data values. The datatypes only apply to attribute values which are not string literals (i.e. in double quotes).

The information in the preamble is common to all.  Namespaces are implemented that a prefix on an element or attribute name can be used to look up the corresponding link. 

There is no equivalent of declarations, DOCTYPE, external entities, CDATA sections, namespace redeclaration, namespace defaults and so on in RAN.

An XML document being read into the RAN-DOM with an XML parser will be identical to that of a normal DOM, with the exception that namespaces will be transplanted elsewhere. 

The Apatak validator tables can be used for partial validation at the time elements are added, or before shipping.

XPath Data Model

A XML document loaded into a RAN-DOM document has the nodes as a conventional DOM. Similarly, it can have the same typed XDM behaviour as a  document from an XML DOM. 

Some aspects of RAN are coped with by the XDM (and Xquery) such as multiple top-level elements  (fragments can be treated as elements).  As names in RAN may be strings as well as tokens, the XPath would have to use *[local-name()="some name"] in paths for that case.

The definition of the value of an element changes: it is not the concatenation of all descendent nodes, but the concatenation of all descendent nodes of te same scope. It excludes contents of elements in descendent scoped-elements.

As with RAN-DOM, the pre-amble is available to all documents. A link tag's attributes are attached to the top-level elements and fragments with the same prefix.

The three things that do not have an equivalent in XDM are the date-time-range and path datatypes. Consequently these should be treated as strings.

RAN   Infoset

We start with the basic grammar productions:

stream    ::= preamble? body*
preamble  ::= (RAN-PI | link | fluff)*
body ::= (element fluff* ) | (fragment fluff*)+

Each top-level fragment are each treated as virtual XML documents, according to the following rules:

  • The link declaration is applied to each of them.
    • RAN links provide XML namespace declarations and defaulted attribute values
  • Fragments are treated as XML elements
    • Element and attribute names that are literals must be replaced by e.g. Base64 versions of the literal  (allowing - and _ not / and +).
    • Attribute values not in literals are put as string literals, and typed, if available, but by the nearest PSVI equivalent
  • Top-level fluff, i.e. comments and PIs, are not visible as part of the infoset of any fragment.