PVL is a simple language for expressing simple validation constraints and also information item stripping policy concisely. It is suitable for being derived from a schema, and being tightly integrated into an XML parser or SAX stream at a low level.
PVL is made from one or more namespace declarations (ns elements) followed by an actions element. The actions element is line oriented, with a kind of XPath followed by an error keyword (+ allow, w warn, X error) optionally followed by the keyword - strip (do not pass the information on) or 0 fail (halt processing.) Actions are matched starting from the top until the first pattern succeeds. Otherwise they fail.
The semi-XPath is just something likeo
((( prefix ":")? name)? "/")?
((prefix ":")? (("@")? name)
| "#DATA" | #WS | #COMMENT | PI | #DOCTYPE
where prefix and name could be wildcard "*" and the #WS is for
whitespace runs.
PVL addresses several problems: how to remove non-significant whitespace pre-DOM without full validation, how to fail early on gross validation errors thus allowing more complete validation to be performed as a separate pass or phase without tieing up resources, how to have most of the flexibility of order-free, contextual validaiton of elements and attributes without building a grammar or risking DFD blowout, how to get some path benefits without a full random access schematron-style DOM blowout, how to enforce extra requires such as SOAP's no PI rule but without confusing it with XML WFness. Most particularly, how to do these with a small language that would be trivial to implement using the parsers existing stack.
The following schema defines that a document must start with xxx:yyy as the root element, that whitespace children of xxx:yyy are to be stripped, that xxx:zzz children are allowed with data content or an xxx:eee element, that xxx:eee is an empty element, that comments and PIs are stripped, and that a DOCTYPE declaration should generate a warning.
<pvl:schema xmlns:pvl="...">
<pvl:ns prefix="xxx" uri="..."/>
<pvl:actions>
/xxx:yyy +
/* X 0
/*:* X 0
xxx:yyy/#WS + -
xxx:zzz/#DATA +
xxx:zzz/xxx:eee +
xxx:eee/#DATA X 0
#COMMENT + -
#PI + -
#DOCTYPE w
*:*/*:* X 0
*:*/* X 0
</pvl:actions>
</pvl:schema>