What’s in Java 10 (and 9) for XML Developers?

Posted on April 28, 2018 by Rick Jelliffe

Java JDK 10 arrived this month (March 2018). It has much less radical changes compared to Java 9. The general trend:

  • In: JSON/AJAX, Docker, effeciency, security JavaDoc, unmutable collections, more support for charsets in API
  • Out: XML-WS (SOAP), CORBA, Java WebStart, RMI

(When I say “out”, some are just removed from Java Standard Edition API, but are still in Java Enterprise Edition API.)

Here are the main features of interest to XML Developers. I have added in some Java JDK 9 changes of interest too, in particular the standard OpenJK implementation.

XML APIs

Changes:

  • javax/xml/namespace/NamespaceContext.java
    • Was Iterator getPrefixes(String namespaceURI)
    • Now
      Iterator<String> getPrefixes(String namespaceURI
  • javax/xml/xpath/XPathFunction.java
    • Was public Object evaluate(List args)
    • Now
      public Object evaluate(List<?> args)
  • org/xml/sax/helpers/NamespaceSupport.java
    • was  public Enumeration getPrefixes ()
    • was  now public Enumeration getPrefixes (String uri)
    • was  now public Enumeration getDeclaredPrefixes ()
    • now
      public Enumeration<String> getPrefixes ()
      public Enumeration<String> getPrefixes (String uri)
      public Enumeration<String> getDeclaredPrefixes ()
  • Open JDK 9: Transform, Validation and XPath implementations
    • Updated to use libxml2 2.7.2 and libxslt 1.1.28
      • NOTE: in the past, different versions were used on different platforms. It is not clear to me from the notes whether this really relates to JavaFX or something more
    • External libraries (From 9.0.4)
      • Prefer default parser even when other parser is on classpath
      • System property jdk.xml.overrideDefaultParser  can override
      • *Factory.setFeature() accepts this property
      • NOTE: Beause they finally upgraded the old horrible version of there XSD validator] that came with JDK, XML developers won’t need to add Apache Xerces in order to get unbuggy XSD validation. Good. But it means yet another system in place: that makes five or six different methods in different implementations, by the time you add WebServers, now that lib/endorsed has gone away.  Actually, this week I am facing this problem with JDK 8 and WebLogic: such a mess. Overcomplexified. The answer should have been just make parser version a developer compile time problem, like all other libraries, rather than a load-time ops issue.

Deprecated in SE:

Removed:

  • com.sun.java.browser.plugin2.DOM
  • sun.plugin.dom.DOMObject

Un-deprecated:

  • javax.xml.stream.XMLInputFactory.newFactory()    mistake in JDK 9

Other

Java 10.  For syntax, Java 10 add a var type for inferred types. This is a popular feature from other languages, and part of the trend in Java evolution to reduce its verbosity: other features on these lines are null-related annotations and try-with-resource() blocks that close resources.

Best little API improvement?   We finally get a built-in way to transfer from a Reader to a Writer:  reader.transferTo(writer);   (Java is so sloppily unorthogonal on the streaming APIs: you should be able to just connect anything to anything!)

Java 9. Three good things for XML developers!

The big news is that finally we get the internal ports of Xerces  updated.  Oracle (and Sun before them) have been really slack in neglecting this so long: Java was stuck using Xerces 2.7.n for 11 years for goodness sake.  The new ports are equivalent to Apache Xerces 2.11.0. (NOTE: XSD is still 1.0 only, the XSD 1.1 updates have not been put in place, but this probably reflects Apache Xerces’ slow pace to make the changes official.)

Java 9 brings up XML Catalogs to use version 1.1, from 1.0.   .

A big change is the advent of Compact Strings.  This changes the implementation of the String class so that a string will only be stored as two bytes (UTF-16) if it need its, otherwise it will use one byte (ISO8859-1 aka ASCII plus Latin1).  Because the one byte characters have the same numbers as the first 256 Unicode codespoints, conversion is easy (just padding with a zero.)  While this works for static strings, it is not clear to me how well this is integrated: for example, would parsing  XML document with only ASCII data content give a SAX stream using the smaller representation?   But memory allocation for strings has been a big problem with Java, and one reason that libraries written in C/C++ have tended to outperform it for larger documents.  Compact strings improves several things: less memory use, less time between garbage collections on loaded systems, more data gets pulled into the L1/L2 caches, and potentially that SIMD operations can work on larger string fragments.

In Java 9, the endorsed extensions mechanism has been dropped. This was a mechanism starting with Java 5 to override the built-in SAX and DOM API implementations, mainly used to add in Apache Xerces, by putting xerces-impl.jar and xml-apis.jar into <JAVA_HOME> /lib/endorsed.  As mentioned above, the rules by which the Factory classes locate an implementation of a parser etc has changed.  (Underlying this is a move to implement the XML libaries using the standard java.util.ServiceLoader class.  But JAXP suffers the classic problem, whenever you have too many options, fixing it by adding a new option makes things more confusing in the short term.)