1. Empty. In XSLT2


Empty. In XSLT2

MK, Andrew Welsh and DC

Andrew Welch asked

Consider the following code:

<xsl:variable name="foo" select="nothing" as="xs:string?"/>
	<xsl:when test="$foo != ''">A</xsl:when>
	<xsl:when test="$foo = ''">B</xsl:when>
	<xsl:when test="not($foo != '')">C</xsl:when> </xsl:choose>

When there isn't a <nothing> element, the output is C. That is:

$foo != '' is false


$foo = '' also is false 

Which is strange. If I do "$foo is empty" then Saxon tells me $foo is a string and not a nodeset. After adding the explicit cast, the test passes:

string($foo) = '' 

Which suggests that $foo isn't a string (so which is it?). It almost as if the empty nodeset doesn't get implicitly cast like a 'populated' nodeset, and the as: attribute is ignored. Is there a difference between the way the two are handled?

Also, is using "!= ''" a bad way of checking if the variable has content when the variable type is 'xs:string?' (ie optional)?

dc made the point:

But I think it's true to say that as="xs:string" does _not_ force an empty sequence to coerce to an empty string, isn't it?

(ednote: Making the point that a cast does not occur)

mk replied:

The variable was defined as:

<xsl:variable name="foo" select="nothing" as="xs:string?"/>

The select expression yields a node-sequence, the @as expression requires atomic values, so XSLT invokes atomization. The result of atomizing an empty node-sequence is an empty sequence of strings.

The only conversions forced by the "as" attribute are atomization and numeric promotion (e.g. int to double). It doesn't cause a cast. If the "as" attribute had said "xs:string" rather than "xs:string?", a type error would be reported.

(ednote: From the XSLT 2.0 WD. [ERR XT0570] It is a type error if the supplied value of a variable cannot be converted to the required type.)

I asked just what was the meaning of the '?'.

dc replied:

It should be read the way ? is read in regex or dtd syntax as 0-or-1 you could also use + or * there, again with their regex or dtd meanings of 0-or-more or 1-or-more.

A type of xs:string requires a value that is a string. A type of xs:string? requires a value that is a sequence of 0 or more strings. (as always though, there is no difference between a single string and a sequence of length 1 that contains a string)

(ednote: This (for me) is subtle. The emphasis is on *zero* or more strings. Hence an empty string is a valid value.)

mk expressed this differently. The ? is part of the type. It means that the value (after atomization) must either be a string, or nothing (an empty sequence).

Finally, dc answered Andrews question,

q. So, what is the difference between the atomization process when the node <abc/> is present but empty, and when it's not there?

a. In one case you get an empty string "" (for which $abc = '' is true)

In the other you get an empty sequence () (for which the test $abc='' is false, as no item in the sequence is equal to "")

And mk came back with:

A world of difference. The typed value (i.e. the atomized value) of an empty element <abc/> actually depends on how it's described in the schema. If there's no schema, the typed value is a zero-length untypedAtomic, which compares equal to the string "". If there's a schema that describes <abc/> as having a simple type of string, then the typed value is a single zero-length string. However, if the schema says that the type is xs:NMTOKENS, then the typed value is an empty sequence. The empty sequence contains no value that's equal to "", so abc="" returns false.

If abc is defined in the schema as a complex type that doesn't allow mixed content, then atomizing <abc/> is an error.

These rules might seem arbitrary but the reflect the fact that the meaning of an empty element actually depends on what might have been there if it weren't empty.

I found that last sentance almost philosophical.

Thank you gentlemen for an enlightening thread.

If I can get past the DC math filter without reprimand,

sorry not today...

  I think his point that the {}  empty set is a member of
  any set was the Ah-ha moment for me.
  That's what  allows the as="xs:string?" to succeed rather than
  report the error.

The empty set isn't a _member_ of every set.

{} isn't a member of {1,2,3} for example.

It is a member of {{}} though (the set with one member, the empty set) It is a _subset_ of every set.