Category Archives: Misc

XSD, a quick introduction

I often get asked to explain how XSD:s work and everytime I explain it in a different manner. I thought that if I document one way here I could later reference it for the rest of my life 🙂
I have below an example XSD that I will talk about. This is by no means a complete tutorial but an simple look into the world of XSD

XSD EXAMPLE

<?xml version="1.0"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema">

<element name="student">
  <complexType>
    <sequence>
      <element name="name" type="string"/>
      <element name="pnr">
        <simpleType>
          <restriction base="string">
            <pattern value="[0-9]{2}[0-1][0-9][0-9][0-9][0-9a-z][0-9]{3}"/>
          </restriction>
        </simpleType>
      </element>
      <element name="address" type="string"/>
      <element name="zip" type="integer"/>
      <element name="city">
        <simpleType>
	  <restriction base="string">
	    <maxLength value="82"/>
         <minLength value="3"/>
	  </restriction>
	</simpleType>
      </element>
    </sequence>
  </complexType>
</element>

</schema>

The root tag in this xsd is the element ‘student‘. This explains the name of the “highest” element in the XML. It also gives us the names of tags below the root element: name, pnr, address, zip and city. Let us look at each tag separately:
name‘ – This tag has to obey to the rules set in the built-in type: string. This pretty much include any text in any length (even empty)
pnr‘ – For this tag we cannot use an built in xsd type. We have to create our own. To do this we set a pattern value. This is a normal regular expression and in this case it demands that ‘pnr’ starts with 2 single digit integers followed by a binary digit, 3 single digit integers, 1 character that can be a single digit integer or a lowercase letter in the range from a to z. We end with 3 digits between 0 and 9.
address‘ – This element also has to obey by the built-in string type rules
zip‘ – This element has to obey by the integer type which is a little more stringent than string
city‘ – This element has to obey to the two built in restrictions minLength (3 characters) and maxLength (82 characters).

A few words about simpleType and complexType
When we need to create our own restrictions we need to encapsulate them into simpleType (single element) or complexType (multiple elements). complexType can also be used to describe whole structures as done in the example message with the student substructure that is encapsulated in a complexType

Matching XML example

<?xml version="1.0"?>

<student>
  <name>Niklas</name>
  <pnr>1214567890</pnr>
  <address>Kungsgatan 1</address>
  <zip>12345</zip>
  <city>Gothenburg</city>
</student>

Non matching XML example

<?xml version="1.0"?>

<student>
  <name>Niklas</name>
  <pnr>121456-7890</pnr>
  <address>Kungsgatan 1</address>
  <zip>123 45</zip>
  <city>Gothenburg</city>
</student>

It’s often more interesting to talk about non matching examples since they give us a more in-depth look of the problems you could encounter while creating your XSD. Here the ‘pnr’ element contains a dash (‘-‘) which is not defined in our pattern. In our pattern only letters and numbers are allowed. The ‘zip’ element is also false since we demand a value of type ‘integer‘. The whitespace between ‘3’ and ‘4’ makes the value not an integer

To get a feel for the restrictions and types that is built-in I have listed a bunch of them below:
Built in types:
decimal, float, double, integer, positiveInteger, negativeInteger, nonPositiveInteger, nonNegativeInteger, long, int, short, byte, unsignedLong, unsignedInt, unsignedShort, unsignedByte, dateTime, date, gYearMonth, gYear, duration, gMonthDay, gDay, gMonth, string, normalizedString, token, language, NMTOKEN, NMTOKENS, Name, NCName, ID, IDREFS, ENTITY, ENTITIES, QName, boolean, hexBinary, base64Binary, anyURI, notation

Built in restrictions:
minExclusive, minInclusive, maxExclusive, maxInclusive, totalDigits, fractionDigits, length, minLength, maxLength, enumeration, whiteSpace, pattern

XSD:s can easily become very complicated but when they do – please consider following the KISS rule and simplify them. You might understand an XSD that you have created yourself, no matter how complicated it is but a colleague might not…. This is the reason I have not touched the subjects of namespaces and imports which often overcomplicate things

Hope you found this small introduction into the world of XSD:s useful

Changing date format in Trac using mod_wsgi

Trac is a marvelous tool to use for us developers. Unfortunately there are some quirks to it. One is that you can not change the data format in a “normal” way eq. through the admin panel. To change it you have to add a bit of code to the trac.wsgi:

environ['trac.locale'] = 'sv_SE.UTF-8' # Any valid locale will do

Be sure to add it after the “def application” row like this:

...
import os

def application(environ, start_request):
    environ['trac.locale'] = 'sv_SE.UTF-8'
    if not 'trac.env_parent_dir' in environ:
        environ.setdefault('trac.env_path', '/var/trac/my_project')
...

After you have made the change be sure to restart the webserver!
For Apache on Debian:

/etc/init.d/apache2 restart

Tested on Debian Wheezy and Trac v0.12.3

Validate elements in any order and any number of times using XSD

Sometimes you just want to validate any number of elements any number of times. There is no intuitive way in XSD to accomplish this so we have to use a trick to get it to work. The solution looks a little like this:

<xs:element name="myElement">
    <xs:complexType>
      <xs:sequence minOccurs="0" maxOccurs="unbounded">
        <xs:choice>
          <xs:element name="myId" type="xs:int" />
          <xs:element name="myName" type="xs:string" />
        </xs:choice>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

This lets us validate any of the elements in the <choice> (myId and MyName) list any number of times and in any order, so the following will validate:

<myElement>
  <myId>3</myId>
  <myName>Niklas</myName>
</myElement>

and:

<myElement>
  <myName>Niklas</myName>
  <myId>3</myId> 
</myElement>

and:

<myElement>
  <myId>3</myId>
  <myName>Niklas</myName>
  <myName>Anders</myName>
</myElement>

but not:

<myElement>
  <myId>hello</myId>
  <myName>Niklas</myName>
</myElement>

Last one does not validate since ‘hello’ is not a integer.

Tested with xmllint with libxml version 20708