XPath Expressions Dom4j vs XMLSpy


Hello,

this is my first real article so I hope I get things right…

This article is about a “Problem” I recently encountered while using XPath expressions in Dom4j and XMLSpy. The main objective was to select certain nodes in an XML File. Just to give you a better understanding of what I wanted to select, here the related snippet of the XML File:

   <!-- Individual:http://www.co-ode.org/ontologies/pizza/pizza.owl#Germany -->
   <owl:Thing rdf:about="#Germany">
       <rdf:type rdf:resource="#Country"/>
   </owl:Thing>

   <!-- Individual: http://www.co-ode.org/ontologies/pizza/pizza.owl#Italy -->
   <owl:Thing rdf:about="#Italy">
      <rdf:type rdf:resource="#Country"/>
   </owl:Thing>

   <owl:Thing rdf:about="#AmeriCunt">
       <rdf:type rdf:resource="#American"/>
   </owl:Thing>

   <American rdf:ID="Lappland"/>

   <AmericanHot rdf:ID="Toska"></AmericanHot>

   <AnchoviesTopping rdf:ID="Lappland"/>;

   <ArtichokeTopping rdf:ID="Lappland"/></span>

   <AsparagusTopping rdf:ID="Lappland"/>

I wanted to select the elements in the lines 17 – 25 via XPath expressions. So I tried to select the first Element with the expression “//American”. In XMLSpy and everything worked just fine. Therefore I took my java code and tried to run the same XPath expression with Dom4j (see code below):

Document doc = null;
try {
      doc = new SAXReader(  ).read(
              new FileInputStream("files/pizza.owl"));

      XPath xpathSelector =
      DocumentHelper.createXPath("//American");

      List<Element> elements = xpathSelector.selectNodes(doc);
      for(Element e: elements){
          System.out.println("Found");
      }

} catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
} catch (DocumentException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

That didn’t work so well and I ended up with an empty result set. After hours I finally came up with a solution to this problem (even though I am not quite sure that the explanation why the problem occurred with Dom4j and not with XMLSpy is accurate or not…..so it’s just a guess ).

I figured out that the elements I tried to select don’t belong to any namespace (in fact they do, they belong to the standard namespace of the XML file, if defined). The elements which I wanted to select existed one level below the actual root element of the XML file.

My guess is that XMLSpy figures out the namespaces of given document on its own and therefore can select elements based on an expression like “//American”. Dom4j on the other hand doesn’t figure out namespaces on its own an therefore is incapable of correctly evaluating the expression. That is why I ended up with an empty result set on my first try.

What I did is I took the namespace (line 1 in the XML file below) which all elements belong to if they have no prefix assigned to them and made it known to Dom4j:

<rdf:RDF xmlns="http://www.co-ode.org/ontologies/pizza/pizza.owl#"
     xml:base="http://www.co-ode.org/ontologies/pizza/pizza.owl"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:owl="http://www.w3.org/2002/07/owl#">

Java Code (the relevant changes are lokated in lines 4,5 and 8):

           try {
               doc = new SAXReader(  ).read(new FileInputStream("files/pizza.owl"));

               Map map = new HashMap();
               map.put("standard", "http://www.co-ode.org/ontologies/pizza/pizza.owl#");

               XPath xpathSelector  = DocumentHelper.createXPath("//standard:American");
               xpathSelector.setNamespaceContext(new SimpleNamespaceContext(map));

                List<Element> elements = xpathSelector.selectNodes(doc);
                for(Element e: elements){
                    System.out.println("Found");
                }

            } catch (FileNotFoundException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (DocumentException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

Now everything was working fine and I got my desired elements…

Advertisements

One comment


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s