XML and XQuery

What is XQuery?

XQuery is to XML what SQL is to databases.

XQuery was designed to query XML data.

XQuery Example

for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title

Continue reading XML and XQuery

XML and XSLT

With XSLT you can transform an XML document into HTML.


Displaying XML with XSLT

XSLT (eXtensible Stylesheet Language Transformations) is the recommended style sheet language for XML.

XSLT is far more sophisticated than CSS. With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements, perform tests and make decisions about which elements to hide and display, and a lot more.

XSLT uses XPath to find information in an XML document. Continue reading XML and XSLT

XML and XPath

What is XPath?

XPath is a major element in the XSLT standard.

XPath can be used to navigate through elements and attributes in an XML document.

  • XPath is a syntax for defining parts of an XML document
  • XPath uses path expressions to navigate in XML documents
  • XPath contains a library of standard functions
  • XPath is a major element in XSLT and in XQuery
  • XPath is a W3C recommendation

XPath Path Expressions

XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look very much like the expressions you see when you work with a traditional computer file system.

XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.

Continue reading XML and XPath

XML DOM

What is the DOM?

The Document Object Model (DOM) defines a standard for accessing and manipulating documents:

“The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document.”

The HTML DOM defines a standard way for accessing and manipulating HTML documents. It presents an HTML document as a tree-structure.

The XML DOM defines a standard way for accessing and manipulating XML documents. It presents an XML document as a tree-structure.

Understanding the DOM is a must for anyone working with HTML or XML.


Continue reading XML DOM

XML Parser

All major browsers have a built-in XML parser to access and manipulate XML.

XML Parser

The XML DOM (Document Object Model) defines the properties and methods for accessing and editing XML.

However, before an XML document can be accessed, it must be loaded into an XML DOM object.

All modern browsers have a built-in XML parser that can convert text into an XML DOM object.

Continue reading XML Parser

XML HttpRequest

All modern browsers have a built-in XMLHttpRequest object to request data from a server.


The XMLHttpRequest Object

The XMLHttpRequest object can be used to request data from a web server.

The XMLHttpRequest object is a developers dream, because you can:

  • Update a web page without reloading the page
  • Request data from a server – after the page has loaded
  • Receive data from a server  – after the page has loaded
  • Send data to a server – in the background

Continue reading XML HttpRequest

Displaying XML

Raw XML files can be viewed in all major browsers.

Don’t expect XML files to be displayed as HTML pages.

Viewing XML Files

<?xml version="1.0" encoding="UTF-8"?>
- <note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>

 

Most browsers will display an XML document with color-coded elements.

Often a plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure.

To view raw XML source, try to select “View Page Source” or “View Source” from the browser menu.

Note: In Safari 5 (and earlier), only the element text will be displayed. To view the raw XML, you must right click the page and select “View Source”. Continue reading Displaying XML

XML Namespaces

XML Namespaces provide a method to avoid element name conflicts.

Name Conflicts

In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.

This XML carries HTML table information:

<table>
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

This XML carries information about a table (a piece of furniture):

<table>
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

If these XML fragments were added together, there would be a name conflict. Both contain a <table> element, but the elements have different content and meaning.

A user or an XML application will not know how to handle these differences. Continue reading XML Namespaces

XML Attributes

XML elements can have attributes, just like HTML.

Attributes are designed to contain data related to a specific element.

XML Attributes Must be Quoted

Attribute values must always be quoted. Either single or double quotes can be used.

For a person’s gender, the <person> element can be written like this:

<person gender="female">

or like this:

<person gender='female'>

If the attribute value itself contains double quotes you can use single quotes, like in this example:

<gangster name='George "Shotgun" Ziegler'>

or you can use character entities:

<gangster name="George &quot;Shotgun&quot; Ziegler">

Continue reading XML Attributes

XML Elements

An XML document contains XML Elements.

What is an XML Element?

An XML element is everything from (including) the element’s start tag to (including) the element’s end tag.

<price>29.99</price>

An element can contain:

  • text
  • attributes
  • other elements
  • or a mix of the above
<bookstore>
  <book category="children">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

In the example above:

<title>, <author>, <year>, and <price> have text content because they contain text (like 29.99).

<bookstore> and <book> have element contents, because they contain elements.

<book> has an attribute (category=”children”).


Empty XML Elements

An element with no content is said to be empty.

In XML, you can indicate an empty element like this:

<element></element>

You can also use a so called self-closing tag:

<element />

The two forms produce identical results in XML software (Readers, Parsers, Browsers).

Empty elements can have attributes.

XML Naming Rules

XML elements must follow these naming rules:

  • Element names are case-sensitive
  • Element names must start with a letter or underscore
  • Element names cannot start with the letters xml (or XML, or Xml, etc)
  • Element names can contain letters, digits, hyphens, underscores, and periods
  • Element names cannot contain spaces

Any name can be used, no words are reserved (except xml).


Best Naming Practices

Create descriptive names, like this: <person>, <firstname>, <lastname>.

Create short and simple names, like this: <book_title> not like this: <the_title_of_the_book>.

Avoid “-“. If you name something “first-name”, some software may think you want to subtract “name” from “first”.

Avoid “.”. If you name something “first.name”, some software may think that “name” is a property of the object “first”.

Avoid “:”. Colons are reserved for namespaces (more later).

Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your software doesn’t support them!


Naming Conventions

Some commonly used naming conventions for XML elements:

Style Example Description
Lower case <firstname> All letters lower case
Upper case <FIRSTNAME> All letters upper case
Snake case <first_name> Underscore separates words (commonly used in SQL databases)
Pascal case <FirstName> Uppercase first letter in each word (commonly used by C programmers)
Camel case <firstName> Uppercase first letter in each word except the first (commonly used in JavaScript)

Tip! Choose your naming style, and be consistent about it!

XML documents often have a corresponding database. A common practice is to use the naming rules of the database for the XML elements.


XML Elements are Extensible

XML elements can be extended to carry more information.

Look at the following XML example:

<note>
  <to>Tove</to>
  <from>Jani</from>
  <body>Don't forget me this weekend!</body>
</note>

Let’s imagine that we created an application that extracted the <to>, <from>, and <body> elements from the XML document to produce this output:

MESSAGETo: Tove
From: Jani

Don’t forget me this weekend!

Imagine that the author of the XML document added some extra information to it:

<note>
  <date>2008-01-10</date>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>

Should the application break or crash?

No. The application should still be able to find the <to>, <from>, and <body> elements in the XML document and produce the same output.

This is one of the beauties of XML. It can be extended without breaking applications.