SQL Server and XML
Just a few short years ago we would display data from our database on the Web as HTML using Active Server Pages to select and format the data. We would also send and receive data as plain text files in business-to-business (B2B) applications.
XML (Extensible Markup Language) is changing all of that, and SQL Server 2000's XML-related features are
helping to speed up the process of change. SQL Server provides native built-in
support for XML that allows us to select and write XML data in SQL Server. What
this means is that we can write queries and stored procedures that allow us to
retrieve data in XML format. SQL Server stores data natively as XML, and when
XML data is requested SQL Server formats the data selected into the required
XML format. All of this happens behind the scenes and is part of SQL Server's
built-in support for XML.
SQL Server also provides several features that allow us to select XML data using a URL (Universal Resource Locator) just as we would use a URL to access a web page. SQL Server also provides the XML-Data Reduced (XDR) language, which allows us to use annotated XDR schemas to describe the structure of the data in an XML document.
This chapter explores these new XML features in SQL Server 2000 and takes a look at how you can exploit them to your benefit. We will also take a look at how we can use XSL (Extensible Stylesheet Language) to format and display XML data.
So, this chapter will cover:
q A brief introduction to XML and its related technologies
q Accessing data using a URL
q Creating and using stored procedures that return XML formatted data
q Creating and using XML templates
q Creating and using XSL templates
q Creating and using annotated XDR schemas
XML is a huge topic, so we can only give a brief introduction here. For a more complete grounding in XML and its related technologies refer to David Hunter's Beginning XML, ISBN 1-861003-41-2, from Wrox Press, or visit the web links included in Appendix F.
XML is an acronym for Extensible Markup Language and is a language that is used to describe data and how it should be displayed. XML is not just for the Web, which is what it is commonly used for, as it can be used anywhere that self-describing data is needed. An example of this is in B2B (business-to-business) applications where an exchange of data is required. By using self-describing data, any recipient of that data can work with it and know what each item is. This is the great power of XML.
XML markup describes an XML document using tags and attributes in the document. The tags, also known as element names, describe the data, and the attributes describe some attribute of the actual data elements themselves. For example, consider the following element:
<FirstName Employee_ID="1">Thearon</FirstName>
The <FirstName> tag describes that the data is the first name of an employee and the attribute of Employee_ID describes the employee ID of this employee. The tag name and attribute name suggest the data structure, and the values 1 and Thearon represent the actual data itself.
In XML, unlike HTML, the developer defines the tags and attributes, so they can be given meaningful names. XML allows you to nest one element within another to represent groups of data.
Most tags in an XML document have a corresponding closing tag. The closing tag begins with a forward slash and contains the same name as the beginning tag. Using a corresponding closing tag for each defined tag creates what is known as a well-formed document that strictly adheres to the XML standard. Other rules for defining a well-formed XML document include ensuring that there is only one root element and that the element names do not contain spaces. We'll see more of this in just a few minutes.
To define a document as an XML document we must specify the <?XML?> element that allows any XML-compliant browser, such as Microsoft Internet Explorer, to properly parse and display the document. This element has several attributes and the syntax is shown:
<?XML version="version" standalone="DTDflag" encoding="encodingname" ?>
The version attribute specifies the version number of XML that is being used. The current version of XML is 1.0.
The DTDflag attribute is an optional argument that specifies a Boolean value (either yes or no) indicating whether or not this XML file includes a reference to an external Document Type Definition (DTD). A DTD defines the elements and attributes that can be used in an XML document and can also contain pre-defined XML data definitions. It also lays out any rules about how many of each element should be present, and in what order.
The optional encodingname attribute specifies the type of character set encoding used in the document. This argument is usually one of the character sets supported by Microsoft Internet Explorer, such as UTF-8. UTF-8 supports Unicode data that allows you to use non-ASCII characters.
Looking at the following example below we see that the <Employees> tag has a corresponding closing tag of </Employees>. This is very similar to HTML, where the closing tags for various HTML elements contain a forward slash to indicate a closing tag. Some XML tags do not require a separate closing tag as
this can be accomplished in the beginning tag. For example, suppose we had a tag for middle name and the employee did not have a middle name, we could specify the tag as <MiddleName />. This would indicate that there was no data for the <MiddleName> element.
XML data can usually be structured in a hierarchical way. The <Employees> element in the example below is referred to as the root element as it is the topmost data element in the document:
<?xml version="1.0" encoding="utf-8" ?>
<Employees>
<Employee>
<EmployeeID>1</EmployeeID>
<FirstName>Thearon</FirstName>
<LastName>Willis</LastName>
</Employee>
<Employee>
<EmployeeID>4</EmployeeID>
<FirstName>Cheryl</FirstName>
<LastName>Carson</LastName>
</Employee>
</Employees>
Within the Employees element we have two Employee elements, each with three child elements, EmployeeID, FirstName, and LastName.
In this example, and XML in general, elements form a hierarchy of parent-child relationships. The <Employees> element represents a group of employees and is the root element. The <Employee> parent element represents a single employee. Within the <Employee> element we have specified each of the employee child elements and each element has a tag describing the data (for example <FirstName>).
XML is case sensitive, meaning an element with a name of EmployeeID is not the same as an element with a name of employeeid. This is true for all XML elements and attributes.
When we save this file we save it with a .xml extension. The file can then be viewed in a browser, such as Microsoft Internet Explorer, by simply double-clicking on the file within Windows Explorer. This figure shows the example as it is displayed in IE5: |
|
Notice that each group of data in our document is expandable and collapsible, as indicated by the dash next to the group name. This means that we can collapse the employee groups, which are children of the employees group.
Note: this is a feature of Microsoft Internet Explorer 5.0 and other browsers may not allow you to expand and collapse the element groups in this way.
Now that we know what XML looks like, let's move on and see how SQL Server 2000 can help us create and display XML in a browser.