DTD

Introduction

A DTD or Document Type Definition is a text document, allowing the storage of document structure information.

Example

This is a DTD for an XML file. Hopefully, it is reasonably self-explanatory!

<!ELEMENT tasks (task)*> <!ELEMENT task (date, initials, detail)> <!ATTLIST task rrn CDATA #REQUIRED> <!ELEMENT date (#PCDATA)> <!ELEMENT initials (#PCDATA)> <!ELEMENT detail (#PCDATA)>

A Document Type Definition allows the specification of an XML document, in terms of saying what elements can be present and in what kinds of relationships. A DTD, or document type definition, is a collection of XML declarations defining the valid structure, elements and attributes of a document conforming to the DTD. It consists of element, attribute, entity and notation Definitions. One of the greatest strengths of XML is that it allows you to create your own tag names. But for any given application, it is probably not meaningful for tags to occur in a completely arbitrary order. If a document is to have meaning, there must be some constraint on the sequence and nesting of tags. Declarations are where these constraints can be expressed. XML content can be processed without a document type declaration. However, there are some instances where the declaration is required. A document is well-formed when its elements, delimited by their start and end tags, are nested properly within one another and there exists a unique root element. The basic XML rules for well-formedness only determine whether an XML document has its tags correctly formed and nested etc. No validation is performed to ensure that the correct information is present from the viewpoint of any business requirement. In our earlier example, there is a requirement that for each to do item, there is to be a description present and that the initials should follow the date information. These business requirements can be enforced by storing the requirements in a DTD or XML Schema and validating the XML document against the DTD or schema.

Including a DTD

If present, the document type declaration must be the first thing in the document after optional processing instructions and comments. The document type declaration identifies the root element of the document and may contain additional declarations. All XML documents must have a single root element that contains all of the content of the document. Additional declarations may come from an external DTD, called the external subset, or be included directly in the document, the internal subset, or both:

Example

The following code declares an attribute list.

<?xml version="1.0" standalone="no"?> <!DOCTYPE chapter SYSTEM "dbook.dtd" [&lt;!ENTITY %ulink.module "IGNORE"> <!ELEMENT ulink (#PCDATA)*> <!ATTLIST ulinkxml:link CDATA #FIXED "SIMPLE"xml-attributes CDATA #FIXED "HREF URL"URL CDATA #REQUIRED&gt;]> <chapter>...</chapter>

This example references an external DTD, dbook.dtd, and includes element and attribute declarations for the ulink element in the internal subset.

In this case, ulink is being given the semantics of a simple link.

Note that declarations in the internal subset override declarations in the external subset. The XML processor reads the internal subset before the external subset and the first declaration takes precedence. In order to determine if a document is valid, the XML processor must read the entire document type declaration (both internal and external subsets.