For sophisticated authoring and publishing, document validation is a must have. This ensures quality control during manual authoring and also during automated publishing.
During authoring the XML editor tool itself “enforces” validity by allowing only valid operation in a certain context. For example it only allows inserting those elements what the structure allows in that context. Using the example above, we’re only allowed to insert one form-group element under an entry and sense-group(s) must come after form-group.
Validation is not only important during human interaction, but also during automated processing or on data exchange interfaces a schema is a certain “contract” ensures data/content quality.
The next sections show the most typical validation types (XML Schemas).
Document Type Definition (DTD)
This is pretty old, designed for SGML (XML’s predecessor), but it’s still fine for text only documents (without embedded data) to control the structure. Some legacy tools only support this, while new ones might lack of supporting it.
W3C XML Schema / RELAX NG
These go a big step further. They are more flexible concerning the structure definition, but most importantly they define data types and also data patterns. So if you store data in XML, schemas are much better than DTD, since you can control if the field value is an integer, date type.. etc.
Schematron can be used in combination with the previously mentioned schemas. Beyond structure and data type validation it can check complex business rules by defining asserts with the help of XPath. It’s a very powerful approach.