Information structure design - standard vs proprietary
Using standard structures have a great benefit:
- already done
- designed by people who are the best on this domain
- tool support: editors and publishing tools have built in high level support for the standards
However using standard structures also have disadvantages. Usually we want to add our own semantics to the standard by customizing it. It’s certainly possible, but by customizing it we drastically cut down the standard, because we use only a small part of it and at the same time we also add a huge custom layer, then we could end up with more disadvantage than benefits. Simply because for example the publishing tools supports the standard are usually fairly complex, implementing our custom layer in these standard tools might be much more work than just start from scratch and make proprietary publishing pipes. Think long term and don’t forget that the amount of code we produce will matter a lot during the maintenance.
DocBook’s history goes back to 1991 and it was designed for writing technical documentation about computer software and hardware, but it’s not limited to that.
- low level with formatting elements
- basic constructs like, lists and tables
- high level structure: book → chapter → section → para
- plus some semantic elements, like address, productname… etc
DocBook customization used to be very cumbersome, but version 5 simplified it hugely. Also it improves topic driven authoring and document assembly.
DITA was designed 10 years later with topic driven authoring, modularity and customization in mind. From Wikipedia: “... it uses the principles of specialization and inheritance, which is in some ways analogous to the naturalist Charles Darwin's concept of evolutionary adaptation ...”.
The base information model is the Topic which is a relatively simple and very generic structure. The standard defines 3 specialized topic types:
These derive from Topic, inherit its properties but also add new properties via specialisation via the standard customization approach. The same way projects can create new topic types derived from any of these.
DITA is very powerful, customization is easy, however use it with caution. A too complex custom layer will lead to complex publishing customization too.
I’ve seen a DITA project which ignored most of the standard and implemented Docbook by specializing Topic. What is the point?
Proprietary semantic structure
Building proprietary structure from scratch makes sense if standard structures don’t give you much benefits and/or you need to add quite much semantics from your project domain.
However you usually don’t start completely from ground zero. You typically end up reusing some generic standards, like
- the low level formatting layer (for ex. bold, italic, subscript… etc)
- generic constructs, like lists, table
The CALS table model is a standard used all over. It has great tool support and customizable. At least you need to customize the content model of the table cell.