Content types

A DMS stores different types of documents (docx, pdf... ), but in a CMS we also identify types by the different structure of the content.

Non-structured types

These are the typical office documents and media files like you would expect to meet in a standard DMS.

Editorial document formats:

  • OpenDocument (ODF) - OASIS standard (2005) and also ISO/IEC standard (2006)
    • Used by LibreOffice, OpenOffice…
    • Formats: odt, ods, odp, odg...
  • Office Open XML (OOXML) - ECMA standard (2006) and also ISO/IEC standard (2008)
    • Used by Microsoft Office
    • Formats: docx, xlsx, pptx… etc

Publishing document formats:

  • PDF - ISO standard (2008)
  • HTML
  • EPUB - IDPF standard (2007)

Media types:

  • video
  • image
  • audio

Structured (XML) content types

CMS content types are configured based on:

  1. object granularity
  2. content can have different structure which categorizes them into types

Huge documents are usually split to smaller objects for the editorial process and get assembled again during publishing. This has many benefits:

  • smaller documents are not so heavy for the authoring tools
  • easier collaboration - several editors can work parallel on different parts of the (big) document
  • better reusability

For ex. a docbook-like document can be split into chapters what can be split further to sections if necessary. A dictionary splits into letters and that splits into articles.

Obviously we can also store documents with different content structure in the same CMS, which naturally form different content types. They can still share the same structure schema if they share lots of common elements, but for example the higher level semantics is different.


I did already mention the CALS table model standard. It was designed in 1989-1990, well before XML was born, SGML was also quite new. In the meantime its XML version was also made and it became very widely used by the publishing community.

It can be a standalone content type in CMS, but it does not have to be. We can simply embed this into our content schema with slight customization usually concerning the content model of the cell (entry).

Converting CALS XML to HTML is a relatively simple tasks. It’s recommended to use proportional column width during the editorial process (the best XML editors support this out of box) in order to flexible output which can adapt to any desktop and mobile device.


From Wikipedia: “MathML 1 was released as a W3C recommendation in April 1998 as the first XML language to be recommended by the W3C” The second edition of version 3 became W3C recommendation in 2014 and also became an ISO/IEC standard in 2015. As its history suggest, it’s a very important standard, actively maintained.

The tool support is pretty good. There are standalone desktop apps or XML editor plugins of high quality, but also browser based editors exist. Publishing is easy to the web via JavaScript polyfills or browser plugins, also there are converters to postscript or raster graphics.


SVG is an XML based vector graphics standard developed by W3C since 1999. Just as CALS or MathML it can also be directly embedded into XML content.

It has great tool support concerning authoring and publishing.

results matching ""

    No results matching ""