Data versus metadata

I mean what’s the difference between data embedded into the content and data on content (metadata). Well, not much really. As I’ve described earlier some CMS can populate metadata into the content or vice versa out of box, using simple configuration.

The real question is why this is necessary? Metadata is data on the (whole) object, while data embedded into the content is might be connected to just a part of the structure, not for the whole object.


The document describes 3 different variants of a drug, each of them has strength data. We cannot add the strength to the global level, since that belong to a substructure only.

So then the next question is why do we need metadata at all? It can be added to the content and the structure will indicate if it this data is “global”, valid for the whole object or only to parts of it.

Still CMSes tend to define metadata, two main reasons:

  1. Metadata is not part of the content. It might control the publishing, still it does not have to be rendered as part of the content. If it’s defined as metadata, then the publishing does not have to filter this out.
  2. Performance. Maybe metadata is faster to access than the the data is embedded into the content. Depends how they are persisted.

It’s very important to be aware of how our content, metadata is stored and what’s the “price” to access them. If our publishing automation process is completely controlled by the data embedded into the content, then we need to be able to access them quickly. If the content is stored in an XML database and indexed properly, then it might not be a problem at all and then the cost of accessing embedded data or metadata is about the same. But if the XML content is stored in a relational database, then accessing embedded data might not be very fast and we can end up doing lot’s of “data extraction” (moving embedded data to the metadata level), so literally we have to cache the data for quick access. This can add quite much complexity to the project what has to be maintained in the long run.

results matching ""

    No results matching ""