What is semantics?

According to Wikipedia, “semantics is the study of meaning”. In our context we enrich the text by adding semantics to make it (easier) processable for computers. There is a lot implicit for us human readers, since we use our knowledge when we read and interpret text. We use the context of the words to disambiguate.

Natural language processing is getting better, still if our content is enriched with semantic markup like this:

… some text … <mineral>ruby</mineral> … some text …

… that is much easier to process for computers. Likely I don’t have to mention that ruby has several meanings.

Here we use XML syntax to mark the semantics. The XML and its related standards are commonly used in content management.

How can this semantic markup be used in practise? Now we can easily write code which for example collects all mineral names mentioned in the content, perhaps format them as italic… etc.

