2.1 Structured data

Common structured data models are

  • relational - data is represented in terms of tuples, grouped into relations (for ex. relation database)
  • hierarchical - the data is organized into a tree-like structure, stored as records which are connected to one another through links(for ex. XML, JSON)
  • linked - best practices for publishing structured data on the web

Most (legacy) applications use a relational database to persist the data. In a relational data model, data is represented in terms of tuples, grouped into relations. It was the first data model in computer science, therefore, archiving will likely to see a lot of relational data.

Let’s use a typical example, travel report: an employee reports a business trip cost. Our relational model uses three tables for this:

  • travel table: holds one record per trip
  • employee table: one row per employee, described the name, etc. of the employee, connects to the travel table via the employee id field
  • transaction table: one row contains info about one specific transaction, bound to the travel table via the travel id, one travel can have several transactions

Example 1:

The data in this model is pretty scattered from the end user point of view. The user wants to see the data aggregated, all the data about one trip as one record.

Let's aggregate this relational data into hierarchical data:

Example 2:

  • travel id=100
    • date: 2018-01-11
    • reason: Customer meeting in Oslo
    • employee name: Ole Normann
    • transactions
      • bus ticket - 100 NOK
      • accommodation - 1000 NOK

This can be better rendered/formatted for the human reader, or a standard data syntax can be used for machines, like JSON or XML. We'll use XML for readability sake:

Example 3:

Please note using ID attributes on travel employee and the transaction elements is possible not a must.

Through this example, we started from a relational data and ended up with hierarchical data.

This is a data semantics. The process doing the record aggregation has to know the data model quite well, however, for presenting the already aggregated records to the user, we only need to do simple formatting.

results matching ""

    No results matching ""