|Tutorial 3: Semantic Modeling|
Next: Introducing RDFS & OWL
Whilst RDF offers a flexible, graph-based model for recording data that is interchangable globally, it doesn't offer any means to record semantics or meaning. Saving technical specifics for the next lesson, let's make a review of the models of data which are commonly available and explain what all the fuss is about.
After this tutorial, you should be able to:
You should have already understood the following lesson (and pre-requisites) before you begin:
There are various popular, mainstream ways to model data, some of which have emerged later than others. Before exploring the benefits of the RDF model, it is best to make a review of some of the approaches to modeling data that have already been established.
Look at the table below which makes an easy comparison between the approaches and highlights some of the unique qualities of the semantic data model.
3.1 Comparing The Popular Data Models
Metadata is a term you will come across again and again when harnessing semantic web technologies. "Metadata" is not a complex term or concept - it simply means "data about data" (taken from the greek meta- meaning "information"). The table above shows some examples of how you might classify the metadata for various different models.
3.2 Why Include Semantics In Data? Knowledge Integration
There's no point in adding semantics to your data if it does not provide significant benefits. One of the primary benefits of adding semantic meaning to your data is that it can be branched across domains of knowledge automatically. What do we mean domains of knowledge? Let's illustrate using a simple example.
In our example, two websites are started independently from each other. One site hosts information on current and historic Oscar winning films; the other a large database of biographies of Hollywood actors and actresses.
Both contain complementary information in their website databases. We will cover firstly how information sharing between these sites could happen without the use of semantics. Then, we will describe how the same information can be shared between the two sites - and potentially beyond - with the use of semantics.
Sharing Without Semantic Modeling
Our two sites, one fronting an MS SQL database of all Oscar winning films, and another one fronting a MySQL database of Hollywood actors, reside at http://www.oscarwinners.fake and http://www.actorbiographies2go.fake respectively. The two sites were started independently, and do not collaborate.
The Oscar Winners site lists, as its name suggests, all of the Oscar winning films ever produced and also a list of actors and actresses who starred in them. However, it doesn't hold any other actor information other than their name and date of birth.
The Actor Biographies site contains a complete listing of many current and former Hollywood actors, including a complete biography, plus a list of movies that they starred in. But, it does not contain any film plots, or screenshots of the films.
Let's look at how these two sites might collaborate under their current, more traditional data model:
Important Point This sort of information interchange across incompatible, independently designed data systems takes time, money and human contextual interpretation of the different datasets. It also is restrictive to the data domains of only these two websites, any further additions to their knowledge from elsewhere will demand similar efforts. It requires humans to understand the meaning of the data and agree on common formats to collaborate the two databases appropriately.
With the introduction of RDF and semantics, it is far easier. Let's investigate how this could be achieved using RDF and the semantic web - it all happens automatically, not manually.
Sharing With The Semantic Web Model
In semantic modeling, the following are important terms you should know:
So how do we model the two site scenario using semantic modeling? Firstly, the two sites need to apply a common, standard vocabulary to describe their data that is contextually consistent. For example, the term 'film title' should mean the same thing for both sites, as should the term 'actor name' and 'actor birthdate'.
This may be done by the two sites adopting the same base ontology, or a common vocabulary, for expressing the meaning behind the data they expose, and publishing that data on a queryable endpoint so that the two sites can communicate with each other across the web.
With this standard vocabulary in place:
We'll see in the next lesson what the makeup of a semantic web ontology is, how you query a semantic database and even perform machine inference on it.
Point Of Interest The good news is you often won't have to go through the effort of defining and sharing your own ontology for your particular domain of knowledge. There are many popular, standard ontologies already distributed on the web which you can adopt, and if necessary extend yourself. We will introduce some of these in the following section.
The cross-domain knowledge sharing discussed here need not just apply to websites, but also within the knowledge bases built by organisations. Semantic web technologies need not be restricted to applications or information published on the web.
Although there may be a little more groundwork required when first setting up a semantic database, the benefits for ease of cross-domain integration from across the globe and the time saved and ideas gained from doing so are, potentially, highly significant.
3.3 Metadata Initiatives
Standard vocabularies, or formal ontologies representing terms within a domain of knowledge, are already available freely from various organisations dedicated to creating standard vocabularies for a range of subjects - for example media terms, or biomedical terms, or scientific terms. Below are some examples:
You have now completed this lesson. In the next lesson, we will go into more thorough technical detail into how ontologies are defined in the semantic web, and how semantic web databases can be queried for information.
You have completed this lesson. You should now understand the following:
You should now be able to start the following tutorial: