Sunday, August 19, 2012

RDF – A Data model for machines to process entities

If I compare the current web (Web of Documents or information web) to the Web of Data (knowledge web) the latter is much more structured. However, since each Web of Data sources have its own schema which does not follow strict rules as in a database. The schema changes over time when the underlying data changes. For example: new records can require new attributes. I therefore consider the Web of Data have flexibly-structured. In Web of data, the information is organised as entities, each of these entities is uniquely identified by a Uniform Resource Identifier. 

What is RDF?
RDF stands for Resource Description Framework, and it was originally created in early 1999 by W3C as a standard for encoding metadata. RDF is not only used for encoding metadata about Web resources, but also used for describing any resources and their relations existing in the real world. In other words, RDF facilitates the machine understanding of resource (entity) descriptions, hence allowing an automated processing of them. RDF can be used to describe information about any domain. In RDF, a resource/entity description is composed of statements about a given resource/entity. A statement is a triple consisting of a subject, a predicate and an object, and a subject has a property with some value. A set of RDF statements forms a directed labelled graph. In an RDF graph, a node can be of the three types: URI, literal and blank node. An URI serves as a globally-unique identifier for a resource. A literal is a character string with an optional associated language and datatype. A blank node represents a resource/entity for which a URI is not given. Understanding RDF is important due to the fact of it plays central role in Semantic web, so I repeat that RDF offers an abstract data model and framework that tells us how to decompose information/knowledge into small pieces. One such small piece of information/knowledge is represented as a statement which has the form (subject, predicate, object). A statement is also called a triple. A given RDF model can be expressed either as a directed graph or as a collection of statements or triples. Each statement maps to one edge in the graph. Therefore, the subject and object of a given statement are also called nodes, and its predicate is also called edge. Subjects and objects denote resources in the real world. Predicates denote the relationship (typically a verb-phrase) between subjects and objects. RDF is stored in Semantic repositories (reasoner or ontology server or semantic store or metastore or RDF database). 

