Thursday, October 25, 2012

Semantic annotation and ontology population

More than 95% of human-computer information input will be made up of natural, textual language (gartner). This is not going to change any time soon. Hence the world need to convert un/semi structured docs to structured format (RDF) to search, browse, analyse, etc much importantly machine processable content. The first step in the conversion is Semantic annotation. There are two things which are input for Semantic annotation namely 1) Ontology 2) Full text. Firstly, Semantic annotation requires extract structured information (IE) from unstructured/full text documents. Secondly, the information extracted can be linked to ontology. In other words, Semantic Annotation is about assigning to entities and relations in text links to their semantic descriptions in an ontology. This sort of semantic metadata provides both class and instance information about the entities. Automatic semantic annotation enables many new applications, especially, it provides more meaning to existing text and facilitates web search better. Information extraction (IE) takes texts as input and produces structured, unambiguous data as output. It involves processing text to identify selected information, such as particular named entities or relations among them from text documents. In Semantic annotation: types used for annotation are taken from an ontology, to which the text can be linked, and which acts as a type system for extracted information. Automatic ontology population: an OBIE (Ontology based Information Extraction) application identifies instances in the text belonging to concepts in a given ontology, and adds these instances to the ontology in the correct location. Ontologies in OWL format provide for a standardized means of modeling, querying, and reasoning over large knowledge bases data (KB). As a majority of the world’s knowledge is encoded in natural language text, automating the population of these ontologies using results obtained from NLP(Natural language processing) analysis of documents has recently become a major challenge for NLP applications. In the Semantic annotation the document is modified, but in Ontology population the ontology is modified.

No comments: