Combining Named Entities with WordNet and Using Query-Oriented Spreading Activation for Semantic Text Search
Purely keyword-based text search is not satisfactory because named entities and WordNet words are also important elements to define the content of a document or a query in which they occur. Named entities have ontological features, namely, their aliases, classes, and identifiers. Words in WordNet also have ontological features, namely, their synonyms, hypernyms, hyponyms, and senses. Those features of concepts may be hidden from their textual appearance. Besides, there are related concepts that do not appear in a query, but can bring out the meaning of the query if they are added. We propose an ontology-based generalized Vector Space Model to semantic text search. It exploits ontological features of named entities and WordNet words, and develops a query-oriented spreading activation algorithm to expand queries. In addition, it combines and utilizes advantages of different ontologies for semantic annotation and searching. Experiments on a benchmark dataset show that, in terms of the MAP measure, our model is 42.5 keyword-based model, and 32.3 using only WordNet or named entities. Keywords: semantic search, spreading activation, ontology, named entity, WordNet.
READ FULL TEXT