My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members

One of five Student Work Projects on the Indian-summer school on Linked Data (http://lod2.eu/Article/ISSLOD2011). We implemented a simple entity disambiguation approach and made the results visible on a Web UI.

Introduction

  • Given:
    • Reference knowledge base(s) K
    • Text fragment T
    • Set E of Named Entities
  • Task:
    • Find URI for each of the Named Entities

URI lockup

  • Input: String I (label of an entity)
    • Get all entities with rdfs:label l
      • SELECT DISTINCT ?uri WHERE { ?uri rdfs:label “Paris”@en. }
    • or each entity ?e, merge all labels
      • SELECT ?label WHERE { ?e rdfs:label ?label. }

Disambiguation Approach

  1. Remove stopwords in T
  2. Stem each word in T
  3. For each c in C
    1. Remove stopwords in c
    2. Stem each word in c
    3. Calculate Jaccard coef. between c and T
  4. Return ranked list of entities

Web UI

Powered by Google Project Hosting