| CLSP Homepage : Workshop Homepage | |
![]() | |
| Workshop 2007 | Wednesday, May 16, 2012 |
| Entity disambiguation is the problem of determining whether two mentions of entities refer to the same object: e.g., trying to decide whether the entity called "Jim Clark" in one document is the same as the entity called "Jim Clark" in another document. To do this accurately, it is necessary to extract from these documents descriptions of these entities as exhaustive and accurate as possible. This in turn requires 'tracking' these entities in each document - identifying all or most of their mentions - and collecting their properties, particularily those that help the most to discriminate between individuals. The goal of the workshop is to further the state of the art in entity disambiguation by developing better techniques for tracking entities and for extracting their properties. A particular focus will be improving entity tracking by using lexical and encyclopedic knowledge extracted both from structured lexical databases and from semi-strcutured repositories such as Wikipedia. Lack of such knowledge is one of the main problems with current entity tracking methods, which typically cannot detect that 'the Packwood proposal' and 'the Packwood plan' in the following example refer to the same entity.
Methods to be used include text mining techniques (supervised and unsupervised) to extract object properties; better machine learning techniques to improve entity tracking (e.g., using tree kernels); methods for extracting knowledge from WordNet, semantic role labellers, and Wikipedia; and clustering methods for entity disambiguation.
Click here for technical details p> | |||
| Team Members: | |||
| Massimo Poesio | Team Leader | Unversity of Essex and University of Trento | poesio at essex dot ac dot uk |
| David Day | Co-Leader | MITRE | day at mitre dot org |
| Ron Artstein | Senior Researcher | University of Essex | artstein at essex dot ac dot uk |
| Jason Duncan | Senior Researcher | Department of Defense | emailjdd at gmail dot com |
| Alessandro Moschitti | Senior Researcher | University of Trento | moschitti at info dot uniroma2 dot it |
| Xiaofeng Yang | Senior Researcher | Institute for Infocomm Research, Singapore | xiaofengy at i2r dot a-star dot edu dot sg |
| Robert Hall | Graduate Student | University of Massachussetts | rhall at cs dot umass dot edu |
| Simone Ponzetto | Graduate Student | EML Research | ponzetto at eml-research dot de |
| Jason Smith | Graduate Student | Johns Hopkins University | jrs026 at gmail dot com |
| Yannick Versley | Graduate Student | University of Tubingen | versley at sfs dot uni-tuebingen dot de |
| Michael Wick | Graduate Student | University of Massachusetts | mwick at student dot umass dot edu |
| Vladimir Eidelman | Undergraduate Student | Columbia University | vae2101 at columbia dot edu |
| Alan Jern | Undergraduate Student | University of California; Los Angeles | ajern at ucla dot edu |
| Brett Shwom | Undergraduate Student | New York University | brett dot shwom at nyu dot edu |
| Affiliates: | Claudio Giuliano | FBK-IRST | Janet Hitzeman | MITRE |
| Veronique Hoste | University of Antwerp | ||
| Mijail Kabadjov | Edinburgh University | ||
| Sameer Pradhan | BBN | ||
| Emily Jamison | Ohio | ||
| Gideon Mann | University of Massachusetts | ||
| Walter Daelmans | University of Antwerp, Belgium | ||
| Michael Strube | EML Research | ||
| The Center for Language and Speech Processing The Johns Hopkins University 3400 North Charles Street, Barton Hall Baltimore, MD 21218 | |||||
| Telephone: (410) 516-4237 | Fax: (410) 516-5050 | E-mail: clsp@clsp.jhu.edu | |||