IIIT-Hyderabad Advanced School on Natural Language Processing
May 26th - June 9th, Hyderabad, India, Summer 2008

 

Projects

  • Miscellaneous Projects

    1. Machine learning of WSD and Selectional Restrictions

      Word Sense Disambiguation is given a word, its context and its possible meanings, the problem is to determine the meaning of the word in that context.
      In this project you will be given a training and testing text (parsed, if needed), and the task will be to develop an algorithm for WSD.
      References:
      1. Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora (1992) , *David Yarowsky
      http://citeseer.ist.psu.edu/rd/62988341%2C39762%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/cache/papers/cs/1083/http:zSzzSzwww.cs.jhu.eduzSz%7EyarowskyzSzpubszSzcoling92.pdf/yarowsky92wordsense.pdf
      2. Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity*, Dekang Lin
      www.aclweb.org/anthology-new/P/P97/P97-1009.pdf

      Selectional constraints specify the semantic classes acceptable in syntactic structures.
      Eg: The task is to identify the selectional restrictions of verbs i.e. the semantic categories of the arguments and also the mandatory and optional arguments.
      This project is similar to building Verbnet. Here you should build verbnet by machine learning.

    2. Violence in Language

      This project would explore the use of words associated with violence in language. For example, "add a bullet point", "kick off the event", "exploit the treebank" are examples of use of terms related to "physical" and "mental" violence (though they are not being used here to indicate violence). The work would require building a set of terms and then an ontology using the wordnet as well as other lexical resources. These would then be used to give an index of "violent terms" in a text. A cross language study would also be carried out. Also, "peaceful terms" would also be explored.

      -----------------------------------------------------------------------------------------------