Natural Language Processing or Computational Linguistics (NLP/CL) deals with understanding and developing computational theories of human language. Such theories allow us to understand the structure of language and build computer software that can process language.NLP/CL is expected to play a major role in facilitating man-machine communication as well as man-man communication. Goals are to create computer systems that can speak and listen to users, machines that can translate from one language to another, thus bringing about a virtual revolution in access to information.

In the MT-NLP Lab at LTRC, IIIT-H, work is undertaken in many different sub-areas of NLP including syntax and parsing, semantics and word sense disambiguation, discourse and tree banking, machine translation, etc. Computational models are built inspired from linguistics, which are combined with machine learning techniques.The Lab. and the Centre as a whole, has done original work on developing Computational Paninian Grammar (CPG) framework for Indian languages. Using such a framework, treebank for Indian languages have been developed. These provide a rich testbed for studying and understanding language in actual use, and are also used for developing parsers using machine learning. This has given rise to full sentence parsers with broad coverage for Indian languages.

Semantic processing involves developing semantic PurposeNet, semantic category assigners, identifying semantic relations in nominals, etc. Work has also been going on in discourse processing including development of discourse treebank and dialog processing. Machine translation (MT) has been a driving application on which intense research is being done. Work is going on for English to Hindi MT as well as MT from one Indian language to another. Various research areas of Lab are given here.

Sampark (ILMT)

Sampark is a multipart machine translation system developed with the combined efforts of 11 under the umbrella of consortium project “ Indian language to India Language Machine translation” (ILMT) funded by TDIL program of Dept of IT, Govt. of India.

