ICON - 2011 : Tutorial Details

Tamil Computing

By
T V Geetha
Ranjani Parthasarathi
Madhan Karky
Anna University
Chennai

Tamil language presents interesting issues and challenges to researchers and developers of natural language processing applications. The purpose of this tutorial is to discuss some of these challenges, to describe methodologies available for tackling natural language in general and to present some solutions that would guide current and future practitioners in the field of Tamil Computing.

We begin the tutorial by outlining the significance of Tamil Language Processing and discussing some important characteristics of Tamil Language that make processing the language a challenge. Then we go on to describe typical components of a Natural Language Processing system from the computing perspective using Tamil language as the case study.

We will then discuss typical and some futuristic applications of natural language processing. We then proceed to discuss rule based and statistical based approaches to natural language processing and its applications. Some recent approaches to semantic based processing and applications will be discussed. We also outline how the rich morphology and the partially free word characteristics of Tamil pose some interesting challenges to these rule based and statistical based approaches.

The tutorial will provide some basic tools such as a dictionary, a morphological analyzer etc. as APIs which the participants can use to build some interesting applications as part of the hands on component of the tutorial.