Statistical Machine Translation in Indian Languages
Machine translation (MT) is the process of encoding the syntactic and semantic information of a source language text into a target language. In past two decades, MT has shown very promising results particularly using Statistical Machine Translation (SMT) especially for English and other European Languages. However, its effectiveness in translating sentences within Indian Languages (IL) and between English and Indian languages needs to be explored further.
The NLP tool contest in ICON 2014 aims to collectively explore the effectiveness of SMT while translating within ILs and between English and ILs.
The Contest :
In the contest, training data will be provided to the contestants. It will consist of parallel corpus for different ILs and English. The contestants will have to train their systems on the data. A development corpus will also be provided to refine and improve their system. The final contest will be held on November, 2014 with the test data. A workshop will be held as a part of ICON to allow the short listed candidates to present their techniques and results. The detail of the language pair will be announced shortly. We will test translation in both directions for all language pairs.
The detail of the evaluation procedure and the uses policy of additional resources/tools will be announced shortly.
Schedule
Release of training data: August 23, 2014
Release of test data: September 29, 2014
Deadline for System submission: October 5, 2014
Date for Report Submission: October 25, 2014
Prizes
1st Prize: Rs. 10,000
2nd Prize: Rs. 7,500
3rd Prize: Rs. 5,000
Last date of registration: 18th August
NLP Tools contest registration page - link