The framework for anusaaraka with initial data set for English-Hindi pair is almost ready for user-cum-developers.
The initial data, we believe, are sufficient for starting a "boot-strapping" process.
However, the data needs to be enhanced qualitatively and quantitatively several times for general use.
As far as English side is concerned, lot of material is available free
under General Public License(GPL).
What is required is
It is proposed to select a few Hindi medium schools (who have computers) at Lucknow.
Selected students and teachers from these schools will be introduced to the anusaaraka system. The students will act as typical users and give feedback about the system. The teachers will provide concrete suggestions and data for improving the system.
To make sure that the time of students is not wasted, English versions of their science texts will be selected as input for the anusaaraka system.
This is the work where researchers and English-Hindi bilinguals can contribute.
Mahatma Gandhi International Hindi University's Bhasha Kendra at Lucknow, and Lucknow branch of CIEFL have agreed that their Ph.D. scholars will work on topics related to anusaaraka.
In the following we will mainly concentrate on the English-Hindi anusaaraka, which is the culmination of around two decades of efforts of Akshara Bharati group.
The English-Hindi anusaaraka is supported by Satyam Computer Services Limited.
A snapshot of a sample English-Hindi anusaaraka output with brief explanation of each of the layers:
|Princess and the Pea||Java||XML|
|Lazy Man and Coconuts||Java||XML|
|Finding the Thief||Java||XML|
|What clothes Say?||Java||XML|
|Origin of Words||Java||XML|