ICON 2003 - International Conference On Natural Language Processing
  ICON-2003 Home | Important Dates | Contact Us
Call for Papers
Full Information

About ICON-2003
Submission Deadlines
Format of Submission
Call for Tutorials
Submit your Paper
Technical Programme
Full Info
Papers Only
Tutorials Only
Keynote address
Invited Talks
Important Dates
Travel & Stay Guide

Keynote Address Abstract


A Customizable, Self-Learnable Parameterized Machine Translation System Achieved via Two-Way Training

Dr. Keh-Yih Su Behavior Design Corporation
2F, No.5, Industry E. Rd. IV,
Science-Based Industrial Park,
Hsinchu, Taiwan 30077, R.O.C.


Traditionally, Machine Translation Systems adopt rule-based approaches and are designed to have a general-purpose kernel which only changes dictionaries when the domain is switched; and it is hoped that wide coverage and high quality could be obtained at the same time. Such approaches, however, suffer from the problems of dealing with non-deterministic knowledge, and have great difficulty in acquiring the huge fine-grained knowledge required. A Parameterized MT architecture, which allows self-learning and customization in a specific domain with high translation quality is thus greatly desired. About fifteen years ago, IBM proposed a purely statistical approach to handle the problems above mentioned. However, without adopting any linguistic or AI models, this approach fails to handle long distance dependency within the context, and has a very huge parameter space.

In this talk, the major problems of current machine translation systems are first outlined. The characteristics of NLP is then given. Afterwards, a new direction, highlighting the system capability to be self-learnable and customizable, is proposed for attacking those previously described problems, which are mainly resulted from the intrinsic complexity of natural languages. The proposed solution first builds a stochastic language model on top of linguistic models, and then adopts an unsupervised two-way training mechanism and a parameterized architecture to automatically acquire the non-deterministic knowledge required, such that the system can be easily adapted to different domains and various preferences of individual users.

© LTRC All rights reserved