Natural Language Processing based on and for
Information Explosion on the Web



Abstract
 
Information explosion on the web has brought NLP research community two big impacts. One is that it provides us very huge knowledge sources from which we can extract linguistic knowledge and extra-linguistic knowledge (common sense knowledge). This situation makes it possible to solve the deadlock between language understanding and knowledge acquisition in a bootstrap way. Another impact is that NLP-based intelligent support to exploit the web information becomes a killer application for NLP, since information on the web becomes more and more important, giving judgement criteria for people's daily life and starting to have a strong influence on governmental policy and business management. This lecture introduces our several on-going projects concerning NLP based on and for information explosion on the web.
  1. Case frame acquisition from 500M parsed sentences
  2. Synonymous expression acquisition from an ordinary dictionary and web corpus
  3. TSUBAKI: an open search engine infrastructure with 100M Japanese web pages
  4. A clustering search engine
  5. Information credibility criteria on the web



Back to Schedule