Information explosion on the web has brought NLP research community
two big impacts. One is that it provides us very huge knowledge
sources from which we can extract linguistic knowledge and
extra-linguistic knowledge (common sense knowledge). This situation
makes it possible to solve the deadlock between language understanding
and knowledge acquisition in a bootstrap way. Another impact is that
NLP-based intelligent support to exploit the web information becomes a
killer application for NLP, since information on the web becomes more
and more important, giving judgement criteria for people's daily life
and starting to have a strong influence on governmental policy and
business management. This lecture introduces our several on-going
projects concerning NLP based on and for information explosion on the
web.
- Case frame acquisition from 500M parsed sentences
- Synonymous expression acquisition from an ordinary dictionary and web corpus
- TSUBAKI: an open search engine infrastructure with 100M Japanese web pages
- A clustering search engine
- Information credibility criteria on the web
|