ICON 2018ICON 2018 - Tutorial

Explaining Deep Learning models for Natural Language Processing


Deep learning techniques have demonstrated tremendous success in the natural language processing (NLP) community. This has led to the inclusion of deep learning components in many products in the industry.

Most models are trained on large amounts of data, which is generally created by users of the internet. These data may contain human biases and prejudices. Models learned using such data will also carry forward such prejudices, which can be really harmful. For example, Google Photos’ algorithm was criticized as a ‘racist algorithm’ for labeling a black engineer’s photos as a Gorilla. Such mishaps could be prevented if the companies are able to understand and validate the underlying rationale behind their deep learning components. Models and techniques that help understand these rationales fall under the purview of Explainable AI.

Explainable AI technologies are the need of the hour. With the acceptance of GDPR, which includes, among other things, the right to explanation, the need for such technologies is further raised. The aim of this tutorial is to give an extensive overview of existing explainable AI techniques, and describe which of them can be applied to deep learning models for NLP. The attendees will learn to frame their explanation requirement and apply these techniques to their problem statement.

This tutorial will span over three parts. In the first part, we discuss the basics of deep learning. In the second part of the tutorial, we discuss model explainability. In the third part, we plan to demonstrate two techniques: LIME [25] and LRP [2,3] for explaining models trained on Document Classification Sentiment Analysis