New Releases

About the dataset and tools:
Different Indian language resources are available here. Resources are divided into 2 types.

  1. Datasets
  2. Tools
The annotation in the dependency treebanks follows Paninian Grammar Framework (Guidelines in Documents). The mapping of the dependency labels with the stanford dependency labels are also in the Documents. The annotation follows SSF (Shakti Standard Format), for further details related to SSF, refer the SSF_Guide.

Download Dataset :
To download datasets kindly click on respective dataset link in table given below and fill the form.

Dataset Name Link For Download
Hindi Dependency All Domains Link
Hindi Word Problems With Equations Link
Telugu Clickbait Link
Telugu Dependency Treebank Link
Telugu Emotion Identification Link
Telugu Hate Speech Link
Telugu NER Data Link
Telugu Sarcasm Link
Telugu Sentiment Analysis Link
Urdu Dependency Treebank Link


Download Tool :
To download tool kindly click on respective link in table given below and fill the form.

Tool Name Link For Download Supporting Document
Sampark Shallow Parser API Link File
SSF To CoNLL Converter Link
Hindi Word Problem Solver Link
WX to UTF and UTF to Wx Converter for Indian Lanugages Link
WX to UTF and UTF to Wx Converter for Urdu Link


Demo Tool :
To demo tool kindly click on respective link in table given below.

Tool Name Link For Demo Supporting Document
Hindi Shallow Parser v3 Link