Multi domain corpus for sentimental analysis
Authors : Gangula Rama Rohit Reddy, Radhika Mamidi
About the dataset : The corpus "Sentiraama" was created by G.Rama Rohit Reddy at Language Technologies Research Centre, KCIS, IIIT Hyderabad. The corpus consists of 4 datasets annotated using a 2-value scale, distinguishing between positive and negative sentiment at document level. Corpus consists of datasets from multiple domains such as book reviews, product reviews, movie reviews and song lyrics. Each of them were annotated by the annotators carefully following the annotation procedure. In the corpus, folder named "Song_Lyrics" contains 339 different Telugu song lyrics written in Telugu script. Out of them 230 are positive and 109 are negative. It contains a total of 13997 sentences and 81798 words. In the corpus, folder named "Movie Reviews" contains 267 different Telugu movie reviews written in Telugu script. Out of them 136 are positive and 131 are negative. It contains a total of 20000 sentences and 165049 words. In the corpus, folder named "Product Reviews" contains 200 different product reviews written in Telugu script. Out of them 100 are positive and 100 are negative. It contains a total of 43199 sentences and 259189 words. In the corpus, folder named "Book Reviews" contains 200 different book reviews written in Telugu script. Out of them 100 are positive and 100 are negative. It contains a total of 6808 sentences and 33179 words. To aid the sentiment analysis in Telugu, Sentiraama was created.
Corpus Statistics :
Dataset | Documents | Sentences | Words |
---|---|---|---|
Song Lyrics | 339 | 13997 | 81798 |
Movie Reviews | 267 | 25278 | 164307 |
Product Reviews | 200 | 4357 | 37494 |
Book Reviews | 200 | 3340 | 15031 |
Total Corpus | 1006 | 46972 | 298630 |
Download Dataset :
To download this dataset kindly fill the form given below.
If you use this corpus for your research, kindly cite it as follows:
Sentiraama Corpus by Gangula Rama Rohit Reddy, Radhika Mamidi. Language Technologies Research Centre, KCIS, IIIT Hyderabad. ltrc.iiit.ac.in/showfile.php?filename=downloads/sentiraama/