Published September 22, 2019 | Version v1
Dataset Open

BHAAV (भाव) - A Text Corpus for Emotion Analysis from Hindi Stories

  • 1. Adobe, New Delhi
  • 2. Bloomberg LP
  • 3. NSIT, New Delhi
  • 4. USICT, New Delhi
  • 5. IIIT, New Delhi

Description

The first and largest Hindi text corpus, named BHAAV (भाव), which means emotions in Hindi, for analyzing emotions that a writer expresses through his characters in a story, as perceived by a narrator/reader. The corpus consists of 20,304 sentences collected from 230 different short stories spanning across 18 genres such as प्रेरणादायक (Inspirational) and रहस्यमयी (Mystery). Each sentence has been annotated into one of the five emotion categories anger, joy, suspense, sad, and neutral) by three native Hindi speakers with at least ten years of formal education in Hindi.

Files

Datasets-20190922T151602Z-001.zip

Files (24.7 MB)

Name Size Download all
md5:7754ad2c7a3737e98a601962830bfeec
24.7 MB Preview Download