Skip to content

Latest commit

 

History

History
33 lines (21 loc) · 731 Bytes

readme.md

File metadata and controls

33 lines (21 loc) · 731 Bytes

The Multilingual Hatemail Corpus

Purpose

This corpus has been formed as a go-to repository for researchers and hobbyists interested in hatemail.

Submission Guidelines

Anyone can submit.

Currently the submission format is JSON, 1 string per entry:

{ "entries" : [
	{
		"text" : "HATEMAIL IN HEBREW AND ENGLISH GOES HERE",
		"languages" : ["heb", "eng"]
	},
	{
		"text" : "MORE HATEMAIL IN ESTONIAN GOES HERE",
		"languages" : ["est"]
	}
]}

Tag the language with the ISO 639-2 Language Code.

Submit into a folder by username (or Anonymous, if you prefer). Data may be processed and reorganized later.

Please do not submit sender information.