Skip to content

div5252/hindi-fake-news

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hindi fake News Generation and Classification

Setting up the Repository

  1. Clone the repository
git clone [email protected]:div5252/hindi-fake-news.git
  1. Install the required dependencies in your python environment
pip install requirements.txt

News articles scrapping

  1. Navigate to the news_scrapper directory
cd news_scrapper
  1. Make a new directory data where downloaded news article will be stored
mkdir data
  1. Run the navbharat_scrapper.py script
python navbharat_scrapper.py

Fake News Generation

Link to the dataset - https://drive.google.com/drive/folders/1YoEf0FxC_TNIgVlakFE3HjT0LK9EIkv2?usp=sharing

Using Split and Merge

  1. Navigate to the fake_news_generator directory
cd fake_news_generator
  1. Run the split_and_merge.py script
python split_and_merge.py

Using NER replacement

  1. Navigate to the fake_news_generator directory
cd fake_news_generator
  1. Run the .py script
python ner.py --person_list=<csv containing hindi names> --location_list=<csv containing hindi locations> --organisation_list=<csv containing hindi organization names> --input_file=<input real news csv> --output_file=<output file name> --num_steps=<number of steps at which writing takes place>

Using POS replacement

  1. Navigate to the fake_news_generator directory
cd fake_news_generator
  1. Run the pos.py script
python pos.py --input_file=<input real news csv file> --output_file=<output file containing the results>

Fake News Classification

Classification using similarity features

  1. Navigate to the fake_news_classifier directory
cd fake_news_classifier
  1. Run the news_similarity_classifier.py file
python news_similarity_classifier.py --train_path=<train dataset csv> --dev_path=<dev dataset csv> --test_path=<test dataset csv> --gold_path=<gold dataset csv> --result_path=<output text file> [--use_sentiment_features=<whether use sentiment features or not>] [--bert_dir=<bert model file for similarity features>] 

Classification using BERT

  1. Navigate to the fake_news_classifier directory
cd fake_news_classifier
  1. Run the bert_classifier.py file
python bert_classifier.py --train_data=<train dataset csv> --test_data=<test dataset csv> --save_dir=<output directory> --num_epochs=<number of epochs>

About

Fake News Article Detection Datasets for Hindi Language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages