Hindi fake News Generation and Classification

Setting up the Repository

Clone the repository

git clone [email protected]:div5252/hindi-fake-news.git

Install the required dependencies in your python environment

pip install requirements.txt

News articles scrapping

Navigate to the news_scrapper directory

cd news_scrapper

Make a new directory data where downloaded news article will be stored

mkdir data

Run the navbharat_scrapper.py script

python navbharat_scrapper.py

Fake News Generation

Link to the dataset - https://drive.google.com/drive/folders/1YoEf0FxC_TNIgVlakFE3HjT0LK9EIkv2?usp=sharing

Using Split and Merge

Navigate to the fake_news_generator directory

cd fake_news_generator

Run the split_and_merge.py script

python split_and_merge.py

Using NER replacement

Navigate to the fake_news_generator directory

cd fake_news_generator

Run the .py script

python ner.py --person_list=<csv containing hindi names> --location_list=<csv containing hindi locations> --organisation_list=<csv containing hindi organization names> --input_file=<input real news csv> --output_file=<output file name> --num_steps=<number of steps at which writing takes place>

Using POS replacement

Navigate to the fake_news_generator directory

cd fake_news_generator

Run the pos.py script

python pos.py --input_file=<input real news csv file> --output_file=<output file containing the results>

Fake News Classification

Classification using similarity features

Navigate to the fake_news_classifier directory

cd fake_news_classifier

Run the news_similarity_classifier.py file

python news_similarity_classifier.py --train_path=<train dataset csv> --dev_path=<dev dataset csv> --test_path=<test dataset csv> --gold_path=<gold dataset csv> --result_path=<output text file> [--use_sentiment_features=<whether use sentiment features or not>] [--bert_dir=<bert model file for similarity features>]

Classification using BERT

Navigate to the fake_news_classifier directory

cd fake_news_classifier

Run the bert_classifier.py file

python bert_classifier.py --train_data=<train dataset csv> --test_data=<test dataset csv> --save_dir=<output directory> --num_epochs=<number of epochs>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hindi fake News Generation and Classification

Setting up the Repository

News articles scrapping

Fake News Generation

Using Split and Merge

Using NER replacement

Using POS replacement

Fake News Classification

Classification using similarity features

Classification using BERT

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
fake_news_classifier		fake_news_classifier
fake_news_generator		fake_news_generator
news_scrapper		news_scrapper
README.md		README.md
requirements.txt		requirements.txt

div5252/hindi-fake-news

Folders and files

Latest commit

History

Repository files navigation

Hindi fake News Generation and Classification

Setting up the Repository

News articles scrapping

Fake News Generation

Using Split and Merge

Using NER replacement

Using POS replacement

Fake News Classification

Classification using similarity features

Classification using BERT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages