- Clone the repository
git clone [email protected]:div5252/hindi-fake-news.git
- Install the required dependencies in your python environment
pip install requirements.txt
- Navigate to the news_scrapper directory
cd news_scrapper
- Make a new directory data where downloaded news article will be stored
mkdir data
- Run the navbharat_scrapper.py script
python navbharat_scrapper.py
Link to the dataset - https://drive.google.com/drive/folders/1YoEf0FxC_TNIgVlakFE3HjT0LK9EIkv2?usp=sharing
- Navigate to the fake_news_generator directory
cd fake_news_generator
- Run the split_and_merge.py script
python split_and_merge.py
- Navigate to the fake_news_generator directory
cd fake_news_generator
- Run the .py script
python ner.py --person_list=<csv containing hindi names> --location_list=<csv containing hindi locations> --organisation_list=<csv containing hindi organization names> --input_file=<input real news csv> --output_file=<output file name> --num_steps=<number of steps at which writing takes place>
- Navigate to the fake_news_generator directory
cd fake_news_generator
- Run the pos.py script
python pos.py --input_file=<input real news csv file> --output_file=<output file containing the results>
- Navigate to the fake_news_classifier directory
cd fake_news_classifier
- Run the news_similarity_classifier.py file
python news_similarity_classifier.py --train_path=<train dataset csv> --dev_path=<dev dataset csv> --test_path=<test dataset csv> --gold_path=<gold dataset csv> --result_path=<output text file> [--use_sentiment_features=<whether use sentiment features or not>] [--bert_dir=<bert model file for similarity features>]
- Navigate to the fake_news_classifier directory
cd fake_news_classifier
- Run the bert_classifier.py file
python bert_classifier.py --train_data=<train dataset csv> --test_data=<test dataset csv> --save_dir=<output directory> --num_epochs=<number of epochs>