Text Classification and Real-World Problems

Code examples and presentation slides from Machine Learning ID #2 Meetup - Yogyakarta

Training:

Extract dataset directory (dataset/bbc-fulltext.zip). The directory should be named bbc.
Run compile.py in the /dataset directory from your terminal to compile the text files into single dataset file.

$ cd dataset

$ python compile.py
Run engine.py.

$ python engine.py
If there is no pickle files, the system will automatically train the dataset and generate pickle files. This could take some time depends on your hardware.
If you want to retrain your system simply delete the pickles directory.

Classification:

Run the app.py file.

$ python app.py
Using Postman (or similar tools), send POST request to http://127.0.0.1:5050/classify with these form data:

post : your_article

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dataset		dataset
slide		slide
README.md		README.md
app.py		app.py
config.py		config.py
engine.py		engine.py
requirements.txt		requirements.txt