-
Notifications
You must be signed in to change notification settings - Fork 18
Home
Welcome to the hcaptcha-model-factory wiki!
This project is about π hCAPTCHA binary classification model factory.
If this project is hopeful for you, please leave a βstar~!
Image recognazation as a most common captcha category was provided by many captcha service like hCaptcha and reCaptcha. But it's can easyly be solved by deep learning. Collect and label data is the only thing you need to do.
Any image recognazation task can be regarded as a binary classification task for now. You just need to decide to "click" or "not click", "true" or "false".
So, this project is as a pluggable module in hcaptcha-challenger, which can quick iteration and update. When a new challenge comes, just train a simple resnet model for it is enough.
This ResNetMini model is only 295KB
for onnx format. But I don't know how big the hCaptcha generation model is, haha!
Make AI great again!
hcaptcha-model-factory
βββ data
β βββ smiling_dog
β βββ unlabel (If you use auto label tools, you need to place all images at here)
β βββ bad (You need to place the images which not contain a smiling dog at here)
β βββ yes (You need to place the images which contain a smiling dog at here)
β βββ all.yaml (auto generated)
β βββ train.yaml (auto generated)
β βββ val.yaml (auto generated)
β βββ test.yaml (auto generated)
βββ LICENSE
βββ model (After the training, your model will be stored here)
β βββ smiling_dog
β βββ smiling_dog.pth
β βββ smiling_dog_100.pth
β βββ smiling_dog_200.pth
βββ README.md
βββ requirements.txt
βββ src
- ResNetMini
- size: 295 KB
- params: 75154 trainable parameters
- structure: conv - bn - relu - conv - bn - conv - bn - relu
Recommended environment: Python 3.8, PyTorch==1.8.2 [Optional: CUDA>=10.2]
System: Windows/Linux/Mac
(It supports all system which can install PyTorch, but I just test it on Windows. Hoping you know, and Welcome a pr!)
Run following command.
git clone https://github.com/beiyuouo/hcaptcha-model-factory.git
cd hcaptcha-model-factory
pip install -r requirements.txt
cd src
Use this command to start a new challenge, and follow the prompt.
python main.py new
prompt[en] -> Please click each image containing a smiling dog
2022-09-12 19:17:08 | DEBUG - Diagnose task | task_name=smiling_dog
Use AI to automatically label datasets? {'y', 'n'} --> y
please put all the images in the `unlabel` folder and press any key to continue...
2022-09-12 19:17:55 | INFO - Found 1166 images in hcaptcha-model-factory\data\smiling_dog\unlabel
# after auto label you need to check and correct them.
2022-09-12 19:18:20 | INFO - Embeddings extracted
2022-09-12 19:18:20 | INFO - PCA..., shape of embs: (1166, 512)
2022-09-12 19:18:20 | INFO - PCA done, shape of embs: (1166, 128)
2022-09-12 19:18:20 | DEBUG - Clustering...
2022-09-12 19:18:20 | DEBUG - Clustering done
2022-09-12 19:18:20 | INFO - Saving labels...
2022-09-12 19:18:22 | DEBUG - Labels saved
2022-09-12 19:18:22 | SUCCESS - Auto labeling completed
Start automatic training? {'y', 'n'} --> y
# your model is generated in model folder.
ps: main.py
is the name of scaffold File. After the v0.1.x
, the entry will be moved to main.py
.
# hcaptcha-model-factory/src
python main.py trainval --task=[labelName]
You must use quotation marks when the '--task' argument contains Spaces.
python main.py trainval --task dog
python main.py trainval --task=smiling_dog
python main.py trainval --task="cat-shaped cookie"
The " "
, ","
and "-"
characters are automatically replaced with "_"
.
-
"cat-shaped cookie" => "cat_shaped_cookie"
. -
"cat with large, rounded head"
=>"cat_with_large_rounded_head"
# hcaptcha-model-factory/src
python main.py train --task=[labelName]
# hcaptcha-model-factory/src
python main.py val --task=[labelName]
# hcaptcha-model-factory/src
python main.py test_onnx --task=[labelName] (optional)--flag=["all" | "train" | "val" | "test"]
Copyright @BJ.YAN