MGTBench2.0 provides the reference implementations of different machine-generated text (MGT) detection methods. It is still under continuous development and we will include more detection methods as well as analysis tools in the future.
git clone -b release https://github.com/Y-L-LIU/MGTBench-2.0
cd MGTBench-2.0
conda env create -f environment.yml;
conda activate mgtbench2;
Check out demo.ipynb
for a quick start.
from mgtbench import AutoDetector, AutoExperiment
from mgtbench.loading.dataloader import load
model_name_or_path = '/data1/zzy/gpt2-medium'
metric = AutoDetector.from_detector_name('ll',
model_name_or_path=model_name_or_path)
experiment = AutoExperiment.from_experiment_name('threshold',detector=[metric])
data_name = 'AITextDetect'
detectLLM = 'gpt35'
category = 'Art'
data = load(data_name, detectLLM, category)
experiment.load_data(data)
res = experiment.launch()
print('train:', res[0].train)
print('test:', res[0].test)
Currently, we support the following methods (continuous updating):
- Metric-based methods:
- Model-based methods:
It contains human written and AI polished text in different categories, including:
- STEM (Physics, Math, Computer, Biology, Chemistry, Electrical, Medicine, Statistics)
- Social Sciences (Education, Management, Economy and Finance)
- Humanities (Art, History, Literature, Philosophy, Law)
From wiki, arxiv, and Gutenberg
To check the dataset:
'''
supported LLMs and detect categories:
categories = ['Physics', 'Medicine', 'Biology', 'Electrical_engineering', 'Computer_science', 'Literature', 'History', 'Education', 'Art', 'Law', 'Management', 'Philosophy', 'Economy', 'Math', 'Statistics', 'Chemistry']
llms = ['Moonshot', 'gpt35', 'Mixtral', 'Llama3']
'Human' for human written data
'''
detectLLM = 'Llama3'
category = 'Math'
from datasets import load_dataset
# ai polished
polish = load_dataset("AITextDetect/AI_Polish_clean",
name=detectLLM,
split=category,
trust_remote_code=True
)
# human written
human = load_dataset("AITextDetect/AI_Polish_clean",
name='Human',
split=category,
trust_remote_code=True
)
You can also download the dataset from Huggingface, and examine locally:
from datasets import load_dataset
# for human data, chemistry category
human_chemistry = load_dataset("path/to/AITextDetect/AI_Polish_clean/Human/Chemistry")
To run the benchmark on the AITextDetect
dataset:
# specify the model with local path to your model, or model name on huggingface
# distinguish Human vs. Llama3 using LM-D detector
python benchmark.py --detectLLM Llama3\
--method LM-D\
--model /path/to/distilbert-base-uncased\
--epochs 1 \
--batch_size 64 \
--lr 5e-6
# distinguish Human vs. gpt3.5 using log-likelihood detector
python benchmark.py --detectLLM gpt35 --method ll --model /path/to/gpt2-medium
To run model attribution on the AITextDetect
dataset:
# distinguish Human, Moonshot, gpt3.5, Mixtral, Llama3 using LM-D detector
python attribution_train_all.py \
--model_save_dir /data1/model_attribution \ # path to save the models
--output_csv attribution_results_new.csv && \
python attribution_eval_all.py \
--result_csv eval_result.csv
Produces a attribution_results_new.csv
file with all results and a eval_result.csv
file with the highest F1 score for each category. The figure
folder contains the confusion matrix for each category.
Note that you can also specify your own datasets on dataloader.py
.
If you find this repo and dataset useful, please consider cite our work
@inproceedings{he2024mgtbench,
author = {He, Xinlei and Shen, Xinyue and Chen, Zeyuan and Backes, Michael and Zhang, Yang},
title = {{Mgtbench: Benchmarking machine-generated text detection}},
booktitle = {{ACM SIGSAC Conference on Computer and Communications Security (CCS)}},
pages = {},
publisher = {ACM},
year = {2024}
}
@software{liu2024rethinkingMGT,
author = {Liu, Yule and Zhong, Zhiyuan and Liao, Yifan and Leng, Jiaqi and Sun, Zhen and Chen, Yang and Gong, Qingyuan and Zhang, Yang and He, Xinlei},
month = {10},
title = {{MGTBench-2.0: Rethinking the Machine-Generated Text Detection}},
url = {https://github.com//Y-L-LIU/MGTBench-2.0},
version = {2.0.0},
year = {2024}
}