Translation scripts

Python scripts for LLM machine translation

Translation benchmarking for base models

translate_benchmarking.py: This is for benchmarking the translation capability of base LLMs (e.g., Poro, Viking, Mistral, Llama) using a few-shot prompt. Translation examples in the prompt are randomly selected from the dev sets of FLORES-101 and Tatoeba. This script also benchmarks open-source MT models such as Opus and NLLB.

Example usage

Translating the FLORES-101 devtest sentences from English to Finnish with Viking-7B using an 8-shot prompt:

python translate_benchmarking.py \
            --model LumiOpen/Viking-7B   \
            --src_file /scratch/project_462000444/finetuning_data/FLORES-101/eng-devtest.txt \
            --trg_file /scratch/project_462000444/finetuning_data/FLORES-101/fin-devtest.txt \
            --output_file /scratch/project_462000444/translation_evals/FLORES-101/viking-7b-eng-fin.jsonl \
            --lang_pair eng-fin \
            --test_data flores-101 \
            --format_type equals \
            --num_examples 8 \

Translation benchmarking for chat models

translate_benchmarking_chat.py: This is for benchmarking the translation capability of chat-tuned models (e.g. Poro-34B-Chat, Llama-3.1-8B-Instruct). We can evaluate the chat model with zero-shot prompting (no in-context examples) or with few-shot prompting. The script automatically applies the chat template indicated in the tokenizer config (tokenizer_config.json).

Example usage

This uses Poro-34B-chat to translate the first 100 sentences of FLORES-101 from English to Finnish with zero-shot prompting.

    python translate_benchmarking_chat.py \
                --model LumiOpen/Poro-34B-chat \
                --src_file /scratch/project_462000444/finetuning_data/FLORES-101/eng-devtest.txt \
                --trg_file /scratch/project_462000444/finetuning_data/FLORES-101/fin-devtest.txt \
                --src_lang eng \
                --trg_lang fin \
                --max_samples 100 \
                --outfile /scratch/project_462000444/zosaelai2/translation_evals/poro-33b-chat-eng-fin.txt \

Translating datasets

translate_datasets.py: This is for translating SFT datasets using Poro and Viking from English to some target language. This assumes the dataset is in the HF SFTTrainer conversational format:

{"messages": [{"role": "system", "content": "You are helpful"}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "..."}]}

Example usage

Translating our instruction-collection dataset from English to Swedish with Viking-33B

python translate_datasets.py \
        --model LumiOpen/Viking-33B  \
        --filepath /scratch/project_462000444/finetuning_data/SFTTrainer_format/eng/instruction-collection/train.jsonl \
        --output_file /scratch/project_462000444/finetuning_data/SFTTrainer_format/swe/instruction-collection-viking/train.jsonl \
        --trg_lang swe \

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data_preprocess		data_preprocess
.gitignore		.gitignore
README.md		README.md
iso2nllb.map		iso2nllb.map
launch_sbatch_benchmarking_chat.sh		launch_sbatch_benchmarking_chat.sh
only_comet.py		only_comet.py
run_comet.sh		run_comet.sh
run_mtbench_translation.sh		run_mtbench_translation.sh
translate_benchmarking.py		translate_benchmarking.py
translate_benchmarking_chat.py		translate_benchmarking_chat.py
translate_datasets.py		translate_datasets.py
translate_mtbench.py		translate_mtbench.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Translation scripts

Translation benchmarking for base models

Example usage

Translation benchmarking for chat models

Example usage

Translating datasets

Example usage

About

Releases

Packages

Contributors 2

Languages

LumiOpen/translation_scripts

Folders and files

Latest commit

History

Repository files navigation

Translation scripts

Translation benchmarking for base models

Example usage

Translation benchmarking for chat models

Example usage

Translating datasets

Example usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages