Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find the file "apply_delta.py" #55

Closed
Summer-seu opened this issue Jul 12, 2023 · 3 comments
Closed

Can't find the file "apply_delta.py" #55

Summer-seu opened this issue Jul 12, 2023 · 3 comments

Comments

@Summer-seu
Copy link

Can't find the file "apply_delta.py" mentioned below:

For 7B: Download vicuna-7b-delta-v0 and process it:
python3 apply_delta.py
--base /path/to/model_weights/llama-7b
--target vicuna-7b-v0
--delta lmsys/vicuna-7b-delta-v0

@yinanhe
Copy link
Member

yinanhe commented Jul 12, 2023

You can get it from https://huggingface.co/CarperAI/stable-vicuna-13b-delta/raw/main/apply_delta.py

@ceyxasm
Copy link

ceyxasm commented Jul 16, 2023

"""
Usage:
python3 apply_delta.py --base /path/to/model_weights/llama-13b --target stable-vicuna-13b --delta pvduy/stable-vicuna-13b-delta
"""
import argparse

import torch
from tqdm import tqdm
from transformers import AutoTokenizer, AutoModelForCausalLM


def apply_delta(base_model_path, target_model_path, delta_path):
    print("Loading base model")
    base = AutoModelForCausalLM.from_pretrained(
        base_model_path, torch_dtype=torch.float16, low_cpu_mem_usage=True)

    print("Loading delta")
    delta = AutoModelForCausalLM.from_pretrained(delta_path, torch_dtype=torch.float16, low_cpu_mem_usage=True)
    delta_tokenizer = AutoTokenizer.from_pretrained(delta_path)

    DEFAULT_PAD_TOKEN = "[PAD]"
    base_tokenizer = AutoTokenizer.from_pretrained(base_model_path , use_fast=False) #abu-1
    num_new_tokens = base_tokenizer.add_special_tokens(dict(pad_token=DEFAULT_PAD_TOKEN))

    base.resize_token_embeddings(len(base_tokenizer))
    input_embeddings = base.get_input_embeddings().weight.data
    output_embeddings = base.get_output_embeddings().weight.data
    input_embeddings[-num_new_tokens:] = 0
    output_embeddings[-num_new_tokens:] = 0

    print("Applying delta")
    for name, param in tqdm(base.state_dict().items(), desc="Applying delta"):
        assert name in delta.state_dict()
        param.data += delta.state_dict()[name]

    print("Saving target model")
    base.save_pretrained(target_model_path)
    delta_tokenizer.save_pretrained(target_model_path)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--base-model-path", type=str, required=True)
    parser.add_argument("--target-model-path", type=str, required=True)
    parser.add_argument("--delta-path", type=str, required=True)

    args = parser.parse_args()

    apply_delta(args.base_model_path, args.target_model_path, args.delta_path)

Make sure the toeknizer_config.json tokenizer_class is LlamaTokenizer and not
LLaMATokenizer or anything else.

@yinanhe
Copy link
Member

yinanhe commented Jan 24, 2024

Due to the lack of updates for a long time, your issue has been temporarily closed. If you still have any problems, please feel free to reopen this issue.

@yinanhe yinanhe closed this as completed Jan 24, 2024
@yinanhe yinanhe pinned this issue Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants