Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微调(dpo)的数据格式(json)能否给一个例子 #224

Open
whk6688 opened this issue Dec 25, 2024 · 4 comments
Open

微调(dpo)的数据格式(json)能否给一个例子 #224

whk6688 opened this issue Dec 25, 2024 · 4 comments

Comments

@whk6688
Copy link

whk6688 commented Dec 25, 2024

希望可以给一个微调和 pdo 的数据格式例子,比如上传一个json的文件。谢谢

@CSJianYang
Copy link
Collaborator

For SFT:

{
     "messages":[
         {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
         {"role": "user", "content": "Write a regex expression to match any letter of the alphabet"},
         {"role": "assistant", "content": "The regex expression to match any letter of the alphabet (either in uppercase or lowercase) is: \n\n```regex\n[a-zA-Z]\n```"},
         {"role": "user", "content": "How about if I only want to match uppercase letters? Can you modify the regex expression for that?"},
         {"role": "assistant", "content": "Sure, the regex expression to match any uppercase letter of the alphabet is:\n\n```regex\n[A-Z]\n```"}
    ],
    "format": "chatml"
}

For DPO:

{
     {"prompt": "Prompt"},
     {"chosen": "The chosen response"},
     {"rejected": "The rejected response"},
}

@whk6688
Copy link
Author

whk6688 commented Dec 25, 2024

如果是多条,格式是下面这样?

{
"messages":[
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
{"role": "user", "content": "Write a regex expression to match any letter of the alphabet"},
{"role": "assistant", "content": "The regex expression to match any letter of the alphabet (either in uppercase or lowercase) is: \n\nregex\n[a-zA-Z]\n"}
],
"format": "chatml"
},
{
"messages":[
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
{"role": "user", "content": "Write a regex expression to match any letter of the alphabet"},
{"role": "assistant", "content": "The regex expression to match any letter of the alphabet (either in uppercase or lowercase) is: \n\nregex\n[a-zA-Z]\n"}
],
"format": "chatml"
}

@CSJianYang
Copy link
Collaborator

Yes, Each line contains one sample (.jsonl):

{"messages":[xxx], "format": "chatml"}
{"messages":[xxx], "format": "chatml"}
{"messages":[xxx], "format": "chatml"}

@Sherww
Copy link

Sherww commented Dec 28, 2024

@CSJianYang 您好,请问ORPO微调的数据是如下格式吗?
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"chosen": "优质回答(必填)",
"rejected": "劣质回答(必填)"
}
]
那么其中的instruction我应该填写什么内容您有建议吗?我在qwen2.5 coder的基础上进行微调,数据是代码的行级补全数据,我目前的instruction填写的内容类似于“请为我补全以下代码”。但我发现在参数调优后,微调后的模型效果也没有变好,因此我在寻找可能的原因,instruction的设置会是一个可能的原因吗?
期待您的回复,谢谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants