Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于数据集和Plan和Tool指标的问题 #43

Open
LiuJinzhe-Keepgoing opened this issue Mar 8, 2024 · 1 comment
Open

关于数据集和Plan和Tool指标的问题 #43

LiuJinzhe-Keepgoing opened this issue Mar 8, 2024 · 1 comment

Comments

@LiuJinzhe-Keepgoing
Copy link

哈喽,感谢您们精彩的工作!我认为这是十分有意义的。但是我有几个困惑。
我理解针对于Kagent这个框架,数据集中"type": "plantooluse"是为了验证模型的plan多步骤能力。但是:

  1. golden_result_list中的json数据没有step的概念,看起来是没有先后执行顺序的。这个怎么作为多步骤推理的标签呢? 模型在执行多次plan的时候,怎样选择golden_result_list的结果作为标准的label
  2. 评价指标中的Planning和Tool-use是否评价的是一次plan的过程,这样能否体现多步骤推理的能力呢?

感谢! 并且期待您的回复!
谢谢!

@Mars-1990
Copy link

@LiuJinzhe-Keepgoing 你好,我在看这个工作时也有同样的疑惑,请问这个问题你解决了吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants