Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Llama3.1-8B to fix #114

Open
Monstertail opened this issue Mar 5, 2025 · 0 comments
Open

[BUG]: Llama3.1-8B to fix #114

Monstertail opened this issue Mar 5, 2025 · 0 comments
Assignees
Labels
bug Something isn't working priority-0 priority of the issue

Comments

@Monstertail
Copy link
Collaborator

Description of the bug

Two errors:

  1. the rope is not correct. refer to here to fix it;
  2. the request pool is too small. For speculative decoding with llama3.1-8B, when the tree size is larger than 128, it has an error:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/jinweiy/deft/project_deft/DeFT/deft/tree_decoding/branch_controller.py", line 23, in apply_branching
return self.branching_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/jinweiy/deft/project_deft/DeFT/deft/tree_decoding/generation/branch_func_example.py", line 403, in example_branch_Func4_SpeculativeDecoding
model.tree.branch(model.tree.root, token_tree_size)
File "/data/jinweiy/deft/project_deft/DeFT/deft/tree_decoding/tree_cache.py", line 361, in branch
assert new_req is not None
^^^^^^^^^^^^^^^^^^^
AssertionError

Steps To Reproduce

run speculative decoding with llama3.1-8B and set the tree size>=128.

Additional Information

No response

@Monstertail Monstertail added the bug Something isn't working label Mar 5, 2025
@Monstertail Monstertail self-assigned this Mar 5, 2025
@Monstertail Monstertail added the priority-0 priority of the issue label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority-0 priority of the issue
Projects
None yet
Development

No branches or pull requests

1 participant