-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging and Sampling outlines.processors
#35
base: main
Are you sure you want to change the base?
Conversation
c07de55
to
c3e8673
Compare
The idea is interesting. Do we really want to increase the surface area of the library? There's a risk of spreading ourselves too thinly by adding extra code to maintain. |
4d0b9bd
to
d49fd23
Compare
d49fd23
to
31a485a
Compare
31a485a
to
6e69272
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments here and there. Was able to get this to work with a modified example:
import outlines
import outlines.processors as processors
model = outlines.models.transformers(
"openaccess-ai-collective/tiny-mistral",
)
# Create a chained logits processor
logits_processor = (
processors.sequence_logging(model.tokenizer) | # Log the generated sequence
processors.logits_logging(model.tokenizer) | # Log the raw logits
processors.regex(r"[0-9]*", model.tokenizer) | # Restrict the logits to match the pattern
processors.temperature(0.5) | # Set temperature to 0.5
processors.logits_logging(model.tokenizer) # Log the restricted, temperature-augmentent, sampled logits
)
generator = outlines.generate.base(model, logits_processor)
generator("What is your favorite number? ")
We'll also need to export the base
generator method, i.e. __init__.py
should include this line:
from .api import SequenceGenerator
from .base import base
from .cfg import cfg
from .choice import choice
from .format import format
from .fsm import fsm
from .json import json
from .regex import regex
from .text import text
processors.logits_logging(model.tokenizer) # Log the restricted, temperature-augmentent, sampled logits | ||
) | ||
|
||
generator = outlines.generate.base(model, logits_process) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generator = outlines.generate.base(model, logits_process) | |
generator = outlines.generate.text(model, logits_processor) |
- should be
logits_processor
- Is
base
defined here? I haven't been able to find it (yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model = outlines.models.llamacpp( | ||
repo_id="M4-ai/TinyMistral-248M-v2-Instruct-GGUF", | ||
filename="TinyMistral-248M-v2-Instruct.Q4_K_M.gguf" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to work with llamacpp, but it does work with transformers:
import outlines
import outlines.processors as processors
model = outlines.models.transformers(
"openaccess-ai-collective/tiny-mistral",
)
# Create a chained logits processor
logits_processor = (
processors.sequence_logging(model.tokenizer) | # Log the generated sequence
processors.logits_logging(model.tokenizer) | # Log the raw logits
processors.regex(r"[0-9]*", model.tokenizer) | # Restrict the logits to match the pattern
processors.temperature(0.5) | # Set temperature to 0.5
processors.logits_logging(model.tokenizer) # Log the restricted, temperature-augmentent, sampled logits
)
generator = outlines.generate.base(model, logits_processor)
generator("What is your favorite number? ")
The error was
Traceback (most recent call last):
File "/home/cameron/dottxt/outlines/demo-logging.py", line 16, in <module>
generator("What is your favorite number? ")
File "/home/cameron/dottxt/outlines/outlines/generate/api.py", line 503, in __call__
completions = self.model.generate(
File "/home/cameron/dottxt/outlines/outlines/models/llamacpp.py", line 288, in generate
completion = self.model(prompts, **llama_cpp_params)
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 1799, in __call__
return self.create_completion(
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 1732, in create_completion
completion: Completion = next(completion_or_chunks) # type: ignore
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 1216, in _create_completion
for token in self.generate(
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 810, in generate
token = self.sample(
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 704, in sample
else logits_processor(self._input_ids[: idx + 1], logits)
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama.py", line 2250, in __call__
scores = processor(input_ids, scores)
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/cameron/dottxt/outlines/outlines/processors/base_logits_processor.py", line 82, in __call__
processed_logits = self.process_logits(
File "/home/cameron/dottxt/outlines/outlines/processors/base_logits_processor.py", line 168, in process_logits
result = processor.process_logits(input_ids, result)
File "/home/cameron/dottxt/outlines/outlines/processors/logging.py", line 63, in process_logits
self.logger.info(self.tokenizer.decode(input_ids))
File "/home/cameron/dottxt/outlines/outlines/models/llamacpp.py", line 56, in decode
decoded_bytes = self.tokenizer.detokenize(token_ids)
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/llama_tokenizer.py", line 52, in detokenize
return self._model.detokenize(tokens)
File "/home/cameron/dottxt/outlines/.venv/lib/python3.10/site-packages/llama_cpp/_internals.py", line 224, in detokenize
self.model, llama_cpp.llama_token(token), buffer, size, 0, special
TypeError: 'list' object cannot be interpreted as an integer
self.logger = logger | ||
else: | ||
self.logger = logging.getLogger("logits_logger") | ||
self.logger.setLevel(logging.info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be
self.logger.setLevel(logging.info) | |
self.logger.setLevel(logging.INFO) |
self.logger = logger | ||
else: | ||
self.logger = logging.getLogger("sequence_logger") | ||
self.logger.setLevel(logging.info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.logger.setLevel(logging.info) | |
self.logger.setLevel(logging.INFO) |
Oh, also, the transformers output I think is the token indices, maybe
|
Thanks for the review @cpfiffer ! It's a good refresher since I haven't looked at this in a while. I'll see if I have some free time this weekend and can get this ready for review in And indeed, it is token indices. Decoded tokens would be better. |
No description provided.