You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am performing two operations I see on a hugging face tutorial (Fine-tune a language model), and I am defining every aspect inside the mapped functions, also some imports of the library because it doesnt identify anything not defined outside that function where the dataset elements are being mapped:
Describe the bug
I am performing two operations I see on a hugging face tutorial (Fine-tune a language model), and I am defining every aspect inside the mapped functions, also some imports of the library because it doesnt identify anything not defined outside that function where the dataset elements are being mapped:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb#scrollTo=iaAJy5Hu3l_B
`- lm_datasets = tokenized_datasets.map(
group_texts,
batched=True,
batch_size=batch_size,
num_proc=4,
)
def tokenize_function(examples):
model_checkpoint = 'gpt2'
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True)
return tokenizer(examples["text"])`
Steps to reproduce the bug
Currently handle all the imports inside the function
Expected behavior
The code must work es expected in the notebook, but currently this is not happening.
https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb#scrollTo=iaAJy5Hu3l_B
Environment info
print(transformers.version)
4.46.1
The text was updated successfully, but these errors were encountered: