You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to use my own script to handle the processing, and followed the tutorial documentation by rewriting the MyDatasetConfig and MyDatasetBuilder (which contains the _info,_split_generators and _generate_examples methods) classes. Testing with simple data was able to output the results of the processing, but when I wished to do more complex processing, I found that I was unable to debug (even the simple samples were inaccessible). There are no errors reported, and I am able to print the _info,_split_generators and _generate_examples messages, but I am unable to access the breakpoints.
Steps to reproduce the bug
my_dataset.py
import json
import datasets
class MyDatasetConfig(datasets.BuilderConfig):
def init(self, **kwargs):
super(MyDatasetConfig, self).init(**kwargs)
class MyDataset(datasets.GeneratorBasedBuilder):
VERSION = datasets.Version("1.0.0")
Describe the bug
I wanted to use my own script to handle the processing, and followed the tutorial documentation by rewriting the MyDatasetConfig and MyDatasetBuilder (which contains the _info,_split_generators and _generate_examples methods) classes. Testing with simple data was able to output the results of the processing, but when I wished to do more complex processing, I found that I was unable to debug (even the simple samples were inaccessible). There are no errors reported, and I am able to print the _info,_split_generators and _generate_examples messages, but I am unable to access the breakpoints.
Steps to reproduce the bug
my_dataset.py
import json
import datasets
class MyDatasetConfig(datasets.BuilderConfig):
def init(self, **kwargs):
super(MyDatasetConfig, self).init(**kwargs)
class MyDataset(datasets.GeneratorBasedBuilder):
VERSION = datasets.Version("1.0.0")
#main.py
import os
os.environ["TRANSFORMERS_NO_MULTIPROCESSING"] = "1"
from datasets import load_dataset
dataset = load_dataset("my_dataset.py", split="train", cache_dir=None)
print(dataset[:5])
Expected behavior
Pause at breakpoints while running debugging
Environment info
pycharm
The text was updated successfully, but these errors were encountered: