memory leak related with MultiprocessFileCache? #339

dshintani-pfn · 2024-07-19T05:53:41Z

I observed the possible memory leak (~1GB/h) related with MultiprocessFileCache during training.

I defined the dataset class with cache as tutorial.

class CachedDataset:
    def __init__(
        self,
        common_config
    ) -> None:
        self._reader_dict = {
            dataset.name: File(dataset.name, mode="a") for dataset in common_config.datasets
        }
        self._cache = MultiprocessFileCache(len(self), do_pickle=True)

    def _load_from_disk(self, i: int) -> TrainData:
        return ...

    def __getitem__(self, i: int) -> Any:
        return self._cache.get_and_cache(i, self._load_from_disk)

and used this CachedDataset as dataset below for training.

train_set, val_set = torch.utils.data.random_split(
    dataset,
    [int(len(dataset) * train_set_ratio), len(dataset) - int(len(dataset) * train_set_ratio)],
)

train_loader = DataLoader(
    train_set, batch_size=train_args.batch_size, shuffle=True, collate_fn=collate_fn
)

This leakage was solved when I stopped using MultiprocessFileCache.

It might be due to the wrong usage of MultiprocessFileCache, but do you have any idea about this leakage?

The text was updated successfully, but these errors were encountered:

dshintani-pfn changed the title ~~memory leak of MultiprocessFileCache?~~ memory leak related with MultiprocessFileCache? Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak related with MultiprocessFileCache? #339

memory leak related with MultiprocessFileCache? #339

dshintani-pfn commented Jul 19, 2024 •

edited

Loading

memory leak related with MultiprocessFileCache? #339

memory leak related with MultiprocessFileCache? #339

Comments

dshintani-pfn commented Jul 19, 2024 • edited Loading

dshintani-pfn commented Jul 19, 2024 •

edited

Loading