Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Add a mechanism for generating a dataset not based on a Gymnasium etc source but from a plain time series data #276

Open
jamartinh opened this issue Feb 20, 2025 · 3 comments

Comments

@jamartinh
Copy link
Contributor

Proposal

Add a mechanism for generating a dataset not based on a Gymnasium etc source but from a plan time series data.

Motivation

In much scenarios the data for OfflineRL is coming from a plain dataset, e.g. pandas, numpy etc. This makes more or less tedious or difficult to create a minari dataset from it. So how about creating a clear way of having an clear interface for writing a Minari dataset from a time series data stored in pandas, numpy or other plain dataset format without having to have a Gymnasium environment ?

Alternatives

  • Write manually an hdf5 or parquet with the minari format data that will require using knowledge of the internals of minari.
  • Create a fake environment that iterates over the dataset to generate the minari dataset
@jamartinh jamartinh changed the title [Proposal] Add a mechanism for generating a dataset not based on a Gymnasium etc source but from a plan time series data [Proposal] Add a mechanism for generating a dataset not based on a Gymnasium etc source but from a plain time series data Feb 20, 2025
@younik
Copy link
Member

younik commented Feb 20, 2025

Hi @jamartinh, we aim to support this indeed.

We should already support it with EpisodeBuffer and minari.create_dataset_from_buffers.
The only remaining dependency on Gymnasium in that case is in defining the action and observation spaces using Gymnasium. Is this a limitation for your use case?

@jamartinh
Copy link
Contributor Author

jamartinh commented Feb 20, 2025 via email

@younik
Copy link
Member

younik commented Feb 20, 2025

Thanks I have seen this function right now. It should work, but can the obs
and action spaces be optional? Or is it a hard requirement for Minari
datasets? Or perhaps a Space.Unbounded?

El jue, 20 feb 2025, 14:17, Omar Younis @.***> escribió:

At the moment it cannot be optional, but you can set the space to be as generic as possible, e.g. a Box without bounds. However, we expect the shape and types of arrays to be the same across the episode, as well as the structure in the case of a nested array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants