You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As the configuration file of dataset has an unused tag dataFormat, it'll be convenient to automate basic data loading after downloading data from the url as default while users can still customize their personalized data loading. This will be useful when it comes to the production scale and doesn't require very fine-detailed and specific pre-processing, so they can use the default option.
Describe the solution you'd like
If a dataset is npy format, it will read the numpy array only. If a dataset is npz format, it will read the numpy arrays as a dictionary with headers as keys, and arrays as values. If a dataset is csv or stata format, it will read it as a pandas DataFrame. If a dataset is zip format, it will unzip it. If a dataset is a python pickle format, it will load the content from it. If a dataset consists of images for a classification task, it will construct the dataset by using folder names as their labels. This requires the users put the data into the right sub-folders.
Describe alternatives you've considered
A data loader is an essential competent of a ML task. Besides the default option the system provides, users can bypass the default option and overwrite it with their customized data loaders.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
As the configuration file of dataset has an unused tag
dataFormat
, it'll be convenient to automate basic data loading after downloading data from the url as default while users can still customize their personalized data loading. This will be useful when it comes to the production scale and doesn't require very fine-detailed and specific pre-processing, so they can use the default option.Describe the solution you'd like
If a dataset is
npy
format, it will read the numpy array only. If a dataset isnpz
format, it will read the numpy arrays as a dictionary with headers as keys, and arrays as values. If a dataset iscsv
orstata
format, it will read it as a pandas DataFrame. If a dataset iszip
format, it will unzip it. If a dataset is a pythonpickle
format, it will load the content from it. If a dataset consists of images for a classification task, it will construct the dataset by using folder names as their labels. This requires the users put the data into the right sub-folders.Describe alternatives you've considered
A data loader is an essential competent of a ML task. Besides the default option the system provides, users can bypass the default option and overwrite it with their customized data loaders.
The text was updated successfully, but these errors were encountered: