0.10.0 #2125
jaheba
announced in
Announcements
0.10.0
#2125
Replies: 1 comment 2 replies
-
Awesome! Note, another breaking change in this release is #2017 and in the improvement/fixes section I also updated the codeql #2040 and fixed the potential pandas period issue #2066 |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Overview
Arrow based datasets
We have added support for Parquet-files, as well as Arrow's binary format. This is an opt-in feature, requiring
pyarrow
to be installed. Usepip install 'gluonts[pro]'
orpip install 'gluonts[arrow]'
to ensure the correct version is installed.FileDataset
has been reworked to support.parquet
and.arrow
files. Previously, it had assumed all files to usejsonlines
. To continue usingjsonlines
ensure that the the files use one of the.json
,.jsonl
,.json.gz
,jsonl.gz
suffixes.Depending on the dataset size and shape, Arrow can be much faster than the json variant. In more extreme cases we saw speedups of more than 100x when using arrow vs jsonlines (see #2003 for some examples).
To convert a given dataset into arrow, you can use the
gluonts.dataset.arrow
utility:PandasDataset
We have added support for
pandas.DataFrame
andpandas.Series
as well. You can now directly model data given in aDataFrame
usinggluonts.dataset.pandas.PandasDataset
. In thistutorial
we describe in depth how you can usePandasDataset
to speed up modelling using GluonTS.Changelog
New Features
TimeLimitCallback
tomx/trainer
callbacks. #1631 - AddTimeLimitCallback
tomx/trainer
callbacks. (by @yx1215)arrow
-based dataset. #2000 - Addarrow
-based dataset. (by @vafl, @lostella, @jaheba)DatasetWriter
. #2061 - AddDatasetWriter
. (by @jaheba)Breaking Changes
pd.Period
instead ofpd.Timestamp
. #1980 - Usepd.Period
instead ofpd.Timestamp
. (by @jaheba)freq
argument fromForecast
. #1997 - Removefreq
argument fromForecast
. (by @kashif)dct_reduce
. #2011 - Removedct_reduce
. (by @jaheba)freq
attribute ofPredictor
. #2017 - Remove mandatory freq attribute of Predictor. (by @kashif)FileDataset
. #2019 - ReworkFileDataset
. (by @jaheba)dataset_writer
toget_dataset
. #2053 - Adddataset_writer
toget_dataset
. (by @Hongqing-work)jsonl.encode_json
, removeserialize_data_entry
. #2070 - Addjsonl.encode_json
, removeserialize_data_entry
. (by @jaheba)Bug Fixes / Minor Improvements
gluonts.nursery.SCott
#1991 - Remove packaged pytorch-ts fromgluonts.nursery.SCott
(by @lostella)itertools
, add col-to-row and row-to-col functions. #2005 - Reworkitertools
, add col-to-row and row-to-col functions. (by @jaheba)Estimator
,Predictor
,Forecast
ingluonts.model
. #2014 - ExposeEstimator
,Predictor
,Forecast
ingluonts.model
. (by @jaheba)AffineTransformedDistribution
#2015 - Fix mean inAffineTransformedDistribution
(by @stailx)docs
folder, update gitignore #2020 - Remove unnecessary files fromdocs
folder, update gitignore (by @lostella)DataFramesDataset
. #2024 - Fix README. UseDataFramesDataset
. (by @jaheba)Quantile
derive frompydantic.BaseModel
. #2047 - MakeQuantile
derive frompydantic.BaseModel
. (by @jaheba)DataFramesDataset
#2051 - Add tutorial onDataFramesDataset
(by @rsnirwan)time_axis
toforecast_start
. #2057 - Add optional parametertime_axis
toforecast_start
. (by @melopeo)predict_to_numpy
#2062 - Fix type annotations forpredict_to_numpy
(by @lostella)freq
explicitly topd.period_range
. #2066 - Always pass freq explicitly to pd.period_range. (by @kashif)This discussion was created from the release 0.10.0.
Beta Was this translation helpful? Give feedback.
All reactions