Skip to content

Releases: mars-project/mars

v0.4.5

05 Aug 03:48
15c064e
Compare
Choose a tag to compare

This is the release notes of v0.4.5. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for arrow-based string dtype (#1440)
    • Add support for memory usage (#1447)

Bug fixes

  • Fix failed when serializing LearnShuffle operand. (#1449)
  • Fix reference cycle in promise.all_ (#1456)
  • Fix kmeans hang for local cluster (#1446)
  • Support ArrowStringDtype for DataFrame.sort_values() (#1457)

v0.5.0b3

03 Aug 17:26
96c8290
Compare
Choose a tag to compare
v0.5.0b3 Pre-release
Pre-release

This is the release notes of v0.5.0b3. See here for the complete list of solved issues and merged PRs.

Announcements

  • From v0.5.0b3 on, v0.5.x series will no longer support Python 3.5, for Python 3.5 users, please use 0.4.x series.

New Features

  • DataFrame:
    • Add support for arrow-based string dtype (#1438)
    • Support memory_usage on DataFrame objects (#1217)

Bug fixes

  • Fix crash when storing data inside Docker containers (#1429)
  • Fix kmeans hang for local cluster (#1445)
  • Fix failed when serializing LearnShuffle operand. (#1442)

Installation

  • Drop support for python 3.5 (#1435)

v0.4.4

29 Jul 06:57
46acc59
Compare
Choose a tag to compare

This is the release notes of v0.4.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Add mars.learn.cluster.KMeans support (#1428)

Enhancements

  • Optimize to_pandas and to_numpy etc that fetch first, if failed, call execute().fetch() instead (#1410)
  • Create backup CalcActor when spawning a new graph in mars.remote (#1412)
  • Skip rechunk when DataFrame has unknown shape in sort_values (#1420)

Bug fixes

  • Fix worker assign when no evaluation sets specified in LGBM training (#1408)
  • Fix query alias & add estimation for object types (#1417)
  • Fix the dtype of LightGBM model's predicted results (#1421)
  • Fix the error raised when inferring dtype in DataFrame.transform (#1427)
  • Fix crash when storing data inside Docker containers (#1432)

v0.5.0b2

27 Jul 16:23
13ccafc
Compare
Choose a tag to compare
v0.5.0b2 Pre-release
Pre-release

This is the release notes of v0.5.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • Learn
    • Add mars.learn.cluster.KMeans support (#1426)

Enhancements

  • Optimize to_pandas and to_numpy etc that fetch first, if failed, call execute().fetch() instead (#1409)
  • Create backup CalcActor when spawning a new graph in mars.remote (#1411)
  • Skip rechunk when DataFrame has unknown shape in sort_values. (#1414)

Bug fixes

  • Fix worker assign when no evaluation sets specified in LGBM training (#1405)
  • Fix query alias & add estimation for object types (#1416)
  • Fix the dtype of LightGBM model's predicted results. (#1419)
  • Fix the error raised when inferring dtype in DataFrame.transform (#1424)

v0.5.0b1

12 Jul 11:27
fbd2acc
Compare
Choose a tag to compare
v0.5.0b1 Pre-release
Pre-release

This is the release notes of v0.5.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.stats.entropy (#1376)
  • DataFrame
    • Implements DataFrame.rename (#1359)
    • Implements {Series,Index}.rename (#1361)
    • Implement DataFrame.insert (#1389)
  • Remote:
    • Add run_script support (#1299)

Enhancements

  • Set output type when calling new_xxx methods on DataFrames (#1212)
  • Optimize DataFrame.{head, tail} when DataFrame has unknown chunk shape (#1328)
  • Make creation of Kubernetes clusters modular (#1369)
  • Optimize read_sql + head (#1377)
  • Use subgraph to represent fused nodes instead of a list (#1388)
  • Optimize read_csv if followed by DataFrame.getitem (#1390)

Bug fixes

  • Fix hang for distributed roc_curve (#1362, #1380)
  • Fix read_sql when no data selected & refine error when no worker attached (#1371)
  • Fix progress display for bokeh 2.1.x (#1382)
  • Fix serialization issue when remote function has argument which is an executed tileable (#1394)
  • Fix LightGBM when input tileables have unknown shape (#1396)

Installation

  • Specifying encoding for long_description (#1402)

v0.4.3

12 Jul 15:33
87bae0e
Compare
Choose a tag to compare

This is the release notes of v0.4.3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mars.tensor.stats.entropy (#1378)
  • DataFrame
    • Implements {DataFrame,Series,Index}.rename (#1366)
    • Implement DataFrame.insert (#1392)
  • Learn
    • Implements mars.learn.model_selection.train_test_split (#1355)
  • Remote
    • Add run_script support (#1397)

Enhancements

  • Optimize DataFrame.{head, tail} when DataFrame has unknown chunk shape (#1360)
  • Make creation of Kubernetes clusters modular (#1373)
  • Optimize read_sql + head (#1379)
  • Optimize read_csv if followed by DataFrame.getitem (#1398)

Bug fixes

  • Remove reliance on WHERE 1=0 in read_sql (#1353)
  • Fix hang for distributed roc_curve (#1367, #1387)
  • Fix read_sql when no data selected & refine error when no worker attached (#1374)
  • Fix progress display for bokeh 2.1.x (#1383)
  • Fix serialize failed when FetchDataFrame's object_type is a list (#1386)
  • Make local filesystem work when PyArrow not installed (#1391)
  • Fix serialization issue when remote function has executed tileable arguments (#1400)
  • Fix LightGBM when input tileables have unknown shape (#1399)

v0.5.0a3

25 Jun 07:45
11373f6
Compare
Choose a tag to compare
v0.5.0a3 Pre-release
Pre-release

This is the release notes of v0.5.0a3. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for {DataFrame,Series,Index}.drop (#1263)
    • Add {DataFrame,Series}.to_sql() and Series.to_csv() (#1264)
    • Implements {DataFrame,Series,Index}.drop_duplicates (#1285)
    • Implements DataFrame.melt (#1284)
    • Implements md.read_sql_query (#1297)
    • Implements {Series,Index}.to_frame() and Index.to_series() (#1317)
    • Support setting columns for DataFrame (#1326)
  • Learn
    • Add MarsDistributor for tsfresh library (#1277)
    • Implements mars.learn.model_selection.train_test_split (#1352)
  • Remote
    • Support tileables as arguments for spawned functions (#1296)

Enhancements

  • Allow client-side to use pickle to serialize / deserialize tensor data (#1289)
  • Support create session from environment variables (#1265)

Bug fixes

  • Fix NearestNeighbors that run failed in cluster mode (#1262)
  • Fix graph hang on tile failure and execution failure (#1272)
  • Fix failure when executing None-result spawn functions (#1276)
  • Fix shape calculation in TensorIndex for tensor.__setitem__ (#1283)
  • Support fuse for Mars Remote (#1287)
  • Fix mt.linalg.norm when chunk shape on axis > 1 (#1302)
  • Fix error in calc_data_size() for GroupByWrapper (#1307)
  • Trigger execution in check_consistent_length when arrays have unknown shape (#1321)
  • Fix wrong columns value in reset_index (#1320)
  • Fix build_df when input DataFrame has duplicate columns (#1319)
  • Remove reliance on WHERE 1=0 in read_sql (#1335)
  • Make local filesystem work when PyArrow not installed (#1356)

Documentation

  • Add docs for remote API, getting started as well as GPU integration (#1266)
  • Use pydata-sphinx-theme for documentation (#1304)

Others

  • Use latest pandas wheel for Python 3.8 (#1333)

v0.4.2

22 Jun 11:59
c57d545
Compare
Choose a tag to compare

This is the release notes of v0.4.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for {DataFrame,Series,Index}.drop (#1268)
    • Add {DataFrame,Series}.to_sql() and Series.to_csv() (#1267)
    • Implements {DataFrame,Series,Index}.drop_duplicates (#1292)
    • Implement DataFrame.melt (#1295)
    • Implements md.read_sql_query (#1300)
    • Implements {Series,Index}.to_frame() and Index.to_series() (#1323)
    • Support setting columns for DataFrame (#1327)
  • Learn
    • Add MarsDistributor for tsfresh library (#1281)
  • Remote
    • Support tileables as arguments for spawned functions (#1298)

Enhancements

  • Allow client-side to use pickle to serialize / deserialize tensor data (#1291)
  • Support create session from environment variables (#1322)

Bug fixes

  • Fix NearestNeighbors that run failed in cluster mode (#1273)
  • Fix graph hang on tile failure and execution failure (#1275)
  • Fix failure for None-result spawn functions (#1280)
  • Fix shape calculation in TensorIndex for tensor.__setitem__ (#1293)
  • Support fuse for Mars Remote (#1294)
  • Fix mt.linalg.norm when chunk shape on axis > 1 (#1303)
  • Trigger execution in check_consistent_length when arrays have unknown shape (#1325)
  • Fix build_df when input DataFrame has duplicate columns (#1324)
  • Fix error in calc_data_size() for GroupByWrapper (#1329)
  • Fix wrong columns value in reset_index (#1330)

Documentation

  • Add docs for remote API, getting started as well as GPU integration (#1274)

Others

  • Use latest pandas wheel for Python 3.8 (#1332)

v0.5.0a2

29 May 08:52
d9e0c91
Compare
Choose a tag to compare
v0.5.0a2 Pre-release
Pre-release

This is the release notes of v0.5.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add size function for dataframes and groupbys (#1250)
    • Implements DataFrame.{iterrows, itertuples} (#1252)
  • Learn
    • Add support for LightGBM in Mars (#1244)
  • Remote
    • Support running tileables inside functions which spawned via mr.spawn (#1248)

Bug fixes

  • Fix .fetch() that may cause some op executed again (#1243)
  • Fix df.describe() that failed when df has unknown shape and chunk size > 1 (#1249)

Tests

  • Add checks for data consistency in learn module (#1246)

v0.4.1

29 May 12:04
8892948
Compare
Choose a tag to compare

This is the release notes of v0.4.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add size function for dataframes and groupbys (#1253)
    • Implements DataFrame.{iterrows, itertuples} (#1258)
  • Learn
    • Add support for LighGBM in Mars (#1254)
  • Remote
    • Support running tileables inside functions which spawned via mr.spawn (#1257)

Bug fixes

  • Fix .fetch() that may cause some op executed again (#1255)
  • Fix df.describe() that failed when df has unknown shape and chunk size > 1 (#1256)

Tests

  • Add checks for data consistency in learn module (#1259)