Skip to content

Releases: alteryx/woodwork

v0.0.11

15 Mar 15:57
fec592f
Compare
Choose a tag to compare

v0.0.11 March 15, 2021

  • Changes
    • Restrict Koalas version to <1.7.0 due to breaking changes (#674)
    • Include unique columns in mutual information calculations (#687)
    • Add parameter to include index column in mutual information calculations (#692)
  • Documentation Changes
    • Update to remove warning message from statistical insights guide (#690)
  • Testing Changes
    • Update branch reference in tests to run on main (#641)
    • Make release notes updated check separate from unit tests (#642)
    • Update release branch naming instructions (#644)

Thanks to the following people for contributing to this release:
@gsheni, @tamargrey, @thehomebrewnerd

v0.0.10

25 Feb 20:31
c3f73ae
Compare
Choose a tag to compare

v0.0.10 February 25, 2021

  • Changes
    • Avoid calculating mutual info for non-unique columns (#563)
    • Preserve underlying DataFrame index if index column is not specified (#588)
    • Add blank issue template for creating issues (#630)
  • Testing Changes
    • Update branch reference in tests workflow (#552, #601)
    • Fixed text on back arrow on install page (#564)
    • Refactor test_datatable.py (#574)

Thanks to the following people for contributing to this release:
@gsheni, @jeff-hernandez, @johnbridstrup, @tamargrey

v0.0.9

05 Feb 16:16
f047187
Compare
Choose a tag to compare

v0.0.9 February 5, 2021

  • Enhancements
    • Add Python 3.9 support without Koalas testing (#511)
    • Add get_valid_mi_types function to list LogicalTypes valid for mutual information calculation (#517)
  • Fixes
    • Handle missing values in Datetime columns when calculating mutual information (#516)
    • Support numpy 1.20.0 by restricting version for koalas and changing serialization error message (#532)
    • Move Koalas option setting to DataTable init instead of import (#543)
  • Documentation Changes
    • Add Alteryx OSS Twitter link (#519)
    • Update logo and add new favicon (#521)
    • Multiple improvements to Getting Started page and guides (#527)
    • Clean up API Reference and docstrings (#536)
    • Added Open Graph for Twitter and Facebook (#544)

Thanks to the following people for contributing to this release:
@gsheni, @tamargrey, @thehomebrewnerd

v0.0.8

25 Jan 21:47
db6f8fd
Compare
Choose a tag to compare

v0.0.8 January 25, 2021

  • Enhancements
    • Add DataTable.df property for accessing the underling DataFrame (#470)
    • Set index of underlying DataFrame to match DataTable index (#464)
  • Fixes
    • Sort underlying series when sorting dataframe (#468)
    • Allow setting indices to current index without side effects (#474)
  • Changes
    • Fix release document with Github Actions link for CI (#462)
    • Don't allow registered LogicalTypes with the same name (#477)
    • Move str_to_logical_type to TypeSystem class (#482)
    • Remove pyarrow from core dependencies (#508)

Thanks to the following people for contributing to this release:
@gsheni, @tamargrey, @thehomebrewnerd

v0.0.7

15 Dec 15:18
1e73846
Compare
Choose a tag to compare

v0.0.7 December 14, 2020

  • Enhancements
    • Allow for user-defined logical types and inference functions in TypeSystem object (#424)
    • Add __repr__ to DataTable (#425)
    • Allow initializing DataColumn with numpy array (#430)
    • Add drop to DataTable (#434)
    • Migrate CI tests to Github Actions (#417, #441, #451)
    • Add metadata to DataColumn for user-defined metadata (#447)
  • Fixes
    • Update DataColumn name when using setitem on column with no name (#426)
    • Don't allow pickle serialization for Koalas DataFrames (#432)
    • Check DataTable metadata in equality check (#449)
    • Propagate all attributes of DataTable in _new_dt_including (#454)
  • Changes
    • Update links to use alteryx org Github URL (#423)
    • Support column names of any type allowed by the underlying DataFrame (#442)
    • Use object dtype for LatLong columns for easy access to latitude and longitude values (#414)
    • Restrict dask version to prevent 2020.12.0 release from being installed (#453)
    • Lower minimum requirement for numpy to 1.15.4, and set pandas minimum requirement 1.1.1 (#459)
  • Testing Changes
    • Fix missing test coverage (#436)

Thanks to the following people for contributing to this release:
@gsheni, @jeff-hernandez, @tamargrey, @thehomebrewnerd

v0.0.6

30 Nov 19:48
e44cec6
Compare
Choose a tag to compare

v0.0.6 November 30, 2020

  • Enhancements
    • Add support for creating DataTable from Koalas DataFrame (#327)
    • Add ability to initialize DataTable with numpy array (#367)
    • Add describe_dict method to DataTable (#405)
    • Add mutual_information_dict method to DataTable (#404)
    • Add metadata to DataTable for user-defined metadata (#392)
    • Add update_dataframe method to DataTable to update underlying DataFrame (#407)
    • Sort dataframe if time_index is specified, bypass sorting with already_sorted parameter. (#410)
    • Add description attribute to DataColumn (#416)
    • Implement DataColumn.__len__ and DataTable.__len__ (#415)
  • Fixes
    • Rename data_column.py datacolumn.py (#386)
    • Rename data_table.py datatable.py (#387)
    • Rename get_mutual_information mutual_information (#390)
  • Changes
    • Lower moto test requirement for serialization/deserialization (#376)
    • Make Koalas an optional dependency installable with woodwork[koalas] (#378)
    • Remove WholeNumber LogicalType from Woodwork (#380)
    • Updates to LogicalTypes to support Koalas 1.4.0 (#393)
    • Replace set_logical_types and set_semantic_tags with just set_types (#379)
    • Remove copy_dataframe parameter from DataTable initialization (#398)
    • Implement DataTable.__sizeof__ to return size of the underlying dataframe (#401)
    • Include Datetime columns in mutual info calculation (#399)
    • Maintain column order on DataTable operations (#406)
  • Testing Changes
    • Add pyarrow, dask, and koalas to automated dependency checks (#388)
    • Use new version of pull request Github Action (#394)
    • Improve parameterization for test_datatable_equality (#409)

Thanks to the following people for contributing to this release:
@ctduffy, @gsheni, @tamargrey, @thehomebrewnerd

v0.0.5

11 Nov 21:14
2e60a37
Compare
Choose a tag to compare

v0.0.5 November 11, 2020

  • Enhancements
    • Add __eq__ to DataTable and DataColumn and update LogicalType equality (#318)
    • Add value_counts() method to DataTable (#342)
    • Support serialization and deserialization of DataTables via csv, pickle, or parquet (#293)
    • Add shape property to DataTable and DataColumn (#358)
    • Add iloc method to DataTable and DataColumn (#365)
    • Add numeric_categorical_threshold config value to allow inferring numeric columns as Categorical (#363)
  • Fixes
    • Catch non numeric time index at validation (#332)
  • Changes
    • Support logical type inference from a Dask DataFrame (#248)
    • Fix validation checks and make_index to work with Dask DataFrames (#260)
    • Skip validation of Ordinal order values for Dask DataFrames (#270)
    • Improve support for datetimes with Dask input (#286)
    • Update DataTable.describe to work with Dask input (#296)
    • Update DataTable.get_mutual_information to work with Dask input (#300)
    • Modify to_pandas function to return DataFrame with correct index (#281)
    • Rename DataColumn.to_pandas method to DataColumn.to_series (#311)
    • Rename DataTable.to_pandas method to DataTable.to_dataframe (#319)
    • Remove UserWarning when no matching columns found (#325)
    • Remove copy parameter from DataTable.to_dataframe and DataColumn.to_series (#338)
    • Allow pandas ExtensionArrays as inputs to DataColumn (#343)
    • Move warnings to a separate exceptions file and call via UserWarning subclasses (#348)
    • Make Dask an optional dependency installable with woodwork[dask] (#357)
  • Documentation Changes
    • Create a guide for using Woodwork with Dask (#304)
    • Add conda install instructions (#305`, #309)
    • Fix README.md badge with correct link (#314)
    • Simplify issue templates to make them easier to use (#339)
    • Remove extra output cell in Start notebook (#341)
  • Testing Changes
    • Parameterize numeric time index tests (#288)
    • Add DockerHub credentials to CI testing environment (#326)
    • Fix removing files for serialization test (#350)

Thanks to the following people for contributing to this release:
@ctduffy, @gsheni, @tamargrey, @thehomebrewnerd

v0.0.4

21 Oct 21:53
536cc6e
Compare
Choose a tag to compare

v0.0.4 October 21, 2020

  • Enhancements
    • Add optional include parameter for DataTable.describe() to filter results (#228)
    • Add make_index parameter to DataTable.__init__ to enable optional creation of a new index column (#238)
    • Add support for setting ranking order on columns with Ordinal logical type (#240)
    • Add list_semantic_tags function and CLI to get dataframe of woodwork semantic_tags (#244)
    • Add support for numeric time index on DataTable (#267)
    • Add pop method to DataTable (#289)
    • Add entry point to setup.py to run CLI commands (#285)
  • Fixes
    • Allow numeric datetime time indices (#282)
  • Changes
    • Remove redundant methods DataTable.select_ltypes and DataTable.select_semantic_tags (#239)
    • Make results of get_mutual_information more clear by sorting and removing self calculation (#247)
    • Lower minimum scikit-learn version to 0.21.3 (#297)
  • Documentation Changes
    • Add guide for dt.describe and dt.get_mutual_information (#245)
    • Update README.md with documentation link (#261)
    • Add footer to doc pages with Alteryx Open Source (#258)
    • Add types and tags one-sentence definitions to Understanding Types and Tags guide (#271)
    • Add issue and pull request templates (#280, #284)
  • Testing Changes
    • Add automated process to check latest dependencies. (#268)
    • Add test for setting a time index with specified string logical type (#279)

Thanks to the following people for contributing to this release:
@ctduffy, @gsheni, @tamargrey, @thehomebrewnerd

v0.0.3

09 Oct 19:15
c50dc4a
Compare
Choose a tag to compare

v0.0.3 October 9, 2020

  • Enhancements
    • Implement setitem on DataTable to create/overwrite an existing DataColumn (#165)
    • Add to_pandas method to DataColumn to access the underlying series (#169)
    • Add list_logical_types function and CLI to get dataframe of woodwork LogicalTypes (#172)
    • Add describe method to DataTable to generate statistics for the underlying data (#181)
    • Add optional return_dataframe parameter to load_retail to return either DataFrame or DataTable (#189)
    • Add get_mutual_information method to DataTable to generate mutual information between columns (#203)
    • Add read_csv function to create DataTable directly from CSV file (#222)
  • Fixes
    • Fix bug causing incorrect values for quartiles in DataTable.describe method (#187)
    • Fix bug in DataTable.describe that could cause an error if certain semantic tags were applied improperly (#190)
    • Fix bug with instantiated LogicalTypes breaking when used with issubclass (#231)
  • Changes
    • Remove unnecessary add_standard_tags attribute from DataTable (#171)
    • Remove standard tags from index column and do not return stats for index column from DataTable.describe (#196)
    • Update DataColumn.set_semantic_tags and DataColumn.add_semantic_tags to return new objects (#205)
    • Update various DataTable methods to return new objects rather than modifying in place (#210)
    • Move datetime_format to Datetime LogicalType (#216)
    • Do not calculate mutual info with index column in DataTable.get_mutual_information (#221)
    • Move setting of underlying physical types from DataTable to DataColumn (#233)
  • Documentation Changes
    • Remove unused code from sphinx conf.py, update with Github URL(#160, :pr:163)
    • Update README and docs with new Woodwork logo, with better code snippets (#161, :pr:159)
    • Add DataTable and DataColumn to API Reference (#162)
    • Add docstrings to LogicalType classes (#168)
    • Add Woodwork image to index, clear outputs of Jupyter notebook in docs (#173)
    • Update contributing.md, release.md with all instructions (#176)
    • Add section for setting index and time index to start notebook (#179)
    • Rename changelog to Release Notes (#193)
    • Add section for standard tags to start notebook (#188)
    • Add Understanding Types and Tags user guide (#201)
    • Add missing docstring to list_logical_types (#202)
    • Add Woodwork Global Configuration Options guide (#215)
  • Testing Changes
    • Add tests that confirm dtypes are as expected after DataTable init (#152)
    • Remove unused none_df test fixture (#224)
    • Add test for LogicalType.__str__ method (#225)

Thanks to the following people for contributing to this release:
@gsheni, @tamargrey, @thehomebrewnerd

v0.0.2

28 Sep 20:23
b7324ba
Compare
Choose a tag to compare

v0.0.2 September 28, 2020

  • Fixes
    • Fix formatting issue when printing global config variables (#138)
  • Changes
    • Change add_standard_tags to use_standard_Tags to better describe behavior (#149)
    • Change access of underlying dataframe to be through to_pandas with ._dataframe field on class (#146)
    • Remove replace_none parameter to DataTables (#146)
  • Documentation Changes
    • Add working code example to README and create Using Woodwork page (#103)

Thanks to the following people for contributing to this release:
@gsheni, @tamargrey, @thehomebrewnerd