Skip to content

Latest commit

 

History

History
140 lines (82 loc) · 6.34 KB

CHANGELOG.md

File metadata and controls

140 lines (82 loc) · 6.34 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Added

  • ClickhouseServerAPI can register pandas tables with datetime columns, and allows integers to be signed #61.
  • ClickhouseServerAPI will now register dict or list via pandas #61.

0.4.0 - 2024-12-23

Changed

  • Renamed ClickhouseAPI and ClickhouseDataFrame to ClickhouseServerAPI and ClickhouseServerDataFrame respectively, and splinkclickhouse.clickhouse to splinkclickhouse.clickhouse_server #54.

0.3.4 - 2024-12-16

Added

  • Added Clickhouse appropriate versions of comparison level PairwiseStringDistanceFunctionLevel and comparison PairwiseStringDistanceFunctionAtThresholds to the relevant libraries #51.
  • ClickhouseAPI can now properly register pandas tables with string array columns #51.

Fixed

  • Table registration in chdb now works for pandas tables whose indexes do not have a 0 entry #49.

0.3.3 - 2024-12-05

Added

  • Term frequency adjustments are now not limited in Clickhouse server (or chdb when debug_mode is switched on) #46.

Changed

  • Dropped support for Splink <= 4.0.5 #46.

0.3.2 - 2024-10-23

Added

  • SQL UDF days_since_epoch to parse a date representing a string to the number of days since 1970-01-01 #39.
  • Custom Clickhouse ColumnExpression with additional transform parse_date_to_int to parse string to days since epoch #39.
  • Custom date comparison and comparison levels working with integer type representing days since epoch #39.

0.3.1 - 2024-10-14

Added

  • ClickhouseAPI now has a function .set_union_default_mode() to allow manually setting client state necessary for clustering, if session has timed out e.g. when running interactively #36.
  • Added support for Splink 4.0.4 #37.

Fixed

  • estimate_probability_two_random_records_match now works correctly when debug_mode is switched on #34.

0.3.0 - 2024-09-26

Changed

  • chdb is now an optional dependency, requiring opt-in installation for use of ChDBAPI #28.

0.2.5 - 2024-09-23

Changed

  • Added support for Splink >= 4.0.2, dropped support for 4.0.0, 4.0.1 #26.

0.2.4 - 2024-09-19

Added

  • Extended ClickhouseAPI pandas table registration to support float columns #24.
  • Added Clickhouse-specific library comparisons/levels - cll_ch.DistanceInKMLevel, cl_ch.DistanceInKMAtThresholds, and cl_ch.ExactMatchAtSubstringSizes #24.

0.2.3 - 2024-09-16

Changed

  • Dropped support for python 3.8 #20.
  • Removed numpyrequirements #20.

0.2.2 - 2024-09-12

Added

  • ClickhouseAPI now allows for registering tables directly from pandas DataFrames, if they contain only integer and string columns #18.

Fixed

  • Create an alias for rand, random so that Linker.visualisations.comparison_viewer_dashboard runs without error #14.
  • Workaround for Clickhouse count(*) filter ... parsing issue so that linker.clustering.compute_graph_metrics(...) now runs #18.

0.2.1 - 2024-09-12

Changed

  • Updated numpy dependency requirements to allow compatible versions for all supported python versions #9.

0.2.0 - 2024-09-11

Added

  • ClickhouseAPI and dataframe added to support running calculations in a Clickhouse instance #4.

0.1.1 - 2024-09-10

Fixed

  • Fix random_sample_sql so that u-training works when we don't sample the entire dataset #1.

Changed

  • try_parse_date and try_parse_timestamp now use DateTime64 to extend the range to more useful values, and no longer support custom format strings #2.

0.1.0 - 2024-09-09

Added

  • Basic working version of package with api for chdb