Skip to content

Releases: delta-io/delta-sharing

Delta Sharing 0.6.2

20 Dec 23:51
Compare
Choose a tag to compare

We’d like to announce the release of Delta Sharing 0.6.2, which introduces the following improvement and bug fixes.

Bug fixes:

  • Fix comparison of the expiration time to current time for pre-signed urls.(#236)

Credits: Lin Zhou, William Chau

Delta Sharing 0.6.1

20 Dec 01:50
Compare
Choose a tag to compare

We’d like to announce the release of Delta Sharing 0.6.1, which introduces the following improvement and bug fixes.

Improvements:

  • Spark connector changes to consume size from metadata. (#228)
  • Improve delta sharing error messages(#235)

Bug fixes:

  • Extends DeltaSharingProfileProvider to customize tablePath and refresher (#223)
  • Refresh pre-signed urls for cdf and streaming queries (#221, #222)
  • Allow 0 for versionAsOf parameter, to be consistent with Delta (#224)
  • Fix partitionFilters issue: apply it to all file indices. (#227, #229)

Credits: Abhijit Chakankar, Lin Zhou

Delta Sharing 0.6.0

02 Dec 20:23
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.6.0, which introduces the following improvements.

Improvements:

Credits: Abhijit Chakankar, Lin Zhou, Xiaotong Sun

Delta Sharing 0.5.2

10 Oct 22:56
Compare
Choose a tag to compare

Delta Sharing 0.5.2 has one single change that adds ability to override HTTP headers included in the request to the Delta Sharing server.

  • Add a Custom Http Header Provider (#192)

Credits: Xiaotong Sun

Delta Sharing 0.5.1

08 Sep 19:31
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.5.1, which introduces the following changes.

Improvements:

  • Upgrade AWS SDK to 1.12.189 (#170)
  • More tests on the error message when loading table fails (#164)
  • Add ability to configure armeria server request timeout (#163)
  • documentation improvements (#171, #179)

Bug fixes:

  • Fix column selection bug on Delta Sharing CDF spark dataframe (#184)
  • Fix GCS path reading (#181)

Credits: Antonio Irizarry, Lin Zhou, Shixiong Zhu, Pat McCauley

Delta Sharing 0.5.0

30 Aug 22:08
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.5.0, which introduces the following improvements.

Improvements:

Credits: Abhijit Chakankar, Alex Ott, Lin Zhou, Shixiong Zhu, William Chau, Xiaotong Sun, harksin, Kohei Toshimitsu, Vuong Nguyen

Delta Sharing 0.4.0

14 Jan 00:09
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.4.0, which introduces the following improvements and fixes.

Improvements:

  • Support Google Cloud Storage on Delta Sharing Server (#81, #105)
  • Add a new API to get the metadata of a Share (#97)
  • Protocol and REST API documentation enhancements (#85, #89, #93, #98)
  • Allow for customization of recipient profile in Apache Spark connector (#99, #107)

Bug fixes:

  • Block managed table creation for Delta Sharing to prevent user confusions (#92)

Credits: Denny Lee, Lin Zhou, Shixiong Zhu, William Chau, Xiaotong Sun, Kohei Toshimitsu

Delta Sharing 0.3.0

01 Dec 18:19
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.3.0, which introduces the following improvements and fixes issues:

Improvements:

  • Support Azure Blob Storage and Azure Data Lake Gen2 in Delta Sharing Server (#56, #59)
  • Apache Spark Connector now can send the limitHint parameter when a user query is using limit (#55)
  • load_as_pandas in Python Connector now accepts a limit parameter to allow users fetching only a few rows to explore (#76)
  • Apache Spark Connector will re-fetch pre-signed urls before they expire to support long running queries (#69)
  • Add a new API to list all tables in a share to save network round trips (#63, #66, #67, #88)
  • Add a User-Agent header to request sent from Apache Spark Connector and Python (#75)
  • Add an optional expirationTime field to Delta Sharing Profile File Format to provide the token expiration time (#77)

Bug fixes:

  • Fix a corner case that list_all_tables may not return correct results in the Python Connector (#84)

Credits: Denny Lee, Felix Cheung, Lin Zhou, Matei Zaharia, Shixiong Zhu, Will Girten, Xiaotong Sun, Yuhong Chen, kohei-tosshy, William Chau

Delta Sharing 0.2.0

11 Aug 05:31
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.2.0, which introduces the following improvements and fixes multiple issues:

Improvements:

  • Added official Docker images for Delta Sharing Server
  • Added an examples project to show how to try the open Delta Sharing Server (#26)
  • Added the conf directory to the Delta Sharing Server classpath to allow users to add their Hadoop configuration files in the directory (#45)
  • Added retry with exponential backoff for REST requests in the Python connector (#49)

Bug fixes:

  • Added the minimum fsspec requirement in the Python connector (#23)
  • Fixed an issue when files in a table have no stats in the Python connector (#30)
  • Improve error handling in Delta Sharing Server to report 400 Bad Request properly (#32)
  • Fixed the table schema when a table is empty in the Python connector (#37)
  • Fixed KeyError when there are no shared tables in the Python connector (#50)

Credits: Denny Lee, Matei Zaharia, Shixiong Zhu, Yaohua, Yuhong Chen, dobachi

Delta Sharing 0.1.0

26 May 04:59
Compare
Choose a tag to compare

We are excited to announce the release of Delta Sharing 0.1.0.

Delta Sharing is an open protocol for secure real-time exchange of large datasets, which enables organizations to share data in real time regardless of which computing platforms they use. It is a simple REST protocol that securely shares access to part of a cloud dataset and leverages modern cloud storage systems, such as S3, ADLS, or GCS, to reliably transfer data.

With Delta Sharing, a user accessing shared data can directly connect to it through pandas, Tableau, Apache Spark, Rust, Python, or dozens of other systems that support the open protocol, without having to deploy a specific compute platform first. This makes life simpler for both data providers and consumers. Data providers can share a dataset once to reach a broad range of consumers on any platform, and data consumers can get started using the data in minutes on their existing computing tools.

This repo includes the following components:

  • Delta Sharing protocol specification.
  • Python Connector: A Python library that implements the Delta Sharing Protocol to read shared tables as pandas DataFrame or Apache Spark DataFrames.
  • Apache Spark Connector: An Apache Spark connector that implements the Delta Sharing Protocol to read shared tables from a Delta Sharing Server. The tables can then be accessed in SQL, Python, Java, Scala, or R.
  • Delta Sharing Server: A reference implementation server for the Delta Sharing Protocol for development purposes. Users can deploy this server to share existing tables in Delta Lake and Apache Parquet format on modern cloud storage systems.

See the documentation for more details.