Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ort v2 BREAKING #12

Merged
merged 6 commits into from
Jan 8, 2024
Merged

feat: ort v2 BREAKING #12

merged 6 commits into from
Jan 8, 2024

Conversation

joshniemela
Copy link
Contributor

@joshniemela joshniemela commented Jan 7, 2024

Migrated the code to ort to 2.0.0 based on the breaking API changes
Changed padding strategy and wrote a sequential version of query_embed, this gives a speedup of 95%

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Compliance Checks

Thank you for your Pull Request! We have run several checks on this pull request in order to make sure it's suitable for merging into this project. The results are listed in the following section.

Issue Reference

In order to be considered for merging, the pull request description must refer to a specific issue number. This is described in our Contributing Guide.
This check is looking for a phrase similar to: "Fixes #XYZ" or "Resolves #XYZ" where XYZ is the issue number that this PR is meant to address.

Protected Branch

In order to be considered for merging, the pull request changes must not be implemented on the "main" branch. This is described in our Contributing Guide. We would suggest that you close this PR and implement your changes as described in our Contributing Guide and open a new pull request.

@Anush008 Anush008 changed the title Updated ort to 2.0.0 feat: ort v2 Jan 7, 2024
@Anush008
Copy link
Owner

Anush008 commented Jan 7, 2024

Cargo.toml Outdated
@@ -16,7 +16,7 @@ anyhow = { version = "1.0" }
flate2 = { version = "1.0" }
minreq = { version = "2.10", default-features = false, features = ["https-rustls"] }
ndarray = { version = "0.15", default-features = false }
ort = { version = "1", features = ["load-dynamic"] }
ort = { version = "2.0.0-alpha.4", features = ["load-dynamic"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libraries should only enable features they need.

[dependencies]
ort = { version = "2.0.0-alpha.4", default-features = false, features = [ "ndarray" ] }

[dev-dependencies]
ort = "2.0.0-alpha.4"

@decahedron1
Copy link
Contributor

decahedron1 commented Jan 8, 2024

@Anush008 btw, 7e0526a has a critical performance fix. @joshniemela and I found that PaddingStrategy::Fixed caused fastembed-rs to be almost 15x slower than a Python equivalent on shorter sequences. Like the comment says, a new field should be added to InitOptions to change the padding strategy if the user wishes (InitOptions should also be marked #[non_exhaustive] imo).

@joshniemela
Copy link
Contributor Author

#9 is probably also related to this PR

@decahedron1
Copy link
Contributor

Indeed it significantly improves #9.

master

init: 138.47ms
passage_embed: 579.84ms
query_embed: 147.26ms

v2

init: 133.73ms
passage_embed: 39.97ms
query_embed: 4.50ms

@Anush008 Anush008 changed the title feat: ort v2 feat: ort v2 BREAKING Jan 8, 2024
@joshniemela
Copy link
Contributor Author

Indeed it significantly improves #9.

master

init: 138.47ms
passage_embed: 579.84ms
query_embed: 147.26ms

v2

init: 133.73ms
passage_embed: 39.97ms
query_embed: 4.50ms

Is it possible to do a test on python as well on the same machine?

@Anush008 Anush008 merged commit 36573a8 into Anush008:main Jan 8, 2024
7 of 8 checks passed
@decahedron1
Copy link
Contributor

Is it possible to do a test on python as well on the same machine?

init: 118.33ms
passage_embed: 17.55ms
query_embed: 3.51ms

github-actions bot pushed a commit that referenced this pull request Jan 8, 2024
# [1.11.0](v1.10.0...v1.11.0) (2024-01-08)

### Features

* ort v2 BREAKING ([#12](#12)) ([36573a8](36573a8))

## [1.11.0](v1.10.0...v1.11.0) (2024-01-08)

### 🍕 Features

* ort v2 BREAKING ([#12](#12)) ([36573a8](36573a8))
Copy link

github-actions bot commented Jan 8, 2024

🎉 This PR is included in version 1.11.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@Anush008
Copy link
Owner

Anush008 commented Jan 8, 2024

Bummer. Should've been a major release.

@Anush008
Copy link
Owner

Anush008 commented Jan 8, 2024

@joshniemela, @decahedron1 Thanks for the efforts. Super grateful. This is amazing.

Anush008 pushed a commit that referenced this pull request Jan 8, 2024
Anush008 pushed a commit that referenced this pull request Jan 8, 2024
BREAKING CHANGE: #12
github-actions bot pushed a commit that referenced this pull request Jan 8, 2024
# [2.0.0](v1.11.0...v2.0.0) (2024-01-08)

### Features

* ort v2 ([a6b5f55](a6b5f55))
* ort v2 ([dca1f86](dca1f86))
* ort v2\n\nBREAKING CHANGE: #12 ([1128471](1128471))

### BREAKING CHANGES

* #12

## [2.0.0](v1.11.0...v2.0.0) (2024-01-08)

### ⚠ BREAKING CHANGES

* #12

### 🍕 Features

* ort v2 ([a6b5f55](a6b5f55))
* ort v2 ([dca1f86](dca1f86))
* ort v2\n\nBREAKING CHANGE: #12 ([1128471](1128471))
@Anush008
Copy link
Owner

Anush008 commented Jan 8, 2024

After some commit slogging, it is finally up.
https://github.com/Anush008/fastembed-rs/releases/tag/v2.0.0

I've yanked 1.11.0.

@prattcmp
Copy link
Contributor

Thank you for the perf enhancements, but the ONNX runtime upgrade needs to be notated.

This should be released as v2.0.0-alpha or with some other indication that it is not a stable release since it depends on an alpha version of ONNX runtime.

@decahedron1
Copy link
Contributor

This should be released as v2.0.0-alpha or with some other indication that it is not a stable release since it depends on an alpha version of ONNX runtime.

ort v2.0.0-alpha.4 uses ONNX Runtime v1.16.3, so only a minor upgrade as ort 1.16 uses 1.16.0. It does not use an alpha version of ONNX Runtime.

@prattcmp
Copy link
Contributor

prattcmp commented Jan 12, 2024

This should be released as v2.0.0-alpha or with some other indication that it is not a stable release since it depends on an alpha version of ONNX runtime.

ort v2.0.0-alpha.4 uses ONNX Runtime v1.16.3, so only a minor upgrade as ort 1.16 uses 1.16.0. It does not use an alpha version of ONNX Runtime.

Thank you for correcting that the crate is alpha, not the actual runtime. Nevertheless, this release is not stable and should indicate such in its versioning.

P.S. I think its really awesome that you are maintaining this crate for the community! I'm just trying to help :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants