-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple inference from loading traced pytorch models #978
Comments
Hi, glad you are interested! I would advise you to see the llama example in candle-examples. If you run |
If the question is more about inference of model exported from PyTorch by tracing in the TorchScript form, we don't support this at the moment. You may already be aware of this but tch-rs does support it with an example here. |
Yes, perhaps the Tensor type could become a trait and there would be TracedTensor and Tensor? This is an interesting idea. |
Oh I don't mean we should support tracing in candle (we actually already do it with the graph that is internally maintained), more that we could support executing a model exported from PyTorch in the TorchScript form, no need for a new trait or anything for this it would even more be an external crate rather than done in |
Ah, ok. Supporting TorchScript might be another candle-xxx crate? Edit: *might |
@LaurentMazare thanks for your reply. I'm working at SurrealDB which is a database written in rust and we're building support for inference for traced models exported in pytorch. We have integrated |
I was just looking at the .pt format: https://pytorch.org/tutorials/beginner/saving_loading_models.html. It looks like it is a serialized I think another crate is a good idea because it is likely functionality not everyone needs. I would be happy to create it, but let me know. This could be very interesting - a sort of interpreter that could perhaps be JIT compiled with inkwell. |
@maxwellflitton just to propose something slightly different: do you have specific reasons to prefer exporting to torchscript rather than exporting to onnx? We've been thinking about having an onnx runtime for candle for a while and this would probably be easier as onnx is well documented format and was designed explicitly for interoperability. |
True, and TorchScript can convert to ONNX. |
@LaurentMazare As long as we can do inference on models alongside neural networks with ONNX I'm down for that. A lot of models in production are random forests and hummingbird has managed to convert sklearn random forests to pytorch models. It's an exciting time to build inference libraries. Because of the embedded nature of the database I really want the ONNX C++ library to be directly fused with the rust binary. Do you think building a rust wrapper for the ONNX library using the cxx crate? If we can take this approach I think I have a strong case for my place of work to allocate resources to it. |
If you're thinking about the onnxruntime c++ library, there are already pretty good rust bindings for it - and you can probably statically link your binary with it, so no need to use the cxx crate or anything, just a |
I agree, as it would be more cohesive and we would need a wrapper for |
@LaurentMazare that would be amazing! Would definitely be up for this as it would really lean into our desire to have pure rust in our database for the embedded and WASM nature. Is there anywhere we can work on a roadmap! I'm going to get the basic C++ working in the database which shouldn't take too long, and then I'm happy to commit to building an onnx interpreter for candle. Is there anywhere we can draft tasks or define a layout? |
Really up to you, I would suggest creating a specific repo & crate for the candle-onnx interpreter and either check in some design documents in the repo or use the wiki/issues/... not sure what people typically use for this. |
There is already a rust ONNX runtime written in rust: wonnx. |
Looks like it is its entity, so its types and API would not match Candle's. Perhaps we could base a custom implementation off of wonnx. @maxwellflitton, what are your thoughts on this vs. implementing from scratch? |
There is also tract, which also can load ONNX files. |
@EdorianDark thanks I checked out the |
@maxwellflitton did you get some time to start looking at this? If you didn't and think you may not have time soon I may take a stab at it - I hope for it to be actually fairly simple and quite useful to potential users. |
@LaurentMazare sorry not yet, I've had to get ONNX into SurrealDB somehow for our ML feature so I embedded the C onnx runtime into the rust binary as seen here: https://github.com/surrealdb/surrealml/blob/main/modules/utils/src/execution/onnx_environment.rs Once that's done, we and deployed more than happy to get involved. If you're starting, do you have a scoped out project board or anything? |
I haven't started yet - I also haven't planned on having a board or anything, my guess is that it's something like a day or two of work (just supporting the ops that are readily available in candle) so I was thinking that it's more a matter of just trying it out rather than doing some planning ahead. |
I've merged some preliminary support for onnx, for now it's in cargo run --example onnx_basics -- simple-eval --file myfile.onnx As next steps, I'll try adding support for a bunch more ops, hopefully enough to run a couple non-trivial model. |
Another update, the current implementation seems good enough to run squeezenet so I've made this an example squeezenet-onnx. I'll push on a couple more ops so as to be able to run more models and after that the next step is likely to be adding the intermediary structure for the compute graph, validating the different ops/inputs/outputs properly, and polishing everything. |
Hey, I'm super excited that this library is being built. I've been looking through the documentation and examples and I cannot see any examples of how to load a traced model and perform a simple inference on the loaded model. Is there a way to do this? If not, are there any plans to support such a thing and if so, are there any contribution guidelines?
The text was updated successfully, but these errors were encountered: