PyTorch on Flink with ONNX

This repository shows an example of using Apache Flink to serve an ONNX model that was created by PyTorch. While the initial model is simple, it shows the process by which a machine learning model can be created in Python, and then served by a Scala program written for streaming with Flink.

Training in PyTorch

The training/ directory provides two example Jupyter Notebooks that uses PyTorch to create simple networks. The networks are exported to ONNX through PyTorch. The simple network takes a variable length floating point tensor and adds a constant offset. The MNIST network solves the classic handwritten number classification problem that is a mainstay of neural network tutorials.

Packaging the JAR

The ONNX models are packaged as a resource, so that they can be distributed to the Flink TaskManagers in the same JAR as the code.

The Flink application follows the standard template for SBT. The JAR can be built using sbt as follows:

sbt clean assembly

The Scala code is also tested to ensure quality, and those tests are run by sbt. These tests exercise the inference of the ONNX models.

The contents of the JAR can be inspected to see the included ONNX models and the onnxruntime library.

jar tf ./target/scala-2.11/flink-onnx-pytorch-assembly-0.0.1.jar

Serving with onnxruntime

The ONNX model is served using the onnxruntime library. The Java API is used, which provides a similar interface to Python. The OrtModel class is added as a convenience wrapper to handle loading from the resources directory.

Running the model inference for the simple case is wrapped by the AddFive class, which extends the RichMapFunction. This allows the open and close methods to handle the model loading and closing. The map method runs the input value through the ONNX model.

The types need to be handled properly, since fractional values default to Double in Scala unless specifically identified.

Streaming in Flink

With the JAR built, the Flink jobs can be submitted. The jobs are written over a static dataset of numbers for the simple example. This could be easily replaced by another source. The jobs output the results to stdout for simplicity, which could also be replaced by another sink. The DataStream API is used in this example.

To run the JAR on a local cluster, make sure to start it first.

./flink-1.13.0/bin/start-cluster.sh

The com.datacolin.Job can be submitted to the cluster through the CLI:

./flink-1.13.0/bin/flink run -c com.datacolin.Job ./flink-onnx-pytorch/target/scala-2.11/flink-onnx-pytorch-assembly-0.0.1.jar

The local cluster will provide output in the log/ directory, which can be watched:

tail -f ./flink-1.13.0/log/*.out

The static dataset should be output with the added offset.

The MNIST example can be run using the ai.vectra.Job.

Future directions

This demonstration provides the basic working pieces to get a Python machine learning model running in a streaming Scala application in Flink. There are a number of interesting real-time applications to try next.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
project		project
src		src
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt
idea.sbt		idea.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch on Flink with ONNX

Training in PyTorch

Packaging the JAR

Serving with onnxruntime

Streaming in Flink

Future directions

About

Releases

Packages

Languages

License

vectra-ai-research/flink-onnx-pytorch

Folders and files

Latest commit

History

Repository files navigation

PyTorch on Flink with ONNX

Training in PyTorch

Packaging the JAR

Serving with onnxruntime

Streaming in Flink

Future directions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages