Enable AMP (Automatic Mixed Precision ) in Tensorflow Serving. #1583

whatdhack · 2020-03-25T16:05:22Z

Describe the problem the feature is intended to solve

AMP accelerates inference significantly.

Describe the solution

A flag for enabling AMP

Describe alternatives you've considered

There is no alternative with Tensorflow Serving

Additional context

N/A

whatdhack · 2020-04-03T16:13:28Z

This I think should be very high priority ( at the least FP16) , otherwise the case for TFS becomes weak.

shadowdragon89 · 2020-04-10T16:00:51Z

The AMP is mainly target on training instead of serving.(https://www.tensorflow.org/guide/keras/mixed_precision)

Have you observed the significant performance difference for serving as well? If so, could you share the benchmark and related numbers?

whatdhack · 2020-04-12T21:03:31Z

How do I turn on AMP in serving ? I have observed 50% improvement in processiong time with fp16 over fp32 without any noticeable change in accuracy. Reduced precision is one of the corner stones of Nvidia TensorRT, etc. See this one also - https://medium.com/@whatdhack/neural-network-inference-optimization-8651b95e44ee .

whatdhack · 2020-04-13T06:54:40Z

Is there a way to do the following in TFS ?

config = tf.ConfigProto()
config.graph_options.rewrite_options.auto_mixed_precision = 1
sess = tf.Session(config=config)

whatdhack · 2020-05-14T16:52:49Z

I just ran some tests on a MaskRCNN Saved Model in nvcr.io/nvidia/tensorflow:20.03-tf1-py3. TF_ENABLE_AUTO_MIXED_PRECISION seems to work very well for inference - requires less memory and speeds up significantly. The following are the numbers , if you need more convincing.

TF_ENABLE_AUTO_MIXED_PRECISION =1, memory = 4.2GB , inference speed 0.25 sec .

vs

memory = 7.1 GB , inference speed 0.53 sec .

shadowdragon89 · 2020-05-14T18:49:28Z

Thanks for the experiments and numbers! Based on the number, we could add the option. I will also follow up with our GPU team.

jeisinge · 2020-11-02T16:21:50Z

Any update here? Also, is it possible to enable JIT/XLA as well like #1515 ?

lre · 2022-02-22T14:52:07Z

Any update here?

DerryFitz · 2022-04-12T14:19:21Z

I'd really appreciate this feature being added too

junA2Z · 2023-05-17T13:48:23Z

Hi, Any updates here?

BobLiu20 · 2023-09-11T06:23:21Z

Hi, Any updates here?

gowthamkpr self-assigned this Mar 26, 2020

gowthamkpr added the type:feature label Mar 30, 2020

gowthamkpr assigned shadowdragon89 and unassigned gowthamkpr Mar 31, 2020

gowthamkpr added the stat:awaiting tensorflower label Mar 31, 2020

singhniraj08 mentioned this issue Feb 3, 2023

I use tensorflow serving with TF-TRT, use fp16 serving but not get any improved!! It's not wokr ok on FP16? #1964

Closed

singhniraj08 assigned nniuzft and unassigned shadowdragon89 Feb 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable AMP (Automatic Mixed Precision ) in Tensorflow Serving. #1583

Enable AMP (Automatic Mixed Precision ) in Tensorflow Serving. #1583

whatdhack commented Mar 25, 2020

whatdhack commented Apr 3, 2020 •

edited

Loading

shadowdragon89 commented Apr 10, 2020

whatdhack commented Apr 12, 2020 •

edited

Loading

whatdhack commented Apr 13, 2020

whatdhack commented May 14, 2020

shadowdragon89 commented May 14, 2020

jeisinge commented Nov 2, 2020

lre commented Feb 22, 2022

DerryFitz commented Apr 12, 2022

junA2Z commented May 17, 2023

BobLiu20 commented Sep 11, 2023

Enable AMP (Automatic Mixed Precision ) in Tensorflow Serving. #1583

Enable AMP (Automatic Mixed Precision ) in Tensorflow Serving. #1583

Comments

whatdhack commented Mar 25, 2020

Describe the problem the feature is intended to solve

Describe the solution

Describe alternatives you've considered

Additional context

whatdhack commented Apr 3, 2020 • edited Loading

shadowdragon89 commented Apr 10, 2020

whatdhack commented Apr 12, 2020 • edited Loading

whatdhack commented Apr 13, 2020

whatdhack commented May 14, 2020

shadowdragon89 commented May 14, 2020

jeisinge commented Nov 2, 2020

lre commented Feb 22, 2022

DerryFitz commented Apr 12, 2022

junA2Z commented May 17, 2023

BobLiu20 commented Sep 11, 2023

whatdhack commented Apr 3, 2020 •

edited

Loading

whatdhack commented Apr 12, 2020 •

edited

Loading