Skip to content

Commit

Permalink
Update benchmark numbers (OpenNMT#524)
Browse files Browse the repository at this point in the history
  • Loading branch information
guillaumekln authored Jul 26, 2021
1 parent 065548a commit 4c0536d
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 9 deletions.
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,7 @@ We compare CTranslate2 with OpenNMT-py and OpenNMT-tf on their pretrained Englis
| - int16 | 187MB |
| - float16 | 182MB |
| - int8 | 100MB |
| - int8 + float16 | 95MB |

CTranslate2 models are generally lighter and can go as low as 100MB when quantized to int8. This also results in a fast loading time and noticeable lower memory usage during runtime.

Expand All @@ -387,10 +388,10 @@ See the directory [`tools/benchmark`](tools/benchmark) for more details about th
| OpenNMT-tf 2.19.0 (with TensorFlow 2.5.0) | 364.1 | 2620MB | 26.93 |
| OpenNMT-py 2.1.2 (with PyTorch 1.9.0) | 472.6 | 1856MB | 26.77 |
| - int8 | 510.4 | 1712MB | 26.80 |
| CTranslate2 2.1.0 | 1185.5 | 1091MB | 26.77 |
| - int16 | 1531.2 | 944MB | 26.83 |
| - int8 | 1758.6 | 795MB | 26.86 |
| - int8 + vmap | 2167.8 | 788MB | 26.70 |
| CTranslate2 2.3.0 | 1182.3 | 1037MB | 26.77 |
| - int16 | 1532.0 | 954MB | 26.83 |
| - int8 | 1785.2 | 810MB | 26.86 |
| - int8 + vmap | 2263.4 | 692MB | 26.70 |

Executed with 8 threads on a [*c5.metal*](https://aws.amazon.com/ec2/instance-types/c5/) Amazon EC2 instance equipped with an Intel(R) Xeon(R) Platinum 8275CL CPU.

Expand All @@ -400,9 +401,10 @@ Executed with 8 threads on a [*c5.metal*](https://aws.amazon.com/ec2/instance-ty
| --- | --- | --- | --- | --- |
| OpenNMT-tf 2.19.0 (with TensorFlow 2.5.0) | 1815.2 | 2660MB | 1724MB | 26.93 |
| OpenNMT-py 2.1.2 (with PyTorch 1.9.0) | 1536.7 | 3046MB | 2987MB | 26.77 |
| CTranslate2 2.1.0 | 3726.4 | 1266MB | 676MB | 26.77 |
| - int8 | 5190.3 | 978MB | 567MB | 26.82 |
| - float16 | 5361.3 | 786MB | 606MB | 26.75 |
| CTranslate2 2.3.0 | 3696.7 | 1234MB | 555MB | 26.77 |
| - int8 | 5201.9 | 946MB | 565MB | 26.82 |
| - float16 | 5303.5 | 818MB | 607MB | 26.75 |
| - int8 + float16 | 5824.3 | 722MB | 566MB | 26.88 |

Executed with CUDA 11 on a [*g4dn.xlarge*](https://aws.amazon.com/ec2/instance-types/g4/) Amazon EC2 instance equipped with a NVIDIA T4 GPU (driver version: 460.73.01).

Expand Down
1 change: 1 addition & 0 deletions tools/benchmark/benchmark_pretrained.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ def run(name, image, env=None):
if gpu:
run("- int8", ctranslate2, env={"COMPUTE_TYPE": "int8"})
run("- float16", ctranslate2, env={"COMPUTE_TYPE": "float16"})
run("- int8 + float16", ctranslate2, env={"COMPUTE_TYPE": "int8_float16"})
else:
run("- int16", ctranslate2, env={"COMPUTE_TYPE": "int16"})
run("- int8", ctranslate2, env={"COMPUTE_TYPE": "int8"})
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
FROM opennmt/ctranslate2:2.1.0-ubuntu20.04-cuda11.2 as model_converter
FROM opennmt/ctranslate2:2.3.0-ubuntu20.04-cuda11.2 as model_converter

RUN apt-get update && \
apt-get install -y --no-install-recommends \
wget \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

RUN wget -q https://opennmt-models.s3.amazonaws.com/transformer-ende-wmt-pyOnmt.tar.gz && \
tar xf *.tar.gz && \
Expand All @@ -10,7 +17,7 @@ RUN ct2-opennmt-py-converter --model_path averaged-10-epoch.pt --output_dir /mod
RUN wget -q -P /model https://opennmt-models.s3.amazonaws.com/vmap.txt
RUN cp sentencepiece.model /model

FROM opennmt/ctranslate2:2.1.0-ubuntu20.04-cuda11.2
FROM opennmt/ctranslate2:2.3.0-ubuntu20.04-cuda11.2

COPY --from=model_converter /model /model

Expand Down

0 comments on commit 4c0536d

Please sign in to comment.