Add Llama3.1-8B benchmark with disabled collective matmul #1317

bvandermoon · 2025-02-27T01:43:58Z

Description

Add a new Trillium benchmark that runs Llama3.1 with collective matmuls disabled. I ran this on v6e-8 and saw an improvement from ~350 TFLOP/s/device to ~410-420 TFLOP/s/device after this change.

This change is needed to support adding a reproducible recipe for v6e-8.

Tests

Ran this benchmark on v6e-8 using the following command:

python3 benchmarks/benchmark_runner.py xpk \
    --project=$PROJECT \
    --zone=$ZONE \
    --device_type=v6e-8 \
    --num_slices=1  \
    --cluster_name=${CLUSTER_NAME} \
    --base_output_directory=${OUTPUT_DIR} \
    --model_name="llama3_1_8b_8192_no_collective_matmul" \
    --libtpu_version=20241209 \
    --base_docker_image=maxtext_base_image

I got the perf described above. Also confirmed in the profile that the previous collective matmuls in the MLP layer are now gone:

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

gemini-code-assist · 2025-02-27T01:44:01Z

Important

The terms of service for this installation has not been accepted. Please ask the Organization owners to visit the Gemini Code Assist Admin Console to sign it.

Add Llama3.1-8B benchmark with disabled collective matmul

67a239d

bvandermoon requested review from gobbleturk, khatwanimohit, vipannalla, RissyRan, richjames0, rni418 and gagika as code owners February 27, 2025 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Llama3.1-8B benchmark with disabled collective matmul #1317

Add Llama3.1-8B benchmark with disabled collective matmul #1317

bvandermoon commented Feb 27, 2025

gemini-code-assist bot commented Feb 27, 2025

Add Llama3.1-8B benchmark with disabled collective matmul #1317

Are you sure you want to change the base?

Add Llama3.1-8B benchmark with disabled collective matmul #1317

Conversation

bvandermoon commented Feb 27, 2025

Description

Tests

Checklist

gemini-code-assist bot commented Feb 27, 2025