Skip to content

Commit

Permalink
Merge pull request #12 from intelligent-machine-learning/pin_2023_12_27
Browse files Browse the repository at this point in the history
Pin 2023_12_27
COMMIT_ID=efa6fcfdac5368330a0770e9019649eba08b5f56
  • Loading branch information
wbmc authored Dec 28, 2023
2 parents 21c989d + 81ebc22 commit f5bf0b6
Show file tree
Hide file tree
Showing 111 changed files with 3,532 additions and 8,845 deletions.
6 changes: 1 addition & 5 deletions .circleci/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ function install_deps_pytorch_xla() {
pip install hypothesis
pip install cloud-tpu-client
pip install absl-py
pip install pandas
pip install --upgrade "numpy>=1.18.5"
pip install --upgrade numba

Expand Down Expand Up @@ -150,11 +151,6 @@ function run_torch_xla_python_tests() {
# echo "Running MNIST Test"
# python test/test_train_mp_mnist_amp.py --fake_data --num_epochs=1
fi
elif [[ "$RUN_XLA_OP_TESTS1" == "xla_op1" ]]; then
# Benchmark tests.
# Only run on CPU, for xla_op1.
echo "Running Benchmark tests."
./benchmarks/test/run_tests.sh
fi
fi
popd
Expand Down
1 change: 1 addition & 0 deletions .circleci/docker/install_conda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ function install_and_setup_conda() {
/usr/bin/yes | pip install cloud-tpu-client
/usr/bin/yes | pip install expecttest==0.1.3
/usr/bin/yes | pip install absl-py
/usr/bin/yes | pip install pandas
# Additional PyTorch requirements
/usr/bin/yes | pip install scikit-image scipy==1.6.3
/usr/bin/yes | pip install boto3==1.16.34
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ on:
jobs:
build:
runs-on: ${{ inputs.runner }}
timeout-minutes: 90
timeout-minutes: 180
outputs:
docker-image: ${{ steps.upload-docker-image.outputs.docker-image }}
env:
Expand Down
1 change: 1 addition & 0 deletions .kokoro/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ function install_deps_pytorch_xla() {
pip install hypothesis
pip install cloud-tpu-client
pip install absl-py
pip install pandas
pip install --upgrade "numpy>=1.18.5"
pip install --upgrade numba

Expand Down
6 changes: 3 additions & 3 deletions TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ vm:~$ git clone --branch r2.1 https://github.com/pytorch/xla.git
vm:~$ python xla/test/test_train_mp_imagenet.py --fake_data
```

If you can get the resnet to run we can conclude that torch_xla is installed correctly.
If you can get the resnet to run we can conclude that torch_xla is installed correctly.


## Performance Debugging
Expand All @@ -64,10 +64,10 @@ The debugging tool will analyze the metrics report and provide a summary. Some e

```
pt-xla-profiler: CompileTime too frequent: 21 counts during 11 steps
pt-xla-profiler: TransferFromServerTime too frequent: 11 counts during 11 steps
pt-xla-profiler: TransferFromDeviceTime too frequent: 11 counts during 11 steps
pt-xla-profiler: Op(s) not lowered: aten::_ctc_loss, aten::_ctc_loss_backward, Please open a GitHub issue with the above op lowering requests.
pt-xla-profiler: CompileTime too frequent: 23 counts during 12 steps
pt-xla-profiler: TransferFromServerTime too frequent: 12 counts during 12 steps
pt-xla-profiler: TransferFromDeviceTime too frequent: 12 counts during 12 steps
```

### Compilation & Execution Analysis
Expand Down
17 changes: 8 additions & 9 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,12 @@ python xla/benchmarks/result_analyzer.py --output-dirname=experiment_results

## Aggregating results

Generated reports (e.g. `metric_report.csv` files mentioned above) can be
aggregated to track performance improvements over time with the `aggregate.py`
script.
Aggregate reports can be generated directly from the output JSONL files
(i.e., skipping `result_analyzer.py` altogether) with the `aggregate.py` script.
The script compares Pytorch/XLA performance numbers against Inductor numbers.
Because Inductor's performance also changes over time, the script takes
the oldest Inductor performance numbers present in the CSV files (as determined
by their timestamp) as the baseline for each benchmark.
the oldest Inductor performance numbers present in the JSONL files (as
determined by the records' timestamp) as the baseline for each benchmark.

Sample runs and sample output:

Expand All @@ -106,7 +105,7 @@ Sample runs and sample output:
- Note 3: we are using ASCII output here just to avoid checking in PNG files.

```
$ python3 aggregate.py --accelerator=v100 --test=inference -i /tmp/test --format=png --report=histogram
$ python3 aggregate.py --accelerator=v100 --test=inference --format=png --report=histogram /tmp/test/*.jsonl
Histogram of Speedup over Oldest Benchmarked Inductor
1.2 +------------------------------------------------------------------+
Expand All @@ -126,7 +125,7 @@ $ python3 aggregate.py --accelerator=v100 --test=inference -i /tmp/test --format
2000 2005 2010 2015 2020 2025 2030 2035 2040 2045
Date
$ python3 aggregate.py --accelerator=v100 --test=inference -i /tmp/test --format=png --report=speedup
$ python3 aggregate.py --accelerator=v100 --test=inference --format=png --report=speedup /tmp/test/*.jsonl
Geomean Speedup Over Oldest Benchmarked Inductor
1 +----------------------------------------------+
Expand All @@ -145,7 +144,7 @@ $ python3 aggregate.py --accelerator=v100 --test=inference -i /tmp/test --format
0.4 +----------------------------------------------+
2000 2005 2010 2015 2020 2025 2030 2035 2040 2045
Date
$ python3 aggregate.py --accelerator=v100 --test=inference -i /tmp/test --format=png --report=latest
$ python3 aggregate.py --accelerator=v100 --test=inference --format=png --report=latest /tmp/test/*.jsonl
Speedup Over Oldest Benchmarked Inductor as of 2023-11-11
1.8 +----------------------------------------------+
Expand All @@ -168,7 +167,7 @@ Speedup Over Oldest Benchmarked Inductor as of 2023-11-11

The last plot shows the "latest" snapshot for all benchmarks ("Workload" on the
plot), sorting them by speedup. That is, it shows the speedup of both Inductor
and Pytorch/XLA over the oldest Inductor data point that we have in the CSV
and Pytorch/XLA over the oldest Inductor data point that we have in the JSONL
files. (Note: to reiterate, because we are plotting data from single day,
Inductor gets speedup == 1 for all benchmarks). This plot also shows the
correctness gap between Pytorch/XLA and Inductor; there are benchmarks that do
Expand Down
Loading

0 comments on commit f5bf0b6

Please sign in to comment.