This is AMuLeT, the Automated Microarchitectural Leak Tester.
AMuLeT is a fuzz testing framework, which uses relational testing to find microarchitectural leakages in simulated CPUs and in Spectre countermeasures modeled in microarchitectural simulators. We compare executions on an emulater (Unicorn Engine) and on a defense implemented in a microarchitectural simulator gem5. If it finds 2 test cases (program + input) with the same emulated behavior, but different microarchitectural behavior (i.e. cache state, branch predictor state, etc.), it flags a microarchitectural leak in the defense.
For more details, please take a look at our ASPLOS'25 paper.
If you use our tool, please cite our paper:
AMuLeT: Automated Design-Time Testing of Secure Speculation Countermeasures. Bo Fu*, Leo Tenenbaum*, David Adler, Assaf Klein, Arpit Gogia, Alaa R. Alameldeen, Marco Guarnieri, Mark Silberstein, Oleksii Oleksenko, and Gururaj Saileshwar
In 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2025
- Baseline (no defense)
- InvisiSpec
- CleanupSpec
- STT
- SpecLFB
Please reach out if you would like to add support for your defense, we will be happy to accept a PR.
-
Hardware Requirements
- AMuLeT supports the x86 ISA, for both the host and the simulation base.
- 100-core server with 128GB RAM can run our full campaigns with 100 parallel runs (~50 cores can also run a smaller campaigns with 50 parallel runs, that should detect all our violations)
-
Software Requirements
- Docker
- We provide the steps to set up root-less docker since many shared systems do not allow running docker without root privileges. If docker is already set up, then this is not necessary; Please run
docker run hello-world
to ensure docker is correctly set-up on your system.
- Dependencies - https://docs.docker.com/engine/install/ubuntu/:
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg libpng-dev libhdf5-dev
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
- Install:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
- Post-Installation Steps (for rootless mode) - https://docs.docker.com/engine/install/linux-postinstall/
sudo groupadd docker
sudo usermod -aG docker $USER # Will no longer need sudo perms to run docker!
newgrp docker # Or log out&in
- Now, verify your user is now a member of the
docker
group.
# Should output that you are a member of the docker group, like so:
fubof@dolphin:~/code$ groups
docker sudo fubof
- Lastly, run these steps to start up docker and verify that it runs.
sudo service docker start
sudo service --status-all
docker run hello-world
export DOCKER_HOST=unix:///var/run/docker.sock
cd docker;
./dockerRun.sh stt start
docker attach stt
$RVZR_RUN # Inside container
./dockerRun.sh stt stop # To end test container
- Please see
artifact_evaluation/debug.md
for debugging assistance!
Run ./run_artifact.sh
Results for each table will be generated and placed in artifact_evaluation/Table_4_Results.txt
, artifact_evaluation/Table_5_Results.out
, artifact_evaluation/Table_5_Results_stt_only.out
, and artifact_evaluation/Table_6_Results.txt
respectively.
If you would like to generate each table individually, follow these instructions while within ./docker
:
Run ./dockerRun.sh invisispec start uarch_trace_formats
, or run
Table_4_uarch_trace_formats.sh
inside of the docker container -
this will output a LaTeX document containing the table to docker/docker_invisispec/revizor-docker/artifact_evaluation/Table_4_Results.tex
,
and a plaintext listing to Table_4_Results.txt
.
You can also provide the number of programs as an argument, e.g. ./Table_4_uarch_trace_formats.sh 10
,
if you want to try testing with a smaller number of programs first.
For 50 parallel instances, this table will take around ~22 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz) For 100 parallel instances, this table will take around ~76 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz)
Run ./run_benchmarks.sh
./run_benchmarks.sh
also takes optional arguments [TEST_CASES] [INPUTS] [PARALLEL_INSTANCES]
For example, to benchmark all defenses using 200 test cases, 70 inputs (each input is boosted, so this is 140 total inputs) per test case, and 50 instances in parallel:
./run_benchmarks.sh 200 70 50
This will automatically run a campaign for every defense in sequential order, and then generate a table with all the run results.
Alternatively, to get more detailed results for a given defense, run $BENCHMARK_SH <defense>
inside of any defense's docker container.
- Currently supported defenses are
(InvisiSpec|CleanupSpec|STT|SpecLFB)
The output of ./run_benchmarks.sh
will be copied over to this directory as Table_5_Results.out
. Here is some example output:
+-------------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
| Defense | Contract | Detected Violation? | Avg. Detection Time (sec) | Testing Throughput (test cases/sec) | Campaign Execution Time (sec) |
+-------------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
| InvisiSpec | CT-SEQ | YES | 5.17 | 627.03 | 4465.53 |
| CleanupSpec | CT-SEQ | YES | 1.75 | 2318.43 | 1207.71 |
| SpecLFB | CT-SEQ | YES | 1.29 | 3042.56 | 920.28 |
+-------------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
For 50 parallel instances, this table will take around ~2.5 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz) For 100 parallel instances, this table will take around ~4 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz)
We have also created a ./run_benchmarks_stt_only.sh
, as this run takes the longest time. It takes the same arguments as ./run_benchmarks.sh
.
The output will be copied over as Table_5_Results_stt_only.out
. Here is some example output:
+---------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
| Defense | Contract | Detected Violation? | Avg. Detection Time (sec) | Testing Throughput (test cases/sec) | Campaign Execution Time (sec) |
+---------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
| STT | ARCH-SEQ | YES | 6502.58 | 26.91 | 104041.32 |
+---------+----------+---------------------+---------------------------+-------------------------------------+-------------------------------+
For 50 parallel instances, this table will take around ~18 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz) For 100 parallel instances, this table will take around ~57 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz)
Run ./dockerRun.sh invisispec start smaller_uarch_structures
, or run Table_6_Smaller_uarch_structures.sh
inside of the docker container - this will output a LaTeX document containing the table to
docker/docker_invisispec/revizor-docker/artifact_evaluation/Table_6_Results.tex
,
and a plaintext listing to Table_6_Results.txt
.
You can also provide the number of programs as an argument, e.g. ./Table_6_Smaller_uarch_structures.sh 10
,
if you want to try testing with a smaller number of programs first.
Note: Like any fuzzing framework, we automatate the vulnerability detection, but root-causing it is a manual process and not provided by the above scripts. Please take a look at our paper to understand how we root-cause identified leaks, if you are using it to test your defense.
For 50 parallel instances, this table will take around ~3 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz) For 100 parallel instances, this table will take around ~9 hours to generate. (Tested on an AMD EPYC 7713 @ 3.72GHz)
Given defense 'stt':
dockerRun.sh
callsstt.Dockerfile
(indocker_stt
)stt.Dockerfile
callsentrypoint.sh
entrypoint.sh
:- "Borrows" your host system credentials (entire
~/.ssh
dir) to pullhttps://github.com/sith-lab/amulet-gem5/tree/stt
- Returns a bash shell. Invoke
$RVZR_RUN
to execute fuzzing campaign.
- "Borrows" your host system credentials (entire
To facilitate future debugging, please keep run instantiation code into the respective docker_<defense>/revizor_run.sh
!
- Look into the respective defense subfolder for run examples (i.e.
docker/docker_stt/revizor_run.sh
)
# Example usages of Revizor with IPC orchestration
YAML=/code/revizor-docker/docker/docker_stt/docker_gem5_v1_final_cache_tlb.yaml;
cd /code/revizor-docker/src; mkdir -p logs;
for i in $(seq 1 100); do
python3.11 ./cli.py fuzz -s x86/isa_spec/base.json --nonstop --ruby --STT --STT_FuturisticSpec -i 70 -n 200 -c $YAML -p stt-$i &> logs/stt-$i.txt &
echo "Launched stt-$i - Check /code/revizor-docker/src/logs/stt-$i.txt for results";
done
The fuzzer is controlled via a single command line interface cli.py
(located in src/cli.py
). It accepts the following arguments:
-s, --instruction-set PATH
- path to the ISA description file-c, --config PATH
- path to the fuzzing configuration file-n , --num-test-cases N
- number of test cases to be tested-i , --num-inputs N
- number of input classes per test case. The number of actual inputs = input classes * inputs_per_class, which is a configuration option-t , --testcase PATH
- use an existing test case instead of generating random test cases
- Please see artifact_evaluation/README.md to generate all tables from paper
Within a docker container, to run a campaign against a defense, run $BENCHMARK_SH <defense> [test cases] [inputs] [parallel_instances]
- Currently supported defenses are
(InvisiSpec|CleanupSpec|STT|SpecLFB)
[test cases] [inputs] [parallel_instances]
are optional args. By default, we run 200 test cases, each with 70 inputs, and 100 instances of runs in parallel.- Default is 100
parallel_instances
. For systems with less parallelism, you can reduce it to 50 (which should also detect our violations).
- Default is 100
Within $RVZR_DIR/src/benchmark-out-<defense>/
, the output for each parallel fuzzing round will appear as log_round<i>_config000
.
You can check how long the run has occured, as well as stdout/stderr, within $RVZR_DIR/src/logs/bench-<defense>.txt
.
Upon the end of a run, a results table will be generated, with this format:
==============================================================================================
Config 0: /code/revizor-docker/docker/docker_stt/docker_gem5_v1_final_cache_tlb.yaml
Command line:
python3.11 cli.py fuzz -s /code/revizor-docker/src/x86/isa_spec/base.json -i 2 -n 4 -c /code/revizor-docker/docker/docker_stt/docker_gem5_v1_final_cache_tlb.yaml -p bench_STT_config0_round0 --no-save-stats --result-dir=benchmark-out-STT/config0 --ruby --STT --STT_Futuristic --nonstop
avg_wall_time: 44.64
total_system_time: 446.41
total_test_programs: 30
input_count: 2
total_test_cases: 60
total_violations: 0
avg_violations: 0.00
contract_clause: ARCH-SEQ
detected_violation: NO
avg_detection_time: 0.00
testing_throughput: 1.34
campaign_execution_time: 44.64
Detailed Results - Config 0
===========================
Round | Execution Time (s) | Test Cases | Violations Found | First Violation (s)
-------- | -------------------- | ------------ | ------------------ | ----------------------
0 | 57.36 | 3 | 0 | N/A
1 | 46.20 | 3 | 0 | N/A
2 | 47.06 | 3 | 0 | N/A
3 | 55.95 | 3 | 0 | N/A
4 | 35.50 | 3 | 0 | N/A
5 | 47.16 | 3 | 0 | N/A
6 | 48.67 | 3 | 0 | N/A
7 | 27.18 | 3 | 0 | N/A
8 | 54.97 | 3 | 0 | N/A
9 | 26.36 | 3 | 0 | N/A
Mean | 44.64 | - | 0.0 | -
Std Dev | 11.30 | - | 0.00 | -
-------- | -------------------- | ------------ | ------------------ | ----------------------
==============================================================================================
- If you plan to extend AMuLeT locally, please fork the
amulet
(core) andamulet-gem5
(defense) repositories and update the remote addresses in the docker scripts. - Any changes in your
amulet
oramulet-gem5
repos must be pushed to remote, so the container can pull the new updates during builds - Must push new changes from outside container & pull them inside the container to get updates.
- Pushing changes from inside the container will cause a git error.
We practice the following conventions for commit messages (similar to microsoft/sca-fuzzer):
<scope>: [<type>] <subject>
Where:
<scope>
: The scope of the change.<type>
: The type of the change.<subject>
: A short description of the change.
The following scopes are typical:
Scope | Description |
---|---|
all |
Changes that affect the entire project (e.g., major refactoring) |
docker |
Docker config changes |
root |
Root directory changes (e.g., readme, git, author list) |
fuzz |
Changes to the core fuzzer algorithm. |
cli |
Changes to the command-line interface. |
exec |
Changes to the executor. |
model |
Changes to the model. |
analyser |
Changes to the analyser. |
mini |
Changes to the postprocessor (i.e., minimizer). |
gen |
Changes to the program generator |
input_gen |
Changes to the input generator |
tests |
Changes to the tests |
isa |
Changes to the ISA loader or to get_spec files |
If a commit covers several scopes, use the most relevant one.
If a commit targets a specific architecture (e.g., x86), add the architecture to the scope (e.g., fuzz/x86
).
Use one of the following types:
Type | Description |
---|---|
feat |
A new feature. |
fix |
A bug fix. |
docs |
Documentation changes. |
chore |
Changes to the build process or auxiliary tools. |
ft |
Fault tolerance changes (e.g., adding error handling or recovery mechanisms). |
refact |
Refactoring of the codebase. This includes code style change. |
perf |
Performance improvements. |
revert |
Reverts a previous commit. |