From 5eaf831f184a219b71ff55f0ca2a48af16b4c26a Mon Sep 17 00:00:00 2001 From: John Kressel Date: Sat, 18 Jan 2025 13:05:10 +0000 Subject: [PATCH] Updates to HiPEAC 2025 tutorial --- docs/tutorials/hipeac2025/README.md | 37 ++++++++++--------- docs/tutorials/hipeac2025/exercise1/README.md | 4 +- docs/tutorials/hipeac2025/exercise2/README.md | 2 +- docs/tutorials/hipeac2025/exercise3/README.md | 2 +- docs/tutorials/hipeac2025/exercise4/README.md | 14 +++---- .../hipeac2025/introduction/README.md | 2 +- 6 files changed, 31 insertions(+), 30 deletions(-) diff --git a/docs/tutorials/hipeac2025/README.md b/docs/tutorials/hipeac2025/README.md index 1c08d32d..44adb3ed 100644 --- a/docs/tutorials/hipeac2025/README.md +++ b/docs/tutorials/hipeac2025/README.md @@ -16,6 +16,26 @@ > [!NOTE] > MAMBO can be also run natively, without Docker, on Armv8 Linux machines. Speak to us if you wish to do so and have any problems. +Complete exercises in the following order: + +## Introduction +Follow the [link](introduction/README.md) to start with the Introduction. + +## Exercise 1 - Callbacks and scan-time code analysis +Follow the [link](exercise1/README.md) to start Exercise 1. + +## Exercise 2 - Extending Scan-time Analysis + Follow the [link](exercise2/README.md) to start Exercise 2. + +## Exercise 3 - Run-time Instrumentation + Follow the [link](exercise3/README.md) to start Exercise 3. + +## Exercise 4 - Advanced Instrumentation + Follow the [link](exercise4/README.md) to start Exercise 4. + +## Appendix + Follow the [link](appendix/README.md) to start the additional exercises. + > [!NOTE] > After completing Exercise 1 you can either continue with your current code or start from the code template provided for you in subsequent exercises. @@ -59,20 +79,3 @@ ``` -## Introduction -Follow the [link](introduction/README.md) to start with the Introduction. - -## Exercise 1 - Callbacks and scan-time code analysis -Follow the [link](exercise1/README.md) to start Exercise 1. - -## Exercise 2 - Extending Scan-time Analysis - Follow the [link](exercise2/README.md) to start Exercise 2. - -## Exercise 3 - Run-time Instrumentation - Follow the [link](exercise3/README.md) to start Exercise 3. - -## Exercise 4 - Advanced Instrumentation - Follow the [link](exercise4/README.md) to start Exercise 4. - -## Appendix - Follow the [link](appendix/README.md) to start the additional exercises. diff --git a/docs/tutorials/hipeac2025/exercise1/README.md b/docs/tutorials/hipeac2025/exercise1/README.md index 9162c31d..e6654345 100644 --- a/docs/tutorials/hipeac2025/exercise1/README.md +++ b/docs/tutorials/hipeac2025/exercise1/README.md @@ -75,7 +75,7 @@ The `mambo_register_pre_basic_block_cb` event runs just before scanning a single int mambo_register_pre_basic_block_cb(mambo_context *ctx, mambo_callback cb); ``` -The `mambo_callback` is simply a pointer to a function with the following signature: +The `mambo_callback` is simply a pointer to a function with the following signature which will be called when an event occurs: ```c int (*mambo_callback)(mambo_context *ctx); @@ -93,7 +93,7 @@ int mambo_register_post_basic_block_cb(mambo_context *ctx, mambo_callback cb); ``` >[!TIP] -> This callback can be used to backpatch instrumentation in the basic block based on information not available earlier (e.g. basic block size). +> This callback can be used to backpatch instrumentation in the basic block based on information that is not available until the whole basic block has been scanned (e.g. basic block size). > [!NOTE] > It is important to note that these callbacks enable analysis at **scan time**. diff --git a/docs/tutorials/hipeac2025/exercise2/README.md b/docs/tutorials/hipeac2025/exercise2/README.md index e1009e1e..a133f0c3 100644 --- a/docs/tutorials/hipeac2025/exercise2/README.md +++ b/docs/tutorials/hipeac2025/exercise2/README.md @@ -26,7 +26,7 @@ From outside they behave exactly as standard memory management routines, however ### MAMBO Hash Map -MAMBO provides a simply and light-weight hash map implementation for storing data within the plugin. It support three main operations: +MAMBO provides a simple and light-weight hash map implementation for storing data within the plugin. It support three main operations: ```c int mambo_ht_init(mambo_ht_t *ht, size_t initial_size, int index_shift, int fill_factor, bool allow_resize); diff --git a/docs/tutorials/hipeac2025/exercise3/README.md b/docs/tutorials/hipeac2025/exercise3/README.md index 9ed682f4..c761849d 100644 --- a/docs/tutorials/hipeac2025/exercise3/README.md +++ b/docs/tutorials/hipeac2025/exercise3/README.md @@ -54,7 +54,7 @@ Replaces the incorrect: ## Step 2: Evaluation -Now, the `test` binary can be run with the modified plugin. Notice the output of the modified version of the plugin. The previously incorrect basic blocks count should display correct values. +Now, the `test` binary can be run with the modified plugin. Notice the output of the modified version of the plugin. The previously incorrect basic blocks count should display correct values. Note that you will see a large number of basic blocks because this includes basic blocks from pre-main and post-main execution such as libc. ## Next Steps 👏 diff --git a/docs/tutorials/hipeac2025/exercise4/README.md b/docs/tutorials/hipeac2025/exercise4/README.md index 37e1b903..62722272 100644 --- a/docs/tutorials/hipeac2025/exercise4/README.md +++ b/docs/tutorials/hipeac2025/exercise4/README.md @@ -24,7 +24,7 @@ The second steps describes MAMBO facilities for decoding individual instructions ### PIE -MAMBO uses [PIE](https://github.com/beehive-lab/pie) (MAMBO custom instruction encoder/decoder generator) to generate functions for instruction decoding and encoding. Those are fairly low-level utilities, that closely follow conventions of the [ARM Architecture Reference Manual](https://developer.arm.com/documentation/ddi0487/ja/) and [RISC-V Specification](https://drive.google.com/file/d/1uviu1nH-tScFfgrovvFCrj7Omv8tFtkp/view). For ARM64, certain instructions are aggregated under the same basic type, and further decoding of specific fields may be required to identify the specific instruction. A good example of this concept is the MOV instruction, which moves a value from one register to another. This is decoded by MAMBO as an ADD instruction, since in the ARMv8 ISA, MOV Xd, Xn is simply an alias for the true operation in the hardware which adds the contents of the zero register to the value in register Xn and places the result in register Xd (ADD Xd, Xn, Xzr). +MAMBO uses PIE (MAMBO custom instruction encoder/decoder generator, found in `$MAMBO_ROOT/pie`) to generate functions for instruction decoding and encoding. Those are fairly low-level utilities, that closely follow conventions of the [ARM Architecture Reference Manual](https://developer.arm.com/documentation/ddi0487/ja/) and [RISC-V Specification](https://drive.google.com/file/d/1uviu1nH-tScFfgrovvFCrj7Omv8tFtkp/view). For ARM64, certain instructions are aggregated under the same basic type, and further decoding of specific fields may be required to identify the specific instruction. A good example of this concept is the MOV instruction, which moves a value from one register to another. This is decoded by MAMBO as an ADD instruction, since in the ARMv8 ISA, MOV Xd, Xn is simply an alias for the true operation in the hardware which adds the contents of the zero register to the value in register Xn and places the result in register Xd (ADD Xd, Xn, Xzr). For RISC-V, each instruction is encoded seperately meaning that the desired instruction can be decoded directly. > [!TIP] @@ -59,7 +59,7 @@ Where `source_addr` was set by the `mambo_get_source_addr` function. > PIE instruction types do not directly map to the ARM assembly instructions, but they map to instructions types defined in the ARM Architecture Reference Manual. More user friendly disassembly could be achieved with tool such as [Capstone](https://www.capstone-engine.org/). ## ARM64: -In this exercise, we look for the `A64_DATA_PROC_REG3` (data processing on 3 registers) instruction type that includes the `MUL` instruction (see [ARM Architecture Reference Manual](https://developer.arm.com/documentation/ddi0487/ja/) for more details). For example: +In this exercise, we look for the `A64_DATA_PROC_REG3` (data processing on 3 registers) instruction type that includes the `MUL` instruction because instructions are grouped by instruction type (see [ARM Architecture Reference Manual](https://developer.arm.com/documentation/ddi0487/ja/) for more details). For example: ```c a64_instruction instruction = a64_decode(source_addr); @@ -186,7 +186,7 @@ For the purpose of this exercise, it is enough to know that the first 8 integer ### Tasks -- [ ] Write a C function that prints current operands of the instruction. +- [ ] Write a C function that prints current operands of the instruction, by implementing the `foo` function described. ## Step 4: Emitting Function Calls @@ -234,13 +234,13 @@ emit_pop(ctx, (1 << x0) | (1 << x1) | (1 << lr)); ### Setting Arguments -Before calling the function, arguments have to be set. It was already explained that the function required for this exercise takes its arguments in `x0` and `x1`, so operands of `MUL` have to be moved into those registers. To do that, the `emit_mov` is used: +Before calling the function, arguments have to be set. The foo() function which prints the operands, is passed its arguments in registers `x0` and `x1` (as per the ARM calling convention), so operands of `MUL` have to be moved into those registers. To do that, the `emit_mov` is used: ```c void emit_mov(mambo_context *ctx, enum reg rd, enum reg rn); ``` -The function takes MAMBO context, as well as, the index of the destination and source register. +The function takes MAMBO context, as well as, the index of the destination `rn` and source register `rm`. The operands of the `MUL` instruction were already decoded by `a64_data_proc_reg3_decode_fields` and placed in `rm` and `rn` variables - assuming the exact code from this document has been used. Hence, setting the arguments can be as simple as: @@ -454,6 +454,4 @@ Since some of the C standard library (libc) functions use `MUL` there is more ou ## Next Steps 👏 -This is the last exercise, so please feel free to extend the plugin with any other ideas, ask us any questions or have a look at the [Appendix](../appendix/README.md) that discusses debugging MAMBO and its plugins with GDB. - -#### ✏️ Please help us improve the MAMBO tutorial by following the [link](https://forms.office.com/e/ZtDJSEgWhH). +This is the last exercise, so please feel free to extend the plugin with any other ideas such as only printing `MUL`/`MULW` instructions in a particular function. You can also ask us any questions, or have a look at the [Appendix](../appendix/README.md) that discusses debugging MAMBO and its plugins with GDB. diff --git a/docs/tutorials/hipeac2025/introduction/README.md b/docs/tutorials/hipeac2025/introduction/README.md index 7f9d0f02..910f1653 100644 --- a/docs/tutorials/hipeac2025/introduction/README.md +++ b/docs/tutorials/hipeac2025/introduction/README.md @@ -23,7 +23,7 @@ make ## Step 2: Build MAMBO with the plugin -Copy line 13: `PLUGINS+=plugins/tutorial.c` of the makefile from `$MAMBO_ROOT/tutorials/hipeac2025/introduction/mambo` to the makefile in your MAMBO repository. This includes the new plugin into the build process. +Copy line 13: `PLUGINS+=plugins/tutorial.c` of the makefile from `$MAMBO_ROOT/docs/tutorials/hipeac2025/introduction/mambo` to the makefile in your MAMBO repository. This includes the new plugin into the build process. Then, copy the initial plugin template into `$MAMBO_ROOT/plugins/tutorial.c`: