Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target/riscv: cache requests to trigger configuration #928

Merged
merged 1 commit into from
Nov 10, 2023

Conversation

AnastasiyaChernikova
Copy link
Contributor

@AnastasiyaChernikova AnastasiyaChernikova commented Sep 28, 2023

Depending on configuration, the existing implementation of watchpoints is rather inefficient for certain scenarios. Consider HW that:

  1. triggers 0-3 can be used as instruction breakpoints
  2. triggers 4-7 can be used as data breakpoints (watchpoints)
  3. NAPOT triggers are not supported.

Now, consider that we have a pending watchpoint. And we perform a "step" operation. According to the current implementation:

  • OpenOCD will disable watchpoints
  • Perform a single-step
  • Will try to restore the original watchpoints. It will need 12 attempts to find a suitable trigger: (8 attempts to try NAPOT, and another 4 to try GE+LE).

This patch introduces a dedicated cache for requests to triggers. It significantly speeds things up, since we cache failed attempts and no additional interactions with HW is necessary.

Copy link
Collaborator

@timsifive timsifive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this improvement. It's going to help a lot, as you explained.

Why did you choose to use a linked list for the data structure? It looks like you're always allocating the whole thing. I think using an array would simplify the code significantly, but perhaps I'm missing some downside.

@AnastasiyaChernikova
Copy link
Contributor Author

AnastasiyaChernikova commented Oct 4, 2023

Why did you choose to use a linked list for the data structure? It looks like you're always allocating the whole thing. I think using an array would simplify the code significantly, but perhaps I'm missing some downside.

When caching tdata2 data, each new call is written to the top of the list (or moved if such a call has already been made), thus making it easier to find the data. And for tdata1 a list to make code uniform.


struct tdata1_cache *tdata1_cache = find_tdata1(&(r->wp_triggers_cache[idx]), tdata1);
if (tdata1_cache == NULL) {
tdata1_cache = create_new_tdata1_cache(&(r->wp_triggers_cache[idx]), tdata1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you making a new cache per trigger index here? I don't understand that. There can only be one tdata1 value per trigger at a time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use one trigger to set wp of different types (eq, napot, ge, lt). And in this implementation, all this information is cached. This is necessary if we installed wp on a certain trigger, then deleted it and want to reinstall it.

src/target/riscv/riscv.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@timsifive timsifive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I got it now. Overall looks good. I have a lot of naming/comment suggestions that would have made this code more obvious to me, but nothing really about the structure of the code.

src/target/riscv/riscv.h Outdated Show resolved Hide resolved
@@ -700,6 +732,99 @@ static void log_trigger_request_info(struct trigger_request_info trig_info)
trig_info.tdata1, trig_info.tdata2, trig_info.tdata1_ignore_mask);
};

static struct tdata1_cache *create_new_tdata1_cache(struct list_head *tdata1_cache_head, riscv_reg_t tdata1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the example of e.g. riscv_batch_alloc, let's call this:

Suggested change
static struct tdata1_cache *create_new_tdata1_cache(struct list_head *tdata1_cache_head, riscv_reg_t tdata1)
static struct tdata1_cache *tdata1_cache_alloc(struct list_head *tdata1_cache_head, riscv_reg_t tdata1)

(How does this square with e.g. riscv_free_registers()? It's a bit messy and inconsistent, but at least when there's a simple struct then using _alloc() seems more consistent.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

return elem;
}

static void create_new_tdata2_cache(struct list_head *tdata2_cache_head, riscv_reg_t tdata2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static void create_new_tdata2_cache(struct list_head *tdata2_cache_head, riscv_reg_t tdata2)
static void tdata2_cache_alloc(struct list_head *tdata2_cache_head, riscv_reg_t tdata2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

list_add(&elem->elem_tdata2, tdata2_cache_head);
}

struct tdata2_cache *find_call_in_tdata2_cache(struct list_head *tdata2_cache_head, riscv_reg_t find_tdata2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "call" in this context? Isn't this simply tdata2_cache_search?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

return NULL;
}

struct tdata1_cache *find_tdata1(struct list_head *tdata1_cache_head, riscv_reg_t find_tdata1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct tdata1_cache *find_tdata1(struct list_head *tdata1_cache_head, riscv_reg_t find_tdata1)
struct tdata1_cache *tdata1_cache_search(struct list_head *tdata1_cache_head, riscv_reg_t find_tdata1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

src/target/riscv/riscv.c Outdated Show resolved Hide resolved
Comment on lines 802 to 819
static bool is_use_wp_trigger_in_cache(struct target *target, unsigned int idx,
riscv_reg_t tdata1, riscv_reg_t tdata2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static bool is_use_wp_trigger_in_cache(struct target *target, unsigned int idx,
riscv_reg_t tdata1, riscv_reg_t tdata2)
static bool wp_triggers_cache_search(struct target *target, unsigned int idx,
riscv_reg_t tdata1, riscv_reg_t tdata2)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate suggestion for your consideration:

is_known_unsupported_trigger(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.h Outdated Show resolved Hide resolved
Comment on lines 802 to 819
static bool is_use_wp_trigger_in_cache(struct target *target, unsigned int idx,
riscv_reg_t tdata1, riscv_reg_t tdata2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate suggestion for your consideration:

is_known_unsupported_trigger(...)

@AnastasiyaChernikova
Copy link
Contributor Author

@timsifive All issues were addressed. Could you please take a look?

Copy link
Collaborator

@JanMatCodasip JanMatCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be two unaddressed (overlooked?) items:

Other than that, the code looks all right.

timsifive
timsifive previously approved these changes Nov 2, 2023
JanMatCodasip
JanMatCodasip previously approved these changes Nov 3, 2023
@AnastasiyaChernikova
Copy link
Contributor Author

@timsifive this commit has an unexpected (to me test failure) that I haven't observed locally. That being said - I used relatively old riscv-tests to check my changes (riscv-tests ed0a63ea4f , which is from around Jul 10, 2023). Looks like I'll need to re-test my changes with an updated riscv-tests.

@aap-sc
Copy link
Collaborator

aap-sc commented Nov 3, 2023

@timsifive this commit has an unexpected (to me test failure) that I haven't observed locally. That being said - I used relatively old riscv-tests to check my changes (riscv-tests ed0a63ea4f , which is from around Jul 10, 2023). Looks like I'll need to re-test my changes with an updated riscv-tests.

@AnastasiyaChernikova , @timsifive, @JanMatCodasip actually, I've tried to update riscv-tests version to the current TOT and I have UnavailableHaltedTest test failure without this MR applied. Looks like the test is broken. Could it be a known issue with the test itself?

@aap-sc
Copy link
Collaborator

aap-sc commented Nov 3, 2023

I have UnavailableHaltedTest test failure without this MR applied.

and just for the reference (I think this should be mentioned). This attempt to update was in repository that has few additional patches applied on top of riscv-openocd (minor fixes that were backported from https://sourceforge.net/p/openocd/ . And I used a relatively old version of spike (around 2 months old)

@TommyMurphyTM1234
Copy link
Collaborator

TommyMurphyTM1234 commented Nov 3, 2023

@timsifive this commit has an unexpected (to me test failure) that I haven't observed locally. That being said - I used relatively old riscv-tests to check my changes (riscv-tests ed0a63ea4f , which is from around Jul 10, 2023). Looks like I'll need to re-test my changes with an updated riscv-tests.

@AnastasiyaChernikova , @timsifive, @JanMatCodasip actually, I've tried to update riscv-tests version to the current TOT and I have UnavailableHaltedTest test failure without this MR applied. Looks like the test is broken. Could it be a known issue with the test itself?

If I build everything (without the changes in this PR) from scratch from the latest sources (toolchain, Spike, RISC-V OpenOCD) then the UnavailableHaltedTest runs succeeed.

If I copy in the modified riscv.c and riscv.h from here and rebuild OpenOCD those test runs still succeed.

@timsifive
Copy link
Collaborator

UnavailableHaltedTest seems to fail intermittently in the github action. I haven't seen it fail this way when I run it locally. :-( Hopefully I'll have some time to investigate more next week.

@timsifive
Copy link
Collaborator

I've restarted this test run several times, and it fails UnavailableHaltedTest consistently. Is it simply based on older code?
Can you rebase this off the latest riscv branch? That should also address the conflict that has appeared.

Depending on configuration, the existing implementation of watchpoints is
rather inefficient for certain scenarios. Consider HW that:

1. triggers 0-3 can be used as instruction breakpoints
2. triggers 4-7 can be used as data breakpoints (watchpoints)
3.  NAPOT triggers are not supported.

Now, consider that we have a pending watchpoint. And we perform a "step"
operation. According to the current implementation:

* OpenOCD will disable watchpoints
* Perform a single-step
* Will try to restore the original watchpoints. It will need 12 attempts
to find a suitable trigger: (8 attempts to try NAPOT, and another 4 to try
GE+LE).

This patch introduces a dedicated cache for requests to triggers. It
significantly speeds things up, since we cache failed attempts and no
additional interactions with HW is necessary.

Change-Id: Ic272895eaa763a7ae84d14f7633790afd015ca9d
Signed-off-by: Anastasiya Chernikova <[email protected]>
@timsifive
Copy link
Collaborator

For some reason, in this branch only, UnavailableHaltedTest consistently fails. I have no idea why. It doesn't make any sense with this change and what that test does. I cannot reproduce the failure on my machine either. :-(

I'm weary merging this change, though, because I don't want all testing to suddenly start failing. Is this blocking other work for you?

@aap-sc
Copy link
Collaborator

aap-sc commented Nov 7, 2023

@timsifive

For some reason, in this branch only, UnavailableHaltedTest consistently fails. I have no idea why. It doesn't make any sense with this change and what that test does. I cannot reproduce the failure on my machine either. :-(

I can actually. However, the issue is reproducible even without this patch on my machine.

This may be related to python version or some weird timing issues. Sometimes OpenOCD log file contain non-printable characters (like ^H and stuff). It seems that this sometimes messes up matching algorithms used in testing infrastructure. There are 2 types of issues I observe:

  • testlib.TestLibError: Timed out waiting for b'riscv\\ dmi_write\\ 0x1f\\ 0x3\\\n' - this I cannot reproduce reliably (yet).
  • cannot use a string pattern on a bytes-like object these are caused that sometimes we have a mismatch between regexp and self.log_buf from expect method of testlib.py. Mismatch happens because sometimes these objects are bytes and sometimes these are strings

@aap-sc
Copy link
Collaborator

aap-sc commented Nov 8, 2023

This may be related to python version or some weird timing issues.

I think I've found the root cause. It's a timing/parsing issue. I've filed riscv-software-src/riscv-tests#520 against riscv-tests.

@timsifive timsifive merged commit 1ea0e9b into riscv-collab:riscv Nov 10, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants