-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf event array map kernel-side implementation #4144
Conversation
7623a4f
to
10d4f86
Compare
d5f2cea
to
a5f9711
Compare
Moving to draft until the ring buffer refactoring is merged in #4204 |
This is currently blocked on #4204, as it uses the new ring buffer for wait-free reserve (at dispatch) in the per-cpu rings. Most of the refactoring to use the new ring buffer has been done, but after the new ring buffer implementation is merged I'll update+publish this again. |
6513f71
to
547a6e1
Compare
I split the context header support requirement out into #4267. After that passes all CICD tests I'll rebase this PR split into two commits - context header support then perf event array impl. |
_Inout_ ebpf_operation_ring_buffer_map_async_query_reply_t* reply, | ||
_ebpf_core_protocol_map_async_query( | ||
_In_ const ebpf_operation_map_async_query_request_t* request, | ||
//_Inout_updates_bytes_(reply_length) ebpf_operation_map_async_query_reply_t* reply, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: delete commented code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #4300
* @brief Subscribe for notifications from the input perf event array map. | ||
* | ||
* @param[in] perf_event_array_map_fd File descriptor to the perf event array map. | ||
* @param[in, out] sample_callback_context Pointer to supplied context to be passed in notification callback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param[in, out] sample_callback_context Pointer to supplied context to be passed in notification callback. | |
* @param[in, out] callback_context Pointer to supplied context to be passed in notification callback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #4300
async_ioctl_completion(nullptr), async_ioctl_failed(false) | ||
{ | ||
} | ||
~_ebpf_perf_event_array_subscription() { EBPF_LOG_ENTRY(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: normally ENTRY and EXIT traces should be paired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #4300
@@ -2688,3 +2696,40 @@ ebpf_program_set_flags(_Inout_ ebpf_program_t* program, uint64_t flags) | |||
{ | |||
program->flags = flags; | |||
} | |||
|
|||
void | |||
ebpf_program_set_header_context_descriptor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is invoked only from ebpf_program.c
. Should this be marked static?
} | ||
|
||
void | ||
ebpf_program_get_header_context_descriptor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as earlier -- invoked only from ebpf_program.c
. Should this be marked static?
size_t perf_event_array_size = 0; | ||
uint32_t dummy; | ||
|
||
result = _get_map_descriptor_properties( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to cache the map descriptor in ebpf_perf_event_array_subscription_t
to avoid a possible IOCTL call to get map properties each time?
{ | ||
ebpf_core_map_t core_map; | ||
uint32_t ring_count; | ||
// Flag that is set the first time an async operation is queued to the map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: which flag is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was for the tripwire flag which has been moved into a shared struct. Fixed in #4300.
EBPF_FROM_FIELD(ebpf_core_perf_event_array_map_t, core_map, map); | ||
uint32_t cpu_id = (uint32_t)index; | ||
ebpf_assert(cpu_id < perf_event_array_map->ring_count); | ||
ebpf_core_perf_ring_t* ring = &perf_event_array_map->rings[cpu_id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to handle CPU hot-add case here? Or at least fail the call if cpu_id
> perf_event_array_map->ring_count
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a runtime check of the cpu_id in #4300.
Currently we allocate rings for the maximum logical processor count -- so hot add CPUs will already get rings allocated during initialization.
This was an issue for epoch memory because all CPUs participating in epochs need to participate in the consensus protocol. For perf array, a hot-add CPU that never gets added is functionally equivalent to a CPU that existed at boot but never has records queued to it.
|
||
ebpf_core_perf_event_array_map_t* perf_event_array_map = | ||
EBPF_FROM_FIELD(ebpf_core_perf_event_array_map_t, core_map, map); | ||
ebpf_core_perf_ring_t* ring = &perf_event_array_map->rings[cpu_id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above answer.
if (irql_at_enter < DISPATCH_LEVEL) { | ||
ebpf_lower_irql(irql_at_enter); | ||
} | ||
ExitPreDispatch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: snake_case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #4300.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, modulo some comments.
Description
Implements the kernel side of perf event array maps.
This includes the core and kernel changes for #658, with part of the user-side API skeleton.
Testing
Includes platform and execution context tests.
Documentation
Documented in code and PerfEventArray.md.
Installation
N/A