Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf event array map kernel-side implementation #4144

Merged
merged 13 commits into from
Mar 23, 2025

Conversation

mikeagun
Copy link
Contributor

@mikeagun mikeagun commented Jan 16, 2025

Description

Implements the kernel side of perf event array maps.

This includes the core and kernel changes for #658, with part of the user-side API skeleton.

Testing

Includes platform and execution context tests.

Documentation

Documented in code and PerfEventArray.md.

Installation

N/A

@mikeagun mikeagun marked this pull request as ready for review February 4, 2025 19:58
@mikeagun mikeagun changed the title DRAFT Perf event array map implementation. Perf event array map kernel-side implementation. Feb 4, 2025
@mikeagun
Copy link
Contributor Author

Moving to draft until the ring buffer refactoring is merged in #4204

@mikeagun mikeagun marked this pull request as draft February 12, 2025 21:54
@mikeagun
Copy link
Contributor Author

This is currently blocked on #4204, as it uses the new ring buffer for wait-free reserve (at dispatch) in the per-cpu rings. Most of the refactoring to use the new ring buffer has been done, but after the new ring buffer implementation is merged I'll update+publish this again.

@mikeagun
Copy link
Contributor Author

mikeagun commented Mar 7, 2025

I split the context header support requirement out into #4267. After that passes all CICD tests I'll rebase this PR split into two commits - context header support then perf event array impl.

@mikeagun mikeagun changed the title Perf event array map kernel-side implementation. Perf event array map kernel-side implementation Mar 7, 2025
shankarseal
shankarseal previously approved these changes Mar 21, 2025
_Inout_ ebpf_operation_ring_buffer_map_async_query_reply_t* reply,
_ebpf_core_protocol_map_async_query(
_In_ const ebpf_operation_map_async_query_request_t* request,
//_Inout_updates_bytes_(reply_length) ebpf_operation_map_async_query_reply_t* reply,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete commented code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #4300

* @brief Subscribe for notifications from the input perf event array map.
*
* @param[in] perf_event_array_map_fd File descriptor to the perf event array map.
* @param[in, out] sample_callback_context Pointer to supplied context to be passed in notification callback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param[in, out] sample_callback_context Pointer to supplied context to be passed in notification callback.
* @param[in, out] callback_context Pointer to supplied context to be passed in notification callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #4300

async_ioctl_completion(nullptr), async_ioctl_failed(false)
{
}
~_ebpf_perf_event_array_subscription() { EBPF_LOG_ENTRY(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: normally ENTRY and EXIT traces should be paired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #4300

@@ -2688,3 +2696,40 @@ ebpf_program_set_flags(_Inout_ ebpf_program_t* program, uint64_t flags)
{
program->flags = flags;
}

void
ebpf_program_set_header_context_descriptor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is invoked only from ebpf_program.c. Should this be marked static?

}

void
ebpf_program_get_header_context_descriptor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as earlier -- invoked only from ebpf_program.c. Should this be marked static?

size_t perf_event_array_size = 0;
uint32_t dummy;

result = _get_map_descriptor_properties(
Copy link
Contributor

@saxena-anurag saxena-anurag Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to cache the map descriptor in ebpf_perf_event_array_subscription_t to avoid a possible IOCTL call to get map properties each time?

{
ebpf_core_map_t core_map;
uint32_t ring_count;
// Flag that is set the first time an async operation is queued to the map.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: which flag is this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was for the tripwire flag which has been moved into a shared struct. Fixed in #4300.

EBPF_FROM_FIELD(ebpf_core_perf_event_array_map_t, core_map, map);
uint32_t cpu_id = (uint32_t)index;
ebpf_assert(cpu_id < perf_event_array_map->ring_count);
ebpf_core_perf_ring_t* ring = &perf_event_array_map->rings[cpu_id];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle CPU hot-add case here? Or at least fail the call if cpu_id > perf_event_array_map->ring_count?

Copy link
Contributor Author

@mikeagun mikeagun Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a runtime check of the cpu_id in #4300.

Currently we allocate rings for the maximum logical processor count -- so hot add CPUs will already get rings allocated during initialization.

This was an issue for epoch memory because all CPUs participating in epochs need to participate in the consensus protocol. For perf array, a hot-add CPU that never gets added is functionally equivalent to a CPU that existed at boot but never has records queued to it.


ebpf_core_perf_event_array_map_t* perf_event_array_map =
EBPF_FROM_FIELD(ebpf_core_perf_event_array_map_t, core_map, map);
ebpf_core_perf_ring_t* ring = &perf_event_array_map->rings[cpu_id];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above answer.

if (irql_at_enter < DISPATCH_LEVEL) {
ebpf_lower_irql(irql_at_enter);
}
ExitPreDispatch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: snake_case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #4300.

Copy link
Contributor

@saxena-anurag saxena-anurag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, modulo some comments.

@shankarseal shankarseal added this pull request to the merge queue Mar 23, 2025
Merged via the queue into microsoft:main with commit 2465eef Mar 23, 2025
95 of 97 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

eBPF for Windows should support perf_event_array
6 participants