-
Notifications
You must be signed in to change notification settings - Fork 148
scx_rustland_core: Forbid mmap() syscall #1812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks like nix doesn't like |
Cherry pick aba0b3d and you should be good |
15fec88
to
82a1c5d
Compare
The user-space schedulers should never perform blocking memory allocations, otherwise the entire scheduling pipeline may get stuck. To prevent this from happening, scx_rustland_core implements a GlobalAlloc with a custom memory allocator that operates on a pre-allocated locked memory arena and all the memory of the process is automatically locked. However, external libraries/crates can still execute mmap() syscalls directly (i.e., libc), potentially stalling the scheduler, for example: R scx_rustland[159] -5016ms scx_state/flags=3/0x1 dsq_flags=0x0 ops_state/qseq=2/1 sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0 slice=0 weight=100 cpus=ff asm_sysvec_apic_timer_interrupt+0x1a/0x20 mmap_region+0x65/0x140 do_mmap+0x47d/0x620 vm_mmap_pgoff+0xbc/0x1c0 do_syscall_64+0xbb/0x1e0 entry_SYSCALL_64_after_hwframe+0x77/0x7f To catch these calls introduce a seccomp filter that returns EPERM when the mmap() syscall is invoked. This doesn't solve the problem, but it allows to catch the code that invokes mmap() and we can exit early without having to wait for the watchdog timeout. Signed-off-by: Andrea Righi <[email protected]>
82a1c5d
to
d5237cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I was wondering if there was a way to enforce the seccomp config on the main thread only. That way other libraries could mmap in separate threads and in theory not interfere with scheduling.
Yeah... that was the idea, have the main thread never being blocked in mmap() and create separate threads that can do mmap() and other blocking operations. Since the seccomp filter is applied in |
Right, I confirm that, I just added an explicit call to mmap() in the stats server and everything's fine. If I put the same mmap() call in the main user-space scheduler thread, then the mmap() fails with EPERM. Basically you need to create all the threads before initializing the scheduler, then they can use mmap(). |
The user-space schedulers should never perform blocking memory allocations, otherwise the entire scheduling pipeline may get stuck.
To prevent this from happening, scx_rustland_core implements a GlobalAlloc with a custom memory allocator that operates on a pre-allocated locked memory arena and all the memory of the process is automatically locked.
However, external libraries/crates can still execute mmap() syscalls directly (i.e., libc), potentially stalling the scheduler, for example:
R scx_rustland[159] -5016ms
scx_state/flags=3/0x1 dsq_flags=0x0 ops_state/qseq=2/1
sticky/holding_cpu=-1/-1 dsq_id=(n/a)
dsq_vtime=0 slice=0 weight=100
cpus=ff
To catch these calls introduce a seccomp filter that returns EPERM when the mmap() syscall is invoked.
This doesn't solve the problem, but it allows to catch the code that invokes mmap() and we can exit early without having to wait for the watchdog timeout.