We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Supervising and emulating syscalls

Formal Metadata

Title
Supervising and emulating syscalls
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Recently the kernel landed seccomp support for SECCOMPRETUSER_NOTIF which enables a process (supervisee) to retrieve a fd for its seccomp filter. This fd can then be handed to another (usually more privileged) process (supervisor). The supervisor will then be able to receive seccomp messages about the syscalls having been performed by the supervisee. We have integrated this feature into userspace and currently make heavy use of this to intercept mknod(), mount(), and other syscalls in user namespaces aka in containers. For example, if the mknod() syscall matches a device in a pre-determined whitelist the privileged supervisor will perform the mknod syscall in lieu of the unprivileged supervisee and report back to the supervisee on the success or failure of its attempt. If the syscall does not match a device in a whitelist we simply report an error. This talk is going to show how this works and what limitations we run into and what future improvements we plan on doing in the kernel.