Skip to content
This repository has been archived by the owner on May 3, 2023. It is now read-only.

contrib: integrate with systemd units #61

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,97 @@ $ sudo curl --unix-socket /run/traceloop.socket 'http://localhost/dump-by-cgroup

```

## With systemd services using traceloopctl and the HTTP interface to integrate with systemd

The `contrib/traceloopctl` helper is a command line tool to manage traceloop logs and has a special mode for systemd units.
This works when systemd unit traces have a name that consits of the systemd unit file name combined with the systemd service invocation ID through a `_` character (`%n_$INVOCATION_ID`).

```
$ contrib/traceloopctl
Usage: contrib/traceloopctl COMMAND|-h|--help
Needs to be run with access to /run/traceloop.socket (e.g., with sudo).
Commands:
list-all
dump-id ID
dump-name NAME
close-id ID
close-name NAME
add-current-cgroup NAME

list-sd-units
list-sd-traces SERVICE
dump-sd SERVICE INVOCATION|-1
close-sd SERVICE INVOCATION|-1|all
add-current-cgroup-sd SERVICE INVOCATION

The *-sd commands assume the trace name format is systemd_UNIT_INVOCATIONID which can be automated with:
ExecStartPre=+/…/contrib/traceloopctl add-current-cgroup-sd "%n" "$INVOCATION_ID"
```

First you need to make sure that traceloop runs as a service:

```
sudo cp contrib/traceloop.service /etc/systemd/system/traceloop.service
# You can enable it always or just start it on demand pulled in as dependency: sudo systemctl enable --now traceloop.service
```

Now add `Requires=traceloop.service` and `After=traceloop.service` directives to the `[Unit]` section of your service.
In the `[Service]` section you have to add a special command that registers the unit CGroup with traceloop: add `ExecStartPre=+/PATH/TO/kinvolk/traceloop/contrib/traceloopctl add-current-cgroup-sd "%n" "$INVOCATION_ID"` as very first `ExecStartPre` line.
The `+` prefix means to ignore any `User=` directives but run as root user and also ignore any filesystem changes like `ProtectSystem=`.
This allows us to use the system's exec path and write to `/run/traceloop.socket` regardless of the restrictions applying for the regular `ExecStartPre=`/`ExecStart=` processes.

An example unit `my-service.service` looks like this:

```
[Unit]
Description=My Service
# Add:
Requires=traceloop.service
# Add:
After=traceloop.service

[Service]
User=1000
Group=1000
ProtectSystem=strict
NoNewPrivileges=yes

# Add:
ExecStartPre=+/PATH/TO/kinvolk/traceloop/contrib/traceloopctl add-current-cgroup-sd "%n" "$INVOCATION_ID"

ExecStart=/bin/echo Hello World
```

Instead of modifying the original `my-service.service` unit file you can also do the traceloop registration through a small drop-in unit file in `/etc/systemd/system/my-service.service.d/10-traceloop.conf`:

```
[Unit]
Requires=traceloop.service
After=traceloop.service

[Service]
# The + prefix means to ignore the User= but run as root and ignore filesystem changes like ProtectSystem=, this allows us to use the system's curl and write to /run/
ExecStartPre=+/PATH/TO/kinvolk/traceloop/contrib/traceloopctl add-current-cgroup-sd "%n" "$INVOCATION_ID"
```

Start the service with `sudo systemctl daemon-reload; sudo systemctl restart my-service.service` and observe the traces:

```
# List the traced systemd units:
$ list-sd-units
Traces Units
------- -----
1 my-service.service
# Now list the traces:
$ contrib/traceloopctl list-sd-traces my-service.service
a72e5d0f2b7e405894ca4664ddf205b1
# Dump the trace:
$ contrib/traceloopctl dump-sd my-service.service -1 | less
# Clean up afterwards:
$ contrib/traceloopctl close-sd my-service.service all
closed
```

### Talk at Linux Plumbers Conference 2020

A comprehensive presentation was held at LPC 2020 in the Networking and BPF Summit.
Expand Down
12 changes: 12 additions & 0 deletions contrib/traceloop.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[Unit]
Description=Traceloop

[Service]
Type=notify
NotifyAccess=all
ExecStartPre=/bin/rm -f /run/traceloop.socket
ExecStart=/bin/sh -c "/home/kai/kinvolk/traceloop/traceloop serve & while ! curl -fsS --unix-socket /run/traceloop.socket 'http://localhost/list' > /dev/null; do sleep 1; echo Waiting for traceloop to start up; done ; systemd-notify --ready; wait"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling systemd-notify, you could do it in Golang:
https://vincent.bernat.ch/en/blog/2017-systemd-golang
https://github.com/coreos/go-systemd/blob/v22.3.2/daemon/sdnotify.go#L56

daemon.SdNotify(false, daemon.SdNotifyReady)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with ignoring the return code (false, nil) it should be fine when used outside of a systemd unit.

ExecStopPost=/bin/rm -f /run/traceloop.socket

[Install]
WantedBy=multi-user.target
131 changes: 131 additions & 0 deletions contrib/traceloopctl
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
#!/bin/sh
CMD="$1"
ARG1="$2"
ARG2="$3"
set -euo pipefail

if [ $# -lt 1 ] || [ "$CMD" = "-h" ] || [ "$CMD" = "--help" ]; then
echo "Usage: $0 COMMAND|-h|--help"
echo "Needs to be run with access to /run/traceloop.socket (e.g., with sudo)."
echo "Commands:"
echo " list-all"
echo " dump-id ID"
echo " dump-name NAME"
echo " close-id ID"
echo " close-name NAME"
echo " add-current-cgroup NAME"
echo
echo " list-sd-units"
echo " list-sd-traces SERVICE"
echo " dump-sd SERVICE INVOCATION|-1"
echo " close-sd SERVICE INVOCATION|-1|all"
echo " add-current-cgroup-sd SERVICE INVOCATION"
echo
echo "The *-sd commands assume the trace name format is systemd_UNIT_INVOCATIONID which can be automated with:"
echo ' ExecStartPre=+/…/contrib/traceloopctl add-current-cgroup-sd "%n" "$INVOCATION_ID"'
exit 1
fi

SCRIPT_FOLDER="$(dirname "$(readlink -f "$0")")"
CURL="curl -fsS --unix-socket /run/traceloop.socket"
if [ "$CMD" = list-all ]; then
$CURL "http://localhost/list"
elif [ "$CMD" = dump-id ]; then
if [ "$ARG1" = "" ]; then
echo "Expected ID argument" > /dev/stderr
exit 1
fi
ID="$ARG1"
$CURL "http://localhost/dump?id=${ID}"
elif [ "$CMD" = dump-name ]; then
if [ "$ARG1" = "" ]; then
echo "Expected NAME argument" > /dev/stderr
exit 1
fi
NAME="$ARG1"
$CURL "http://localhost/dump-by-name?name=${NAME}"
elif [ "$CMD" = close-id ]; then
if [ "$ARG1" = "" ]; then
echo "Expected ID argument" > /dev/stderr
exit 1
fi
ID="$ARG1"
$CURL "http://localhost/close?id=${ID}"
elif [ "$CMD" = close-name ]; then
if [ "$ARG1" = "" ]; then
echo "Expected NAME argument" > /dev/stderr
exit 1
fi
NAME="$ARG1"
$CURL "http://localhost/close-by-name?name=${NAME}"
elif [ "$CMD" = add-current-cgroup ]; then
if [ "$ARG1" = "" ]; then
echo "Expected NAME argument" > /dev/stderr
exit 1
fi
NAME="$ARG1"
CURRENT_CGROUP=$("${SCRIPT_FOLDER}"/current-cgroup)
$CURL "http://localhost/add?name=${NAME}&cgrouppath=${CURRENT_CGROUP}"
elif [ "$CMD" = list-sd-units ]; then
echo " Traces Units"
echo "------- -----"
$CURL "http://localhost/list" | grep -o '\[systemd_.*\]' | cut -d _ -f 2 | sort | uniq -c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The systemd-specific commands *-sd could be added in the Go program, so that the shell script could be simplified. (could be in a follow-up PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And internally traceloop with still concatenate a special name or would you introduce a new attributes for Tracelet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess introducing new attributes but I have not given it much thought. Do you have a preference?

elif [ "$CMD" = list-sd-traces ]; then
if [ "$ARG1" = "" ]; then
echo "Expected SERVICE argument" > /dev/stderr
exit 1
fi
SERVICE="$ARG1"
$CURL "http://localhost/list" | grep -o "\[systemd_${SERVICE}.*\] " | cut -d _ -f 3- | cut -d ] -f 1
elif [ "$CMD" = dump-sd ]; then
if [ "$ARG1" = "" ]; then
echo "Expected SERVICE argument" > /dev/stderr
exit 1
fi
SERVICE="$ARG1"
if [ "$ARG2" = "" ]; then
echo "Expected INVOCATION argument" > /dev/stderr
exit 1
fi
INVOCATION="$ARG2"
if [ "$INVOCATION" = "-1" ]; then
INVOCATION=$($CURL "http://localhost/list" | grep -o "\[systemd_${SERVICE}.*\] " | cut -d _ -f 3- | cut -d ] -f 1 | tail -n 1)
fi
$CURL "http://localhost/dump-by-name?name=systemd_${SERVICE}_${INVOCATION}"
elif [ "$CMD" = close-sd ]; then
if [ "$ARG1" = "" ]; then
echo "Expected SERVICE argument" > /dev/stderr
exit 1
fi
SERVICE="$ARG1"
if [ "$ARG2" = "" ]; then
echo "Expected INVOCATION argument" > /dev/stderr
exit 1
fi
if [ "$ARG2" = "-1" ]; then
INVOCATIONS=$($CURL "http://localhost/list" | grep -o "\[systemd_${SERVICE}.*\] " | cut -d _ -f 3- | cut -d ] -f 1 | tail -n 1)
elif [ "$ARG2" = all ]; then
INVOCATIONS=$($CURL "http://localhost/list" | grep -o "\[systemd_${SERVICE}.*\] " | cut -d _ -f 3- | cut -d ] -f 1)
else
INVOCATIONS="$ARG2"
fi
for ID in $INVOCATIONS; do
$CURL "http://localhost/close-by-name?name=systemd_${SERVICE}_${ID}"
done
elif [ "$CMD" = add-current-cgroup-sd ]; then
if [ "$ARG1" = "" ]; then
echo "Expected SERVICE argument" > /dev/stderr
exit 1
fi
SERVICE="$ARG1"
if [ "$ARG2" = "" ]; then
echo "Expected INVOCATION argument" > /dev/stderr
exit 1
fi
INVOCATION="$ARG2"
CURRENT_CGROUP=$("${SCRIPT_FOLDER}"/current-cgroup)
$CURL "http://localhost/add?name=systemd_${SERVICE}_${INVOCATION}&cgrouppath=${CURRENT_CGROUP}"
else
echo "Unknown command \"$CMD\"" > /dev/stderr
exit 1
fi