feat: Add nftables for firewalling #5

MoeMahhouk · 2024-10-21T16:57:27Z

No description provided.

Ruteri · 2024-10-21T18:14:33Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        tcp dport 8551 accept comment "Consensus Client Engine API"
+        tcp dport 30303 accept comment "Execution Client P2P"
+        tcp dport { 8545, 8546 } accept comment "Execution Client RPC"
+        tcp dport 27017 accept comment "Searcher Input Channel"


What's listening on this one?

Based on Angela's doc, it is the searcher's input channel

Ruteri · 2024-10-21T18:16:15Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        ct state established,related accept
+
+        # Production mode rules
+        tcp dport 8551 accept comment "Consensus Client Engine API"


Hm, should we be accepting requests to CL API apart from p2p?
I think Lighthouse listens at 9000 by default, should probably be listened for and accepted

yeah, you are right but I was following the table in Angela's doc here.
I guess, this is specific to the deployment of searcher stuff within podman. We need more context sync with Angela on that doc

Spoke with searcher, they prefer to run CL on their machine too. So let's remove 8551, open 9000 on our trusted CL node outside, and forward to 9000 on the searcher's machine.

Ruteri · 2024-10-21T18:18:41Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+
+        # Production mode rules
+        tcp dport 30303 accept comment "Execution Client P2P"
+        tcp dport { 8545, 8546 } accept comment "Execution Client RPC"


Hm, should we accept EL RPC to anyone external?
I wonder if there's a good way for us to restrict RPC to only between containers

I am not sure about that but I guess princess is running an EL node within podman too. Or that was my impression from Angela's doc.
However, this is just a draft and not the final nftable confs. I would require @bakhtin to check them out and provide me with the conf that we usually require from production instances instead.

We wanted to open 8545 for the hackathon so we wouldn't have to wait for their EL node inside to sync, but removing this from the spec now.

Ruteri

I guess one caveat is this only works if the untrusted user themselves cannot listen on those ports, which can be the policy but should be up front and obvious that it's the requirement
I'm guessing the script enabling and disabling maintenance mode is wip?

I wonder if we can make the user containers have explicit networks https://github.com/containers/podman/blob/main/docs/tutorials/basic_networking.md

Ruteri · 2024-10-21T18:27:00Z

One interesting thing is putting containers in a single pod (see Communicate between two rootless containers in a pod) -- that way they can communicate over localhost, and we would not have to open any of the ports on the host machine.
This could get rid of the requirement for allowing RPC ports in particular

MoeMahhouk · 2024-10-21T18:51:14Z

I guess one caveat is this only works if the untrusted user themselves cannot listen on those ports, which can be the policy but should be up front and obvious that it's the requirement I'm guessing the script enabling and disabling maintenance mode is wip?

I wonder if we can make the user containers have explicit networks https://github.com/containers/podman/blob/main/docs/tutorials/basic_networking.md

yes, the maintenance and production mode switching script is the next step after finalizing those nftable firewalling stuff.

I guess we need both the nftables firewalling for the root/user and also explicit network configuration for the user's containers.
If our devops (@bakhtin) already have such confs available, I can re-use the in-prod ones. Otherwise, we need to craft those for this use case.

MoeMahhouk · 2024-10-21T18:54:15Z

One interesting thing is putting containers in a single pod (see Communicate between two rootless containers in a pod) -- that way they can communicate over localhost, and we would not have to open any of the ports on the host machine. This could get rid of the requirement for allowing RPC ports in particular

Interesting, we could look into that approach but not sure what's the hard requirements for princess use-case is. wouldn't their LH/EL nodes within the pod containers require any external access?
Is it only going to be the port for providing the state diffs through it? Not sure how their deployment works since they do it manually.
Some of it is described here

MoeMahhouk · 2024-10-24T00:06:44Z

recipes-core/searcher-ssh-key/files/ns-enter.c

+    if (!wait_for_namespace()) {
+        fprintf(stderr, "Network namespace not available, running in default namespace\n");
+        execl("/bin/sh", "sh", NULL);
+        perror("execl");
+        return 1;


We should probably just crash it if the namespace is not available and not allow any ssh login as a drastic security measurement

MoeMahhouk · 2024-10-24T00:08:30Z

recipes-core/searcher-ssh-key/files/ns-enter.c

+    // Temporarily elevate privileges
+    if (setuid(0) != 0) {
+        perror("setuid");
+        return 1;
+    }


unfortunately, despite setting the binary as SUID to execute with root privileges, it didn't work because setting the namespace requires root privileges and was returning operation not permitted. So we need this hack here were we escalate the privileges just to enter the name space and then set it back to the searcher user in line 80

I'd add dropping privileges as atexit just in case

fnerdman · 2024-10-28T08:47:28Z

recipes-core/custom-nftables-rules/files/nftables-init

There should be some logic that puts the machine in automatic maintenance mode if the nftables firewall failed to be setup correctly. Ideally maintenance mode shouldn't require firewall rules to work, so if the firewall setup fails for some reason, we're still in a situation where no live data can be leaked.

fnerdman · 2024-10-28T09:06:49Z

recipes-core/searcher-ssh-key/files/ns-enter.c

+    return 0;
+}
+
+int enter_namespace() {


what are the restrictions of this namespace? is it only network wise?

This is all described in details in the searcher-net-init script below.

fnerdman · 2024-10-28T16:17:36Z

One of the main headaches I have with this approach is entering the namespace via .profile.

Could you try setting up a wrapper shell that is calling ns-enter?

# 1. Create the restricted shell
cat > /bin/rsh << 'EOF'
#!/bin/busybox sh
exec /bin/sh
EOF

# 2. Set correct permissions
chmod 755 /bin/rsh
chown root:root /bin/rsh

# 3. Add to /etc/shells
echo "/bin/rsh" >> /etc/shells

MoeMahhouk · 2024-10-29T10:37:14Z

One of the main headaches I have with this approach is entering the namespace via .profile.

Could you try setting up a wrapper shell that is calling ns-enter?
# 1. Create the restricted shell
cat > /bin/rsh << 'EOF'
#!/bin/busybox sh
exec /bin/sh
EOF

# 2. Set correct permissions
chmod 755 /bin/rsh
chown root:root /bin/rsh

# 3. Add to /etc/shells
echo "/bin/rsh" >> /etc/shells

yeah, I simplified this yesterday through the dropbear configuration to enter a similar wrapper shell for the transition.
The latest commit that I pushed added the mode-controller feature with the wrapper shell to work well together

Ruteri · 2024-10-29T11:32:47Z

recipes-core/custom-nftables-rules/files/nftables-maintenance.conf

+define BUILDER_ENDPOINT = 0.0.0.0     # Replace with actual builder endpoint IP
+
+table inet filter {
+    chain input {


Should we not accept by default and drop everything coming from & going to BUILDER_ENDPOINT?

I thought of this too but I think in the maintenance mode the searcher would need this to sync their stuff when deploying the podman for their EL/CL. Wouldn't they need to send and receive from peer connections?

That's why I only opened those and dropped everything else including the state diff stream which is sensitive and not necessary in the maintenance stage.

Ruteri · 2024-10-29T11:33:24Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        ct state established,related accept
+
+        # Allow only builder endpoint connections to these ports
+        ip saddr $BUILDER_ENDPOINT tcp dport 9000 accept comment "FB-operated Consensus Client P2P"


I guess more accurate would be "Builder-operated..."

yeah, I took the naming from the table in the document but that can be changed for sure

Ruteri · 2024-10-29T11:36:33Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        type filter hook output priority 0; policy drop
+
+        # Allow established/related connections
+        ct state established,related accept


hm, if a searcher establishes a connection during maintenance to their own server, wouldn't this rule let them continue sending stuff? Or are you maybe dropping the established connections on transition to production?

I am currently dropping/killing everything the searcher establishes during transition. But that is why I also want a way to test this somehow.
It is kind of difficult to check which ones to keep established and which one to drop. Also, we need to make sure that the undesired established connection doesn't try to re-establish itself upon termination during the transition. But that shouldn't be possible because the transition nftable configuration should drop anything not allowed.

Ruteri · 2024-10-29T11:37:11Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        ip daddr $BUILDER_ENDPOINT tcp dport 8645 accept comment "Builder RPC"
+
+        # Allow fluent-bit to access log endpoint
+        ip daddr $SEARCHER_LOG_HOST tcp dport $SEARCHER_LOG_PORT accept comment "Logs Collector"


Should this be only from the root network ns?

yes, only the root should be able to access this and intercept sent/received stuff over that address:port

Ruteri · 2024-10-29T11:37:57Z

recipes-core/custom-nftables-rules/files/nftables-production.conf

+        ip daddr $SEARCHER_LOG_HOST tcp dport $SEARCHER_LOG_PORT accept comment "Logs Collector"
+
+        # Allow traffic to searcher namespace
+        oifname "veth0" accept comment "Searcher network interface"


Isn't veth0 root ns? I think the rule is correct but the comment is not

# Searcher Namespace Root Namespace External # [veth1 10.0.0.2] [veth0 10.0.0.1] [eth0]

IIUC, this opens that port to the outside. so it is more like veth0 -> eth0-> external world.
Internally, the veth1 is connected to veth0 and anything from the searcher trying to reach that address will be intercepted by veth0 to handle with our delayed fluent-bit.

Ruteri · 2024-10-29T11:39:14Z

recipes-core/custom-nftables-rules/files/nftables-transition.conf

+flush ruleset
+
+table inet filter {
+    chain input {


I'd guess we should allow EL and CL

Yes, we probably should keep those EL/CL connections. I believe I did this as a strict transition configuration and then re-allow them.
I have no strong opinion on this. So we can re-add the EL/CL rules here again without the state diff ofc

Ruteri · 2024-10-29T11:40:20Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+    // Wait for processes to terminate
+    sleep(2);
+
+    // Force kill remaining processes


Hm, why are you killing processes? Wouldn't dropping connections be enough?

I tried dropping but the already established ones where still intakt and upon several tries with different approaches, the killing was doing it. But I am for sure open to suggestions on how and which connections to drop

Ruteri · 2024-10-29T11:42:30Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+    long pid;
+
+    // Elevate privileges
+    if (setuid(0) != 0 || setgid(0) != 0) {


I'd set reverting privilege as atexit before elevating them
https://en.cppreference.com/w/c/program/atexit

Interesting, yeah. We can abstract the drop privileges in a separate function and wrap it with atexit

Ruteri · 2024-10-29T11:44:14Z

recipes-core/searcher-ssh-key/files/ns-enter.c

+    // Temporarily elevate privileges
+    if (setuid(0) != 0) {
+        perror("setuid");
+        return 1;
+    }


I'd add dropping privileges as atexit just in case

Ruteri · 2024-10-29T11:54:10Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+        return 1;
+    }
+
+    if (strcmp(argv[1], "transition-to-maintenance") == 0) {


General comment that strcmp is a buffer overrun waiting to happen - I'd strongly recommend strnlen_s + strncmp for all the comparisons (or C++ but it's a bit too late for that)

I can try to refactor all to C++ or different language if necessary. Like Go.
First I just wanted to quickly have something running and doing the logic we need in a compiled binary.
I avoided Rust for the sake of delivery speed but we can also consider it if needed

Ruteri · 2024-10-29T11:59:09Z

recipes-core/custom-nftables-rules/files/nftables-transition.conf

+    }
+
+    chain output {
+        type filter hook output priority 0; policy drop


I'd guess established output is alright?

in the transition, you need to make sure that the killed undesired connection do not re-establish themselves during transition. So anything not explicitly added should be by default not allowed to establish during transition

Ruteri · 2024-10-29T12:05:39Z

recipes-core/searcher-ssh-key/files/searcher-net-init

+
+    # Configure veth1 inside the namespace
+    log "Configuring veth1 in namespace $NAMESPACE..."
+    if ! ip netns exec "$NAMESPACE" ip addr add 10.0.0.2/24 dev veth1; then


Shouldn't this be 10.0.0.2/32 or 10.0.2.0/24? Similar comment for the 10.0.0.1

Could you elaborate more on why?

Ruteri · 2024-10-29T12:09:19Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+    }
+
+    // Apply final rules
+    snprintf(cmd, sizeof(cmd), "/usr/sbin/nft -f /etc/nftables-%s.conf", to_mode);


I'd avoid reusing the buffer here

I agree, but in this case it is hardcoded size but I see we don't clear it which indeed might be a problem.
I will refactor later on.

Ruteri · 2024-10-29T12:09:50Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+    return ret;
+}
+
+static int execute_transition_script(const char *from_mode, const char *to_mode) {


I'd guess the function would be simpler if you hardcoded the transitions as one function each

I had them as one function each but it became annoying each time to refactor an issue/bug, I had to handle them in each separately due to duplication. So here I left the common code.
I can separate them again if we see it better or more readable.

Ruteri · 2024-10-29T12:11:01Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+    uid_t real_gid = getgid();
+
+    log_message(LOG_INFO, "Applying transition rules...");
+    if (setuid(0) != 0) {


I don't think you drop the privilege

good catch. This only elevates and down below it should drop it again

Ruteri · 2024-10-29T12:20:18Z

recipes-core/searcher-ssh-key/files/mode-controller.c

+}
+
+static int read_current_mode(char *mode, size_t size) {
+    if (acquire_lock() != 0) {


I'd probably lock the whole execution -- I don't think concurrent access is handled

So you mean not on function level but rather directly in the main. Lock it and only unlock it when finished instead?

Ruteri · 2024-10-29T12:25:15Z

recipes-core/searcher-ssh-key/files/shell-wrapper.sh

+export PS1='searcher@tdx:\w\$ '
+
+# Check current mode and handle transition if needed
+mode=$(mode-controller status | awk '{print $NF}')


An alternative would be to not read the mode here, instead call mode-controller with no arguments and let it figure out the mode.
I think /usr/bin/ns-enter can be called regardless of mode which is super convenient

I had it all in mode-controller originally but there were issues regarding the user not being set correctly.
Also the execution status check was not working correctly upon ssh-in into the machine while in production mode.
That's why we need this check in the login-shell for it to check prior to passing it to the mode-controller

MoeMahhouk requested review from bakhtin and Ruteri October 21, 2024 16:57

feat: Add nftables for firewalling

90bce81

MoeMahhouk force-pushed the firewalling branch 2 times, most recently from 7e99dc8 to 946d047 Compare October 21, 2024 17:31

Add the template nftables confs for maintenance and production modes

9d228a4

MoeMahhouk force-pushed the firewalling branch from 946d047 to 9d228a4 Compare October 21, 2024 17:51

Ruteri reviewed Oct 21, 2024

View reviewed changes

MoeMahhouk force-pushed the firewalling branch 2 times, most recently from c329e38 to 7243a78 Compare October 23, 2024 15:49

Adjust the firewall configurations for maintenance and production mode

41a607d

MoeMahhouk force-pushed the firewalling branch from 7243a78 to 41a607d Compare October 23, 2024 15:49

MoeMahhouk added 2 commits October 23, 2024 23:14

feat: Add network namespace separation for searcher user

bdc0d51

Add chain forward for the nftable configurations

dedf994

MoeMahhouk commented Oct 24, 2024

View reviewed changes

fnerdman reviewed Oct 28, 2024

View reviewed changes

feat: Add mode controller for the execution mode transition

2275ee0

Ruteri reviewed Oct 29, 2024

View reviewed changes

feat: Add nftables for firewalling #5

Are you sure you want to change the base?

feat: Add nftables for firewalling #5

Conversation

MoeMahhouk commented Oct 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MoeMahhouk Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ruteri left a comment • edited Loading

Choose a reason for hiding this comment

Ruteri commented Oct 21, 2024

MoeMahhouk commented Oct 21, 2024

MoeMahhouk commented Oct 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fnerdman commented Oct 28, 2024

MoeMahhouk commented Oct 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MoeMahhouk Oct 21, 2024 •

edited

Loading

Ruteri left a comment •

edited

Loading