Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP tests/e2e: Add encrypted image test for operator #446

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

GabyCT
Copy link
Contributor

@GabyCT GabyCT commented Sep 18, 2024

This PR adds an encrypted image test for the operator repository.

@GabyCT GabyCT requested a review from a team as a code owner September 18, 2024 22:54
@GabyCT GabyCT changed the title tests/e2e: Add encrypted image test for operator WIP tests/e2e: Add encrypted image test for operator Sep 18, 2024
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenhorsman I think that we need to create this secrets and have that info, is this something that you can help me to set up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the gitub.actor value is automatically generated based on the last person to modify the flow, rather than a secret

spec:
runtimeClassName: RUNTIMECLASS
containers:
- image: docker://ghcr.io/confidential/nginx:latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the image can change depending of the secrets that we will have for the GHA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied

- name: Install skopeo
shell: |
sudo apt-get install -y build-essential libgpgme-dev libassuan-dev libbtrfs-dev pkg-config go-md2man
git clone https://github.com/containers/skopeo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @fidencio found in confidential-containers/guest-components#669 that we needed a specific version of skopeo in order to get the round trip encryption working? It might be we can get a newer one going, but maybe it's safest to start with that one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I tried to use the skopeo version f64a376 but seems like it does not work as when looking for the logs to see that the key is being retrieved by the kbs when doing kubectl logs -f "$POD" | grep "$KEY_PATH"

[2024-09-23T17:30:00Z INFO  kbs] Using config file /etc/kbs/kbs-config.toml
[2024-09-23T17:30:00Z WARN  attestation_service::rvps] No RVPS address provided and will launch a built-in rvps
[2024-09-23T17:30:00Z INFO  attestation_service::token::simple] No Token Signer key in config file, create an ephemeral key and without CA pubkey cert
[2024-09-23T17:30:00Z INFO  kbs] Starting HTTP server at [0.0.0.0:8080]
[2024-09-23T17:30:00Z INFO  actix_server::builder] starting 256 workers
[2024-09-23T17:30:00Z INFO  actix_server::server] Tokio runtime found; starting in existing Tokio runtime

export kbs_svc_name="kbs"
export kbs_ingress_name="kbs"
export runtimeclass="kata-qemu"
export encrypted_image="ghcr.io/confidential/nginx:latest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export encrypted_image="ghcr.io/confidential/nginx:latest"
export encrypted_image="ghcr.io/confidential-containers/nginx:encrypted"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can pass in the repo via an envar and set it in the workflow that would be even better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied

spec:
runtimeClassName: RUNTIMECLASS
containers:
- image: docker://ghcr.io/confidential/nginx:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenhorsman I think here we will need to create the secret

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GITHUB_TOKEN is automatically created, so we don't need to make it.

popd
}

deploy_k8s_kbs() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need something else for the kbs configuration?

kubectl delete "${pod_name}"
}

check_image_key() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that currently is failing... seems like we can't find the secret...not sure if a configuration is missing @wainersm

Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made a few suggestions.

Does this pass as is?

guest_components_repo="https://github.com/confidential-containers/guest-components.git"
guest_components_path="${script_dir}/guest-components"
if [ -d "${guest_components_path}" ]; then
echo "Guest components repo directory exists"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should always clone guest components in case the version on the host is out of date

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied

tag_name="coco-keyprovider"
docker build -t "${tag_name}" -f ./attestation-agent/docker/Dockerfile.keyprovider .
mkdir -p oci/{input,output}
skopeo copy docker-daemon:unencrypted:latest dir:./oci/input
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we get the basic flow working, let's think about adding a random secret to the image (rather than "something confidential") and checking for that value inside the guest. Also, we can grep for that string in the encrypted layer and make sure it doesn't show up.

fi

pushd "${script_dir}/trustee/kbs/config/kubernetes"
echo "somesecret" > overlays/$(uname -m)/key.bin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw you could probably simplify things by providing the image decryption key here (instead of "somesecret"). in that case you could probably remove provide_image_key.

force: yes
retries: 3
delay: 10
- name: Build and install the kbs client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like you actually use the kbs-client in this test. You can probably skip installing and uninstalling it.

@GabyCT GabyCT force-pushed the topic/addencryptedtest branch 3 times, most recently from 0523d88 to ac44ba4 Compare September 23, 2024 20:32
export aa_kbc="${aa_kbc:-cc_kbc}"
export kbs_svc_name="${kbs_svc_name:-kbs}"
export kbs_ingress_name="${kbs_ingress_name:-kbs}"
export runtimeclass="${runtimeclass:-kata-qemu}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @GabyCT !

I started reviewing and, more importantly, running this script. I'm executing function by function, as of today I ran the build_encryption_key().

The first thing caught my attention was the default value of runtimeclass. It should be kata-qemu-coco-dev and the encrypted test should not run on kata-qemu or kata-clh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wainersm the kata-qemu-coco-dev does not exists as we can see

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, not yet. It's in my TODO list to create a new job for kata-qemu-coco-dev.

export encrypted_image="${encrypted_image:-ghcr.io/$username/nginx:encrypted}"

build_encryption_key() {
docker build -t unencrypted .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: On v0 of this test I think it's fine to build the encrypted image all the time, but long term we will need to catch up with @portersrc 's work to have these test images all built once and shared amongst tests.


pushd "${guest_components_path}"
export tag_name="coco-keyprovider"
docker build -t "${tag_name}" -f ./attestation-agent/docker/Dockerfile.keyprovider .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guest-components published an image of the keyprovider for the v0.10.0. There is a workflow to keep publishing that image for the next releases so I think we can consume the built one instead for build ourselves.

export kbs_ingress_name="${kbs_ingress_name:-kbs}"
export runtimeclass="${runtimeclass:-kata-qemu}"
export username="${username:-}"
export encrypted_image="${encrypted_image:-ghcr.io/$username/nginx:encrypted}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator e2e CI install and launch a container registry at localhost:5000. One option that we could explore later is to push the encrypted image to that registry rather than ghcr.io. I managed to push an encrypted image to localhost:5000 but I had to disable TLS on all operations with skopeo because the local registry is configured with HTTP only. Thus, the image-rs on kata's guest may not be able to talk with the local registry....

I wanted to record that idea, but it should not be priority now :D

echo "somesecret" > overlays/$(uname -m)/key.bin
export DEPLOYMENT_DIR=nodeport
./deploy-kbs.sh
kbs_pod=$(kubectl -n "${kbs_namespace}" get pods -o NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kbs pod may not had started and kbs_pod will be empty. This happened here in my env.

On kata-containers we use the waitForProcess function to wait the pod show up. But a simple solution would be to use the kubectl rollout command (and drop the next kubectl wait call): kubectl rollout status -w --timeout=30s deployment/kbs -n coco-tenant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied

yq -i ".${annotation_key} = \"${value}\"" "${yaml}"
}

set_aa_kbc() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added kernel_params metadata is not quoted:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx-encrypted
  annotations:
    io.katacontainers.config.hypervisor.kernel_params: agent.guest_components_rest_api=resource agent.aa_kbc_params=cc_kbc::http://192.168.122.180:31231
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx

<snip>

Don't know whether it will be a problem or not but we better quote it to avoid any parsing issues. Something like io.katacontainers.config.hypervisor.kernel_params: ' agent.guest_components_rest_api=resource agent.aa_kbc_params=cc_kbc::http://192.168.122.180:31231'

spec:
runtimeClassName: RUNTIMECLASS
containers:
- image: docker://ghcr.io/USERNAME/nginx:encrypted
Copy link
Member

@wainersm wainersm Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker:// isn't needed. It even refuses to start the container.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied


launch_pod() {
kubectl apply -f "${script_dir}/nginx-encrypted.yaml"
export pod_name=$(kubectl -n default get pods -o NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use kubectl rollout here too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes applied

@wainersm
Copy link
Member

Hi @GabyCT ,

I ran your code and have some results to share.

With the generated nginx-encrypted.yaml it fails and doesn't even try to connect at kbs. You can debug kbs logs by running:

kubectl logs -l app=kbs -n coco-tenant

Then I decided to apply the following pod yaml which is pretty similar to what is used on Kata CI. Notice that I'm using kata-qemu-coco-dev runtimeclass and the encrypted image generated with this scripts (ghcr.io/wainersm/nginx:encrypted)

apiVersion: v1
kind: Pod
metadata:
  name: test-e2e
  annotations:
    io.containerd.cri.runtime-handler: kata-qemu-coco-dev
    io.katacontainers.config.hypervisor.kernel_params: ' agent.guest_components_procs=confidential-data-hub agent.aa_kbc_params=cc_kbc::http://192.168.122.86:30327'
    io.katacontainers.config.agent.policy: IyBDb3B5cmlnaHQgKGMpIDIwMjMgTWljcm9zb2Z0IENvcnBvcmF0aW9uCiMKIyBTUERYLUxpY2Vuc2UtSWRlbnRpZmllcjogQXBhY2hlLTIuMAojCgpwYWNrYWdlIGFnZW50X3BvbGljeQoKZGVmYXVsdCBBZGRBUlBOZWlnaGJvcnNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBBZGRTd2FwUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ2xvc2VTdGRpblJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IENvcHlGaWxlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlQ29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlU2FuZGJveFJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IERlc3Ryb3lTYW5kYm94UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgRXhlY1Byb2Nlc3NSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBHZXRNZXRyaWNzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR2V0T09NRXZlbnRSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBHdWVzdERldGFpbHNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBMaXN0SW50ZXJmYWNlc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IExpc3RSb3V0ZXNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBNZW1Ib3RwbHVnQnlQcm9iZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IE9ubGluZUNQVU1lbVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFBhdXNlQ29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgUHVsbEltYWdlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgUmVhZFN0cmVhbVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlbW92ZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlbW92ZVN0YWxlVmlydGlvZnNTaGFyZU1vdW50c1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlc2VlZFJhbmRvbURldlJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlc3VtZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFNldEd1ZXN0RGF0ZVRpbWVSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTZXRQb2xpY3lSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTaWduYWxQcm9jZXNzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhcnRDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTdGFydFRyYWNpbmdSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTdGF0c0NvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFN0b3BUcmFjaW5nUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgVHR5V2luUmVzaXplUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgVXBkYXRlQ29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgVXBkYXRlRXBoZW1lcmFsTW91bnRzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgVXBkYXRlSW50ZXJmYWNlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgVXBkYXRlUm91dGVzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgV2FpdFByb2Nlc3NSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBXcml0ZVN0cmVhbVJlcXVlc3QgOj0gdHJ1ZQo=
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
    - name: test-container
      #image: ghcr.io/confidential-containers/test-container:multi-arch-encrypted
      image: ghcr.io/wainersm/nginx:encrypted
      imagePullPolicy: Always
      command:
        - sleep
        - "30"

This time it is able to connect to kbs:

[2024-09-26T19:33:41Z INFO  kbs::http::attest] Auth API called.
[2024-09-26T19:33:41Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000117
[2024-09-26T19:33:41Z INFO  kbs::http::attest] Attest API called.
[2024-09-26T19:33:41Z INFO  attestation_service] Sample Verifier/endorsement check passed.
[2024-09-26T19:33:41Z INFO  attestation_service] Policy check passed.
[2024-09-26T19:33:41Z INFO  attestation_service] Attestation Token (Simple) generated.
[2024-09-26T19:33:41Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 200 2171 "-" "attestation-agent-kbs-client/0.1.0" 0.002289
[2024-09-26T19:33:41Z WARN  kbs::token::coco] No Trusted Certificate in Config, skip verification of JWK cert of Attestation Token
[2024-09-26T19:33:41Z INFO  kbs::http::resource] Get resource from kbs:///default/image_key/nginx
[2024-09-26T19:33:41Z ERROR kbs::http::error] Resource not permitted.
[2024-09-26T19:33:41Z INFO  actix_web::middleware::logger] 10.244.0.1 "GET /kbs/v0/resource/default/image_key/nginx HTTP/1.1" 401 112 "-" "attestation-agent-kbs-client/0.1.0" 0.000556
[2024-09-26T19:33:41Z INFO  kbs::http::attest] Auth API called.
[2024-09-26T19:33:41Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000088
[2024-09-26T19:33:41Z INFO  kbs::http::attest] Attest API called.
[2024-09-26T19:33:41Z INFO  attestation_service] Sample Verifier/endorsement check passed.
[2024-09-26T19:33:41Z INFO  attestation_service] Policy check passed.
[2024-09-26T19:33:41Z INFO  attestation_service] Attestation Token (Simple) generated.
[2024-09-26T19:33:41Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 200 2171 "-" "attestation-agent-kbs-client/0.1.0" 0.001573

But notice the error [2024-09-26T19:33:41Z ERROR kbs::http::error] Resource not permitted. on the logs above. It means that the resource (the encryption key) is disallowed to be fetched. This happen with kata-qemu-coco-dev, where we have to set the "allow all" policy to kbs (please see https://github.com/kata-containers/kata-containers/blob/main/tests/integration/kubernetes/k8s-guest-pull-image-encrypted.bats#L33). Running within a TEE environment (e.g. kata-qemu-tdx) this policy should not be set and it should work as long as attestation is working.

Ok, then I tried to set the "allow all" policy but hit an error, maybe a bug on latest kbs-client:

$ kbs-client --url http://192.168.122.86:30327 config --auth-private-key trustee/kbs/config/kubernetes/overlays/x86_64/key.bin set-resource-policy --policy-file ./trustee/kbs/sample_policies/allow_all.rego
Error: Parse error

@fitzthum ^ is it a known issue?

@GabyCT GabyCT force-pushed the topic/addencryptedtest branch 8 times, most recently from 4489846 to d193795 Compare September 26, 2024 23:08
@fitzthum
Copy link
Member

is it a known issue?

Yeah. I thought we had fixed this, but we have seen problems parsing the policies in the past. Maybe try randomly changing the format :'(

@fitzthum
Copy link
Member

fitzthum commented Oct 2, 2024

Yeah. I thought we had fixed this, but we have seen problems parsing the policies in the past. Maybe try randomly changing the format :'(

Actually let me be more clear. I wouldn't expect to run into a parsing problem with that policy in particular and actually I just tested it on my machine and it works fine. We have seen some issues parsing policies in general but not with this file.

@GabyCT GabyCT force-pushed the topic/addencryptedtest branch 2 times, most recently from 0186de8 to e53c18d Compare October 16, 2024 18:11
This PR adds an encrypted image test for the operator repository.

Signed-off-by: Gabriela Cervantes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants