-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for internal-frontend #602
base: main
Are you sure you want to change the base?
Add support for internal-frontend #602
Conversation
@@ -242,6 +242,33 @@ server: | |||
containerSecurityContext: {} | |||
topologySpreadConstraints: [] | |||
podDisruptionBudget: {} | |||
internal-frontend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to me the helm chart is uses camelCase
convention, would it ok if we renamed internal-frontend
to internalFrontend
? It would also makes the usage a bit easier, we could use {{ if $.Values.server.internalFrontend.enabled }}
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamko147 I had initially done this as internalFrontend
but recall these identifiers are actually used to generate the kubernetes services and it doesn't allow uppercase letters - hence internal-frontend
which corresponds to what the actual temporal feature is named.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, thanks, did not realize that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a proposition: you can create a simple helper to use internalFrontend as a "yaml" part and internal-frontend in names when needed:
{{- define "serviceName" -}}
{{- $service := index . 0 -}}
{{- if eq $service "internalFrontend" }}
{{- print "internal-frontend" }}
{{- else }}
{{- print $service }}
{{- end }}
{{- end -}}
Usage:
{{ $service := include "serviceName" (list "frontend" "internalFrontend" "matching" "history" "worker") }}
@@ -242,6 +242,33 @@ server: | |||
containerSecurityContext: {} | |||
topologySpreadConstraints: [] | |||
podDisruptionBudget: {} | |||
internal-frontend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, to be backwards compatible, should we disable internal frontend by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamko147 agreed, that was meant to be disabled by default, I'll make this update
This seems to contain all of the work (and more) that was done in #571. I'm happy to close that one if we can get this over the line. |
… services and pdbs are not created if disabled
@adamko147 @dleblanc just pushed a commit to address disabling by default, let me know if there's anything else needed here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍looks good to me. still needs 👀 from code owners...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the MR, I'm waiting for an official implementation of feature!
@@ -242,6 +242,33 @@ server: | |||
containerSecurityContext: {} | |||
topologySpreadConstraints: [] | |||
podDisruptionBudget: {} | |||
internal-frontend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a proposition: you can create a simple helper to use internalFrontend as a "yaml" part and internal-frontend in names when needed:
{{- define "serviceName" -}}
{{- $service := index . 0 -}}
{{- if eq $service "internalFrontend" }}
{{- print "internal-frontend" }}
{{- else }}
{{- print $service }}
{{- end }}
{{- end -}}
Usage:
{{ $service := include "serviceName" (list "frontend" "internalFrontend" "matching" "history" "worker") }}
@@ -33,9 +33,17 @@ spec: | |||
env: | |||
# TEMPORAL_CLI_ADDRESS is deprecated, use TEMPORAL_ADDRESS instead | |||
- name: TEMPORAL_CLI_ADDRESS | |||
{{- if index $.Values.server "internal-frontend" "enabled" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamko147 Just a question about security:
If we use internal frontend here then everyone who has access to the admin-tools pod will be able to execute commands without any additional auth.
Is this expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, that’s how we use it in our use case. Temporal is running in dedicated cluster/namespace and only frontend is exposed to outside world. access to the k8s cluster controlled via IAM, so unless you don’t have cluster access, you don’t have admin tools access.
it’s also works better for ops person when troubleshooting, who’s logs into admin tools using kubectl and from there has access w/o need to generate access token. and as I said, kubectl access is controlled by IAM connected to our internal company IDP where we also set the auth access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamko147 what you described matches our use case as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My internal "deny all" rule is screaming that it will be a problem for some people who has stricter polices related to this topic. But if code owners are ok with that then I'm ok too.
@@ -173,8 +182,10 @@ data: | |||
{{- toYaml . | nindent 6 }} | |||
{{- end }} | |||
|
|||
{{- if not (index $.Values.server "internal-frontend" "enabled") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test it with authorization?
When I was preparing similar changes I had to add condition to "authorization" + "publicClient" only for frontend. Otherwise internalFrontend will try to do auth and fail. Also, if we skip auth section for frontend then users will be able to access UI without permission.
I ended up with creating a loop to make separated configmap per service
{{- range $originalService := (list "frontend" "internalFrontend" "history" "matching" "worker") }}
{{ $serviceValues := index $.Values.server $originalService }}
{{ $service := include "serviceName" (list $originalService) }}
...
{{- if eq $service "frontend" }}
{{- with $server.config.authorization }}
authorization:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
...
{{- if ne $service "frontend"}}
internal-frontend:
rpc:
grpcPort: {{ $server.internalFrontend.service.port }}
httpPort: {{ $server.internalFrontend.service.httpPort }}
membershipPort: {{ $server.internalFrontend.service.membershipPort }}
bindOnIP: "0.0.0.0"
{{- end }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheGeniesis We have tested it with authorization on frontend (in our case using OIDC) which is working, but I haven't specifically tested auth for internal-frontend. I was under the impression (possibly mistakenly?) that the purpose of internal-frontend was to bypass authorization for internal access, but perhaps there's a middle ground where it has a different level of auth configured. If you have insights or ideas here would appreciate any contributions to this PR as it's a bit outside of my wheelhouse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheGeniesis I don't think you're mistaken about the internal frontend purpose. It was introduced at temporalio/temporal#3706
The server worker needs to make connections to the frontend with effective "admin" access (to all namespaces). With pluggable claim mappers and authorizers, we can't control the logic in them, and requiring all users to set up mTLS with special claims is a large burden. Adding internal frontends as an alternative is easier and more reliable.
In our use case we have https://docs.temporal.io/self-hosted-guide/security#plugins custom authorizer/claim mapper for JWT tokens issued by our auth server and there is no way other temporal components (e.g. server worker) could get/refresh the token required for frontend access, so internal-frontend solves exactly this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've successfully tested this PR with authorization enabled. Currently we're enabling auth using the built-in authorizer and claim plugins, and it's working without issue.
I haven't leveraged the features that were failing when we didn't have the internal worker enabled (eg: scheduled jobs), but the system did come up without the errors and seemed to be working properly.
I agree with @adamko147 about the internal frontend's purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to find time in between meetings to test this, so I appreciate that confirmation, @dleblanc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I wasn't clear enough.
About definitions
As you mentioned - internal-frontend shouldn't have any authorization and should by used by internal pods (like worker).
Frontend should have authorization and should be used by users (direct calls/web).
My auth scenario
I use Azure EntraID to manage accesses with default Temporal Authorizer + mapper.
The EntraID was configured by following a medium article. The only difference is in system namespace which in the newer version is temporal-system
instead of system
.
The problem which I faced
During tests services were failing, because internal-frontend had "authorization" part in the config. I had to disable it for the internal-frontend, so it stopped trying to do auth for internal services.
The Frontend needs the "authorization" part and publicClient (since it's public).
This MR
The current config uses the same configmap for "frontend" and "internal-frontend". So, it's not possible to have different setup for both (internal without authorization, frontend without internal-frontend).
How I solved the issue in my repo
I have separated configmaps for frontend and internal-frontend with enabled/disabled options described above
I see it's working for you. There might be one more factor which I'm not aware of and it was the reason why it didn't fully work for me. Unfortunately, I'm behind my working schedule and I don't have time to test deployments again with this MR, to confirm that I'm able to reproduce the same problem again :/
type: ClusterIP | ||
port: 7233 | ||
membershipPort: 6933 | ||
httpPort: 7243 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these ports be 6936 and 7236 by default to match https://github.com/temporalio/temporal/blob/main/docker/config_template.yaml#L283-L284
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ports 6936 and 7236 also matches the suggestion in the release notes for version 1.20 (which introduced the internal-frontend).
What was changed
Adds support for internal-frontend which has been available since v1.20
Why?
Because it is necessary for certain auth configurations to allow internal access, also addresses #560