-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-48469: GCP: Update /etc/hosts file when ClusterHostedDNS is enabled #4800
base: master
Are you sure you want to change the base?
Conversation
@sadasu: This pull request references Jira Issue OCPBUGS-48469, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
1bc96ca
to
3280ff8
Compare
apiServerIntURL={{ .Infra.Status.APIServerInternalURL }} | ||
# Add the/etc/hosts configuration file | ||
mkdir -p /etc/hosts/conf.d | ||
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could maybe name the file api.conf
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping that naming it as etc-hosts.conf
would be it obvious that this file contains some configuration for /etc/hosts
. Happy to call it api.conf
is it contains information about resolving the API/-Int urls.
{{ else }} | ||
exit 0 | ||
{{ end }} | ||
if [ -z "${apiIntLBIPs}" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be:
if [ -z "${apiIntLBIPs}" ]; then | |
if [ -z "{{$apiIntLBIPs}}" ]; then |
?
We appear to be missing a line like:
apiIntLBIPs={{$apiIntLBIPs}}
to get the template variable that we defined on line 19 into a bash variable.
mkdir -p /etc/hosts/conf.d | ||
cat <<EOF | tee /etc/hosts/conf.d/etc-hosts.conf | ||
# Added by OpenShift | ||
${apiLBIPs[0]} ${apiServerURL} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are treating apiLBIPs
as a bash array here. That's fine if it is, but bash arrays always seem like kind of a pain to set up to me. It might be simpler to do this in the template:
${apiLBIPs[0]} ${apiServerURL} | |
{{$apiLBIPs[0]}} ${apiServerURL} |
@@ -0,0 +1,11 @@ | |||
mode: 0755 | |||
path: "/usr/local/bin/update-etc-hosts" | |||
contents: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially a simpler way, avoiding the script and systemd service:
path: "/etc/hosts"
append:
- inline: |
{{ if and (eq .Infra.Status.PlatformStatus.Type "GCP") (.Infra.Status.PlatformStatus.GCP) (.Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig) (eq .Infra.Status.PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType "ClusterHosted") }}
{{ $apiIntLBIPs := cloudPlatformAPIIntLoadBalancerIPs . }}
{{ if len $apiIntLBIPs > 0 }}
{{ $apiLBIPs := cloudPlatformAPILoadBalancerIPs . }}
{{ if len $apiLBIPs > 0 }}{{ $apiLBIPs[0] }}{{ else }}{{ $apiIntLBIPs[0] }}{{ end }} {{ .Infra.Status.APIServerURL }}
{{ $apiIntLBIPs[0] }} {{ .Infra.Status.APIServerInternalURL }}
{{ end }}
{{ end }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The systemd service is providing us a way to time the running of this script before kubelet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it were laid down on disk by ignition in the initramfs then it would be before everything.
To check if feature to run in-cluster DNS on GCP and AWS is enabled by checking if the value of `PlatformStatus.GCP.CloudLoadBalancerConfig.DNSType` is set to `ClusterHosted`.
15c0227
to
3d451a6
Compare
91d44ca
to
1b13a9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically seems fine, although I am unsure how to test, so let me know if you'd like any QE pre-merge testing on this
@@ -777,6 +778,33 @@ func cloudPlatformIngressLoadBalancerIPs(cfg RenderConfig) (interface{}, error) | |||
} | |||
} | |||
|
|||
// cloudPlatformLBIPAvailable returns true when DNSType is set to `ClusterHosted` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, based on the comment I'd expect some check for clusterhosted in the function. I guess it's implicit since the service enablement is dependent on this field?
(I know we do the same elsewhere in the template rendering, so I'm fine with it as is)
Hi @sadasu still something wrong with the syntax of the bash in gcp-update-etc-hosts.service
|
77c93b5
to
171e94a
Compare
@gpei thanks for testing the 2 previous versions. I don't see why the bash script with correct syntax has failures when run within the systemd unit. I have reorganized my code and posted another version. |
@sadasu thanks for the update, with the latest code, we can move one step further now, but the Here are the contents of some key files on the master for your reference:
It looks like two empty lines have been added to file /etc/conf.d/etc-hosts.conf, the value of ${apiLBIP}/${apiServerURL}/${apiIntLBIP}/${apiServerIntURL} were not loaded when running |
fi | ||
mkdir -p /etc/conf.d | ||
etc_hosts_config_filename="/etc/conf.d/etc-hosts.conf" | ||
echo "${apiLBIP} ${apiURL%:*}" >> ${etc_hosts_config_filename} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue I'm seeing is, when I manually executed the following command
[root@gpei-0130-gcpdns-hq6q8-master-0 ~]# /usr/local/bin/update-etc-hosts 34.160.207.88 https://api.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com:6443 10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com:6443
34.160.207.88 https://api.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
Done updating /etc/hosts
It's adding records like
10.0.0.2 https://api-int.gpei-0130-gcpdns.qe.gcp.devcluster.openshift.com
to /etc/hosts file, but it can't help on resolve the URL, so looks like we also need to remove the beginning "https://" in addition.
Something like below should work based on my test, just for your reference.
apiHostname=${apiURL#*//}
apiIntHostname=${apiIntURL#*//}
echo "${apiLBIP} ${apiHostname%%:*}" >> ${etc_hosts_config_filename}
echo "${apiIntLBIP} ${apiIntHostname%%:*}" >> ${etc_hosts_config_filename}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I had removed the tailing port number previously. Now updated to also remove the leading https://.
f790dfb
to
30ea55d
Compare
@gpei thanks for you attention to this work. I ended up moving some simple bash code back to templates/common/gcp/units/gcp-update-etc-hosts.service.yaml. My last attempt before I move to the use of "Environment". |
30ea55d
to
b02b304
Compare
/retest-required |
@sadasu Now a new error was raised in the gcp-update-etc-hosts.service
I tried to put the creating /etc/conf.d directory task into ExecStartPre as following, it's working but still nothing was written to "/etc/conf.d/etc-hosts.conf"
Not sure if it's related to the write permission on that directory somehow, but even I specified |
what if we created the file in I am working on another version, that moves all file operations back to templates/common/gcp/files/usr-local-bin-update-etc-hosts.yaml |
8a70084
to
bd13be9
Compare
@sadasu I think it's working as expected this time 🎉 IP address mapping of API/API-INT were both added into /etc/hosts, and masters are joined into the cluster.
|
The new problem I can see now is the workers are trying to fetch worker ignition from the original MCS address "https://api-int.gpei-0201-gcpdns.qe.gcp.devcluster.openshift.com:22623/config/worker" which couldn't be resolved, looks like we'll need to replace the URL to the internal LB IP address? Here's the console Serial of worker:
I believe we can create another bug to track this issue. |
@gpei Thanks! The problem you are seeing with worker ignition was already solved for masters. I'll replicate that for the workers. |
/label acknowledge-critical-fixes-only |
echo "${apiIntLBIP} ${apiIntServerHostname}" >> ${etc_hosts_config_filename} | ||
if [ -f ${etc_hosts_config_filename} ] | ||
then | ||
cat /etc/conf.d/etc-hosts.conf >> /etc/hosts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure that you check if this entry does not already exist in /etc/hosts; otherwise running /usr/local/bin/update-etc-hosts
multiple times will keep adding the same stuff and potentially the file explodes
/label qe-approved |
@sadasu: This pull request references Jira Issue OCPBUGS-48469, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs
bd13be9
to
5c619c9
Compare
/hold |
/retest |
@sadasu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Append /etc/hosts files with entries to resolve cluster api and api-int URLS. /etc/hosts will provide resolution for these URLs until kubelet joins the cluster and runs its CoreDNS pod which will then take over resolution of those 2 URLs
- What I did
- How to verify it
- Description for the changelog