Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling Istio Ambient Service Mesh on a kubernetes namespace is making Redis Replica pods in CrashLoopBackOff state #16228

Open
tanvik1-sig opened this issue Feb 7, 2025 · 0 comments

Comments

@tanvik1-sig
Copy link

tanvik1-sig commented Feb 7, 2025

Hi,

Installed istio service mesh in ambient mode version: 1.24.2 in GKE cluster.

  1. After enabling it on the namespace, I am unable to get all the pods up and running. Noticing redis replica pods are in CrashLoopBackOff state.

  2. In ztunnel logs, able to see error from almost all the pods that are deployed in cluster including redis. Below are some examples of error:

error access connection complete src.addr=x.x.x.x:60704 src.workload="planeta-scass-scanner-medium-6645898d5b-47rsk" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-scanner" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:5672 dst.service="planeta-scass-rabbitmq.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-rabbitmq-server-2" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-rabbitmq-server" direction="outbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="http status: 503 Service Unavailable"

error access connection complete src.addr=x.x.x.x:50348 src.workload="planeta-scass-redis-replicas-0" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-replica" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:6379 dst.service="planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-redis-master-0" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-master" direction="inbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="connection failed: deadline has elapsed"

  1. There is an existing networkpolicy for redis as part of helm chart installation:
  egress:
  - ports:
    - port: 15008
      protocol: TCP
    - port: 6379
      protocol: TCP
  ingress:
  - ports:
    - port: 15008
      protocol: TCP
    - port: 6379
  policyTypes:
  - Ingress
  - Egress


Redis replica pod shows below error:

Unable to connect to MASTER: Resource temporarily unavailable
Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379

In case, I remove "Egress" from networkPolicy policyTypes,  redis replica pod is still in CrashLoopBackOff, but in logs, I can see:

Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379
 * MASTER <-> REPLICA sync started





Need help in fixing ztunnel network connection to redis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant