Skip to content

[quota management] preemption condition missing when AW is preempted #381

Open
@asm582

Description

@asm582

When AW running in another namespace is preempted, the preemption condition is missing from status block:

Status:
  Conditions:
    Last Transition Micro Time:  2023-05-17T17:20:36.946554Z
    Last Update Micro Time:      2023-05-17T17:20:36.946554Z
    Status:                      True
    Type:                        Init
    Last Transition Micro Time:  2023-05-17T17:20:57.805962Z
    Last Update Micro Time:      2023-05-17T17:20:57.805962Z
    Reason:                      AwaitingHeadOfLine
    Status:                      True
    Type:                        Queueing
    Last Transition Micro Time:  2023-05-17T17:20:36.954879Z
    Last Update Micro Time:      2023-05-17T17:20:36.954879Z
    Reason:                      FrontOfQueue.
    Status:                      True
    Type:                        HeadOfLine
    Last Transition Micro Time:  2023-05-17T17:20:38.805666Z
    Last Update Micro Time:      2023-05-17T17:20:38.805666Z
    Reason:                      AppWrapperRunnable
    Status:                      True
    Type:                        Dispatched
    Last Transition Micro Time:  2023-05-17T17:20:38.814683Z
    Last Update Micro Time:      2023-05-17T17:20:38.814683Z
    Reason:                      PodsRunning
    Status:                      True
    Type:                        Running
    Last Transition Micro Time:  2023-05-17T17:20:57.828012Z
    Last Update Micro Time:      2023-05-17T17:20:57.828012Z
    Message:                     Insufficient quota to dispatch AppWrapper.
    Reason:                      AppWrapperNotRunnable.  Failed to allocate quota on quota designation 'quota_context'
    Status:                      True
    Type:                        Backoff
  Controllerfirsttimestamp:      2023-05-17T17:20:36.946422Z
  Filterignore:                  true
  Queuejobstate:                 HeadOfLine
  Sender:                        before ScheduleNext - setHOL
  State:                         Pending
  Systempriority:                100
Events:                          <none>

quota tree:

apiVersion: ibm.com/v1
kind: QuotaSubtree
metadata:
  name: context-root
  namespace: kube-system
  labels:
    tree: quota_context
spec:
  children:
    - name: context-root
      quotas:
        requests:
          cpu: 8000
          memory: 24Gi
          nvidia.com/gpu: 1
---
apiVersion: ibm.com/v1
kind: QuotaSubtree
metadata:
  name: context-root-children
  namespace: kube-system
  labels:
    tree: quota_context
spec:
  parent: context-root
  children:
    - name: alpha
      quotas:
        hardLimit: true
        requests:
          cpu: 8000
          memory: 24Gi
          nvidia.com/gpu: 1
    - name: beta
      quotas:
        hardLimit: false
        requests:
          cpu: 3000
          memory: 6Gi
          nvidia.com/gpu: 1
    - name: gamma
      quotas:
        hardLimit: true
        requests:
          cpu: 3000
          memory: 6Gi
          nvidia.com/gpu: 4

sample AW-1:

apiVersion: mcad.ibm.com/v1beta1
kind: AppWrapper
metadata:
  name: batch-job-1
  namespace: namespace-1
  labels:
    quota_context: "beta"
spec:
  service:
    spec: {}
  priority: 100
  resources:
    metadata: {}
    GenericItems:
    - replicas: 1
      completionstatus: Complete
      custompodresources:
      - replicas: 1
        requests:
          cpu: 8000m
          memory: 15Gi
        limits:
          cpu: 8000m
          memory: 15Gi
      generictemplate:
        apiVersion: batch/v1
        kind: Job
        metadata:
          name: batch-job-1
          labels:
              appwrapper.mcad.ibm.com: batch-job-1
          namespace: namespace-1
        spec:
          parallelism: 1
          completions: 1
          template:
            metadata:
              labels:
                appwrapper.mcad.ibm.com: batch-job-1
              namespace: namespace-1
            spec:
              containers:
              - name: batch-job-1
                image: ubi8-minimal:latest
                command: [ "/bin/bash", "-c", "--" ]
                args: [ "sleep infinity" ]
                resources:
                  requests:
                    memory: 15Gi
                    cpu: "8000m"
                  limits:
                    memory: 15Gi
                    cpu: "8000m"
              restartPolicy: Never

sample AW-2:

apiVersion: mcad.ibm.com/v1beta1
kind: AppWrapper
metadata:
  name: batch-job-2
  namespace: namespace-2
  labels:
    quota_context: "alpha"
spec:
  service:
    spec: {}
  priority: 100
  resources:
    metadata: {}
    GenericItems:
    - replicas: 1
      completionstatus: Complete
      custompodresources:
      - replicas: 1
        requests:
          cpu: 3000m
          memory: 3Gi
          nvidia.com/gpu: 0
        limits:
          cpu: 3000m
          memory: 3Gi
          nvidia.com/gpu: 0
      generictemplate:
        apiVersion: batch/v1
        kind: Job
        metadata:
          name: batch-job-2
          labels:
              appwrapper.mcad.ibm.com: batch-job-2
          namespace: namespace-2
        spec:
          parallelism: 1
          completions: 1
          template:
            metadata:
              labels:
                appwrapper.mcad.ibm.com: batch-job-2
              namespace: namespace-2
            spec:
              containers:
              - name: batch-job-2
                image: ubi8-minimal:latest
                command: [ "/bin/bash", "-c", "--" ]
                args: [ "sleep infinity" ]
                resources:
                  requests:
                    memory: 3Gi
                    cpu: "3000m"
                  limits:
                    memory: 3Gi
                    cpu: "3000m"
              restartPolicy: Never

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions