nrp-k8s - SKILL.md Agent Skill

name: nrp-k8s description: "Deploy and manage workloads on the NRP Nautilus Kubernetes cluster. Covers batch jobs (opportunistic priority class, resource requests, GPU node avoidance), ingress with HAProxy CORS and timeout annotations, NRP usage policies, and credential wiring. TRIGGER when the user mentions: kubectl, k8s, kubernetes, NRP, Nautilus, rollout, restart deployment, apply yaml, pod, job, namespace, ingress, or any cluster operation. Namespace is 'biodiversity'. Always load this skill BEFORE running any kubectl command." license: Apache-2.0

NRP Kubernetes

Shared academic cluster. Namespace: biodiversity. Usage policies — key rules: no sleep in jobs, resource requests must reflect actual usage (~20% tolerance), no interactive pods >6h.

Batch Job Requirements

All CPU jobs need priorityClassName: opportunistic, explicit resource requests=limits, and restartPolicy: Never. GPU node avoidance is recommended:

apiVersion: batch/v1
kind: Job
metadata:
  name: my-job
spec:
  backoffLimit: 2
  ttlSecondsAfterFinished: 10800
  template:
    spec:
      priorityClassName: opportunistic
      restartPolicy: Never
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: feature.node.kubernetes.io/pci-10de.present
                    operator: NotIn
                    values: ["true"]
      containers:
        - name: worker
          image: ghcr.io/boettiger-lab/datasets:latest
          command: ["bash", "-c", "echo hello"]
          resources:
            requests:
              cpu: "4"
              memory: "8Gi"
            limits:
              cpu: "4"
              memory: "8Gi"

Add ephemeral-storage: "250Gi" to requests/limits if scratch disk is needed.

Secrets

S3 credentials (aws secret):

env:
  - name: AWS_ACCESS_KEY_ID
    valueFrom: {secretKeyRef: {name: aws, key: AWS_ACCESS_KEY_ID}}
  - name: AWS_SECRET_ACCESS_KEY
    valueFrom: {secretKeyRef: {name: aws, key: AWS_SECRET_ACCESS_KEY}}

Rclone config (rclone-config secret):

volumeMounts:
  - name: rclone-config
    mountPath: /root/.config/rclone
    readOnly: true
volumes:
  - name: rclone-config
    secret:
      secretName: rclone-config

Ingress

NRP uses HAProxy (not nginx). TLS is cluster-terminated — just list the hostname, no cert secret needed.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    haproxy-ingress.github.io/cors-enable: "true"
    haproxy-ingress.github.io/cors-allow-origin: "*"
    haproxy-ingress.github.io/cors-allow-methods: "GET, POST, OPTIONS"
    haproxy-ingress.github.io/cors-allow-headers: "DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization,mcp-session-id"
    haproxy-ingress.github.io/cors-allow-credentials: "true"
    haproxy-ingress.github.io/cors-max-age: "86400"
    haproxy-ingress.github.io/timeout-server: "600s"
    haproxy-ingress.github.io/timeout-tunnel: "3600s"  # required for SSE/WebSocket
spec:
  ingressClassName: haproxy
  tls:
    - hosts:
        - my-service.nrp-nautilus.io
  rules:
    - host: my-service.nrp-nautilus.io
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-service
                port:
                  number: 80

Add mcp-session-id to cors-allow-headers for MCP servers.

Dedicated Node

stratus1.nrp-espm.berkeley.edu is our Berkeley-owned node — use when jobs get preempted on shared nodes. It carries a nautilus.io/issue taint; tolerate it with operator: Exists:

nodeSelector:
  kubernetes.io/hostname: stratus1.nrp-espm.berkeley.edu
tolerations:
  - key: "nautilus.io/issue"
    operator: Exists
    effect: NoSchedule

To discover taints on other nodes, check the scheduler error from a pending pod: kubectl -n biodiversity describe pod <pod> — the message lists every untolerated taint verbatim. Use exact key/value pairs (not operator: Exists globally — the admission webhook may reject it).

Deployment Rollouts

Always kubectl apply ConfigMaps before restarting — rollout restart recycles pods using whatever is already in the cluster; git push alone does nothing:

kubectl apply -f k8s/my-configmap.yaml
kubectl -n biodiversity rollout restart deployment/<name>

Stuck rollout? If rollout status hangs >2 min, the pod likely landed on a broken node. Check with kubectl -n biodiversity get pods -o wide and describe pod. If it's a node issue (not your image), delete the stuck pod — it reschedules, and the old pod stays live throughout (maxUnavailable: 0):

kubectl -n biodiversity delete pod <stuck-pod>

Common Pitfalls

No priorityClassName: opportunistic — required for all CPU jobs
Missing resource requests/limits — pod won't schedule
sleep in batch jobs — policy violation, grounds for ban
rollout restart without kubectl apply — config changes won't take effect
nginx ingress annotations — NRP uses HAProxy; nginx annotations are silently ignored
Max 200 completions per indexed job — hard cluster limit