oauth

star 33

OAuth2-proxy authentication setup, configuration, and troubleshooting for the K3s cluster.

gilesknap By gilesknap schedule Updated 6/6/2026

name: oauth description: OAuth2-proxy authentication setup, configuration, and troubleshooting for the K3s cluster.

OAuth / Authentication

Architecture

  • Provider: GitHub via oauth2-proxy (lightweight, ~128Mi vs ~2GB for Authentik/Keycloak)
  • Flow: Cloudflare Access (wildcard *.gkcluster.org) → oauth2-proxy gateway → backend services
  • No native OIDC — Cloudflare Access + oauth2-proxy only (branch external-auth-n-z was parked)

Configuration

  • oauth2-proxy enabled in kubernetes-services/values.yaml
  • Reusable ingress sub-chart at kubernetes-services/additions/ingress/ supports oauth2_proxy annotation mode
  • Services behind OAuth get ingress annotations pointing to the oauth2-proxy auth endpoint

MCP servers need a different auth model — do NOT try to reuse oauth2-proxy

Non-browser MCP clients (claude.ai's connector, Claude Code --transport http) cannot authenticate through this cluster's standard pattern. Two reasons:

  1. Cloudflare Access challenges with Google SSO — solvable only in a browser.
  2. oauth2-proxy is cookie-based forward-auth — it's not an OAuth 2.1 authorization server, so MCP clients can't run a token flow against it.

The established pattern for MCP servers in this cluster (open-brain-mcp historically; thoth going forward) is:

  • The MCP server embeds an OAuth 2.1 authorization server (RFCs 8414 + 9728 + 7591 DCR + 7636 PKCE) and issues its own JWTs with GitHub as the upstream IdP.
  • The MCP hostname (brain.gkcluster.org, thoth.gkcluster.org) needs a Cloudflare Access bypass policy so the in-pod OAuth flow isn't shadowed by Google SSO. See /cloudflare skill.
  • The MCP server's 401 response on /mcp must include WWW-Authenticate: Bearer resource_metadata="<host>/.well-known/oauth-protected-resource" — without this header, MCP clients silently fail to discover the auth server and just give up.
  • Single-replica MCP pods can keep PKCE state and DCR client_ids in-memory; clients re-register transparently on pod restart.
  • Reference implementation: gilesknap/open-brain-mcp/oauth.py (provider-agnostic). When standing up the next MCP service, port the structure rather than reinventing.

Gotchas

  • Chrome caching can cause stale redirects/blank pages after auth changes — fix with chrome://settings/reset
  • SealedSecrets for oauth2-proxy credentials must match name+namespace exactly (encryption is bound)
  • Merging into existing secrets requires sealedsecrets.bitnami.com/managed: "true" annotation on target
  • ArgoCD Dex audiences are hardcodedserver.additional.audiences does nothing for Dex. Override the argo-cd client in dex.config with trustedPeers instead. See additions/argocd/README.md.
  • oidc.config disables Dex — having oidc.config in argocd-cm causes IsDexDisabled()=true. Use dex.config only.
  • Re-sealing secrets requires pod restart — pod env vars from secretKeyRef are read at startup. After rotating with just seal-argocd-dex <subcommand>, restart affected pods.
  • Dex secret needs ArgoCD label — the argocd-dex-secret SealedSecret template must include app.kubernetes.io/part-of: argocd label. Without it, ArgoCD's $secret:key resolution in dex.config silently fails, passing literal key names as OAuth client IDs (→ GitHub 404).
  • Dex static client secrets must all be in argocd-dex-secretdex.config uses $argocd-dex-secret:key for each static client. Missing keys silently resolve to empty, causing "Failed to get token from provider" on login. The rebuild path (scripts/seal-from-json) seals all clients (grafana, open-webui, argocd-monitor) along with the matching service-side secrets in one pass.
  • Sidecar oauth2-proxy cookie clash — the shared oauth2-proxy sets cookie-domain=.gkcluster.org, so its _oauth2_proxy cookie reaches all subdomains. Any service with its own oauth2-proxy sidecar (e.g. argocd-monitor) must use --cookie-name=<unique> to avoid validating the shared proxy's cookie with its own secret (→ infinite login loop).
  • oauth2-proxy email_domains must be [] — the Helm chart defaults to email_domains = ["*"], which allows any GitHub user through and silently overrides authenticatedEmailsFile. The fix is config.configFile with email_domains = []. This bug was found and fixed in PR #279 — do not remove the configFile override.
  • oauth2-proxy cookie-secret size — must be exactly 16, 24, or 32 bytes for the AES cipher. base64.b64encode(token_bytes(32)) produces 44 chars and crashes oauth2-proxy. Use secrets.token_hex(16) (32 hex chars = 32 bytes). This bug has regressed before — do not change the generation in scripts/seal-argocd-dex. See /sealed-secrets skill.
  • DEX duplicate argo-cd static client — ArgoCD auto-generates an argo-cd DEX client (without trustedPeers). Our dex.config also declares one (with trustedPeers: [argocd-monitor]). DEX v2.45+ stores the first and drops the duplicate, so trustedPeers never takes effect. Fixed in PR #297 by adding oidc.config with allowedAudiences: [argo-cd, argocd-monitor], which lets argocd-monitor authenticate as itself. The duplicate argo-cd client in dex.config is harmless but redundant — kept for clarity.
  • Dex/Grafana need restart after re-sealing — pods that read secrets via envFrom or secretKeyRef cache values at startup. After --tags cluster or just seal-argocd-dex, run just restart-dex and kubectl rollout restart sts grafana-prometheus -n monitoring. Without this, Dex reports "invalid client_secret" even though the Secret objects match. See /sealed-secrets for the full namespace list.
  • Ingress auth-url must be cluster-internal — the ingress sub-chart's auth-url uses the internal service (oauth2-proxy.oauth2-proxy.svc). Using the external domain resolves via Cloudflare to IPv6, which is unreachable from the cluster, causing intermittent 500s on all oauth2-protected ingresses.
  • Dex base URL redirects/api/dex 301s to /api/dex/ which returns
    1. OIDC clients that don't follow redirects (e.g. Open WebUI's authlib) need the full discovery URL: .well-known/openid-configuration.
  • Grafana 12.x requires [users].allow_sign_up — the per-provider allow_sign_up under [auth.generic_oauth] is not sufficient alone. Also set [auth].disable_signup_form: true to block manual signup.

Key Files

  • kubernetes-services/values.yaml — oauth2-proxy toggle and config
  • kubernetes-services/additions/ingress/ — reusable ingress with auth modes
  • kubernetes-services/templates/ — ArgoCD Application CRDs
Install via CLI
npx skills add https://github.com/gilesknap/tpi-k3s-ansible --skill oauth
Repository Details
star Stars 33
call_split Forks 6
navigation Branch main
article Path SKILL.md
More from Creator