gx10-offload

star 26

Offload inference, code generation, and batch processing to local GX10 DGX Spark (GB10 Blackwell) running Ollama

plurigrid By plurigrid schedule Updated 6/10/2026

name: gx10-offload description: Offload inference, code generation, and batch processing to local GX10 DGX Spark (GB10 Blackwell) running Ollama

GX10 Offload

Offload work to the local NVIDIA DGX Spark cluster node running Ollama with Devstral models on GB10 Blackwell GPU (128GB unified memory).

When to Use

  • Long code generation tasks that benefit from a dedicated local model
  • Batch processing of multiple prompts
  • Draft generation for review (speculative decoding pattern)
  • Tasks where latency to cloud APIs is a bottleneck
  • Privacy-sensitive inference that must stay on-premises

Connection

Property Value
Host (WiFi) 10.0.0.234 / gx10-94e2.local (mDNS)
Host (Tailscale) 100.67.53.87 (gx10-acee, different unit)
User a
Password aaaaaa
Ollama API http://localhost:11434 on the device
SSH tunnel ssh -L 11434:localhost:11434 a@10.0.0.234

Note: The WiFi-connected unit is gx10-94e2 (discovered via mDNS). The Tailscale-reachable unit is gx10-acee (different physical Spark).

Available Models

Model Size Use Case
devstral 14GB Fast coding tasks, lightweight generation
devstral-2:123b 74GB Heavy reasoning, complex code generation
devstral2-4k 74GB Same as above, 4k context window

Quick Usage

Single prompt via SSH

sshpass -p 'aaaaaa' ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no \
  a@100.67.53.87 \
  "curl -s http://localhost:11434/api/generate -d '{\"model\":\"devstral\",\"prompt\":\"YOUR_PROMPT\",\"stream\":false}'"

Via SSH tunnel (persistent)

# Open tunnel in background
sshpass -p 'aaaaaa' ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no \
  -fNL 11434:localhost:11434 a@100.67.53.87

# Then use locally as if Ollama were running here
curl http://localhost:11434/api/generate \
  -d '{"model":"devstral","prompt":"Hello","stream":false}'

OpenAI-compatible API

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "devstral",
    "messages": [{"role": "user", "content": "Write a Python function to sort a list"}]
  }'

Using the offload script

# Simple prompt
~/.claude/skills/gx10-offload/scripts/offload.sh "Write a Rust function for binary search"

# With specific model
~/.claude/skills/gx10-offload/scripts/offload.sh "Explain monads" devstral-2:123b

# Batch mode (one prompt per line)
~/.claude/skills/gx10-offload/scripts/offload.sh --batch prompts.txt

Offload Patterns

1. Draft-and-Review

Offload draft generation to GX10, then review/refine with Claude:

# GX10 generates draft
DRAFT=$(~/.claude/skills/gx10-offload/scripts/offload.sh "Implement a Redis cache wrapper in Python with TTL support")
# Claude reviews and improves the draft

2. Batch Code Generation

Generate multiple implementations in parallel on GX10:

for task in "sort" "search" "hash" "tree"; do
  ~/.claude/skills/gx10-offload/scripts/offload.sh "Implement $task in Rust" &
done
wait

3. Test Generation

Offload test writing to the local model:

~/.claude/skills/gx10-offload/scripts/offload.sh "Write pytest tests for: $(cat src/main.py)"

Device Status Check

~/.claude/skills/gx10-offload/scripts/offload.sh --status

Ensure Ollama is Running

sshpass -p 'aaaaaa' ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no \
  a@100.67.53.87 'pgrep ollama || nohup ollama serve > /tmp/ollama.log 2>&1 &'

Hardware

  • GPU: NVIDIA GB10 Blackwell (DGX Spark)
  • Memory: 128GB unified (Grace-Blackwell architecture)
  • CPU: 20-core Grace ARM64
  • OS: Ubuntu 24.04 aarch64, kernel 6.14-nvidia
  • PyTorch: 2.10.0 with CUDA
  • Disk: 510GB free

GF(3) Assignment

Trit Role Description
+1 PLUS Generator - produces code/text offloaded from Claude

Conservation triad: gx10-offload (+1) + tailscale (0) + skill-creator (-1) = 0

Install via CLI
npx skills add https://github.com/plurigrid/asi --skill gx10-offload
Repository Details
star Stars 26
call_split Forks 8
navigation Branch main
article Path SKILL.md
More from Creator