claude-batch-ai

star 0

Fast batch AI processing using `claude -p` CLI with OAuth credentials. Use when: (1) you need programmatic Claude calls but have no API key, (2) batch processing hundreds of items with AI (titles, summaries, labels), (3) `claude -p` is slow (75s+ per call) due to MCP/tool loading overhead. Key insight: `--tools "" --disable-slash-commands` cuts startup from 75s to 9s. Covers parallel execution with ThreadPoolExecutor for throughput.

mcevoyinit By mcevoyinit schedule Updated 2/23/2026

name: claude-batch-ai description: | Fast batch AI processing using claude -p CLI with OAuth credentials. Use when: (1) you need programmatic Claude calls but have no API key, (2) batch processing hundreds of items with AI (titles, summaries, labels), (3) claude -p is slow (75s+ per call) due to MCP/tool loading overhead. Key insight: --tools "" --disable-slash-commands cuts startup from 75s to 9s. Covers parallel execution with ThreadPoolExecutor for throughput. author: Claude Code version: 1.0.0 date: 2026-02-11

Claude CLI Batch AI Processing

Problem

You need to make many programmatic Claude API calls (e.g., generating titles, summaries, labels for hundreds of items) but:

  1. No ANTHROPIC_API_KEY is available in the environment
  2. Claude Code uses OAuth credentials stored in the macOS keychain, not API keys
  3. Naive claude -p calls take 75+ seconds each due to MCP server and tool loading

Context / Trigger Conditions

  • Need batch AI processing (100+ items)
  • No API key set (ANTHROPIC_API_KEY env var is empty)
  • Claude Code is installed and authenticated via OAuth (claude.ai login)
  • Keychain shows Claude Code-credentials with claudeAiOauth tokens
  • Individual claude -p calls are unacceptably slow

Solution

1. Strip CLI overhead with flags

The critical optimization: disable all tools and skills to eliminate MCP server startup.

# SLOW (75s) — loads all MCP servers, tools, skills
echo "prompt" | claude -p --model haiku

# FAST (9s) — bare model call, no tool/MCP overhead
echo "prompt" | claude -p --model haiku --no-session-persistence --tools "" --disable-slash-commands

Flag breakdown:

  • --tools "" — Disables ALL tool loading (Bash, Edit, Read, MCP servers). This is the biggest win.
  • --disable-slash-commands — Skips skill/slash-command loading
  • --no-session-persistence — Don't save session to disk (prevents thousands of junk JSONL files)
  • --model haiku — Use cheapest/fastest model for simple tasks

2. Parallelize with ThreadPoolExecutor

import concurrent.futures
import subprocess

CLAUDE_BIN = str(Path.home() / ".local" / "bin" / "claude")

def call_claude(prompt):
    result = subprocess.run(
        [CLAUDE_BIN, "-p", "--model", "haiku",
         "--no-session-persistence", "--tools", "", "--disable-slash-commands"],
        input=prompt,
        capture_output=True,
        text=True,
        timeout=60,
    )
    if result.returncode != 0:
        raise RuntimeError(result.stderr.strip()[:200])
    return result.stdout.strip()

# 6 workers: ~0.5 items/sec throughput (9s per call / 6 parallel = 1.5s effective)
with concurrent.futures.ThreadPoolExecutor(max_workers=6) as executor:
    futures = {executor.submit(call_claude, prompt): item_id
               for item_id, prompt in work_items}
    for future in concurrent.futures.as_completed(futures):
        item_id = futures[future]
        result = future.result()
        # process result...

3. Performance characteristics

Workers Per-call Throughput 764 items
1 ~9s 0.11/s ~115 min
3 ~9s 0.33/s ~39 min
6 ~9s 0.5/s ~24 min
8 ~10s 0.7/s ~18 min

Beyond 8 workers, diminishing returns due to CPU contention from Node.js startup.

4. Clean model output

Haiku sometimes returns markdown formatting or preamble. Clean it:

import re
title = result.stdout.strip()
title = re.sub(r'\*+', '', title)        # Remove **bold**
title = re.sub(r'^#+\s*', '', title)     # Remove # headers
title = title.strip('"\'').rstrip('.')   # Remove quotes/periods
if '\n' in title:
    title = title.split('\n')[0].strip() # Take first line only

Verification

# Single call should complete in <15s
time echo "Say hello" | claude -p --model haiku --no-session-persistence --tools "" --disable-slash-commands

# Should output just the response, no tool calls or system messages

Example

Real-world usage: generating AI titles for 764 conversation sessions.

cd ~/.claude-conversations
python3 generate_titles.py --workers 6
# Result: 764 titles, 0 errors, 24.3 minutes, ~0.5 titles/sec

Notes

  • OAuth tokens are stored in macOS keychain under Claude Code-credentials
  • The --tools "" flag is the key insight — without it, each call loads all configured MCP servers (kubernetes, playwright, memory-keeper, etc.), adding 60+ seconds
  • For API-key users, the anthropic Python SDK is faster than CLI subprocess calls
  • Rate limiting is handled by the Claude infrastructure; no manual throttling needed
  • Each claude -p call spawns a full Node.js process — memory usage scales with workers
  • --no-session-persistence prevents generating thousands of JSONL transcript files
Install via CLI
npx skills add https://github.com/mcevoyinit/agentic-skills --skill claude-batch-ai
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator