torre-operate-external-job-batches

star 0

Use when a user wants to publish dozens or thousands of external jobs from a browser workflow, a listing page, or a large link set and needs durable queue tracking, resumable progress, and reporting across a long-running Torre ingest run.

torre-labs By torre-labs schedule Updated 6/11/2026

name: torre-operate-external-job-batches description: Use when a user wants to publish dozens or thousands of external jobs from a browser workflow, a listing page, or a large link set and needs durable queue tracking, resumable progress, and reporting across a long-running Torre ingest run.

Torre Operate External Job Batches

Overview

Use this skill for long-running batch publication runs.

This skill does not replace per-job selection or resolution. It gives those steps a durable operating model so the work can continue across 200, 1000, or 2000 jobs without losing state.

Use these companion skills inside the run:

  • torre-select-external-jobs
  • torre-resolve-external-job-context
  • torre-post-external-jobs
  • a browser capability chosen jointly by the user and the agent when the run starts from browser interaction

When to Use

  • The user wants to publish many jobs from one or more sources
  • The source is a browser flow or a listing page with many candidate roles
  • The run may outlive one short interaction
  • The user needs a resumable queue or an operator report
  • The backend load matters and polling/submission must be paced

Batch Pattern

1. Choose the intake mode first

Ask this kickoff question:

Do you want me to start from a browser flow or from links I can inspect directly?

If the answer is browser, immediately align on the browser path:

Do we want to use your connected browser, a browser automation tool available in this agent, or switch to links I can inspect directly?

Choose:

  • browser: Use when the work requires navigation, expansion, filters, logins, or extracting links from interactive pages.
  • link: Use when the user already has a listing URL, a company jobs page, or a link set that can be parsed directly.

If the user chooses browser intake:

  • do not assume a specific browser tool
  • prefer the browser capability that is actually available and preferred in the current environment
  • only mention playwright when it is the chosen browser tool for this run
  • if neither shared browser access nor a suitable browser tool is available, fall back to link-based intake

2. Initialize a durable run folder

Create one folder per run:

output/torre-batches/<run-id>/

Recommended run-id shape:

YYYYMMDD-HHMMSS-<source-slug>

Minimum files:

  • run.md
  • queue.jsonl
  • report.md

Optional support folder:

  • artifacts/

3. Persist the run configuration

Write run.md with:

  • run goal
  • intake mode
  • source URLs
  • selection filters
  • TORRE_API_URL, defaulting to https://crawl.torre.ai/api
  • default_sharer_gg_id
  • optional default_subtorre
  • submission_interval_ms
  • max_in_flight_submissions
  • rate_limit_cooldown_ms
  • polling interval
  • notes about chunk size or concurrency

If sharer_gg_id matters and the user has not provided one, surface that option before the first submission phase.

4. Build a queue instead of holding everything in memory

Use queue.jsonl with one current-state row per job.

Minimum fields per row:

  • queue_id
  • source_type
  • source_url
  • company_name
  • job_title
  • canonical_company_url
  • canonical_job_url
  • crawled
  • request_id
  • status
  • attempts
  • resolve_request_id
  • fallback_request_id
  • fallback_strategy
  • browser_snapshot_path
  • fallback_blocked_reason
  • last_error
  • last_resolve_error
  • last_fallback_error
  • last_status_check_at

Use crawled as a nullable boolean in the queue. Keep it null when the source/operator did not specify it, because the ingest API defaults omitted crawled values to true.

Recommended statuses:

discovered -> selected -> resolved -> ready -> submitted -> polling -> fallback_ready -> fallback_submitted -> fallback_polling -> posted|skipped|failed|manual_review

5. Process the queue by phase

Discovery

  • capture candidate rows from browser or links
  • write them as discovered

Selection

  • reduce the candidate set using torre-select-external-jobs
  • move approved rows to selected

Resolution

  • resolve company identity and canonical URLs with torre-resolve-external-job-context
  • move clean rows to resolved or ready

Submission

  • submit only ready rows
  • assign a stable request_id per submission
  • default to one in-flight submission at a time
  • wait at least 5000ms between submit attempts unless the user has confirmed a safer backend-specific limit
  • when a submit returns 429 or a transient 5xx, keep the row retryable and pause all submissions for at least 65000ms
  • move rows to submitted

Polling

  • move async rows to polling
  • when the resolve request succeeds, update terminal outcomes as posted or skipped
  • when the resolve request fails but the source is still trustworthy, move the row to fallback_ready instead of failed
  • when the resolve request returns completed_with_skips with terminal_reason: "insufficient_strengths" and the source is still trustworthy, move the row to fallback_ready instead of treating the skip as final
  • update failed or manual_review only after the fallback path has also been evaluated
  • do not treat bulk retries of resolve_and_publish as fallback; they are still first-path retries
  • do not make more than two resolve_and_publish attempts for the same canonical job. After the second recoverable resolve outcome, use direct fallback or mark manual_review

Fallback

  • for every fallback_ready row, run browser/source remediation before building the fallback payload:
    • open the canonical job URL in Chrome, a connected browser, or the browser tool available in the current agent
    • capture the final URL, page title, visible job text, HTML, and structured job data such as JSON-LD when available
    • save the evidence under artifacts/ and write its path to browser_snapshot_path
    • if browser access is blocked, try the public ATS/API source only when it returns the full job description
    • if neither source exposes enough role content, set fallback_blocked_reason and move the row to manual_review
  • use torre-post-external-jobs fallback rules to build company.direct_publish, job.direct_publish, or both from the remediated evidence
  • for place/location validation failures, build the fallback with an explicit valid place
  • for insufficient-strengths skips, build the fallback with explicit source-backed opportunity.strengths; do not submit strengths: []
  • assign a new fallback_request_id; never reuse resolve_request_id with a different body
  • move rows to fallback_submitted or fallback_polling
  • preserve the first-pass failure in last_resolve_error
  • preserve fallback failures separately in last_fallback_error
  • count first-pass resolve effectiveness and final effectiveness separately in report.md
  • a batch is not complete while any row remains in fallback_ready, fallback_submitted, or fallback_polling
  • if more than 10% of selected jobs fail first-pass resolve, pause broad retries and run the browser remediation pass on a representative chunk before continuing

6. Checkpoint after every chunk

Do not wait until the end of the run.

After each chunk, update:

  • queue.jsonl
  • report.md
  • run.md notes when the operating plan changes

This is what makes the run resumable.

7. Resume from persisted state

When the run restarts:

  • reload queue.jsonl
  • skip terminal rows
  • continue only rows still in non-terminal states
  • preserve each row's request_id, attempts, and latest known error context

Queue File Pattern

Example row:

{
  "queue_id": "hn-20260421-0001",
  "source_type": "listing_page",
  "source_url": "https://news.ycombinator.com/jobs",
  "company_name": "Acme Labs",
  "job_title": "Senior Backend Engineer",
  "canonical_company_url": "https://acme.com",
  "canonical_job_url": "https://jobs.acme.com/backend-engineer",
  "crawled": null,
  "request_id": "a6f5b64f-7a5f-4f1d-8e4c-0f26d6df4f52",
  "resolve_request_id": "a6f5b64f-7a5f-4f1d-8e4c-0f26d6df4f52",
  "fallback_request_id": null,
  "fallback_strategy": null,
  "browser_snapshot_path": null,
  "fallback_blocked_reason": null,
  "status": "polling",
  "attempts": 1,
  "last_error": null,
  "last_resolve_error": null,
  "last_fallback_error": null,
  "last_status_check_at": "2026-04-21T16:05:00Z"
}

Reporting Pattern

Keep report.md human-readable. Update counts such as:

  • discovered
  • selected
  • resolved
  • ready
  • submitted
  • polling
  • fallback_ready
  • fallback_submitted
  • fallback_polling
  • posted
  • skipped
  • failed
  • manual_review

Include these effectiveness metrics:

  • first-pass posted rate from resolve_and_publish
  • fallback attempted count
  • fallback posted rate
  • fallback blocked count and reasons
  • final posted rate after fallback

Also keep short sections for:

  • current source and filters
  • latest successful chunk
  • repeated failure reasons
  • repeated fallback failure reasons
  • browser/source remediation coverage
  • items needing manual follow-up

Rate and Load Guardrails

  • Treat a batch as a durable queue of paced API calls, not as permission to fire many requests at once.
  • Default max_in_flight_submissions to 1.
  • Default submission_interval_ms to 5000.
  • Default rate_limit_cooldown_ms to 65000.
  • Never poll the status endpoint faster than once per second.
  • Prefer 2000-5000ms between polls for large runs.
  • Keep submission concurrency intentionally conservative; increase it only when the user confirms the backend can absorb it.
  • Do not fire unbounded status loops against the same backend.
  • Do not run multiple batch workers against the same backend unless the user explicitly confirms the intended combined rate.
  • If any request returns 429, treat it as backend pressure, not as a job failure. Slow the whole run before retrying.
  • Stop polling a row as soon as it reaches a terminal state.

Batch API Pacing Example

Batch publishing should use the normal ingest API from a queue. Do not build one huge burst of parallel POST requests unless Torre.ai provides a dedicated bulk endpoint and the user confirms that endpoint should be used.

Use this operating shape for batch submissions:

const SUBMISSION_INTERVAL_MS = 5000;
const STATUS_POLL_INTERVAL_MS = 5000;
const RATE_LIMIT_COOLDOWN_MS = 65000;
const MAX_IN_FLIGHT_SUBMISSIONS = 1;

for (const row of readyRows) {
  await waitForAvailableSubmissionSlot(MAX_IN_FLIGHT_SUBMISSIONS);

  const response = await postJson(`${TORRE_API_URL}/crawling/ingest`, {
    request_id: row.request_id,
    company: row.company,
    job: row.job
  });

  if (response.status === 429) {
    markRetryable(row, response);
    await sleep(RATE_LIMIT_COOLDOWN_MS);
    continue;
  }

  recordSubmissionResult(row, response);
  await sleep(SUBMISSION_INTERVAL_MS);
}

Use this operating shape for status checks:

for (const row of pollingRows) {
  const response = await getJson(
    `${TORRE_API_URL}/crawling/ingest/status/${row.request_id}`
  );

  if (response.status === 429) {
    keepPollingLater(row, response);
    await sleep(RATE_LIMIT_COOLDOWN_MS);
    continue;
  }

  updateTerminalStateIfReady(row, response);
  await sleep(STATUS_POLL_INTERVAL_MS);
}

This example is intentionally sequential. If a run needs more throughput, first reduce repeated polling, then increase chunk size carefully, and only then consider more submission concurrency with explicit user confirmation.

Quick Reference

Situation Action
User gives one or a few jobs Stay in torre-post-external-jobs
User gives a listing with many jobs Start a batch run
Source requires clicks or browser state Use browser intake with the user-approved browser capability
Source is directly inspectable by URL Use link intake
Run is interrupted Resume from queue.jsonl
Backend is under pressure Slow polling and reduce chunk size
Any request returns 429 Pause the whole run, keep the row retryable, and retry later
Resolve fails but source is trustworthy Move row to fallback_ready and try direct fallback
Many rows fail with timeout, missing opportunity id, extraction failure, or validation errors Open failed jobs in browser/source remediation and build direct payloads

Common Mistakes

  • Starting a 200+ job run without a persistent queue file
  • Keeping progress only in the chat context
  • Mixing source URLs and canonical URLs
  • Polling every request too aggressively
  • Treating a batch as parallel API submission by default
  • Retrying immediately after a 429
  • Reprocessing already terminal rows on resume
  • Publishing before the queue has a confirmed selected set
  • Assuming playwright is always the right browser path
  • Treating URL rewriting or another resolve retry as the pass-through fallback
  • Ending the run with recoverable failed rows that were never opened in browser/source remediation
Install via CLI
npx skills add https://github.com/torre-labs/skills --skill torre-operate-external-job-batches
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator