name: build-paths-intermediate description: Scaffold a Grok-4 tool-calling agent over an Oracle schema using langchain-oracledb + oracle-database-mcp-server + in-DB ONNX embeddings (registered MiniLM model, no external embedding API) + Open WebUI. For users who've built RAG before and want to rebuild it on the production-feeling Oracle stack. inputs: - target_dir: where to scaffold (default = current working directory; ask if it isn't empty) - topic: optional; one of intermediate/project-ideas.md, or a free-text pitch
The user picked the intermediate path. They've built RAG and chatbots before. Your job is to introduce them to two new ideas at once: (a) an LLM agent that calls live SQL via oracle-database-mcp-server, and (b) embeddings that happen inside the database via a registered ONNX model. The stack is production-shaped: OCI GenAI Grok 4, in-DB ONNX, Open WebUI. No Ollama, no external embedding API.
Step 0 — Read these references first
shared/references/sources.mdshared/references/oracle-26ai-free-docker.mdshared/references/langchain-oracledb.mdshared/references/oci-genai-openai.md← Pattern 1 SigV1 authshared/references/onnx-in-db-embeddings.md← load-bearing for embeddingsshared/references/oracledb-python.mdshared/references/ai-vector-search.mdshared/references/hybrid-search.md(idea 3 specifically)shared/references/exemplars.mdintermediate/project-ideas.mdskills/oracle-aidb-docker-setup/SKILL.mdskills/langchain-oracledb-helper/SKILL.mdskills/oracle-mcp-server-helper/SKILL.md
Step 1 — Interview
Run shared/interview.md. For intermediate specifically:
- Q3 (DB target) — default to local Docker. Allow "already-running container" if user says so.
- Q4 (Inference) — not optional at this tier. OCI GenAI for the LLM (
xai.grok-4via the OpenAI-compat bearer-token endpoint atus-phoenix-1). In-DB ONNX for embeddings. Confirm:OCI_GENAI_API_KEY(ask-...value) is set or about to be added to project.env. If absent, stop and ask the user to generate one in the OCI GenAI service console. No~/.oci/config/ no compartment OCID needed.- Default endpoint is
https://inference.generativeai.us-phoenix-1.oci.oraclecloud.com; override viaOCI_GENAI_BASE_URL. - In-DB ONNX model: default =
sentence-transformers/all-MiniLM-L6-v2, registered asMY_MINILM_V1(384 dim). The user does not need to download this themselves — the skill scaffolds viaonnx2oracleCLI (one command).
- Q5 (Topic) — one of the three from
intermediate/project-ideas.md. Map free-text pitches; default to idea 1 (NL2SQL). - Q6 (Notebook) — default yes.
- Q7 (intermediate-only) — sql_mode for MCP? —
read_only(default — covers all three idea shapes safely) orread_write. Idea 1 and idea 2 are read-only. Idea 3 can be either. Capture an explicityifread_writeselected.
Print confirmation block. Wait for y.
Step 2 — Resolve choices
| Variable | Value |
|---|---|
project_slug |
derived from topic |
package_slug |
snake_case |
embedder |
in-db-onnx |
embedding_dim |
384 |
onnx_model_local_id |
sentence-transformers/all-MiniLM-L6-v2 |
onnx_model_db_name |
MY_MINILM_V1 |
llm_model |
grok-4 (or chosen fallback) |
oci_base_url |
https://inference.generativeai.us-phoenix-1.oci.oraclecloud.com (the OpenAI client appends /v1; do not add the legacy /20231130/actions/openai path — that is for SigV1, not bearer-token) |
collections |
per-idea: idea 1 → ["CONVERSATIONS"] only; idea 2 → ["SCHEMA_DOCS_DOCUMENTS", "CONVERSATIONS"]; idea 3 → ["INVOICES_DOCS", "CONVERSATIONS"] |
mcp_sql_mode |
read_only (default) |
mcp_allowed_tools |
per-idea (see below) |
notebook |
yes |
Step 3 — Scaffold
Order matters: building-block skills first, then project code.
3a — Foundation via building-block skills
- Refuse if
target_diris non-empty. - Invoke
skills/oracle-aidb-docker-setup. Block until OK. - Append the Open WebUI service to the generated
docker-compose.yml(same as beginner SKILL step 3a-3). - Register the in-DB ONNX model via
onnx2oracleCLI before invoking the langchain helper, since the helper's dim assertion needs the model registered:- Add
onnx2oracleto the project'spyproject.tomldeps. - Install:
~/miniconda3/envs/<env>/bin/pip install onnx2oracle. - Run:
onnx2oracle load all-MiniLM-L6-v2 --name MY_MINILM_V1 --dsn "$DB_USER/$DB_PASSWORD@$DB_DSN" --force— outputsMY_MINILM_V1registered in the DB. - Smoke:
SELECT VECTOR_EMBEDDING(MY_MINILM_V1 USING 'test' AS data) FROM dualreturns a 384-vector. If not, stop and surface the loader error (most common: missing GRANTs — seeshared/references/onnx-in-db-embeddings.md"Required GRANTs"). - The required GRANTs (
CREATE MINING MODEL,EXECUTE ON SYS.DBMS_VECTOR) are issued byoracle-aidb-docker-setupStep 6. If they're missing, the docker-setup didn't run fully — fix that first.
- Add
- Invoke
skills/langchain-oracledb-helper. Passtarget_dir,package_slug,embedder=in-db-onnx(the helper writes theInDBEmbeddingssubclass),collections=...,has_chat_history=True. Block until OK. - Invoke
skills/oracle-mcp-server-helper. Passtarget_dir,package_slug,sql_mode=...,allowed_tools=.... Block until OK. Tool list per idea:- Idea 1:
[list_tables, describe_table, run_sql] - Idea 2:
[list_tables, describe_table, describe_schema, run_sql, vector_search] - Idea 3:
[run_sql, vector_search](the agent doesn't need to discover tables — they're known)
- Idea 1:
3b — Per-idea seeding
- Idea 1 (NL2SQL with seeded fake data). Generate
migrations/100_seed_dummy.sql— 10 tables (customers, orders, products, employees, suppliers, invoices, payments, regions, categories, returns), populated viaFakerfromscripts/seed_faker.py. ~50K rows. Run during bootstrap. - Idea 2 (Schema doc Q&A). Reuse the seed schema from idea 1 if the user wants; otherwise expect them to point at their real schema.
- Idea 3 (Hybrid retrieval). Generate
INVOICE_PDFS/folder viascripts/seed_invoice_pdfs.py(usesreportlabto make 20 fake invoice PDFs). Runingest.pyonce at bootstrap to embed them intoINVOICES_DOCSvia in-DB embeddings. Plus the seed schema from idea 1.
3c — Project-specific code (the only files this skill writes itself)
target_dir/.gitignore— extend withdata/,INVOICE_PDFS/,*.onnx,scripts/__pycache__/.target_dir/pyproject.toml— start fromshared/templates/pyproject.toml.template(do NOT hand-roll one). The template already pins the correct build backend (setuptools.build_meta, NOTsetuptools.backends.legacy:build— that name does not exist insetuptools>=68andpip install -e .will fail withModuleNotFoundErroron a fresh venv; v3 friction P0-V3-N4). Then extenddependencieswith:- Always:
fastapi>=0.110,uvicorn[standard]>=0.27,langchain-core>=0.3,langchain-community>=0.3,langchain-openai>=0.2,langgraph>=0.2,openai>=1.40,onnx2oracle,Faker>=24,python-multipart. - Idea 1: + (no extras).
- Idea 2: + (no extras).
- Idea 3: +
reportlab>=4,pypdf>=4. - Do NOT add
oci-openai,oci,oracle-database-mcp-server, or hand-rolled ONNX deps. Those are friction P0-1 / P0-2 / P0-4 — superseded by the bearer-token + local-BaseTool +onnx2oraclepaths. - Imports use the installed package name, not the on-disk path. Even though sources live under
src/<package_slug>/, the[tool.setuptools.packages.find] where = ["src"]line in the template meanspip install -e .installs the package as<package_slug>(nosrc.prefix). Always importfrom <package_slug>.foo import bar—from src.<package_slug>.fooraisesModuleNotFoundError: No module named 'src'(v3 friction P1-V3-F-3).
- Always:
src/<package_slug>/inference.py— copyshared/snippets/oci_chat_factory.pyverbatim. It uses the upstreamopenaiSDK against the OCI Generative AI bearer-token endpoint (OCI_GENAI_BASE_URLdefaults tous-phoenix-1, auth viaOCI_GENAI_API_KEY). Model id is the fullxai.grok-4. The earlier OCI-SDK SigV1 path is inarchive/only.Per-idea agent module — IMPORTANT, read both warnings before writing code:
Warning A — LangChain 1.x removed
AgentExecutorandcreate_tool_calling_agent. They were inlangchain.agentsin 0.3.x; in 1.x the agent loop has moved to LangGraph (langgraph.prebuilt.create_react_agent) or to plain.bind_tools()+ manual loop. Do NOT import them fromlangchain.agents—ImportErroron a fresh venv (v3 friction P0-V3-N2).Warning B — Grok-4 over the OCI OpenAI-compat endpoint stops emitting structured
tool_callsafter ~2 turns. On the 3rd+ tool call it returns plain text likeFunction: run_sql({"query": "..."})instead of an OpenAI-shapetool_callsobject, which LangGraph + LangChain agents cannot parse. The reliable shape at this tier is therefore a 2-step pipeline, not an open agent loop (v3 friction P0-V3-N3):# src/<package_slug>/agent.py — 2-step pipeline (canonical) from <package_slug>.inference import get_chat_client from <package_slug>.tool_registry import get_tools def answer(user_q: str) -> dict: tools = get_tools() # local BaseTool subclasses from shared/snippets llm = get_chat_client() # Step 1: LLM picks ONE tool + args (single tool_call — reliable) plan = llm.bind_tools(tools).invoke([{"role": "user", "content": user_q}]) # Step 2: execute, then synthesise (no further tool turns) results = [t.run(call["args"]) for call in plan.tool_calls for t in tools if t.name == call["name"]] final = llm.invoke([ {"role": "user", "content": user_q}, {"role": "assistant", "content": str(plan.tool_calls)}, {"role": "tool", "content": "\n".join(map(str, results))}, ]) return {"answer": final.content, "tool_calls": plan.tool_calls, "tool_results": results}The 2-step pipeline produces grounded answers + the SQL/tool args used, which is what the demo needs. If a multi-step loop is essential (e.g. idea 3's "vector then SQL then both" routing), split it into multiple top-level
answer()calls and orchestrate from the FastAPI adapter — never let the LLM drive >2 tool turns in one call. The intermediate v3 cold-start walk proved this.- Idea 2 →
src/<package_slug>/generate.py(one-shot script that walks the schema and INSERTs rows intoSCHEMA_DOCS_DOCUMENTSwith embeddings viaVECTOR_EMBEDDING(MY_MINILM_V1 USING :description)) +src/<package_slug>/agent.py(RAG over the generated docs viavector_searchMCP tool). - Idea 3 →
src/<package_slug>/agent.pywith a system prompt that explicitly teaches the agent the two-modality choice (vector for "find similar invoices to this PDF", run_sql for "sum unpaid amounts", both for "find unpaid invoices similar to X").
- Idea 2 →
src/<package_slug>/adapter.py— FastAPI/v1/chat/completionswrapping the agent (same shape as beginner; differences: handles tool-call streaming events from the agent executor, surfaces them as OpenAI-compatible "function_call" deltas).SQLcl-tee logging (folded in by default at this tier — friction-pass decision). Wrap the
run_sqlBaseTool withshared/snippets/sqlcl_tee.pyso every SQL the agent emits gets teed through SQLcl into<target>/logs/sqlcl_<ts>.log. The wrapper appends[sqlcl_log: <path>]to the streamed response. Setup:- Pre-flight: check that SQLcl is installed before scaffolding the wiring. Run
which sql(orcommand -v sql); if not on PATH, follow the install steps inshared/references/sqlcl-tee.md(~/opt/sqlcl) BEFORE writing the wiring. Do NOT assume/home/ubuntu/sqlcl/bin/sqlor any other host-specific path is present (v3 friction P2-V3-N5). The wrapper degrades gracefully when SQLcl is missing — it appends[sqlcl_tee: skipped — SQLcl not installed]and the inner tool result passes through — but the user loses the inspectable log, so install is strongly recommended. - In
src/<package_slug>/tool_registry.py, importfrom shared.snippets.sqlcl_tee import wrap_with_sqlcl_teeand wrap therun_sqltool that comes back frommcp_client.list_tools(). - Document SQLcl install in the project's README (link to
shared/references/sqlcl-tee.md). - Why MCP+SQLcl: MCP shows the SQL the agent emits; SQLcl shows what the DB actually did (rows, errors, plan). Together you can debug an agent turn end-to-end.
- Observability inherited from
oracle-mcp-server-helperSteps 4.5+4.6: everyrun_sqlcall goes out tagged/* LLM in use is <model> */, sessions populateV$SESSION.MODULE/ACTION, and (if the user opts in) one row is inserted intoCYP_MCP_LOGper call. README should mention the three diagnostic queries:SELECT module, action FROM v$session,SELECT * FROM v$sql WHERE sql_text LIKE '/* LLM in use is %', andSELECT * FROM CYP_MCP_LOG ORDER BY ts DESC FETCH FIRST 20 ROWS ONLY. - If the user has SQLcl 25.2+: mention in the README that
sql -mcpis a drop-in alternative for the local-tool transport (Oracle's first-party MCP server, ships withDBTOOLS$MCP_LOGnatively). Do not auto-switch — the local-tool scaffold remains the workshop default for portability.
- Pre-flight: check that SQLcl is installed before scaffolding the wiring. Run
verify.py— fill template:- Round-trip:
len(get_embedder().embed_query("dim check")) == 384. - Smoke: query the registered ONNX model directly via SQL.
- Smoke: list MCP tools — assert at least the per-idea allowed list is present.
- Smoke: a single chain call asking a simple question of the seeded data.
- Round-trip:
notebook.ipynb— 8 cells:- Setup (load
.env, smokeverify). - Show the registered ONNX model (
SELECT * FROM USER_MINING_MODELS WHERE MODEL_NAME='MY_MINILM_V1'). - Show MCP tools list.
- One direct
vector_searchMCP call. - One
run_sqlMCP call. - One full agent turn (idea-specific question).
- Show the chat history table populated.
- "Now run
python -m <pkg>.adapterand openhttp://localhost:3000."
- Setup (load
README.md— fill placeholders. "Why Oracle" paragraph names: in-DB ONNX embeddings, AI Vector Search, oracle-database-mcp-server, JSON Duality (idea 3), persistent chat history. Include the "Why in-DB embeddings?" callout fromintermediate/project-ideas.mdverbatim — it's the load-bearing pitch.
Step 4 — Verify
- DB is up (skill 1).
- ONNX model registered (step 3a-4).
- From
target_dir:python -m pip install -e .. python verify.py. Expectverify: OK (db, vector, inference, mcp).- Run notebook end-to-end:
jupyter nbconvert --to notebook --execute notebook.ipynb. Must complete clean. - Bring Open WebUI up. Boot adapter, hit
/v1/models, kill it. Don't keep it running. - On any failure, follow
shared/verify.mdrecovery loop, max 3 retries.
Step 5 — Polish for sharing
- README placeholders filled.
docs/— note: drop a 60s demo GIF showing tool-call traces.- Final report:
Done. project at: <target_dir> features used: in-DB ONNX (MY_MINILM_V1), oracle-database-mcp-server, OracleVS, OracleChatHistory run with: cd <target_dir> docker compose up -d python -m <pkg>.adapter # blocks; Open WebUI on :3000 verify: OK notebook: <target_dir>/notebook.ipynb (executed clean) ui: http://localhost:3000 next: record a 60s tool-call demo, push to GitHub.
Stop conditions
- OCI selected but
~/.oci/configmissing — stop, point atoci setup config. - ONNX export fails (BertTokenizer model only — SentencePiece will fail). Surface error, stop.
- ONNX model registers but its
embed_queryreturns dim ≠ 384. Drop the model, surface error. - MCP server fails to initialize within 30s — stop and surface stderr.
sql_mode=read_writewithout explicit usery.- Verify fails 3 times.
When to graduate to OAMP
If you grow this intermediate project into a multi-user agent — multiple humans, each wanting their preferences and durable facts auto-extracted and recalled across sessions — swap the manual chat history layer for OAMP (oracleagentmemory PyPI package). OAMP owns per-user threads, automatic memory extraction, and prompt-ready context cards; the advanced tier wires it via shared/snippets/oamp_helpers.py against the same in-DB ONNX embedder + Grok-4 you're using here. See shared/references/oamp.md for the OAMP-vs-OracleVS-vs-OracleChatHistory-vs-SQL decision tree. Until then, the manual chat history is correct for single-user demos.
What you must NOT do
- Don't bypass the metadata-as-string monkeypatch (
langchain-oracledb-helperincludes it; just don't remove the import). - Don't write raw
VECTOR_DISTANCESQL whenOracleVS.similarity_searchcovers it. - Don't introduce non-Oracle vector stores anywhere.
- Don't introduce Ollama as a fallback. OCI GenAI only at this tier.
- Don't introduce Cohere embeddings as a fallback. In-DB ONNX is the contract — the whole pedagogical point is "no external embedder."
- Don't pin a model that doesn't exist in the user's region without warning.
- Don't ship without the executed notebook.
- Don't claim done before verify is green AND the notebook runs clean AND the adapter boots.