name: connect-colab-local-tts description: Launch a local-tts-on-google-colab OpenAI-compatible TTS server through Colab MCP Go, expose it with trycloudflare, then validate it from @aituber-onair/voice using the Node.js openaiCompatible example. Use when requests mention connecting Colab local TTS, local-tts-on-google-colab, Colab MCP Go, trycloudflare TTS URLs, or running @aituber-onair/voice against a Colab OpenAI-compatible speech endpoint.
Connect Colab Local TTS
Goal
Use Colab MCP Go to drive Google Colab, launch
shinshin86/local-tts-on-google-colab, capture its public
trycloudflare URL, and verify speech generation through
@aituber-onair/voice.
This skill consumes an existing OpenAI-compatible Colab TTS project. Do not rewrite TTS wrappers unless the user explicitly asks for wrapper development.
Inputs
Collect or infer:
repo_url: defaulthttps://github.com/shinshin86/local-tts-on-google-colab.gitrepo_ref: defaultmain; prefer a tag or commit for reproducible runsengine: defaultIrodori-TTSwhen the user does not choose onemodel: OpenAI-compatible model id; default to the selected enginevoice: optional; leave empty for engines with no useful voice selectortext: Japanese smoke text to synthesize
Good sample engines:
Irodori-TTS: useful Japanese default candidatePiper-Plus: lightweight multilingual candidateMOSS-TTS-v1.5: notable Hugging Face model, but A100-class Colab is usually required
Procedure
- Confirm Colab MCP Go is available.
- Use
open_colab_browser_connectionwhen the Colab session is not connected. - Then call
list_colab_tools. - Do not assume exact remote tool names; choose the available tool that can create/replace notebook cell code and run it.
- Use
- If the run needs a GPU, gated model access, or
HF_TOKEN, ask the user to prepare those items in Colab before launching.- The user handles Google login, runtime selection, gated-model license acceptance, and token/secret setup.
- The agent should still explain the concrete missing prerequisite.
- Generate one Colab cell that launches the selected engine.
- Use
repo_url,repo_ref, andengine. - Keep
EXPOSE_PUBLIC_URL = True. - Pass engine-specific options only when the user chose them.
- Set
OPENAI_MODEL_IDtomodelwhen provided.
- Use
- Execute the cell through the Colab MCP tools and monitor output.
- Wait for
=== Public Ready ===. - Extract
Speech : https://...trycloudflare.com/v1/audio/speech. - If the MCP result only shows
CompletedProcess(...), inspect/content/openai-compatible-local-tts/logs/cloudflared.logand extract the latesthttps://...trycloudflare.comURL. - If the run fails, inspect the tail shown by the Colab output before changing settings.
- Wait for
- Validate the Colab server itself.
- Use the generated test curl from Colab output when possible.
- Confirm
GET /v1/modelsandPOST /v1/audio/speechwork before running the local voice package example.
- Validate
@aituber-onair/voicelocally.- Build the voice package before running the Node example.
- Run
packages/voice/examples/node-basic/openai-compatible-colab-example.jswith dynamic environment variables:
OPENAI_COMPATIBLE_TTS_URL="<speech_url>" \
OPENAI_COMPATIBLE_TTS_MODEL="<model>" \
OPENAI_COMPATIBLE_TTS_VOICE="<voice>" \
OPENAI_COMPATIBLE_TTS_TEXT="<text>" \
node packages/voice/examples/node-basic/openai-compatible-colab-example.js
Use OPENAI_COMPATIBLE_TTS_PLAY=0 when the environment cannot play audio but
WAV generation should still be verified.
Colab Cell Shape
Use the current canonical launch flow from
local-tts-on-google-colab. The generated cell should be equivalent to:
REPO_URL = "https://github.com/shinshin86/local-tts-on-google-colab.git"
REPO_REF = "main"
WORKDIR = "/content/local-tts-on-google-colab"
ENGINE = "Irodori-TTS"
EXPOSE_PUBLIC_URL = True
TEST_TEXT = "こんにちは。AITuber OnAir Voice から再生しています。"
TEST_SPEED = 1.0
TEST_VOICE = ""
OPENAI_MODEL_ID = "Irodori-TTS"
import os
import subprocess
if not os.path.exists(WORKDIR):
subprocess.run(["git", "clone", REPO_URL, WORKDIR], check=True)
subprocess.run(["git", "fetch", "--all", "--tags"], cwd=WORKDIR, check=True)
subprocess.run(["git", "checkout", REPO_REF], cwd=WORKDIR, check=True)
cmd = [
"python", "colab/bootstrap.py",
"--engine", ENGINE,
"--test-text", TEST_TEXT,
"--test-speed", str(TEST_SPEED),
"--openai-model-id", OPENAI_MODEL_ID,
]
if TEST_VOICE:
cmd.extend(["--test-voice", TEST_VOICE])
if EXPOSE_PUBLIC_URL:
cmd.append("--expose-public-url")
else:
cmd.append("--no-expose-public-url")
subprocess.run(cmd, cwd=WORKDIR, check=True)
For Piper-Plus, add --piper-plus-model <value> only when the user chooses
one. For MOSS-TTS-v1.5, warn that A100 is expected and pass
--moss-tts-v1-5-language, --moss-tts-v1-5-default-voice, or prompt WAV
options only when selected.
Verification Commands
Run from the repository root after a code change:
npm -w @aituber-onair/voice run fmt
npm -w @aituber-onair/voice run build
npm -w @aituber-onair/voice run test
For a live Colab endpoint:
OPENAI_COMPATIBLE_TTS_URL="<speech_url>" \
OPENAI_COMPATIBLE_TTS_MODEL="<model>" \
OPENAI_COMPATIBLE_TTS_VOICE="<voice>" \
OPENAI_COMPATIBLE_TTS_PLAY=0 \
node packages/voice/examples/node-basic/openai-compatible-colab-example.js
Failure Modes
- Colab MCP tools are unavailable: open the Colab browser connection and list tools again.
- No public URL appears: check
cloudflaredoutput and whether the server became ready locally. - Colab reports success but no detailed output: read the files under
/content/openai-compatible-local-tts/logs, especiallycloudflared.logand the engine-specific uvicorn log. MOSS-TTS-v1.5fails on L4/T4: retry on A100 or choose a lighter engine.POST /v1/audio/speechfails but/v1/modelsworks: inspect Colab logs and verify model, voice, prompt WAV, language, and gated-model prerequisites.- Node example saves no audio: confirm
OPENAI_COMPATIBLE_TTS_URLincludes/v1/audio/speech, and setOPENAI_COMPATIBLE_TTS_PLAY=0to isolate playback from synthesis.