name: xrblocks
description: >-
Guide to building AI + XR applications with the XR Blocks SDK (the xrblocks
package, source in src/). Use when writing, editing, or debugging WebXR /
Android XR / VR / AR / mixed-reality experiences on this framework — authoring
xb.Script classes, configuring xb.Options, and wiring spatial UI, hand
tracking, gestures, depth & occlusion, plane/object detection, physics,
spatial audio, or Gemini/OpenAI integration — and when running them in the
desktop simulator or on-device. Covers the canonical app skeleton, the real
enable* option methods, the Script lifecycle hooks, the global aliases
(xb.core, xb.user, xb.world, xb.ai...), and the most common
hallucinated-API mistakes to avoid. Read before generating XR Blocks code.
XR Blocks SDK
XR Blocks (import * as xb from 'xrblocks') is a cross-platform JavaScript/TypeScript
SDK for rapidly prototyping AI + XR apps. It is built on three.js,
targets Chrome 136+ with WebXR on Android XR, and ships a desktop simulator so the
same code runs in a normal browser. The whole SDK is re-exported from
src/xrblocks.ts — that barrel is the public API surface; if a symbol is
not exported there it is internal.
The single most important rule when generating code: only call APIs that exist. The framework's own evaluation found that hallucinated/inconsistent APIs are the #1 cause of broken generated apps. When unsure, grep
src/xrblocks.tsand the relevant subfolder, or copy a pattern from a real file insamples/,demos/, ortemplates/.
Canonical app skeleton
Every app is one or more xb.Script subclasses added before xb.init(). This is the
minimal, verified pattern (see templates/ and the README for full HTML):
import * as THREE from 'three';
import * as xb from 'xrblocks';
class MainScript extends xb.Script {
init() {
// called once; may be async
this.add(new THREE.HemisphereLight(0xffffff, 0x666666, 3));
const geometry = new THREE.CylinderGeometry(0.2, 0.2, 0.4, 32);
const material = new THREE.MeshPhongMaterial({color: 0xffffff});
this.player = new THREE.Mesh(geometry, material);
// Place it in front of the user at a comfortable height/distance.
this.player.position.set(0, xb.user.height - 0.5, -xb.user.objectDistance);
this.add(this.player); // `this` is itself a THREE.Object3D
}
onSelectEnd() {
// desktop click OR XR pinch
this.player.material.color.set(Math.random() * 0xffffff);
}
}
document.addEventListener('DOMContentLoaded', () => {
xb.add(new MainScript()); // register the script
xb.init(new xb.Options()); // boot the engine + render loop
});
Script is a THREE.Object3D, so this.add(obj) puts things in the scene under it.
Do not manage the render loop, WebXR session, or camera yourself — Core owns them.
Mental model
Coreis a singleton (core/Core.ts).xb.init()builds the renderer, camera, WebXR session, and every subsystem, then drives the frame loop.Coreis also exposed asxb.core, with convenience aliases:xb.scene,xb.user,xb.world,xb.ai,xb.depth,xb.sound,xb.input,xb.camera, plus functionsxb.add(),xb.init(),xb.getDeltaTime(),xb.getElapsedTime().Scriptis your extension point (core/Script.ts) — UnityMonoBehaviour-style lifecycle hooks (below).Optionsconfigures everything (core/Options.ts) — one object with chainableenable*()methods and per-subsystem sub-options.- The conceptual model from the papers is the Reality Model:
user,world, and AIagentsas first-class primitives, with an interaction grammar that separates explicit events (onSelectStart, click, pinch) from implicit intent (gesture, gaze, voice).
Enabling features (xb.Options)
These chainable methods exist — verified in core/Options.ts:
const options = new xb.Options();
options.enableUI(); // spatial UI + reticles
options.enableReticles(); // pointing cursor
options.enableControllers(); // tracked controllers
options.enableHands(); // hand tracking (joints, pinch)
options.enableHandRays(); // visible rays from hands/controllers
options.enableGestures(); // pinch/fist/point/etc. (see input/gestures)
options.enableStrokes(); // $1 unistroke recognition
options.enableDepth(); // WebXR depth sensing + depth mesh
options.enablePlaneDetection(); // detected planes in xb.world
options.enableObjectDetection(); // object detection (also sets camera permission)
options.enableCamera('environment'); // passthrough device camera ('environment'|'user')
options.enableAI(); // Gemini/OpenAI via xb.ai
options.enableXRTransitions(); // fade transitions
options.enableVR(); // immersive-vr instead of immersive-ar
xb.init(options);
There is no enablePhysics() or enableLighting() — these are configured directly:
import RAPIER from '@dimforge/rapier3d-simd-compat'; // physics engine
options.physics.RAPIER = RAPIER; // assigning RAPIER enables physics
// ...then implement initPhysics(physics) / physicsStep() in your Script.
options.formFactor = 'desktop' autostarts the simulator; ?formFactor=desktop in the
URL does the same. options.catchScriptExceptions (default true) keeps one buggy
script from crashing the app.
Script lifecycle hooks
Override only what you need (all verified against core/Script.ts and core/User.ts):
| Hook | When |
|---|---|
init(deps?) |
once, after registration; may return a Promise; receives injected deps |
update(time?, frame?) |
every frame |
initPhysics(physics) / physicsStep() |
physics setup / per physics step |
onSelectStart(e) / onSelectEnd(e) |
pinch (XR) or click (desktop) |
onSqueezeStart(e) / onSqueezeEnd(e) |
grip button |
onKeyDown(e) / onKeyUp(e) |
keyboard (e.code) |
onXRSessionStarted(session?) / onXRSessionEnded() |
entering/leaving XR |
onSimulatorStarted() |
desktop simulator booted |
Object-targeted hooks fire on the Script whose subtree was hit. Return true to mark
the event handled and stop it propagating to ancestors:
onObjectSelectStart/End, onObjectTouchStart/Touching/End,
onObjectGrabStart/Grabbing/End, onHoverEnter/Hovering/Exit.
Talking to the user, world, and AI
// User (xb.user — see core/User.ts)
xb.user.height;
xb.user.objectDistance;
xb.user.panelDistance;
xb.user.handedness;
xb.user.isSelecting(); // any controller pinching/clicking? (id optional)
xb.user.isSelectingAt(object); // is the user selecting this object/subtree?
xb.user.isPointingAt(object); // hover test
xb.user.getReticleTarget(0); // object under controller 0's reticle
xb.user.hands; // hand joints when enableHands()
// AI (xb.ai — see ai/AI.ts). Requires a key (?key=... or keys.json) — guard it.
if (xb.ai.isAvailable()) {
const res = await xb.ai.query({prompt: 'Write a haiku about dust.'});
// multimodal: xb.ai.query({type: 'multiPart', parts: [{text}, {inlineData:{data,mimeType}}]})
// res.text holds the answer; xb.ai.startLiveSession(config) for Gemini Live.
}
// World (xb.world) — detected planes / objects / meshes after enabling detection.
// Depth (xb.depth), Sound (xb.sound: spatial audio, speech recog/synth).
Spatial UI (core)
The core UI is a declarative grid built from xb.SpatialPanel (see
ui/layouts/SpatialPanel.ts and templates/1_ui):
const panel = new xb.SpatialPanel({
backgroundColor: '#2b2b2baa',
width: 2.5,
height: 1.5,
});
panel.position.set(0, xb.user.height, -xb.user.panelDistance);
this.add(panel);
const grid = panel.addGrid();
grid
.addRow({weight: 0.7})
.addText({text: 'Hello XR', fontColor: '#fff', fontSize: 0.08});
const button = grid
.addRow({weight: 0.3})
.addCol({weight: 1})
.addIconButton({text: 'check_circle', fontSize: 0.5}); // `text` = Material icon name
button.onTriggered = () => console.log('clicked/pinched/touched'); // unified select
onTriggered unifies click / pinch / touch on buttons. For flexbox-rich cards,
gradients, and shadows, use the uiblocks addon instead — see
addons/uiblocks/SKILL.md and "two UI systems" below.
Directory map (read deeper on demand)
| Path | What lives there |
|---|---|
core/ |
Core singleton, Script, Options, User, DI Registry, XRButton, WebXR session mgmt |
input/ |
controllers, hands, gaze, mouse, gamepad; gestures/; strokes/ |
world/ |
World + planes/, mesh/, objects/ (Gemini & MediaPipe backends), sounds/ |
depth/ |
depth sensing, depth mesh, occlusion/ shaders & passes |
ai/ |
AI facade over Gemini + OpenAI (query / live / image gen) |
agent/ |
agent framework: tools, memory, context (WIP — see agent/README.md) |
ui/ |
core spatial UI: SpatialPanel, Grid/Row/Col, views, ModelViewer, Reticle |
ux/ |
DragManager, reusable interaction behaviors |
simulator/ |
desktop XR simulator (virtual user/hands/depth/planes, control modes) |
sound/ |
spatial audio, speech recognizer/synthesizer (see sound/README.md) |
physics/ |
Rapier3D integration |
lighting/, camera/, video/, stereo/ |
light estimation, device camera, video streams, stereo utils |
utils/ |
ModelLoader, dependency injection, helpers |
addons/ |
opt-in modules, each often with its own README/skills: uiblocks, netblocks, glasses, volumes, virtualkeyboard, simulator UI, ... |
Two UI systems — pick deliberately
- Core UI (
xb.SpatialPanel+.addGrid()/.addRow()/.addCol()/.addText()/.add*Button()) — lightweight, no extra deps, good for HUDs, menus, and quick panels. - uiblocks addon (
UICard,UIPanel,UIText,UIImage,UIIcon) — full flexbox layout (@pmndrs/uikit), gradients, strokes, drop/inner shadows, and spatial behaviors. Import fromxrblocks/addons/uiblocks/srcand calloptions.uikit.enable(uikit). Seeaddons/uiblocks/SKILL.md.
Don't mix the two on the same panel, and don't import UIPanel/UICard from xrblocks
core — they only exist in the uiblocks addon.
Common hallucinated-API mistakes to avoid
| ❌ Don't | ✅ Do |
|---|---|
options.enablePhysics() |
options.physics.RAPIER = RAPIER + implement initPhysics() |
Use xb.core.renderer / xb.core.physics in a constructor |
They're created during xb.init(); use them in/after init() |
new xb.UIPanel(...) / new xb.UICard(...) |
Those are the uiblocks addon; core uses xb.SpatialPanel().addGrid() |
xb.ai.query('text') (bare string) |
xb.ai.query({prompt: 'text'}), and guard with xb.ai.isAvailable() |
| Assume AI works with no key | Provide ?key=... or keys.json; handle the unavailable case |
rgba()/hsla() colors in UI |
hex strings ('#ffffff') or THREE.Color |
Drive your own requestAnimationFrame loop |
Put per-frame logic in update(time, frame) |
Forget xb.add(script) before xb.init() |
Register every Script first |
Import bare three without the pinned importmap |
Use the importmap from the README / a template |
Design principles (honor these when contributing)
- Simplicity & readability — a
Scriptshould read like a high-level description of the experience. Simple things stay simple; complex logic stays explicit. - Creator experience first — absorb incidental complexity (sensor fusion, AI, cross- platform input) behind ready-to-use primitives.
- Pragmatism over completeness — "worse is better": small, modular, adaptable.
- Legible to AI — favor high-level, semantic, hard-to-misuse APIs; consistent naming;
export through
src/xrblocks.ts. The SDK is meant to ground LLM code generation.
Contributing conventions
- TypeScript throughout; new public symbols must be re-exported from
src/xrblocks.ts. - Tests are colocated
*.test.ts(Vitest):npm test. npm run lint(ESLint) andnpm run format(Prettier) before a PR.- Local dev:
npm run dev(Rollup watch +http-serveron :8080);npm run serveto just serve. Build:npm run build. Addons build separately intobuild/addons/*.