How Yugma works

Architecture

The pipeline

User prompt
  → referenceResolver       ("that"/"the red sphere" → IDs)
  → aiSerializer            (YSL scene context, ~45 tok/object)
  → aiCompose (Cloud Function, the brain)
      ├─ System prompt (coords, scale, materials, design principles)
      ├─ 19 tool schemas
      └─ Agentic loop (max 3 iters)
  → TOOL_DISPATCH → Zustand stores → Three.js rerenders
  → Collab broadcast deltas to peers

The 19 tools

Add / update / remove / duplicate / animate objects. Set environment / clear scene / focus camera. Apply material presets. Align / distribute / group / tag / search-select. Sketchfab + layout generation + preview / commit.

The scene graph

Every object is { id, name, type, transform, material, geometry, tags, parentId, ... }. Stored in a Zustand store as the source of truth. Rendered by R3F. Serialized to GLB / USDZ / PNG for export.

Why this design

Speed. Cerebras streams the first tool call ~3-5s into the prompt; you see objects appear before the AI finishes.
Reliability. Typed schemas mean the AI can't emit malformed code that breaks rendering.
Editability. Every object has a stable ID; subsequent prompts can reference and mutate it.
Exportability. The graph serializes cleanly to whatever format you need.

FAQ

How fast is the first tool call?

Streaming SSE delivers the first tool call ~3-5 seconds in. Full scene typically 10-30 seconds depending on object count.

How does the AI know spatial relationships?

A spatial pre-processor handles obvious patterns (circle of N, grid, stack, scatter) by computing exact positions. The LLM owns the rest, guided by scale references in the system prompt.

Can I see a demo?

Open Yugma Studio; type a sentence; watch the scene build.

# The pipeline

# The 19 tools

# The scene graph

# Why this design

FAQ

The pipeline

The 19 tools

The scene graph

Why this design