Use Cases Compare Learn Blog Docs Open Studio

How Yugma works

Architecture

The pipeline

User prompt
  → referenceResolver       ("that"/"the red sphere" → IDs)
  → aiSerializer            (YSL scene context, ~45 tok/object)
  → aiCompose (Cloud Function, the brain)
      ├─ System prompt (coords, scale, materials, design principles)
      ├─ 19 tool schemas
      └─ Agentic loop (max 3 iters)
  → TOOL_DISPATCH → Zustand stores → Three.js rerenders
  → Collab broadcast deltas to peers

The 19 tools

Add / update / remove / duplicate / animate objects. Set environment / clear scene / focus camera. Apply material presets. Align / distribute / group / tag / search-select. Sketchfab + layout generation + preview / commit.

The scene graph

Every object is { id, name, type, transform, material, geometry, tags, parentId, ... }. Stored in a Zustand store as the source of truth. Rendered by R3F. Serialized to GLB / USDZ / PNG for export.

Why this design

FAQ

How fast is the first tool call?

Streaming SSE delivers the first tool call ~3-5 seconds in. Full scene typically 10-30 seconds depending on object count.

How does the AI know spatial relationships?

A spatial pre-processor handles obvious patterns (circle of N, grid, stack, scatter) by computing exact positions. The LLM owns the rest, guided by scale references in the system prompt.

Can I see a demo?

Open Yugma Studio; type a sentence; watch the scene build.