Vision

Vision is the layer where you think through the product idea. It is a living scratchpad. Goals, constraints, non-goals, audience, competitive context, open questions, and freeform notes. Vision never fully completes. It evolves as the project develops, and changes ripple forward into every downstream layer.

What Vision captures

Seven categories, each a list of short written items.

Goals. What the product should achieve for its users. Outcomes, not features.
Constraints. What limits the design. Time, budget, tech stack, organizational realities, accessibility requirements, regulatory rules.
Non-goals. What the product explicitly will not do. The things you have decided to leave out so that the goals stay sharp.
Audience. Who the product is for and their context.
Competitive context. What exists in this space and how this product differs.
Open questions. Unresolved decisions. The things the designer wants to return to.
Scratchpad. Freeform thinking. The stuff that doesn't fit anywhere else yet.

Each item has a short string of content, a status (proposed or confirmed), and a source (ai or manual). Proposed items are suggestions Claude Code made during a conversation. Confirmed items are the ones you accepted. Manual items are the ones you wrote yourself.

The canvas

Vision is the one layer where the terminal is the primary surface and the canvas is secondary.

The embedded Claude Code terminal takes the dominant width. You type into it. Claude Code responds. The conversation is where the thinking happens.

To the side, a sidebar accumulates the items worth keeping, grouped under the seven categories. Items appear in the sidebar in real time as Claude Code proposes them and as you pin text from the terminal. Proposed items show with a distinct visual state until you confirm or dismiss them.

Two capture paths feed the sidebar:

Claude Code proposes. During the /vision conversation, the skill identifies candidate goals, constraints, audience notes, and open questions and writes them to design/vision.json with status proposed. The sidebar shows them; you confirm or dismiss per item.
You pin terminal text. Select any text in the terminal output, right-click, and assign it to a category. The selection is written to the sidebar as a manual item with status confirmed.

You can also edit any item inline from the sidebar, drag items between categories, or delete items you no longer need.

The file written

Vision writes one file: design/vision.json. The schema has a top-level object with seven array properties (goals, constraints, nonGoals, audience, competitiveContext, openQuestions, scratchpad). Each item is { id, content, status, source, createdAt }.

The file is small, human-readable, and intentionally flat. You can edit it in a code editor and Denote will pick up the change.

The skill

/vision is the conversational skill that populates the Vision layer.

Reads:

design/vision.json (if it exists, merges, does not replace)
Any starting context you pass as arguments

Writes:

design/vision.json

Behavior:

Treats the first run as a kickoff conversation and later runs as refinement sessions.
Proposes items progressively during the conversation rather than waiting for a complete picture.
Asks follow-up questions to push on ambiguity before committing items.
Writes items with status: proposed by default so you stay in control of what reaches the confirmed set.

You can run /vision multiple times. Each run picks up where the last left off.

Why Vision is different

Vision is divergent and exploratory. The other five layers are convergent and structural. Extract organizes what is already known into entities. Map commits governance. Structure commits a page inventory. Each downstream layer narrows the decision space.

Vision is the one place where the decision space is still wide open. It is where you refine the idea itself.

This matters because a structural pipeline needs a thinking step before the structuring starts. If you skip straight to Extract without a Vision, you will commit to entities based on a half-formed idea, and you will find yourself rewriting them later. Vision is cheap to edit. Entities committed too early are not.

The recipe example

A Vision for the recipe portion calculator might look like this after a short /vision session.

Goals

Home cook scales any recipe up or down by serving count, on a phone, in under thirty seconds.
Saved scaling ratios per recipe so the cook does not recalculate favorites each time.

Constraints

Must work on a phone held with one hand while cooking.
No account or login. Local storage only.

Non-goals

Not a recipe storage or recipe-discovery product. The cook brings the recipe.
No meal planning, shopping lists, or nutritional calculation.

Audience

Home cooks cooking for family sizes that do not match the original recipe.
Comfortable with phones but not interested in configuring anything.

Competitive context

Existing recipe apps either ignore scaling or bury it in a menu. No focused scaling tool.

Open questions

Should the app handle ingredient unit conversions (cups to grams) or stay portion-only?
How are fractional results rounded so the cook does not need to eyeball "0.34 teaspoons"?

This is enough for /extract to find the objects (Recipe, Ingredient, Serving, ScalingRatio) and for /map to reason about pattern assignments.

What Vision is not for

Entity modeling. That is Extract. Vision captures the idea. Extract commits the objects.
Pattern decisions. That is Map. Vision does not say "this should be a card grid." It says "the cook needs to pick a recipe quickly."
Page inventory. That is Structure. Vision does not enumerate pages.
Copy or content. Vision is about direction, not wording.

The discipline is that Vision stays high-level. Short items. No implementation. No layout.

Where to next

Extract takes Vision as input and pulls out the objects, personas, and tasks.
The six layers overview for where Vision sits in the pipeline.
The two app model for how the embedded terminal and the sidebar share the design/vision.json file.