Is rubr_flow a programming language?

No. It is pseudocode for instructions, not a runtime. There is no compiler and nothing to install — a person reads it, an agent follows it directly, and the artifact is still the instruction you paste into your tool.

Does this make my instructions longer for no reason?

The goal is not length, it is exposure. rubr_flow surfaces the decisions an agent would otherwise guess — inputs, boundaries, step order, and the finish line — so a short loose request becomes a short bounded procedure.

Which blocks are required?

TASK, INPUTS, RULES, FLOW, OUTPUT, and VERIFY carry the procedure. CONTEXT, TOOLS, STATE, and ON are optional and used only when the work needs durable facts, tool limits, working memory, or explicit failure handling.

How does Rubrkit use it?

Rubrkit grades whether a rubr_flow block is bounded enough to reuse: are inputs named, are edits constrained, does VERIFY define a pass or fail. You get the marks, a stronger rewrite, and an eval check that proves the rewrite holds.

Can I paste a rubr_flow into any agent?

Yes. Because it is plain text with boring keywords and indentation, it drops into a prompt, a command, a skill, or an agent spec without special syntax. The format documents intent; the agent still does the work.

rubr_flow

A compact procedure format for instructions an agent can actually follow.

rubr_flow turns loose prompts, commands, skills, agent specs, and workflows into bounded, testable procedures — with visible inputs, rules, steps, outputs, and a pass/fail finish line.

Run an audit See the anatomy

Specimen RBR-FLOW

Ready to audit

TASK "Ship safer agent instructions"
INPUTS
  artifact = prompt_or_agent_spec
RULES
  preserve user intent
  flag unsupported claims
FLOW
  CALL tool "Rubrkit audit" WITH artifact -> audit
  EDIT weak dimensions FROM audit -> revision
  WRITE eval checks FROM audit, revision -> tests
OUTPUT
  score: audit.score
  revision
  tests
VERIFY
  PASS WHEN tests can judge the result
  FAIL WHEN tests cannot judge the result

What it is

Pseudocode for instructions, not a runtime.

The artifact is still the instruction. A person can read it, an agent can follow it, and Rubrkit can grade whether the work is bounded enough to reuse.

Readable by default

Boring keywords, indentation, labels, and named outputs instead of clever syntax. A teammate can review it on first read.

Built for grading

Facts, constraints, actions, outputs, and verification sit in separate blocks, so a weak dimension is easy to mark.

Pasteable into agents

No compiler and no special runtime. The procedure is plain text meant to be followed directly by the model you already use.

Anatomy

Every block has one job.

rubr_flow works because it makes the hidden control surface visible: what the agent knows, what it may do, how it moves, and how success is checked. Six blocks carry the procedure; four are optional.

TASK

States the objective in one sentence so the agent and the reader agree on what done means.

TASK "Improve onboarding completion"

CONTEXT

Separates durable facts from the procedure so the agent stops re-deriving what it already knows.

CONTEXT user is new to [PRODUCT]

INPUTS

Names the files, data, and assumptions the work depends on instead of leaving them implicit.

INPUTS current_flow, [ANALYTICS], drop_off_point

RULES

Makes boundaries and preservation requirements visible so edits stay inside the lines.

RULES change only copy and step order

TOOLSoptional

Declares which capabilities the agent may call, turning an open toolbox into a short list.

TOOLS READ [ANALYTICS]

STATEoptional

Initializes the working memory the flow accumulates, so intermediate results have a home.

STATE friction_notes = []

FLOW

Lists the ordered work as labelled steps with branches and handoffs the agent follows in sequence.

FLOW REVIEW each screen -> friction_notes

ONoptional

Handles the predictable failure: what to do when context is missing or a step cannot complete.

ON missing_context ASK user -> detail

OUTPUT

Defines the shape of the final artifact before the work starts, so the result is never a surprise.

OUTPUT changed_copy, rationale, risk_notes

VERIFY

Gives the agent a pass/fail finish line instead of asking it to decide when the work is good enough.

VERIFY PASS WHEN next action is obvious

How to write one

From loose request to bounded procedure in four passes.

You are not adding ceremony. Each pass names a decision the agent would otherwise have to guess.

State the task and pin the context

Write the objective as a single TASK line, then move every durable fact the agent should assume into CONTEXT so it is not buried in the steps.

Declare inputs and rules

List the files, data, and assumptions under INPUTS, then set the boundaries and preservation requirements under RULES so edits stay in scope.

Order the flow and handle failure

Break the work into labelled FLOW steps that each name their output, and add an ON block for the failure you can predict instead of hoping it will not happen.

Define output and a pass/fail check

Describe the final artifact under OUTPUT, then close with a VERIFY block whose PASS, STOP, or FAIL conditions can be scored without a judgement call.

Worked specimens

Loose request in. Bounded procedure out.

The same grading loop from the rest of Rubrkit: mark the weak dimensions, rewrite as rubr_flow, then define the eval check that proves the rewrite holds.

Onboarding improvement

Before · 52/100

Review our onboarding flow and fix anything confusing.

No inputs

Unbounded edits

No finish line

After · 91/100

TASK "Improve onboarding completion"
CONTEXT user is new to [PRODUCT]
INPUTS current_flow, [ANALYTICS], drop_off_point
RULES change only copy and step order
FLOW
  REVIEW each screen -> friction_notes
  RANK issues by user impact -> ranked
  EDIT the highest-impact issue -> change
OUTPUT changed_copy, rationale, risk_notes
VERIFY PASS WHEN the next action is obvious in one pass

Why it improved

Inputs are named, edits are bounded to copy and order, and VERIFY gives the agent a finish line it can score.

Sample eval check

Passes if a first-time user can identify the next action in one pass and no legal text changed.

Coding-agent repair loop

Before · 48/100

Fix the failing tests and clean up anything related.

Scope creep

No root cause

No retry limit

After · 90/100

TASK "Repair failing checkout tests"
INPUTS failing_command = [TEST CMD], changed_files = git diff
RULES edit only checkout code and its focused tests
FLOW
  RUN [TEST CMD] -> result
  DECIDE root_cause FROM result, changed_files
  EDIT minimal patch -> patch
  RUN [TEST CMD] -> verification
OUTPUT root_cause, patch, verification
VERIFY
  PASS WHEN verification.status == "passed"
  FAIL WHEN the same failure repeats 3 times

Why it improved

Scope is pinned to checkout, a root cause is required before editing, and a retry ceiling stops an endless loop.

Sample eval check

Passes if the patch touches only checkout files and the suite reports passed within three attempts.

FAQ

What people ask before they write their first block.

Turn your next loose instruction into rubr_flow.

Paste a prompt, command, skill, agent spec, or workflow and get the marks, the rewrite, and the eval that proves it.

Run an audit