Pricing
Rubrkit vs Promptfoo

Rubrkit vs Promptfoo

Promptfoo is an open-source CLI for evaluating and red-teaming LLM apps: you define prompts and test cases in config files and run assertions and security scans from your repo. Rubrkit grades instruction quality against a rubric, explains each mark, and produces a stakeholder-readable proof report — across agents, skills, and workflows, staying model-neutral. Choose Promptfoo for config-driven assertions and security red-teaming; choose Rubrkit for a falsifiable quality verdict you can hand to a non-engineer.

At a glance

How Rubrkit and Promptfoo compare

DimensionRubrkitPromptfoo

Primary job

Grade instruction quality and prove the improvement

Run assertion-based evals and red-team tests from the CLI

Artifact types

Prompts, agents, skills, commands, workflows, and rubr_flow

Prompts and test cases defined in config

Quality model

Rubric score 0–5 per dimension with the evidence behind each mark

Pass/fail assertions you author per test case

Security red-teaming

Not a red-teaming tool

Built-in scans for prompt injection, PII, and jailbreaks

Stakeholder output

A shareable proof report a non-engineer can read

CLI output and reports aimed at developers

Ease of first signal

Grade an artifact against a ready rubric — no config to write

Write a config and test cases before you get a result

CLI / CI

npx rubrkit plus CI quality gates

CLI-first and CI-friendly by design

Ownership / neutrality

Independent and model-neutral

OpenAI-owned since March 2026; core stays MIT and model-agnostic

Who it's for

Pick the tool that fits the job

Choose Rubrkit when

Teams who want a rubric-backed quality verdict and a readable proof report across prompts, agents, and skills — without writing a test config first.

Choose Promptfoo when

Engineers who want config-driven, repo-resident assertions and built-in security red-teaming run from the CLI.

Promptfoo’s security red-teaming — prompt-injection, PII, and jailbreak scanning — is genuinely stronger than anything Rubrkit offers. If adversarial testing is your goal, Promptfoo is purpose-built for it and Rubrkit is not.

FAQ

Rubrkit and Promptfoo, answered.

See how your instructions score in ~20 seconds.

Grade an instruction
Newsletter

Follow the review loop as it ships.

Notes on AI artifact testing, rubr_flow conversion, evals, and proof reports.