Changelog

What's new

RUQA ships continuously. Every meaningful change is logged here. Subscribe to the build-in-public newsletter for weekly digests.

2026-05-04v0.4.0MAJOR

Sandbox · multi-LLM playground

Test prompts on Claude Sonnet 4.6, GPT-5.5, Gemini 2 Pro, and Llama 4 side-by-side. Eval rubric scoring. Promote winners to prod.

+New /sandbox page with side-by-side comparison
+Auto eval rubric (5 criteria: format, coherence, accuracy, brevity, actionability)
+Latency, tokens, cost tracked per provider
+Promote-to-prod flow for harness library

2026-05-04v0.3.0MINOR

Stop tracking tools. Start tracking the meta-skills that don't go stale.

+16 metaskills across 4 categories (thinking, communication, execution, leadership)
+Auto-scored from real outputs
+Team heatmap view + project fit matching
+Strength / weakness drill-down

2026-05-04v0.2.0MINOR

Anti-gaming evaluation that holds up. T1-T5 with HARD/SOFT flag system.

+Self-report vs AI baseline vs Git timestamps vs volume regression vs peer median
+Automatic deviation flagging (>30% triggers manager review)
+Calibration period guarantee (90 days, no consequences)
+Algorithm fully documented

2026-05-04v0.1.0INITIAL

Auto-synthesized daily reports from real signals. The original RUQA feature.

Sandbox · multi-LLM playground

Test prompts on Claude Sonnet 4.6, GPT-5.5, Gemini 2 Pro, and Llama 4 side-by-side. Eval rubric scoring. Promote winners to prod.

+New /sandbox page with side-by-side comparison

+Auto eval rubric (5 criteria: format, coherence, accuracy, brevity, actionability)

+Latency, tokens, cost tracked per provider

+Promote-to-prod flow for harness library