v3.5

AI-Native Agentic
Development Framework

Structured Multi-Agent Collaboration for
AI-Assisted Software Engineering

"Reasoning is the primary artifact. Code is output."

⚠ Evolving rapidly — much is proof-of-concept. The framework changes frequently; several capabilities are POC and will shift (sourced-assertion memory substrate, lineage tracking, the progressive-disclosure restructure, deterministic routing). Pin a version if you need stability.

Navigate: ← → arrow keys

The Challenge

The Problem with AI-Assisted Development Today

🕳️

No Decision Trail

AI makes hundreds of choices — architecture, patterns, trade-offs — but the reasoning disappears. No audit trail. No traceability. No way to understand why six months later.

🪞

Confirmation Bias

The same AI that writes the code evaluates it. It grades its own homework. No independent review means no safety net for blind spots, hallucinations, or compounding errors.

🎲

Vibe Coding

Unstructured, ad-hoc AI interactions produce wildly inconsistent quality. No repeatable process. No standards enforcement. Each session is a gamble.

💨

Knowledge Evaporates

Insights from AI sessions aren't captured, curated, or reused. Every session starts from zero. Patterns discovered once are lost and rediscovered again and again.

There has to be a better way.

Foundation

9 Non-Negotiable Principles

Click any principle to expand.

1 Reasoning is the primary artifact. ▼

Code is output. Deliberation, trade-offs, and decision lineage are the durable assets. Every significant decision must be traceable to the discussion that produced it. Why: Six months from now, the reasoning is more valuable than the code.

2 Capture must be automatic. ▼

If logging depends on model compliance, it will fail. Structured commands guarantee event-level recording. The AI cannot opt out. Why: Voluntary compliance doesn't work for audit trails.

3 Collaboration precedes adversarial rigor. ▼

Multi-perspective analysis is the default. Adversarial modes are scoped exclusively to: security review, fault injection, and anti-groupthink checks. Why: Productive tension beats manufactured opposition.

4 Independence prevents confirmation loops. ▼

The agent that generates code must not be the sole evaluator. At minimum, one specialist who did not participate in generation must perform independent review. Why: Self-review finds what it expects to find.

5 ADRs are never deleted. ▼

Architecture Decision Records are only superseded, with references to the replacing decision. This creates an immutable decision history. Why: Understanding past decisions prevents repeating mistakes.

6 Education gates before merge. ▼

Walkthrough → Quiz → Explain-back → Merge. Proportional to complexity and risk. Why: Developers must understand AI-generated code — not just accept it.

7 Layer 3 promotion requires human approval. ▼

No discussion insight is promoted to curated memory automatically. Requires 2+ independent confirmations plus explicit human sign-off. Why: Institutional knowledge must be deliberately curated, not auto-accumulated.

8 Least-complex intervention first. ▼

Prefer prompt changes → command changes → agent changes → architecture changes. Lower-complexity interventions are cheaper, more reversible, and faster to validate. Why: Don't restructure when a prompt tweak will do.

9 Clarify before acting (the 95% rule). ▼

Before producing a plan, writing code, or taking any substantive action, ask questions until you're ≥95% confident of both intent and scope. Mandatory unless the developer explicitly overrides; micro-fixes are exempt. Why: A wrong assumption acted on costs far more than the question that would have prevented it.

Core Architecture

Four-Layer Capture Architecture

How AI reasoning gets captured. Click a layer to expand.

1

Immutable Discussion Capture

Raw event streams sealed after every reasoning session

▼

Every canonical AI reasoning session produces a sealed, immutable record in the discussions/ directory. Each discussion contains events.jsonl (machine-readable event stream) and transcript.md (human-readable rendering). Events track: agent identity, intent type (proposal, critique, question, evidence, synthesis, decision, reflection), confidence scores, and risk flags. After closure, these files are locked — corrections require new discussions that reference the original.

2

Relational Index

SQLite database for querying and metrics across all discussions

▼

All events are ingested into a SQLite database with 5 tables and 10 indexes. This enables cross-discussion analytics: agent contribution scoring, false positive rate tracking, time-to-consensus measurement, reopened decision analysis, decision churn metrics, and drift detection. The relational layer makes raw reasoning data queryable — essential for the meso and macro learning loops.

3

Curated Memory

Human-approved patterns, decisions, and rules promoted from Layers 1–2

▼

The memory/ directory holds promoted knowledge: decision summaries, code patterns, agent reflections, lessons learned, and graduated rules. Promotion requires 2+ independent confirmations plus explicit human approval. Every promoted artifact has a 90-day forgetting curve — it must be reconfirmed or it gets archived. This prevents knowledge rot and keeps the curated layer deliberately lean.

4

Optional Vector Acceleration

Semantic retrieval when the corpus outgrows keyword search

▼

The vector layer activates only when the discussion corpus grows large enough that keyword and full-text search become insufficient. It never replaces the relational structure — it accelerates retrieval only. This layer is intentionally deferred; most projects won't need it until they reach significant scale.

From raw events → queryable metrics → curated knowledge. Nothing is lost.

The Team

12 Agents: 2 Leaders + 10 Specialists

● Opus — Leadership & complex reasoning ● Sonnet — Analysis & review

👑

Facilitator

opus

Orchestrates all multi-agent workflows

Risk assessment, specialist assembly, cross-pollination rounds, synthesis of findings, capture enforcement. The facilitator never renders specialist verdicts — it coordinates but does not evaluate.

🏛️

Architecture Consultant

opus

Structural integrity & ADR validation

Boundary enforcement, pattern consistency, dependency direction validation. Activated for architecture changes, new modules, and significant refactoring. Guards against unnecessary abstraction.

🛡️

Security Specialist

sonnet

OWASP Top-10 & trust boundaries

Red-team thinking, auth/authz review, input validation, secret management. Activated for auth code, API security, data handling, and external integrations. Avoids security theater.

🧪

QA Specialist

sonnet

Test adequacy & edge cases

Coverage analysis, error handling review, boundary conditions. Activated for every code review. Focuses on meaningful assertions, not just coverage numbers. Prevents tests that assert nothing.

⚡

Performance Analyst

sonnet

Complexity, hot paths & scalability

Algorithmic complexity analysis, database query efficiency, scalability assessment. Activated for data processing, DB operations, and API endpoints. Avoids premature optimization.

📚

Docs / Knowledge

sonnet

Documentation completeness & ADR quality

Self-healing docs: detects when documentation drifts from implementation. Validates ADR completeness, CLAUDE.md currency, and docstring coverage. Active in every review at light intensity.

🔍

Independent Perspective

opus

Anti-groupthink & hidden assumptions

Pre-mortem analysis, alternative exploration, challenge of consensus positions. Supports multi-instance dispatch with 4 instance types. Activated for medium/high risk changes. Provides the dissenting voice that prevents echo-chamber failures.

🔬

Project Analyst

sonnet

External project scouting & pattern evaluation

Two-phase workflow: Survey phase scouts the target project, then Orchestrate phase dispatches domain specialists. The only subagent that can delegate to others — a special orchestrator for external analysis.

🎨

UX Evaluator

sonnet

Interaction flow & accessibility

Evaluates UI code for UX friction, interaction flow, state feedback, platform conventions, and accessibility. Activated for any user-facing changes.

🏛️

Steward

opus

Framework philosopher-guardian & lineage

Framework philosopher-guardian. Evaluates agent definition changes, rule modifications, and philosophy evolution. Maintains lineage tracking and drift detection. Activated only for framework evolution — not day-to-day reviews.

🎓

Educator

sonnet

The Coach — walkthroughs, quizzes & Bloom's assessment

Generates Bloom's taxonomy assessments, guided walkthroughs, and comprehension quizzes. Tracks mastery tiers and adapts difficulty. Scaffolding intensity fades as demonstrated competence grows.

📊

History Analyst

sonnet

Git archaeologist — churn, refactors, reverts & blame

Surfaces git history patterns for files under review: churn frequency, authorship concentration, reverted changes, and bug fix patterns. Activated only with --deep flag for high-risk reviews.

Every agent has a defined lane, explicit triggers, and anti-patterns it must avoid.

Collaboration

5 Modes of Multi-Agent Collaboration

1

Ensemble

Independent contributions, no inter-agent exchange

Low risk

2

Yes, And

Collaborative building — each agent adds to previous

Additive

3

Structured Dialogue

Coopetitive multi-round discussion

Default

4

Dialectic Synthesis

Thesis-antithesis-synthesis with ACH matrix

High stakes

5

Adversarial

Red team — security, fault injection, anti-groupthink only

Scoped

Exploration Intensity (orthogonal axis)

Intensity	Description
Low	Primary analysis with brief notes on alternatives
Medium	2–3 alternatives with trade-off analysis
High	Thorough exploration of edge cases & failure modes

Key Concept: Coopetition

Agents share goals but have different professional priorities — a security specialist and a performance analyst will naturally surface different concerns. This creates productive tension without manufactured opposition.

Workflow

16 Core Commands That Drive the Workflow

24 commands in all — these are the ones you’ll reach for most.

Core Workflow

/review Multi-agent code review /deliberate Structured discussion /build_module Spec-driven construction /plan Feature planning

Analysis & Learning

/analyze-project External patterns /discover-projects GitHub search /retro Sprint retrospective /meta-review Quarterly evolution

Knowledge & Education

/promote Promote to Layer 3 /onboard Project takeover /quiz Bloom's assessment /walkthrough Code explanation /knowledge-health Pipeline health check /batch-evaluate Audit adoptions

Release & Lineage

/ship Release workflow /lineage Drift & manifest

The `/review` Pipeline

A 10-step automated workflow — from risk assessment to sealed report:

Risk Assessment → Discussion Creation → Specialist Assembly → Independent Analysis → Cross-Pollination → Synthesis → Verdict → Report → Sealed Record

Every command auto-captures reasoning via the capture pipeline. The model cannot opt out of logging — it's enforced at the tooling layer.

Safety

9 Hooks — Automated Guardrails

Safety is enforced at the tooling layer, not by asking the AI to behave.

Before Operations

Pre-Write

File Locking

Atomic locks prevent concurrent agent edits, auto-expire after 120 seconds

Pre-Write

Secret Detection

Scans for 12 secret patterns: API keys, AWS keys, JWT, PATs, private keys, and more

Pre-Write

Protected Files

Blocks edits to .env, .git/, evaluation.db, and critical config files

Pre-Commit

Quality Gate

Formatting, linting, tests, and coverage must all pass before any commit

Pre-Push

Main Branch Protection

Blocks direct pushes to main/master with remediation instructions

After Operations

Post-Write

Auto-Format

Runs ruff format + ruff check --fix on every Python file after every edit

Post-Write

Lock Release

Releases file locks after write/edit completes — cleanup is automatic

Session Lifecycle

PreCompact

State Save

Saves in‑flight task state to BUILD_STATUS.md before context compaction

SessionStart

Context Restore

Reads BUILD_STATUS.md on session resume to restore working context

Async Autonomy (ADR-0019)

While the agent works autonomously, gating decisions surface to your phone via ntfy.sh. ask sends tap-to-answer buttons (max 3). check looks back for missed answers on resume. poll streams live replies. Replies are validated against an allow-list before the agent acts — never trusted as raw input. Two topics: MAIN (agent outbound) + REPLY (developer inbound). See the collaborating-async skill.

Self-Improvement

Three Nested Learning Loops

Macro

Quarterly evolution

Meso

Sprint retros

Micro

Per-discussion

Micro Loop — Per-Discussion

After each discussion, agents write structured reflections: what they missed, what they'd improve, confidence calibration. Reflections are stored in SQLite and feed candidate improvement rules.

Meso Loop — Sprint Retrospective

The /retro command queries SQLite for: reopened decisions, override frequency, frequent issue tags, time-to-resolution stats, and adoption pattern evaluation (PENDING → CONFIRMED or REVERTED).

Macro Loop — Quarterly Evolution

The /meta-review command produces: agent effectiveness scoring, drift analysis, rule update candidates, and decision churn index. Drives framework-level evolution.

Double-Loop Learning

Single-loop: tune thresholds within existing rules. Double-loop: change what counts as "good" based on accumulated evidence. The framework doesn't just follow rules — it evolves them.

Human Ownership

Education Gate: Developers Must Understand AI Code

1

Walkthrough

AI explains the code step by step — what it does, why decisions were made, how components interact

2

Quiz

Bloom's taxonomy assessment — from recall to analysis to evaluation. Includes debug scenarios and change-impact questions.

3

Explain-Back

Developer explains the code in their own words — proving comprehension, not just recognition

✓

Merge

Only after completing all three steps. Proportional to complexity and risk.

Bloom's Taxonomy Levels

Level 1 Remember — Recall facts and definitions

Level 2 Understand — Explain concepts in own words

Level 3 Apply — Use knowledge in new situations

Level 4 Analyze — Break down and debug components

Level 5 Evaluate — Assess change impact and trade-offs

Level 6 Create — Produce original explanations

70% pass threshold. At least 1 debug scenario + 1 change-impact question per quiz. Scaffolding fades as competence grows.

AI writes the code, but the human must own it.

Evolution

Learning From the Ecosystem

/discover-projects → /analyze-project → 5-Dimension Scoring → Adopt / Defer / Reject → Adoption Audit

Scoring Rubric (max 25 points)

Prevalence

How common?

Elegance

How clean?

Evidence

Proven results?

Fit

Compatible?

Maintenance

Sustainable?

Rule of Three

Patterns seen in 3+ independent projects get priority consideration. Validates that a pattern isn't a one-off novelty but a genuinely useful practice.

Threshold: only patterns scoring ≥ 20/25 are recommended for adoption. Every adoption and rejection is documented with reasoning — decision lineage is preserved per Principle #1.

Framework's Own Evolution

8

Projects Analyzed

77

Patterns Evaluated

42

Adopted

20

Deferred

20

Rejected

4 patterns achieved Rule of Three status — validated across 3+ independent projects. The framework practices what it preaches: its own evolution follows the structured analysis pipeline.

Structure

What a Project Looks Like

Click folders to explore. The framework lives alongside your code.

▸ .claude/ — Framework configuration

▸ agents/ — 12 agent definitions

facilitator.md

architecture-consultant.md

security-specialist.md

qa-specialist.md

history-analyst.md

... (7 more)

▸ commands/ — 24 slash commands

review.md

deliberate.md

build_module.md

analyze-project.md

... (20 more)

custodian/ — Lineage tracking

hooks/ — 9 lifecycle hooks

rules/ — 4 context-loaded standards (rest are on-demand skills)

skills/ — 21 on-demand skills (playbooks + relocated protocols, ADR-0016/ADR-0019)

▸ discussions/ — Layer 1: Sealed reasoning

DISC-YYYYMMDD-slug/

events.jsonl

transcript.md

▸ memory/ — Layer 3: Curated knowledge

decisions/

patterns/

reflections/

lessons/

rules/

archive/

metrics/ — Layer 2: SQLite index

▸ docs/ — Documentation

adr/ — Architecture Decision Records

reviews/ — Review reports

templates/ — Artifact templates

scripts/ — Capture pipeline utilities

src/ — Your application code

tests/ — Test suite

CLAUDE.md — Project constitution

Framework ≠ Separate Tool

The framework doesn't run as an external service. It lives inside your project's directory structure — agent definitions, commands, hooks, and rules are all version-controlled files alongside your source code.

CLAUDE.md — The Project Constitution

A single file that codifies all project conventions, principles, boundaries, and ID formats. Every agent reads it. It's the source of truth for how the framework operates in this project.

Everything is a Markdown File

Agent definitions, commands, rules, ADRs, reviews — all Markdown with YAML frontmatter. Human-readable, version-controllable, and diff-friendly. No proprietary formats.

Runs Inside VS Code + Claude Code

The framework is designed for Claude Code inside VS Code. Slash commands integrate directly into the Claude Code interface. Hooks fire automatically via Claude Code's hook system.

Quality

Automated Quality Enforcement

Every commit must pass the quality gate. No exceptions. No --no-verify.

✓Formatting — ruff format

✓Linting — ruff check

✓Tests — pytest (full suite, deterministic)

✓Coverage — ≥ 80% for new/modified code

✓ADR Completeness — all decisions documented

✓Review Existence — code changes require /review

Git Pre-Commit Hook

The quality gate runs automatically as a git pre-commit hook. If any check fails, the commit is blocked. This is non-negotiable — the hook cannot be bypassed without explicit developer override and documented reason.

Trend Analysis

Every quality gate run appends a JSONL record to a log file. This data feeds into sprint retrospectives and framework meta-reviews — enabling trend analysis of code quality over time.

Two-Gate Commit Protocol

Gate 1: Quality gate (automated). Gate 2: Multi-agent code review via /review (agent-assisted). Both must pass before code is committed.

Auto-Fix Available

Run the quality gate with --fix to automatically remediate formatting and lint issues. Tests and coverage still require manual attention.

Summary

From Vibe Coding to Engineering Discipline

✗ AI decisions disappear → ✓ Every decision captured with full lineage

✗ Same AI writes and reviews → ✓ Independent multi-agent evaluation

✗ Unstructured ad-hoc sessions → ✓ 24 structured commands with auto-capture

✗ Knowledge lost between sessions → ✓ Four-layer capture with curated memory

✗ Developers blindly accept AI code → ✓ Education gates ensure understanding

✗ No quality enforcement → ✓ Automated quality gates on every commit

✗ Static unchanging process → ✓ Three nested loops drive continuous improvement

"Reasoning is the primary artifact. Code is output."

Get Started

# Install dependencies

pip install -r requirements.txt

# Initialize the metrics database

python scripts/init_db.py

# Try the framework commands

/review src/ # Multi-agent code review

/deliberate "topic" # Structured discussion

/walkthrough src/ # Guided code walkthrough

/quiz src/ # Comprehension assessment

View on GitHub →

AI-Native Agentic Development Framework v3.5 · Diviner Dojo
diviner-dojo@gmail.com

AI-Native AgenticDevelopment Framework

The Problem with AI-Assisted Development Today

No Decision Trail

Confirmation Bias

Vibe Coding

Knowledge Evaporates

9 Non-Negotiable Principles

Four-Layer Capture Architecture

Immutable Discussion Capture

Relational Index

Curated Memory

Optional Vector Acceleration

12 Agents: 2 Leaders + 10 Specialists

5 Modes of Multi-Agent Collaboration

Ensemble

Yes, And

Structured Dialogue

Dialectic Synthesis

Adversarial

Exploration Intensity (orthogonal axis)

Key Concept: Coopetition

16 Core Commands That Drive the Workflow

The /review Pipeline

9 Hooks — Automated Guardrails

Before Operations

File Locking

Secret Detection

Protected Files

Quality Gate

Main Branch Protection

After Operations

Auto-Format

Lock Release

Session Lifecycle

State Save

Context Restore

Async Autonomy (ADR-0019)

Three Nested Learning Loops

Micro Loop — Per-Discussion

Meso Loop — Sprint Retrospective

Macro Loop — Quarterly Evolution

Double-Loop Learning

Education Gate: Developers Must Understand AI Code

Walkthrough

Quiz

Explain-Back

Merge

Bloom's Taxonomy Levels

Learning From the Ecosystem

Scoring Rubric (max 25 points)

Prevalence

Elegance

Evidence

Fit

Maintenance

Rule of Three

Framework's Own Evolution

What a Project Looks Like

Framework ≠ Separate Tool

CLAUDE.md — The Project Constitution

Everything is a Markdown File

Runs Inside VS Code + Claude Code

Automated Quality Enforcement

Git Pre-Commit Hook

Trend Analysis

Two-Gate Commit Protocol

Auto-Fix Available

From Vibe Coding to Engineering Discipline

Get Started

AI-Native Agentic
Development Framework

The `/review` Pipeline