Brownfield Experiment: Semantic Traceability Matrix

Comparing three reverse-engineering approaches against the original documentation. Matching by semantic content, not filenames or IDs. All versions include team-answered Open Questions.

1. Design Decisions (5 original ADRs)

Decision	Direct	Socratic	Two-Phase
Use JSONC as DSL 6 alternatives, Pugh +20	✅ Decision + rationale + all 6 alternatives + Pugh matrix	✅ Decision + rationale + Pugh matrix. Confirmed via team answer.	✅ Decision + rationale + 6 alternatives + Pugh matrix + code refs
Use Go as language vs Python, Kotlin/JVM	⚠️ Go mentioned in PRD/NFR but no ADR. ADR-002 is about Cobra CLI instead.	❌ No ADR about implementation language. Go mentioned but no alternatives evaluated.	✅ ADR with 5 alternatives (Go, Rust, Node TS, GraalVM, Python) + Pugh matrix
Risk Classification via Vibe-Coding Risk Radar Tier 2 assessment	❌ Not documented. ADR-003 is about three-way diff instead.	❌ Not documented.	✅ ADR documents Risk Radar adoption with Pugh matrix
Reject sequence diagram export 4 approaches evaluated, all rejected	❌ Not documented.	❌ Not documented.	✅ ADR marked as Rejected. 3 alternatives + Pugh matrix.
Bespoke layered layout engine vs Graphviz, dagre, force-directed	✅ Documented as ADR-007. Pugh matrix, 3 modes, static-binary rationale.	❌ Layout mentioned in sync spec but no ADR.	✅ ADR with 4 alternatives + Pugh matrix

New decisions invented by generated versions

Version	Decision	In Original?
Direct	Cobra CLI framework	🆕 No
Direct	Three-way diff sync strategy	🆕 No
Direct	Model-wins conflict policy	🆕 No
Direct	etree XML library	🆕 No
Direct	Embedded templates (go:embed)	🆕 No
Socratic	Model-wins conflict policy	🆕 No
Socratic	Pure sync function architecture	🆕 No

2. Functional Requirements (7 original FRs)

Requirement	Direct	Socratic	Two-Phase
FR-1: Model Definition JSONC, elements, relationships, nesting	✅ Extensive (FR-001..006)	✅	✅
FR-2: draw.io Generation	✅	✅	✅
FR-3: Bidirectional Sync	✅ Extensive	✅	✅
FR-4: Relationship Handling center attachment, label sync	✅	⚠️ Center attachment not explicit	⚠️ Center attachment not explicit
FR-5: Navigation / Drill-Down	✅ In ACs	✅ In PRD + concepts	✅ In PRD + sync spec
FR-6: CLI Interface	✅ Most detailed	✅	✅
FR-7: Template System	✅	⚠️ Partial	✅

3. Quality Goals (3 goals, priority order matters)

Goal (Original Priority)	Direct	Socratic	Two-Phase
#1 Learnability productive in 30 min	❌ Not in quality goals. Listed as 6 different priorities.	✅ Priority #1, "productive in 30 minutes"	⚠️ Listed as "zero-friction install" (#3), not as #1
#2 IDE Support JSON Schema, no plugins	⚠️ NFR-010 but not a quality goal	✅ Priority #2	⚠️ Mentioned but not priority #2
#3 LLM Friendliness JSON readable by AI	⚠️ Listed as priority #5 (not #3)	✅ Priority #3	⚠️ Priority #2 as "LLM predictability" (different framing)

Only Socratic correctly reproduces all three goals with the correct priority order. This is surprising — the team answers included the same information for all three versions, but only Socratic's Question Tree decomposition led to the correct priorities.

4. Use Cases (8 original UCs)

Use Case	Direct	Socratic	Two-Phase
Init Architecture Model	✅	✅	✅
Define Elements (manual IDE)	⚠️ Only CLI add, not manual editing	❌	❌
Forward Sync	✅ Combined with reverse	✅ Combined	✅ Combined
Reverse Sync	✅ Combined with forward	✅ Combined	✅ Combined
Watch Mode	✅	✅	✅
LLM-Driven Modification	⚠️ Implicit (LLM as actor)	⚠️ Implicit	⚠️ Implicit
Drill-Down Navigation	⚠️ In ACs, not standalone UC	⚠️ In concepts, not standalone UC	⚠️ In spec, not standalone UC
Export Diagram Views	✅	✅	✅

New use cases discovered from code

All three versions independently discovered the same additional use cases: Validate, Add Element via CLI, Add Relationship via CLI, Export Tables, Export Text Diagrams. These exist in the code but are not in the original spec.

5. Performance Budgets (4 metrics)

Metric	Direct	Socratic	Two-Phase
Startup <10ms	✅	✅	✅
Sync <100ms (200 elements)	✅	✅	✅
Binary 10-15 MB	✅	✅	✅
Zero dependencies	✅	✅	✅

All versions got all 4 metrics exactly right. These came from team answers (OQ-050 / OQ-8 / Q-3.10.2).

6. Security / Trust Model (3 boundaries)

Boundary	Direct	Socratic	Two-Phase
Fully Trusted: CLI Flags	✅	✅	✅
Semi-Trusted: Model/Template Files	✅ Most detail (size, depth, checksum)	✅	✅
Untrusted: Agent/CI	✅	✅	✅
Non-Threats (no network, no code exec, no credentials)	⚠️ Only "no network"	⚠️ Only "no network"	⚠️ Only "no network"

Metric	Original	Direct	Socratic	Two-Phase
Total doc lines	~13,800	3,886	2,481	4,083
ADRs (matching original topics)	5	2/5 (+5 new)	1/5 (+2 new)	5/5
Team answer markers	0	26	35	50
Q-ID traceability refs	0	101	123	109
Open Questions remaining	—	0	0	0