[NOTE] ==== This ADR was reverse-engineered from code. The team did not provide a specific narrative about implementation language; the rationale below is inferred from build configuration, dependency choices, and the team's performance budget (OQ-8). ====
Brownfield Experiment 2: Socratic Code Theory Recovery (Two-Phase)
.1. Experiment Design
.1.1. Hypothesis
The Two-Phase workflow combines the strengths of both previous approaches: the Socratic approach’s honesty about unknowns (Phase 1) with the Direct approach’s comprehensive documentation (Phase 2). By routing Open Questions to the team between phases, the documentation should include correct rationale, quality goal priorities, and performance budgets that neither pure approach could produce.
.1.2. Setup
-
Project: Bausteinsicht (same as Experiments 1a and 1c)
-
Branch:
brownfield-2-phases -
Phase 1 Prompt: Socratic Code Theory Recovery (51 lines)
-
Phase 2 Prompt: Documentation synthesis with 11 team-answered Open Questions
-
LLM: Claude (fresh session per phase)
.1.3. Method
-
Phase 1: LLM builds Question Tree from code (166 questions)
-
LLM produces OPEN_QUESTIONS.adoc with 11 unanswerable questions, routed by role
-
"Team" answers all 11 questions from the original documentation (simulating Product Owner, Architect, Developer input)
-
Phase 2: LLM synthesizes documentation from answered questions + code evidence + team answers
-
Compare output against Original, Direct (1a), and Socratic (1c)
.2. Results at a Glance
| Metric | Original | Direct (1a) | Socratic (1c) | Two-Phase |
|---|---|---|---|---|
Total doc lines |
~13,800 |
3,850 |
2,434 |
4,083 |
arc42 chapter lines |
1,300 |
642 |
429 |
1,090 |
ADRs |
5 |
6 (wrong topics) |
3 |
5 (correct topics) |
ADR topics match Original |
— |
No |
No |
Yes |
Use Cases |
8 |
9 |
9 |
9 |
Acceptance Criteria |
40 |
69 |
~20 |
47 |
Open Questions remaining |
— |
33 |
26 |
0 |
Q-ID traceability |
0 files |
1 file |
23 files |
24 files |
Team answer markers |
0 |
0 |
0 |
50 |
Performance budgets (Ch. 7) |
Yes |
No |
No |
Yes |
Quality goal priorities (Ch. 1) |
Yes |
No |
No |
Yes |
Glossary terms |
2 |
31 |
19 |
97 |
.3. The Breakthrough: ADRs Match the Original
The most significant result: the Two-Phase approach produced exactly the right 5 ADR topics, matching the Original 1:1.
| ADR | Original | Direct (1a) | Socratic (1c) | Two-Phase |
|---|---|---|---|---|
DSL Format |
✅ |
✅ (fewer alternatives) |
✅ |
✅ (full rationale from OQ-5) |
Implementation Language |
✅ |
❌ (CLI Framework instead) |
❌ |
✅ |
Risk Classification |
✅ |
❌ (missing) |
❌ |
✅ |
Sequence Diagram Export |
✅ (Rejected) |
❌ (Conflict Policy instead) |
❌ |
✅ (Rejected) |
Auto-Layout Engine |
✅ |
❌ (XML Library instead) |
❌ |
✅ |
This happened because OQ-4 asked "Are ADRs maintained, and where?" and the team answered with the complete list of 5 ADRs including their topics and status. The LLM then wrote ADRs for those exact topics instead of guessing which decisions were important.
The ADR-001 (DSL Format) includes the real rationale: 6 alternatives evaluated, JSONC scored +20, key reasons (no parser needed, bidirectional sync, IDE support, LLM-native, JSONC comments). This came directly from the OQ-5 team answer.
.4. Previously "Poorly Derivable" Chapters: Now Strong
.4.1. Chapter 1: Quality Goal Priorities
| Version | Top 3 Quality Goals |
|---|---|
Original |
1. Learnability (30-min onboarding), 2. IDE Support, 3. LLM Friendliness |
Direct (1a) |
6 goals inferred from code, no prioritization |
Socratic (1c) |
4 goals inferred, no priority (Q-3.1.2 left open) |
Two-Phase |
1. Bidirectional correctness, 2. Predictability for LLM agents, 3. Zero-friction install (OQ-6 team answer) |
The Two-Phase approach is the only generated version with explicit quality goal priorities. The framing differs slightly from the Original (correctness vs. learnability as #1), but the intent is captured because the team provided the competitive context (OQ-6: "draw.io is the most widely used free diagramming tool").
.4.2. Chapter 7: Performance Budgets
| Version | Performance Metrics |
|---|---|
Original |
Startup <10ms, Sync <100ms (200 elements), Binary 10-15MB, Zero deps |
Direct (1a) |
Not present |
Socratic (1c) |
Not present |
Two-Phase |
Startup <10ms, Sync <100ms (200 elements), Binary 10-15MB, Zero deps (OQ-8 team answer) |
Identical to Original. The OQ-8 answer provided the exact thresholds.
.4.3. Chapter 9: Architecture Decisions
| Version | ADR References |
|---|---|
Original |
Table of 5 ADRs with status |
Direct (1a) |
List of 6 ADRs (wrong topics) |
Socratic (1c) |
Notes "no explicit ADR files exist", 3 reverse-engineered |
Two-Phase |
Table of 5 ADRs with correct status (4 Accepted, 1 Rejected) |
.5. Traceability: Two Sources, Clearly Marked
The Two-Phase documentation distinguishes two types of claims:
-
Code-derived: Cited with file:function references. Example:
The model uses JSONC format (Q-2.2.1, internal/model/loader.go:StripJSONC) -
Team-provided: Marked with
(team answer). Example:The project competes with Structurizr and LikeC4 (OQ-6 team answer)
50 team-answer markers across 18 files. This dual-source traceability is unique to the Two-Phase approach.
ADRs use [NOTE] blocks to mark reverse-engineered sections while confidently citing team answers where provided:
.6. arc42 Chapter Comparison
| Ch. | Title | Orig | Direct | Socratic | Two-Phase | Winner |
|---|---|---|---|---|---|---|
1 |
Introduction and Goals |
114 |
36 |
37 |
74 |
Two-Phase (priorities restored) |
2 |
Constraints |
69 |
22 |
21 |
74 |
Two-Phase (most complete) |
3 |
Context and Scope |
143 |
53 |
30 |
81 |
Original (most detail) |
4 |
Solution Strategy |
106 |
59 |
20 |
70 |
Original (design patterns) |
5 |
Building Block View |
139 |
137 |
51 |
154 |
Two-Phase (most detailed) |
6 |
Runtime View |
210 |
138 |
49 |
133 |
Original (most scenarios) |
7 |
Deployment View |
142 |
51 |
35 |
92 |
Two-Phase (budgets restored) |
8 |
Crosscutting Concepts |
190 |
95 |
64 |
143 |
Original (more topics) |
9 |
Decisions |
36 |
10 |
11 |
32 |
Two-Phase (correct ADRs) |
10 |
Quality Requirements |
131 |
21 |
36 |
121 |
Two-Phase (closest to Original) |
11 |
Risks |
66 |
39 |
56 |
119 |
Two-Phase (most risks) |
12 |
Glossary |
54 |
31 |
19 |
97 |
Two-Phase (most terms) |
Two-Phase wins 8 of 12 chapters. Original wins 4 (Context, Strategy, Runtime, Concepts) — the chapters where narrative depth and design pattern knowledge matter most.
At 1,090 lines, Two-Phase reaches 84% of the Original’s 1,300-line arc42 — the closest of any approach.
.7. What the Two-Phase Approach Solved
| Problem | Direct (1a) | Socratic (1c) | Two-Phase | How |
|---|---|---|---|---|
ADRs had wrong topics |
❌ |
❌ |
✅ |
OQ-4 team answer listed the 5 real ADR topics |
ADR rationale was guessed |
❌ |
Honest (flagged) |
✅ |
OQ-5 team answer provided the real +20 Pugh Matrix rationale |
Quality goals had no priority |
❌ |
❌ |
✅ |
OQ-6 team answer provided competitive context |
Performance budgets missing |
❌ |
❌ |
✅ |
OQ-8 team answer provided exact thresholds |
Business context generic |
❌ |
❌ |
✅ |
OQ-1 personas, OQ-3 collaboration scope, OQ-6 draw.io rationale |
Open Questions unresolved |
33 open |
26 open |
0 open |
Team answered all 11, Phase 2 integrated them |
Claims not traceable to source |
No Q-IDs |
Q-IDs but no team markers |
Both |
Dual traceability: code refs + team answer markers |
.8. Threats to Validity
.8.1. Unfair comparison
The Two-Phase approach had an information advantage the other approaches did not: 11 team-answered Open Questions. This means the comparison in the table above measures the value of the 11 answers, not the value of the two-phase structure. If we had given the Direct (1a) or Socratic (1c) approaches the same team answers as additional context, they might have improved significantly too.
A fairer experimental design would include a control: run Phase 1 + Phase 2 without team answers (the LLM documents from the Question Tree alone). The difference between that control and the current Two-Phase result would isolate the value of the team answers. The difference between the control and the Direct approach would isolate the value of the two-phase structure.
What we can say with confidence: the two-phase structure produced the right questions. The 11 Open Questions identified by Phase 1 were precise enough that answering them closed the gap. Whether a single-shot prompt would have asked the same questions is unknown.
.8.2. Glossary inflation
The Two-Phase glossary (97 lines) is disproportionately large compared to the Original (54 lines, mostly placeholder) and Direct (31 lines). Several entries include code references and implementation details that belong in the Data Models spec, not in a glossary. A glossary should define domain terms concisely; the Two-Phase version treats it as a mini-encyclopedia. This inflates the line count without adding proportional value.
.9. What the Two-Phase Approach Still Cannot Do
Even with team answers, four categories of information remain weaker than the Original:
-
Narrative depth: Chapters 4 (Solution Strategy) and 8 (Crosscutting Concepts) are shorter because the LLM summarizes patterns rather than explaining them with examples and trade-off discussions.
-
Aspirational features: UC-7 Drill-Down Navigation is still missing. The team answers didn’t explicitly mention it, and the code doesn’t implement it. Only the Original spec describes it.
-
Tutorials and guides:
06_tutorial.adoc(266 lines) and07_template_guide.adoc(322 lines) remain absent. These require didactic skill, not just knowledge. -
Historical artifacts: ATAM reviews, security review reports, E2E test plans — these are process outputs, not recoverable from code or Q&A.
.10. Conclusion
The Two-Phase Socratic Code Theory Recovery workflow produces the most accurate Brownfield documentation of all tested approaches. It combines:
-
The Socratic approach’s honesty (Question Tree, explicit unknowns)
-
The Direct approach’s completeness (full arc42, all spec files)
-
A new dimension: team knowledge routed through Open Questions
The key insight is that 11 well-targeted questions (identified by recursive decomposition) plus their answers were sufficient to close the gap between reverse-engineered documentation and the Original. The Open Questions mechanism is not just an honesty device — it is a precision instrument for knowledge transfer.
A caveat: we have not yet proven that the two-phase structure is better than simply giving the same 11 answers to a single-shot prompt. What we have proven is that Phase 1 identifies the right questions to ask. Whether the answer-integration in Phase 2 is superior to a direct prompt with answers appended remains to be tested.
| Approach | Best for |
|---|---|
Direct (1a) |
Quick documentation from code alone, no team access |
Socratic (1c) |
Identifying knowledge gaps, audit trails |
Two-Phase |
Production-quality Brownfield documentation with team involvement |
For teams preparing legacy projects for the Dark Factory, the Two-Phase workflow is the recommended approach: Phase 1 identifies exactly what the team needs to provide (typically 10-15 questions), Phase 2 produces documentation that includes both code evidence and human knowledge, with full traceability to both sources.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.