Phase 1: Strategic Architecture

Agile: Sprint 0 / Inception Roles: Architect + Product Owner Human: 65% LLM: 35%

In brief: Creates the documented architectural vision (ADRs + diagrams) that provides the LLM with the “mental map” it naturally lacks. Strategic decisions made by humans, alternatives explored with LLM assistance.

Why This Phase Is Critical

The problem without Phase 1: LLM generates code based on generic internet patterns, without understanding the specific context. Proposes solutions disconnected from real needs (e.g., SQL database instead of vectorial). Fundamental architectural errors discovered late require complete redesign.

The solution provided: Strategic Architecture Document creates explicit system representation: why each component exists, which constraints must be respected, which trade-offs are accepted. The LLM receives the business and technical context it cannot guess.

LLM limitations addressed:

No internal representation: ADRs and diagrams create the explicit “mental map” (components, dependencies, impacts)
No understanding of architecture role: Documentation explains why a component exists, what business problem it solves, which architectural trade-offs it represents

Measured impact:

Architectural redesigns avoided thanks to validated decisions before coding
Faster onboarding time for new developers
Improved code generation quality
Technical debt avoided: Documented decisions prevent “tribal knowledge”

Process Flow

Inputs:

Business requirements and success criteria
Technical constraints (existing systems, standards, performance)
Organizational constraints (team skills, timeline, budget)
Stakeholder priorities
Existing system documentation (if refactoring/extension)

1. Problem Crystallization ⏱️⏱️

Architect 80%, LLM 20%

Architect articulates business problem clearly
LLM helps identify unstated assumptions
Separate symptoms from root causes
Define quantitative success criteria

Output: Precise problem statement with success metrics

2. Constraint Mapping ⏱️

Architect 60%, LLM 40%

Architect identifies constraints from experience
LLM generates comprehensive constraint checklist
Prioritize constraints (mandatory vs. nice-to-have)
Document organizational/political constraints

Output: Prioritized constraint list (technical + organizational)

3. Solution Generation ⏱️⏱️

Architect 50%, LLM 50%

Architect provides 1-2 initial solution directions
LLM generates 2-3 alternative approaches
Ensure at least one innovative/unconventional approach
Document each approach with architecture diagrams

Output: 3-4 solution approaches with C4 diagrams

4. Trade-off Analysis ⏱️⏱️

Architect 70%, LLM 30%

Architect evaluates approaches against business priorities
LLM generates trade-off comparison matrix
Identify risks and mitigation strategies per approach
Architect makes final decision with documented justification

Output: Trade-off matrix + justified decision

5. Stakeholder Validation ⏱️⏱️

Human 90%, LLM 10%

Present recommendation to Product Owner
Discuss trade-offs, validate business alignment
Revisions based on feedback (1-2 cycles may be needed)
Formal final approval

Output: Approved Architecture Document

Strategic Architecture Document Deliverable

Length: 2,000-4,000 words

Sections:

Executive Summary (~200 words): Problem + recommended solution
Problem Definition: Root cause analysis, success criteria
Constraints: Technical, organizational, timeline, budget
Solution Approaches: 2-4 approaches with C4 level 1-2 diagrams
Trade-off Analysis: Comparison matrix, risk evaluation
Recommendation: Chosen approach with explicit justification
Success Metrics: How to measure if solution succeeds
Risk Mitigation: Top 3-5 risks and strategies

Definition of Done

This phase is considered complete when:

The business problem is clearly articulated with quantified success criteria
At least 3 ADRs (Architecture Decision Records) document major decisions
A high-level architecture diagram exists (C4 diagrams levels 1-2)
Critical technical constraints are identified and prioritized
The chosen solution includes clear justification of trade-offs vs. alternatives
Main architectural risks are documented with mitigation strategies
The Product Owner validates that the proposed solution meets business needs

Going Deeper

See concrete examples, prompts and detailed ADRs

Complete Example: Nutritional Recommendation System

Business Context

A nutrition application wants to recommend foods based on user profile (allergies, preferences, health goals). Currently: static list by category. Goal: personalized real-time recommendations.

1. Problem Crystallization

Initial statement (vague): “We want to improve food recommendations to make the app more useful.”

After crystallization with LLM:

Precise problem: Users abandon the app (Day-7 retention rate: 12%) because current recommendations don’t account for their constraints (peanut allergies, vegetarian diet, weight loss goal). 35% of users report “irrelevant suggestions” in surveys.

Root causes:

Current system: static if/then logic (120 lines of spaghetti code)
No similarity consideration between foods
No confidence score on recommendations
No learning mechanism (user feedback ignored)

Quantified success criteria:

Day-7 retention rate: 12% → 25% (+108%)
Recommendation satisfaction: 2.1/5 → 4.0/5
API response time: < 200ms (95th percentile)
Allergy/restriction coverage: 100% (vs. 60% current)

2. Constraint Mapping

Mandatory technical constraints:

Backend: Python 3.11+ (existing stack)
API latency: < 200ms p95 (UX critical)
Database: PostgreSQL existing (migration forbidden due to cost)
Volume: 50,000 active users, 10M foods in DB

Organizational constraints:

Team: 2 backend developers, 1 frontend developer (no data scientist)
Timeline: MVP in 6 weeks (business deadline)
Infrastructure budget: +$500/month max
No in-house ML/AI expertise

Nice-to-have constraints:

Recommendation explainability (why this food?)
Basic offline mode (cache recent recommendations)
Multilingual support (FR/EN)

3. Solution Generation

Approach 1: Advanced Rules System (Architect)

Architecture:
┌──────────────────┐
│  User Profile    │
│  (allergies,     │
│   restrictions)  │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐      ┌──────────────────┐
│  Rules Engine    │─────▶│  Food Database   │
│  (500+ rules)    │      │  (PostgreSQL)    │
└────────┬─────────┘      └──────────────────┘
         │
         ▼
┌──────────────────┐
│  Ranked Results  │
└──────────────────┘

Advantages:

Fully explicable (rules = justification)
No ML expertise needed
Predictable latency (< 50ms)

Disadvantages:

Maintenance nightmare (500+ rules to maintain)
No learning (user feedback lost)
Limited scalability (rules explosion with complexity)

Complexity: Medium (simple implementation, high maintenance) Risks: Major technical debt after 1 year

Approach 2: Vector Search + Filtering (LLM - innovative)

Architecture:
┌──────────────────┐
│  User Profile    │
│  embeddings      │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐      ┌──────────────────┐
│  Vector Search   │      │  pgvector        │
│  (cosine sim)    │─────▶│  (PG extension)  │
└────────┬─────────┘      └──────────────────┘
         │
         ▼
┌──────────────────┐
│  Hard Filters    │
│  (allergies)     │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Ranked Results  │
│  + confidence    │
└──────────────────┘

Advantages:

Semantic similarity (finds related foods)
No rule maintenance (embeddings learned)
Natural confidence scores (cosine distance)
Uses existing PostgreSQL (pgvector extension)

Disadvantages:

Embeddings to generate (initial cost)
Reduced explainability (cosine distance ≠ business reason)
Team must learn vector concepts

Complexity: Medium-High (technical novelty) Risks: Team learning curve, embedding quality

Approach 3: External ML API (LLM - alternative)

Architecture:
┌──────────────────┐
│  User Profile    │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐      ┌──────────────────┐
│  API Gateway     │      │  OpenAI API      │
│  (1h cache)      │─────▶│  (embeddings)    │
└────────┬─────────┘      └──────────────────┘
         │
         ▼
┌──────────────────┐
│  Post-processing │
│  (filters)       │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Ranked Results  │
└──────────────────┘

Advantages:

Zero in-house ML expertise needed
Embedding quality guaranteed (OpenAI)
Minimal maintenance

Disadvantages:

Recurring cost (~$2000/month estimated)
External dependency (API availability)
Variable latency (network)
User data sent externally (privacy concerns?)

Complexity: Low (simple API call) Risks: Costs explode with volume, vendor lock-in

4. Trade-off Analysis

Comparison matrix:

Dimension	Advanced Rules	Vectors pgvector	External ML API
Dev time	4 weeks	5-6 weeks	2-3 weeks
Tech complexity	4/10	7/10	3/10
Maintainability	2/10 (rules explode)	8/10 (auto-learning)	9/10 (outsourced)
Performance	9/10 (< 50ms)	7/10 (< 150ms)	5/10 (200ms+ network)
Infrastructure cost	$50/month	$200/month	$2000/month
Risk	High (tech debt)	Medium (learning curve)	Medium (dependency)
Scalability	Low	High	High
Explainability	10/10 (clear rules)	4/10 (cosine distance)	3/10 (black box)

Recommendation: Approach 2 (pgvector Vectors)

Justification:

Respects budget constraints: $200/month < $2000/month external API
Long-term scalability: Avoids technical debt from rules
Uses existing stack: pgvector = PostgreSQL extension (no migration)
Acceptable timeline: 5-6 weeks vs. 4 (manageable difference)
Learning investment: Team learns valuable skill (vectors = the future)

Accepted trade-offs:

Technical complexity +30% vs. simple rules
Reduced explainability (but confidence scores partially compensate)
Team learning curve (mitigated by documentation + 2-day training)

Rejected trade-offs:

External API: recurring costs 10x too high
Advanced rules: unacceptable technical debt (explosive maintenance)

5. Produced ADRs

ADR-001: Use pgvector for Similarity Search

Status: Accepted Date: 2025-12-28 Decision Makers: Architect + Product Owner

Context: Current recommendation system (static if/then rules) doesn’t scale. Need semantic similarity between foods for relevant recommendations.

Decision: Use pgvector extension in existing PostgreSQL for vector-based search with cosine similarity.

Alternatives considered:

Advanced Rules System: Rejected (explosive maintenance, not scalable)
External ML API (OpenAI): Rejected (costs $2000/month too high)
Dedicated vector database migration (Pinecone): Rejected (migration cost forbidden)

Consequences:

Uses existing infrastructure (PostgreSQL)
Acceptable infrastructure cost ($200/month)
Semantic similarity without rule maintenance
Team must learn embedding concepts (2-day training)
Reduced explainability vs. rules (compensated by confidence scores)

ADR-002: Generate Embeddings via Local sentence-transformers

Status: Accepted Date: 2025-12-28

Context: Need to generate embeddings for 10M foods + user profiles. Choice: external API (OpenAI) vs. local model.

Decision: Use sentence-transformers (model all-MiniLM-L6-v2) locally.

Justification:

Zero cost after initial generation (vs. $2000/month API)
Predictable latency (no network dependency)
Privacy: data stays on-premise
Performance sufficient (384 dimensions, quality acceptable for nutrition domain)

Consequences:

Costs controlled long-term
Stable latency (< 50ms embedding generation)
Initial generation of 10M embeddings = 8h compute (one-time)
Embedding quality lower than GPT-4 (acceptable for MVP)

ADR-003: Hard Filtering of Allergies BEFORE Vector Search

Status: Accepted Date: 2025-12-28

Context: Users with allergies (peanuts, gluten, lactose) MUST NEVER receive dangerous recommendations, even if vector similarity is high.

Decision: Apply PostgreSQL allergy filters (WHERE clauses) BEFORE vector search. Safety > relevance.

Justification:

Safety criticality: Allergen recommendation = health risk
Performance: SQL filtering very fast (index)
Certainty: Hard filters = 100% guarantee (vs. probabilistic ML)

Consequences:

Allergy safety guaranteed
Health regulatory compliance
Reduced candidate pool (may impact diversity)
Allergen list maintenance (DB table up-to-date)

Recommended Prompts

Prompt 1: Problem Crystallization

I'm designing a solution for [business problem]. Here's the initial statement:

PROBLEM:
[paste vague/initial problem description]

CONTEXT:
- Users: [who]
- Current system: [how it works today]
- Main complaints: [user feedback]

Help me crystallize this problem by:

1. **Identify unstated assumptions**: What assumptions have I made
   that aren't explicit?

2. **Separate symptoms from root causes**:
   - Observed symptoms: [what users see]
   - Probable root causes: [why it happens]

3. **Suggest 3-5 QUANTITATIVE success criteria**:
   - Format: Current metric → Target metric (+% improvement)
   - Examples: "Day-7 retention rate: 12% → 25% (+108%)"

4. **Scope drift risks**: What aspects could dangerously expand
   the project?

Response format: Structured markdown with clear sections.

Prompt 2: Solution Generation

Given this problem definition and constraints:

PROBLEM:
[paste crystallized problem]

MANDATORY CONSTRAINTS:
- Technical: [ex: Python 3.11+, latency < 200ms, existing PostgreSQL]
- Organizational: [ex: team 3 devs, 6 weeks, no ML expertise]
- Budget: [ex: +$500/month infrastructure max]

BUSINESS PRIORITIES:
[ex: 1. User retention, 2. Time-to-market, 3. Operating costs]

I've identified this initial approach:

ARCHITECT APPROACH:
[paste initial solution 2-3 paragraphs]

Generate 2-3 ALTERNATIVE solution approaches that:
- Address the same problem differently
- Work within stated constraints
- **INCLUDE at least ONE innovative/unconventional approach**
- Are technically feasible for this context

For EACH solution, provide:

1. **Approach name** (short descriptive)
2. **High-level architecture**:
   - Main components (3-5 max)
   - Interactions between components (data flow)
   - Simple ASCII diagram
3. **Key advantages** (3-5 points)
4. **Key disadvantages** (3-5 points)
5. **Estimated complexity**: Low / Medium / High (justify)
6. **Main risks** (top 2-3 with impact)
7. **Dev time estimation**: X weeks (acceptable range)

Format: Markdown, one section per approach.

Prompt 3: Trade-off Analysis

Create a trade-off comparison matrix for these approaches:

SOLUTION APPROACHES:
[paste 3-4 previously generated approaches]

Compare along these dimensions (rate 1-10 or Low/Medium/High):

1. **Development time**: Weeks estimated
2. **Technical complexity**: 1-10 (1 = trivial, 10 = expert required)
3. **Maintainability**: Long-term maintenance cost (Low/Med/High)
4. **Performance**: Meets requirements? Estimated latency
5. **Risk level**: Low/Medium/High (technical + business)
6. **Organizational fit**: Team skills, culture
7. **Scalability**: Future growth (users, data)
8. **Infrastructure cost**: $/month estimated
9. **Explainability**: Can we explain results to users? (1-10)

For each dimension, briefly explain the rating (1-2 sentences).

Then RECOMMEND which approach best balances trade-offs
for this context:

BUSINESS PRIORITIES (in order of importance):
[paste priorities: ex: 1. Retention, 2. Time-to-market, 3. Costs]

NON-NEGOTIABLE CONSTRAINTS:
[paste hard constraints]

Recommendation format:
- **Recommended approach**: [which one]
- **Justification**: Why this approach (5-7 points)
- **Accepted trade-offs**: What we consciously sacrifice
- **Rejected trade-offs**: What we refuse to sacrifice

Quality Standards

Good Architecture Document

Characteristics:

Quantified problem: “Day-7 retention: 12%” not “poor retention”
3+ alternatives evaluated: No single decision without comparison
Explicit trade-offs: “We accept 30% complexity increase to avoid technical debt”
Concrete ADRs: Documented decisions with rejected alternatives
Clear C4 diagrams: Components + interactions visible
Formal PO validation: Signature/explicit approval

Example good documented decision:

ADR-001: Use pgvector

Alternatives considered:
1. Advanced rules (rejected: technical debt)
2. OpenAI API (rejected: $2000/month)
3. Pinecone migration (rejected: migration cost)

Decision: pgvector
Justification: Balances cost/performance/maintenance
Accepted trade-off: 30% complexity increase, learning curve
Consequences: 2-day team training, embedding documentation

Bad Architecture Document

Problems:

Vague problem: “Improve the app” without metrics
Single solution: No evaluated alternatives
Technology bias: “Let’s use GraphQL” without justifying why
Hidden trade-offs: Only benefits, no downsides
Ignored constraints: Theoretically perfect solution, team can’t execute it
No PO validation: Architect decides alone

Example bad decision:

We'll use GraphQL.

[End. No alternatives, no why, no trade-offs]

→ Impossible to understand context or challenge the decision

Common Pitfalls

1. Single-Approach Analysis

Problem: Architect evaluates only their preferred solution. No true alternatives. False impression of objective choice.

Solution:

Minimum 3 approaches: Architect + 2 LLM alternatives
One “crazy” approach: Forced innovative/unconventional
Honest evaluation: Each approach has advantages AND disadvantages
Comparison matrix: Impossible to bias if all dimensions listed

DoD Check #5: “Chosen solution includes justification vs. alternatives”

2. Technology-First Thinking

Problem: “Let’s use Kubernetes!” before understanding the problem. Choosing tech stack before needs = guaranteed failure.

Example:

Architect: "We'll do microservices with Kubernetes"
Reality: 3 users/day, 1 team developer
→ Massive over-engineering

Solution:

Strict order: Problem → Constraints → Solutions → Technologies
Tech justification: Every technology choice must address a specific constraint
YAGNI: “You Aren’t Gonna Need It” - default to simplicity

Good process:

1. Problem: API latency > 500ms (frustrated users)
2. Constraint: Must pass < 200ms
3. Solution: In-memory cache
4. Technology: Redis (justified for speed)

3. Ignoring Organizational Constraints

Problem: Technically perfect solution design, but team can’t execute it. Architecture requires ML expertise, team has zero experience.

Real example:

Architecture: Custom deep learning recommendation system
Actual team: 2 Python backend devs, zero ML
Timeline: 6 weeks
→ Impossible. Project fails.

Solution:

Skills mapping: Who can actually do what
Learning budgeted: If new technology, add training time
Realistic alternatives: “Perfect solution impossible” → “Feasible good solution”

DoD Check #4: “Technical AND organizational constraints identified”

4. Analysis Paralysis

Problem: LLM generates 15 approaches. Architect spends 2 weeks analyzing all. Deadline explodes. Never decides.

Solution:

Strict limit: 3-4 approaches max
Timeboxing: Phase 1 = 1-2 days, not 2 weeks
“Good enough” > “Perfect”: Aim for 80% certainty, not 100%
Decidable decision: Better to decide fast and adjust than analyze forever

Problem signal: Phase 1 lasts > 3 days = analysis paralysis

5. Vague Success Criteria

Problem: “Improve performance”, “increase user satisfaction”. Impossible to measure success or failure.

Vague examples:

"Make the app faster"
"Increase quality"
"Improve user experience"

Solution:

Quantified metrics: Precise numbers
Current baseline: Where we start
Specific target: Where we want to go
% improvement: Magnitude of change

Good examples:

"Reduce p95 latency: 500ms → < 200ms (-60%)"
"Day-7 retention rate: 12% → 25% (+108%)"
"Satisfaction score: 2.1/5 → 4.0/5 (+90%)"

DoD Check #1: “Business problem with quantified success criteria”

6. Missing Stakeholder Buy-in

Problem: Architect decides architecture alone. Presents “fait accompli” to Product Owner. PO discovers solution misaligned with business priorities. Redoes Phase 1.

Solution:

PO involved from problem crystallization: Validates success criteria
Collaborative trade-off review: PO arbitrates conflicting priorities
Formal approval: PO signs document before Phase 2
Feedback loops: 1-2 revision cycles are normal

DoD Check #7: “Product Owner validates solution meets business needs”

Standard ADR Format

Reusable template:

# ADR-XXX: [Decision Title]

**Status**: [Proposed / Accepted / Rejected / Deprecated / Superseded]
**Date**: YYYY-MM-DD
**Decision Makers**: [Who decided]

## Context

[Why is this decision necessary? What problem does it solve?
2-4 paragraphs of business and technical context.]

## Decision

[What decision was made? Clear and concise statement. 1-2 paragraphs.]

## Alternatives Considered

### Alternative 1: [Name]
**Advantages**: [2-3 points]
**Disadvantages**: [2-3 points]
**Reason for rejection**: [Why not chosen]

### Alternative 2: [Name]
[Same structure]

## Justification

[Why this decision versus alternatives? Reference business priorities.
3-5 justified paragraphs.]

## Consequences

**Positive**:
- [Positive consequence 1]
- [Positive consequence 2]

**Negative / Accepted Trade-offs**:
- [Trade-off 1]
- [Trade-off 2]

**Required Actions**:
- [ ] [Action 1 to implement decision]
- [ ] [Action 2]

## References

- [Link to architecture document]
- [Link to Slack/email discussion]
- [Performance benchmark]

Next step: Phase 2: Tactical Planning + Critical Handoff →

Need help? See the Roles and Responsibilities document to clarify who does what in this phase.