Teaching Claude Code to Remember | A Persistent Context Architecture for SEs

The Problem

Every session starts from zero

Out of the box, Claude Code has no persistent memory. No matter how many hours you've spent together, tomorrow it won't remember a thing.

Context Window Bloat

A monolithic CLAUDE.md that dumps 15,000 tokens of instructions every session. Deployment details when you're writing emails. Slide rendering rules when you're debugging Apex. 30% of your context window gone before you type a word.

Groundhog Day

"I'm a Solution Engineer working on Salesforce. Here's my project structure. Don't use these words. Remember the 70/30 rule..." Every. Single. Session. The first 10 minutes are always re-orientation.

No Compound Learning

Claude makes the same mistake twice. Uses a deprecated product name you corrected last week. Suggests a pattern you've already rejected. Knowledge from yesterday's session vanishes overnight.

The real cost isn't time. It's that Claude can never get better at working with you specifically. Without persistent context, every session is a first date with a genius who has amnesia.

The Solution

Five-tier persistent context

Instead of one massive instruction file, split context into five tiers. Load what you need, when you need it. The right information at the right time.

P1

Identity — CLAUDE.md Always Loaded

~2,800 tokens

Who you are, essential rules (anti-slop, anti-hallucination), wiki routing table. The irreducible minimum.

P2

Session Init — Hooks Auto on Start

~200 tokens

Python script analyzes your directory, git history, and branch name → suggests the right wiki pages to load.

P3

Knowledge Base — Wiki On Demand

Variable

51 pages of deep reference: architecture patterns, deployment runbooks, design systems. Loaded only when the task needs them.

P4

Long-Term Memory — Files When Recalled

~2,000 tokens

128 topic files: decisions made, mistakes corrected, project status, feedback. MEMORY.md index auto-loaded, details read on demand.

P5

Behavioral Rules — Guardrails Always Active

10 files

Security governance, architecture standards, testing quality, communication style, context-mode policy, data protection. Loaded from ~/.claude/rules/ every session.

Context window at session start — before vs. after

Before Monolithic CLAUDE.md

~15,000 tokens (7.5%)

15K tokens burned upfront

After Tiered architecture (P1 + P2 + P4 index)

~5,000 tokens (2.5%)

5K

3x

less overhead. The remaining 97.5% of your context window is available for actual work — code, conversation, and output.

Comparison

Before and after

The same person, the same projects, the same Claude Code. One has persistent context architecture. The other doesn't.

Before — Out-of-the-Box Claude Code

10 min ramp-up every session

Same corrections repeated

Context fills fast (3x sooner)

Quality varies session to session

After — Persistent Context Architecture

Start working immediately

Corrections stick forever

3-4x longer conversations

Knowledge compounds every session

Deep Dive

The five components

Each piece is simple on its own. Together they create a system that's genuinely smarter than the sum of its parts.

# What goes in CLAUDE.md
Section 1: Owner identity (name, role, context)
Section 2: Environment (OS, editor, model, GCP project)
Section 3: Essential rules — only the ones Claude needs EVERY time
• Anti-slop banned words (50+)
• Anti-hallucination golden rule
• Product name corrections
• Visual design philosophy
Section 4: Wiki routing table (11 work contexts → wiki pages)
Section 5: Project inventory (names, repos, status only)
Section 6: Available commands (table format)

Key principle: CLAUDE.md answers "who am I and where do I look things up?" — not "here's everything I'll ever need." The routing table is the critical innovation: it tells Claude which wiki page to load for any given task, without loading the content upfront.

6

Projects

19

Concepts

7

Tools

5

Insights

# wiki/ directory structure
wiki/
  index.md              # Master catalog (auto-maintained)
  inbox.md              # Auto-capture queue
  concepts/             # Architecture, patterns, systems
  projects/             # Project overviews and status
  tools/                # Tool configs, MCP servers, CLIs
  entities/             # Companies, people, accounts
  insights/             # Strategic observations, theses

How it grows: At the end of each session, Claude checks if it learned anything worth preserving. If so, it appends a one-liner to wiki/inbox.md. The next session processes inbox items into proper wiki pages. Knowledge compounds automatically.

Memory Types

user — role, preferences, expertise level

feedback — corrections and confirmed approaches

project — status, decisions, deadlines

reference — external resource pointers

How It Works

MEMORY.md — index loaded every session

Topic files — read on demand when relevant

Auto-save — Claude saves when it learns something

Staleness — old info verified before acting on it

# Example memory file
---
name: Product Name Verification
description: Never assume SF product names are correct
type: feedback
---
LLMs hallucinate plausible Salesforce product names.
Always verify against salesforce.com before using.
**Why:** Claude once wrote "System of Agency"
instead of "Agentforce" in a customer-facing deck.
**How to apply:** Check salesforce.com before
using any product name. When in doubt, ask John.

SessionStart session-init.py — creates session dir, wiki-aware context routing

PreToolUse SF CLI guardrail — warns on deprecated sfdx commands

PreToolUse SOQL schema check — validates queries before execution

PreToolUse Architecture guardrail — surfaces infrastructure constraints + ADRs before editing constrained files

PostToolUse Validator — checks written code against naming/security standards

PostToolUse Debug log analyzer — parses SF debug logs automatically

Stop Completion check — reviews for incomplete items before closing

Notification macOS notification — plays sound when Claude needs attention

The key insight: Hooks run your code at specific moments in Claude's workflow. SessionStart is the most powerful — it's where context routing happens. But PreToolUse and PostToolUse are where you enforce quality standards automatically.

security-governance.md

architecture.md

testing-quality.md

communication.md

salesforce-instructions.md

salesforce-platform.md

tailwind-ui.md

slds2-lwc-ui.md

infrastructure-constraints.md

Rules vs. CLAUDE.md: Rules files live in ~/.claude/rules/ and apply to all projects globally. CLAUDE.md is project-specific. Use rules for standards you want everywhere (security, architecture patterns, communication style). Use CLAUDE.md for project-specific context.

In Practice

A real session, step by step

You open Claude Code in your slide-generation project directory. Here's what happens before you type a single word.

1

CLAUDE.md loads (P1)

Claude learns your identity, role, and essential rules. Sees the routing table: "If working on slides → load pitch-craft.md and slide-visual-design.md." Cost: 2,800 tokens.

2

session-init.py fires (P2)

Hook detects cwd contains "slides". Checks git log: recent commits mention "deck" and "slide." Outputs: "Relevant wiki pages: pitch-craft.md, slide-visual-design.md, golden-narrative-framework.md." Cost: 200 tokens.

3

MEMORY.md index loads (P4)

Claude sees 128 memory entries. Relevant ones jump out: "feedback-imagen-cheesy.md — AI backgrounds look cheesy; solid = premium" and "feedback-seller-urgency-ban.md — Never use SF contract timelines as Why Now drivers." Cost: ~2,000 tokens.

4

You say "Build me a Starbucks deck"

Claude reads pitch-craft.md (golden narrative framework), slide-visual-design.md (design tokens, layout library), and the Starbucks memory file. It already knows: solid backgrounds, 70/30 customer-to-Salesforce ratio, no banned words, brand color resolution. No setup. Just builds.

5

PostToolUse hooks validate output

After Claude writes code, the validator hook checks naming conventions and security patterns automatically. No manual review of basics — the system catches the easy stuff.

6

Session ends → knowledge captured

Claude learned that Starbucks uses a green-on-dark palette. It appends to wiki/inbox.md: "Starbucks brand colors resolved to #00704A." Next session processes this into a proper wiki entry. The system just got a little smarter.

Origin

Standing on Karpathy's shoulders

This system implements the LLM Wiki pattern that Andrej Karpathy described in early 2026: instead of re-deriving knowledge every query, have the LLM build and maintain a persistent wiki that compounds over time.

Karpathy Layer 1

Raw Sources

Immutable input. Human curates what goes in.

intel-digests/

X feed (GraphQL)

Exa search results

Articles, docs, PDFs

Karpathy Layer 2

Wiki

Synthesized, interlinked. Claude maintains it.

wiki/ (51 pages)

Cross-referenced

Auto-indexed

Lint-checked

Karpathy Layer 3

Schema

Structure and conventions. Human + Claude co-evolve.

CLAUDE.md

Auto-memory (128 files)

Wiki index + log

Rules (10 files)

Operations — Karpathy Pattern → Our Implementation

Ingest

/ingest command

Query

Every conversation

Lint

/wiki-lint command

The core Karpathy insight: the maintenance burden is what kills every knowledge system. Claude handles all the bookkeeping — summarizing, cross-referencing, filing, consistency checking. Your job is sourcing, directing, and thinking. The wiki grows while you work. You never have to organize it.

Why flat files, not vector embeddings?

RAG (retrieval-augmented generation) retrieves by meaning — great for searching large datasets you didn't write. But for personal context — your preferences, your project state, your corrections — structured markdown files that Claude can directly read and update are simpler, faster, and don't lose fidelity. We use both: flat-file wiki for personal knowledge, pgvector RAG for corporate data (product names, customer stories, competitive intel).

Beyond the pattern

Karpathy described the idea. We validated it, then extended it in five directions he hasn't addressed.

Token economics

Karpathy doesn't address context window costs. We engineered tiered loading: 2,800 tokens always-on, everything else on demand. 82% reduction vs. loading the full schema every session.

Automated routing

Karpathy's workflow is manual — you tell the LLM what to do. Our session-init hook analyzes your git state and auto-suggests which wiki pages to load. No human in the loop.

Multi-modal persistence

Karpathy has one artifact: the wiki. We separate four types — wiki (reference), memory (decisions), rules (standards), schema (routing) — each loaded at different times for different reasons.

Curation & lifecycle

Karpathy describes Lint. We go further: a /curate command that scans memory for staleness, processes the wiki inbox, and flags what needs attention. Memory files have decay rates. His system grows but doesn’t prune. Ours does.

Voluntary → intrinsic memory

Karpathy’s system relies on the LLM voluntarily looking things up. We applied DDR-001 from the Agentforce Agent Harness: “agents forget to remember.” Infrastructure constraints load as rules (always). PreToolUse hooks surface ADRs before editing constrained files. The information is in your face when it matters.

“I think there is room here for an incredible new product instead of a hacky collection of scripts.”

— Andrej Karpathy, on the LLM Wiki pattern

We took that seriously. Five named tiers, 19 commands, 10 MCP servers, automated context routing, tiered token loading, architecture guardrail hooks, and ADRs. Not a product — but not a hacky collection of scripts either.

Get Started

Build your own in stages

You don't need to build the full system on day one. Start with CLAUDE.md, add layers as you go. Each stage delivers value on its own.

1

Day One

30 minutes

CLAUDE.md + Rules

Create CLAUDE.md in your project root with:

• Your name, role, and what you work on
• Your environment (OS, editor, key tools)
• 3-5 rules Claude should always follow
• Any words or patterns to avoid

Add rules files to ~/.claude/rules/ for cross-project standards.

2

Week One

1-2 hours

Memory + Wiki Foundation

Enable auto-memory in CLAUDE.md instructions. Start a wiki/ directory with:

• index.md (master catalog)
• 3-5 pages for your main topics
• inbox.md (auto-capture queue)

Add a routing table to CLAUDE.md mapping work contexts to wiki pages.

3

Month One

2-3 hours total

Hooks + Automation

Add hooks to settings.json for automated behaviors:

• SessionStart: context routing script
• PostToolUse: code validation
• Notification: OS alerts

Slim down CLAUDE.md as your wiki grows. Move detail out, keep routing in.

4

Ongoing

5 min/week

Curation

Knowledge bases rot without maintenance. Run a weekly /curate command:

• Archive stale session memories
• Process wiki inbox items
• Check index health & routing gaps

Claude does the legwork. You just say keep, update, or delete.

Minimal CLAUDE.md to start with

# Project Context

## Owner
- Name: [Your name]
- Role: [Your role and team]
- Context: [How you use Claude Code — what are you building?]

## Environment
- OS: macOS / Windows / Linux
- Editor: VS Code / JetBrains / Terminal
- Key tools: [sf CLI, npm, docker, etc.]

## Rules (Always Follow)
1. [Your most important rule — e.g., "Never use deprecated APIs"]
2. [Quality standard — e.g., "All Apex must be bulk-safe"]
3. [Voice rule — e.g., "Write like an SE, not a marketing team"]
4. [Safety rule — e.g., "Never hardcode IDs or credentials"]

## Current Projects
- [Project 1]: [One-line description + status]
- [Project 2]: [One-line description + status]

## Preferences
- [How you like to work with Claude]
- [What kind of output you expect]

Step-by-Step Getting Started Guide

Copy-paste templates, code blocks, and a completion checklist. 30 minutes from zero to done.

Results

What changes when Claude remembers

82%

Less context overhead

15,000 tokens → 2,800 tokens loaded at session start. The rest comes in on demand. Your context window stays clear for actual work.

0 min

Ramp-up time

No more "here's my project, here are my rules, remember that thing from yesterday." Claude knows who you are, what you're working on, and what you've decided.

128

Decisions remembered

Every correction, every preference, every architectural decision — saved once, applied forever. Claude stops making the same mistake twice.

∞

Compound returns

The wiki grows every session. Memory accumulates. The system gets measurably better the more you use it. Session 100 is dramatically better than session 1.

Teaching Claude Code
to Remember