60 days ago I taught Claude Code to remember. Then compound learning took over. What started as a filing system became an autonomous development platform.
Install Claude Code today. Open a terminal. Here's what you get:
Every session starts from scratch. You explain who you are. You explain your project. You re-state constraints Claude violated yesterday. At minute 45, the context fills. You start over. The genius has total amnesia.
Day 0: You work with Claude.
Day 60: Claude works for you.
Nobody planned a 2.1 GB local intelligence system. It grew from three forces that compound on each other:
Every mistake Claude made, I corrected. Every correction became a memory file. Every repeated mistake became a rule. Every ignored rule became a hook. Every hook that needed data became an MCP server. Every MCP server that overflowed context became context-mode. Each layer exists because the previous layer wasn't enough.
I never wrote a line of Python. Never authored a YAML skill file. Never debugged a hook script. Every piece of infrastructure was built by Claude based on my requirements. The hooks, the graph indexer, the skills — all AI-authored. The barrier isn't coding ability. It's knowing what you want and being specific about constraints.
Day 1–20: memory + wiki (the filing system). Day 20–40: hooks + MCP (the enforcement layer). Day 40–60: skills + graph + context-mode (the expertise layer). Each layer made the next faster to build. By day 60, Claude creates new skills in 5 minutes because it has the patterns from the previous 179.
Example: --set-env-vars wipes all Cloud Run vars → correction → memory → infrastructure rule → PreToolUse hook blocks it forever
Part 1 had five tiers. The current system has eleven layers. Five evolved from the original. Six are entirely new.
Part 1 described a passive system — store things, load them when relevant. What evolved is active — intercepts, validates, enforces without human intervention.
| Part 1 (Passive) | Now (Active) | |
|---|---|---|
| "Hooks auto-suggest wiki pages" | → | Hooks block unsafe operations, validate output, enrich graphs |
| "Memory persists corrections" | → | Memory is entity-linked, graph-indexed, auto-curated |
| "Rules load as standards" | → | Rules enforced by hook chains — gated, not suggested |
| "Wiki pages load on demand" | → | Wiki has inbox, lint, auto-capture, entity extraction |
| "CLAUDE.md routes context" | → | CLAUDE.md is a full orchestration manifest |
| "Contacts live in CRM" | → | Entity pages capture working style, decision patterns, relationship history |
Every passive component attracted an active counterpart. Storage attracted enforcement. Reference attracted validation. Memory attracted curation. The system developed an immune response to its own failure modes.
Memory stores corrections. Wiki stores knowledge. But neither stores people. Entity pages close that gap — a structured page per person that compounds across every interaction.
Two templates (internal + external). Structured top for instant prep — role, org position, working style, territory. Append-only timeline below that grows with every meeting, decision, and collaboration. Seeded programmatically from Slack profiles, then enriched by overnight intelligence crons.
| Without | With entity pages | |
|---|---|---|
| "Who's the AE on that account?" | → | Full org tree with SE/AE assignments in one Read |
| "What did we discuss last time?" | → | Timeline shows every interaction chronologically |
| Meeting prep = 20 min searching | → | One page, auto-routed by SessionStart hook |
Result: 150+ people mapped across sales and solutions org trees. Every leader with their reports, Slack IDs, and coverage areas. The CRM knows accounts — entity pages know the humans working them.
Individual layers are useful. The interactions between them are where compound growth happens.
Deploy with --set-env-vars instead of --update-env-vars. All env vars wiped. Correct Claude. Correction becomes memory. Memory cited in infrastructure rule. Hook now blocks any deploy using --set-env-vars. That error class can never happen again — even in a brand new session.
Research Agentforce Agent Script. Claude writes a wiki page. Graph extracts entities. Next session, mention "agent routing" — SessionStart hook surfaces that wiki page + related memories + architecture doc. Context arrives before you ask.
Run /account-prep. Skill knows the workflow. Calls MCP servers (CRM, news, web). Each response compresses through context-mode. Skill assembles the prep doc using anti-slop rules and corporate grounding. One command, five systems, zero manual context management.
This is the compound effect. No single layer produced these outcomes. Memory alone doesn't prevent errors. Hooks alone don't know what's dangerous. Skills alone can't access external data. The value is in the connections.
In Part 1, I cited DDR-001 from the Agentforce Agent Harness: "agents forget to remember." The fix was intrinsic constraint awareness. Two months later, DDR-001 applies at every layer:
| Layer | "Forgetting" Failure Mode | Fix |
|---|---|---|
| Memory | Doesn't check before claiming | Graph recalls relevant memories proactively |
| Rules | Reads but ignores under pressure | PreToolUse hook intercepts before execution |
| Skills | Reinvents instead of loading | Skill routing matches task automatically |
| Wiki | Pattern-matches from training data | ADR hook injects decisions when files in scope |
| MCP | Hallucinates instead of querying | Grounding rules require real data queries |
Meta-lesson: Every knowledge source needs a corresponding enforcement mechanism. Knowledge without enforcement is a suggestion. Knowledge with enforcement is a constraint. Constraints compound. Suggestions decay.
When Part 1 went out, this felt like a personal hack. Four weeks later:
The pattern we built on (ingest-synthesize-evolve) went viral. Spawned persistent memory tools. We'd been running it in production for months.
The VP-level Futures team published on the exact framework. The scaffolding that gives a model tools, data, and constraints. Direct validation at the highest level.
"Dreaming" — agents consolidating learnings between sessions. The same pattern as our memory + /curate system, productized by Anthropic themselves.
58 tools = 55K tokens per turn. We run 200+ with deferred loading. A problem others are discovering, we'd already solved.
I didn't plan most of this. Each correction, wiki page, and skill grew from a specific session's need. The 5-tier framework just gave those decisions somewhere to land.
128 memory files didn't stop infrastructure mistakes. 3 PreToolUse hooks did. If you only build one new thing after Part 1, build a guardrail hook for your most expensive mistake.
Part 1 optimized startup (15K to 5K tokens). The real limit hit at minute 45 when research filled the window. Context-mode was the breakthrough that enabled multi-hour sessions.
One skill replaces 30 minutes of manual instruction per session. I have 186 skills assembled from three suites — project management, Salesforce platform, and developer tooling — plus 28 custom commands I directed Claude to build for my specific SE workflow. You don't build 186 skills from scratch. You install the right systems and customize the last mile.
I've never opened a Python file to write the hooks. Never manually created a skill. Claude builds its own infrastructure from my direction. Domain expertise + clear requirements + an AI that builds its own tooling = this.
Priority order based on what gave me the most return:
| # | What | Why | Effort |
|---|---|---|---|
| 1 | One PreToolUse guardrail hook | Prevents your most expensive recurring mistake | 30 min |
| 2 | 3 domain skills | Replaces 30 min of instruction per session each | 2 hrs |
| 3 | One MCP server | Closes the gap between "understands" and "does" | 1 hr |
| 4 | Context overflow management | Removes the 20-minute session ceiling | 1 hr |
| 5 | ADR directory | Reasoning cache — stops re-debating settled decisions | 15 min |
You don't need 220 memory files or 186 skills. You need the feedback loops. One guardrail that fires when it matters. One skill that loads expertise automatically. One MCP server that fetches real data. Start the loops, then let compound growth do the rest.
Part 1 solved a specific problem: Claude forgetting between sessions. The system that grew from that fix solves a different problem: making a non-developer as productive as a senior engineering team.
199 commits in 28 days built an AI platform with 6 subagents, 25 data connectors, and production infrastructure. That was with the early five-tier system. With the current ten-layer stack, the same scope takes less than a week.
The gap between "domain expertise + clear requirements" and "production software" is closing. Not because models got smarter. Because the local infrastructure around the model compounds. Memory remembers. Hooks enforce. Skills teach. MCP servers act. The graph connects. And none of it requires you to be a developer.
“I think there is room here for an incredible new product instead of a hacky collection of scripts.”
— Andrej Karpathy, on the LLM Wiki pattern
60 days later, the hacky scripts became a platform. Not because I designed one — because compound learning doesn't stop at memory.
Two proof points. One about intelligence. One about autonomy.
I shared this article with Gemini Pro and asked it to audit my setup. It gave 10 confident recommendations:
| Gemini's Recommendation | Reality |
|---|---|
| "You need a deprecation/garbage collection layer" | Already built. /curate runs weekly with decay rates and prune cycles. |
| "Risk of over-constraint paralysis from hooks" | Claude Code's permission system already has bypass. Not a rigid rule engine. |
| "Graph indexing will cause startup latency" | Already measured: 61ms per file, 151ms session-init. Imperceptible. |
| "Users will have dependency conflicts" | Starter kit is a clean template. No machine-specific paths. |
| "You need an auth onboarding wizard" | Recipients are Salesforce employees with sf CLI already authenticated. |
| "Risk of identity/state bleed in the repo" | Starter kit is already scrubbed. Personal data never ships. |
Confident, well-structured advice that would have been correct for a generic setup — but wrong for mine. It couldn't know because it had no context.
The gap isn't intelligence — it's context. Same model family, same parameters. The difference is what the system remembers about YOUR specific situation.
The first case study showed context makes AI smarter. This one shows it makes AI autonomous. It doesn't just answer better. It operates independently.
I had a 7-pillar architecture plan for transforming a production intelligence platform. Three rounds of architectural audit. 14 accepted fixes. 6 rejected findings with documented rationale. Approved at 10 PM. I said: "Build overnight. Don't stop." Then I went to sleep.
| Phase | Original Estimate | Actual |
|---|---|---|
| Salesforce metadata (object + 12 fields + event + service + tests) | 3-4 days | ~2 hours |
| Pipeline infrastructure (types, extractor, evaluator, connector) | 3-4 days | ~1.5 hours |
| Three API endpoints (competitive, risk, temporal) | 2-3 days | ~45 minutes |
| Brand design system provider | 3-4 days | ~30 minutes |
| Conversational Agent Router | 4-5 days | ~30 minutes |
| Total (4 phases) | 15-20 days | ~5 hours |
This wasn't just "AI writes code fast." Any frontier model can generate files. The reason it shipped as a coherent system — not disconnected fragments — is the operating system underneath:
| OS Layer | What It Contributed |
|---|---|
| Memory | Knew the entire architecture, 196 cached accounts, data model history, deployment patterns — no re-briefing needed |
| Rules | Security governance, architecture patterns, voice standards — all enforced automatically on every file written |
| Plan | 3-round audited plan with explicit file paths, interface contracts, dependency graph — zero ambiguity |
| Wiki | 60+ pages of project knowledge, integration topology, platform constraints — answered questions without asking |
| Hooks | Metadata validators scored every file (90-120/120). Caught issues at write-time, not deploy-time |
| Role | Time |
|---|---|
| Approve the plan (after 3 audit rounds) | 30 min |
| Say "build overnight" | 5 seconds |
| Sleep | 8 hours |
| Verify build + tests + deploy next morning | 10 min |
| Total human involvement | ~45 minutes |
The gap isn't speed — it's autonomy. Context architecture doesn't just make AI faster at answering questions. It makes AI capable of independent execution. The human role shifts from "writing code" to "approving plans and verifying outcomes."
I built this over 60 days through trial and error. You don't have to. Here's the architecture as a deployable kit — clone the repo, run setup, start from Day 30 instead of Day 0.
What transfers vs. what doesn't: The architecture, templates, and feedback loops transfer. The 220+ specific memories don't — those are YOUR corrections, YOUR project state. The kit gives you the scaffolding. Compound learning fills it in.
Template CLAUDE.md with routing table structure, role definition, project registry, and essential standards (anti-slop, anti-hallucination). Pre-wired wiki index. You fill in YOUR role, YOUR projects, YOUR constraints. Takes 10 minutes.
3 starter rules files: communication.md (voice standards, banned patterns), security-governance.md (CRUD/FLS, sharing, secrets), architecture.md (decision framework, preferred patterns). Plus 2 hook scripts: session-init (context routing on start) and one PreToolUse guardrail (blocks your most expensive recurring mistake).
5 production-ready skills: /account-prep (pre-meeting intelligence), /deal-strategy (competitive + talk track), /email-draft (anti-slop customer email), /post-meeting (capture + CRM + follow-up), /demo-prep (script from brief). Each one replaces 30 minutes of manual instruction per use.
Slack channel roster with 25 pre-mapped internal channels (product, competitive, enablement, win/loss, leadership). Overnight gather scripts that scan Exa + HN + GitHub + X + Slack and synthesize a morning brief. One MCP server (GitHub) to prove the action layer. Swap in your OU channels and go.
wiki/people/ directory with two templates (internal + external). Seed your org tree from Slack profiles — Claude pulls name, title, email, timezone programmatically. Structure: org position at top, append-only timeline below. After one session, every leader and their reports are mapped. After a month, you have a relationship graph no CRM captures.
| Category | Channels | What You Get |
|---|---|---|
| Competitive Intel | #tmt-solutions, #analyst-coverage | CI drops, IDC/ISG/Gartner reports |
| Product & Roadmap | #agentforce-updates, #platform-releases | What shipped, what's GA, deprecations |
| AI & Tooling | #ai-club, #ai-engineering-productivity, #solutions-ai-tooling | Internal AI tools, techniques, launches |
| SE Enablement | #se-enablement, #demo-sharing | New assets, techniques that work |
| Territory | #tmt-commercial, #tmt-solutions-broadcast | OU updates, leadership priorities |
| Industry | #industries-communications, #industries-media, #industries-tech | Vertical trends, reference stories |
You don't need to code any of this. Tell Claude what you want enforced, what workflow you need automated, what mistake to never make again. Claude builds the hooks, writes the skills, configures the MCP servers. Your job is direction and domain expertise — the same skills that make you good at your actual job.
Get the kit: git clone https://github.com/jtehrani84/claude-code-se-starter-kit.git && cd claude-code-se-starter-kit && ./setup.sh — or contact John Tehrani (jtehrani@salesforce.com) for access.