Andrej Kaparthy (English)
Andrej Kaparthy (English)
Generated: 2026-06-20 14:45:23
---
Can you believe it? Karpathy's LLM Wiki that the entire internet went crazy over—I spent a week wrestling with it, only to realize the key thing isn't even technical at all!
Don't swipe away just yet. Let me explain slowly.
That day I was lying on the couch scrolling through my phone when my social media feed got flooded with the same tweet—Andrej Karpathy had posted a method. 20 million views. Shot straight to over 5,000 stars on GitHub. The comments were all shouting "RAG is dead," "new paradigm for knowledge management." It felt exactly like when blockchain first came out.
My first reaction: Hah, isn't this just having AI write notes for me? Is it that big of a deal?
But honestly, I have a bad habit—the more hyped something is, the more I have to tear it apart myself. So I spent an entire week, from bare setup to getting it working, from excitement to cursing, and finally to loving it. This post today is me, after going through all the pitfalls, having a heart-to-heart with you about my ultimate conclusion.
Karpathy's system isn't a tool, not even a method; it's a mental switch. Flip it right, and your knowledge base comes alive; flip it wrong, and it's just a dump of documents.
---
The thing that hooked me the most—you might not believe it
What is traditional RAG? Simply put, a glorified data mover. Every time you ask a question, it goes back to the raw documents, searches through them, and pieces together an answer. This month you ask "Why is the attention mechanism in Transformers so good?" twenty times, and it goes looking twenty times, same process every time, no accumulation whatsoever. Think about it—how is that different from planting rice every single time you want to cook a meal?
Karpathy's sharpest cut is right here: He changed "retrieve-time compilation" to "write-time compilation."
What does that mean? You drop a paper into the raw folder, and the LLM doesn't wait to be retrieved; it immediately reads, breaks it down, and writes. It automatically extracts entities from the paper and updates the corresponding wiki pages. If the new evidence contradicts a previous conclusion, it immediately marks the contradiction in red. If the conclusion is reinforced, it strengthens it. Your knowledge base isn't a pile of files; it's a living network that's constantly being "compiled."
The first time I ran ingest and watched those nodes auto-connect in Obsidian's graph view, I got goosebumps, honestly. It felt like when your IDE suddenly tells you "there's a reference you can jump to here"—but wait, the IDE already had that reference sitting there; LLM Wiki automatically builds all the associations and even fills in missing logic. Tell me that doesn't get you.
But hold on—the barrier to trying this thing out is three times higher than what Karpathy's Gist suggests. The pitfalls I fell into—you'd better take notes.
---
Pitfall log, every one a lesson of blood and tears
First pitfall: Don't copy his prompt directly.
Karpathy posted a prompt in his gist. I just shoved it straight into Claude Code. And what did it write for my wiki? All "this is an important concept," "please refer to relevant materials." I almost threw my laptop out the window.
After digging through a few community posts I realized: that prompt was for his own custom Agent, loaded with tons of context from his personal workflow. You have to adapt it to your own scenario—like whether the raw material is PDF or web pages, whether the wiki is in English or Chinese, whether entity names use camelCase or underscores. If you don't change these, what comes out is a mess.
I iterated four rounds before I got a version suitable for a Chinese vertical research domain. The core changes were just two: first, forcing the LLM to cite specific paragraphs from the original document when extracting entities (otherwise it hallucinates); second, adding a rule that if new info conflicts with an existing conclusion, first mark the contradiction, don't immediately delete the old page. This one rule saved me several times—otherwise many important findings would have been overwritten.
Second pitfall: The domain must be vertical—don't be greedy!
The first time, I threw papers from three directions—AI safety, Python tutorial, digital signal processing—all into the same raw folder. The LLM started flip-flopping between "Alignment" (for AI) and "Memory Alignment" (for data structures), and the whole page structure went haywire. Think about it: it couldn't even tell whether "alignment" refers to model training or data structures. How could my Wiki possibly turn out good?
Later I figured out: this system hates mixing multiple domains with large semantic gaps. Because it organizes knowledge via backlinks and entity relationships, once domains cross, the relationship graph becomes a tangled mess of yarn. So after that, I created a separate wiki for each vertical domain. One for AI safety, one for large language model safety, one for prompt injection alone. They don't interfere, each grows on its own. That felt right.
Third pitfall: Ingest isn't a one-time action—it's a loop.
After my first ingest of 13 AI safety papers, I got 31 wiki pages and a graph view full of dense nodes. I felt like a god. Then the next day I threw in a new paper, expecting a simple
Cael Lee
Full-stack developer with 8+ years of experience. Currently building AI-powered developer tools. I've tested 20+ AI API providers and coding assistants.