Based on Andrej Karpathy's "[1hr Talk] Intro to Large Language Models"
Andrej Karpathy doesn't just explain how large language models work — he reframes what they are. In his landmark talk "Intro to Large Language Models," Karpathy argues that LLMs are not sophisticated autocomplete engines. They are the kernel of a new computing paradigm. This post distills that argument, traces the technical layers beneath it, and explores where it leads.
Two Files, One Mind
At the physical level, a large language model is startlingly minimal. Any frontier model — GPT, Claude, Llama — essentially consists of just two files: a parameters file and a run file. That's it. The parameters file holds billions of floating-point weights (Llama-2-70B's weighs around 140GB), each number the product of months of optimization across trillions of tokens. The run file, by contrast, can be implemented in under 500 lines of C. It defines the Transformer architecture and tells the hardware how to use those weights to generate text.
This two-file abstraction has a profound implication: once you have the weights, you can run the model anywhere — disconnected from the internet, on local hardware, entirely under your control. The intelligence is in the numbers, not in the cloud.
From Raw Data to Helpful Assistant: Three Stages of Becoming
Training an LLM is not a single act — it's a layered process. Karpathy outlines three stages, each building on the last.
Pre-training is the most expensive and computationally intensive phase. The model consumes internet-scale text — books, code, papers, forums — and learns to predict the next token. Llama-3's training run illustrates the scale: 15 trillion tokens, 16,000 H100 GPUs, 400 teraflops of throughput. What emerges is a "base model" with vast factual knowledge but no conversational instincts — it completes documents rather than answers questions.
Supervised Fine-Tuning (SFT) bridges that gap. Using roughly 100,000 human-written question-and-answer pairs, developers teach the model to behave like an assistant — to respond helpfully, decline inappropriate requests, and maintain coherent logic across a conversation.
RLHF (Reinforcement Learning from Human Feedback) is where alignment happens. Human evaluators compare model outputs and rank them by accuracy, safety, and helpfulness. A reward model learns to simulate those preferences, and the LLM is fine-tuned to maximize that reward signal. The result is not just a safer model — it's a demonstrably better reasoner on complex tasks.
System 1 vs. System 2: The Next Leap in Machine Thinking
Karpathy borrows Daniel Kahneman's dual-process theory to describe a crucial limitation — and a coming breakthrough. Current LLMs operate in System 1 mode: they generate each token sequentially, with equal computational effort per step. Solving "2+2" and proving a theorem take the same amount of "thinking time." Fast, but brittle.
System 2 is deliberate, step-by-step reasoning. OpenAI's o1 and o2 models are the first serious attempt at this: before producing a final answer, they run internal chain-of-thought computations, explore multiple solution paths, and self-correct when they detect errors. Crucially, extending inference-time compute — letting the model "think longer" — produces performance gains that follow the same power-law curves as adding more training data. This opens a new dimension of scaling: not just bigger models, but deeper thinking.
The LLM OS: A New Architecture of Intelligence
Here is where Karpathy's thesis becomes most radical. He proposes treating LLMs not as applications, but as the kernel of a new operating system. The analogy maps cleanly:
The context window is RAM — it holds everything the model is currently "thinking about." RAG (Retrieval-Augmented Generation) is the file system — pulling in external knowledge on demand. Tool integrations — web search, code execution, API calls via protocols like MCP — are the peripherals and I/O layer. The LLM coordinates all of it, translating natural language intent into structured action.
This architecture has already evolved into multi-agent systems. Complex tasks — say, building a software feature from scratch — are decomposed and distributed: one agent designs the architecture, another writes code, another runs tests, another audits for security. They work in parallel, cross-checking each other's outputs. The single monolithic model gives way to a coordinated swarm.
Security in the Age of the LLM OS
With increased power comes a new attack surface. The threats LLMs face aren't traditional code vulnerabilities — they're linguistic and structural.
Jailbreaking attempts to bypass safety alignment through clever prompting — roleplay scenarios, encoded instructions, semantic sleight of hand. Prompt injection, especially its indirect variant, is more insidious: malicious instructions hidden inside web pages or documents can hijack an LLM agent's behavior mid-task, causing it to exfiltrate data or perform unauthorized actions without the user's awareness.
Data poisoning operates at the training level. By inserting maliciously crafted samples with specific trigger phrases into open-source training datasets, attackers can embed backdoors that remain dormant under normal conditions but activate when triggered. As LLMs become critical infrastructure, these supply-chain threats demand the same rigor we apply to securing traditional software pipelines.
Conclusion: The Ghost Is Learning to Think
Karpathy once described early LLMs as "summoning a spirit from the internet" — vast, knowledgeable, but fundamentally uncontrolled. The trajectory he outlines is toward something different: a system that reasons carefully, acts through tools, coordinates with other agents, and aligns its behavior with human values.
The shift from predicting the next token to orchestrating complex workflows is not a matter of adding more parameters. It requires rethinking what training means, what inference means, and what it means for a machine to "think." Karpathy's framework — dual-file architecture, three-stage training, System 1 / System 2 cognition, LLM-as-OS — gives us the vocabulary to ask those questions clearly. And in a domain moving this fast, asking the right questions is half the work.
Source: Andrej Karpathy, "[1hr Talk] Intro to Large Language Models"
Leave a comment ✎