Karpathy: Train and inference GPT in 243 lines of pure Python

2026-03-06

Here's what matters in AI right now.

Today: Meta drops SAM Audio — isolate any sound with a text prompt., GPT-5.2 derives a new result in theoretical physics., Cursor's third era: cloud agents have overtaken the IDE..

🧠 LAUNCH

Meta drops SAM Audio — isolate any sound with a text prompt.

Meta SAM Audio is the first unified model that separates any sound from complex audio mixtures using text, visual, or span prompts. Open-sourced with a perception encoder, benchmarks, and papers — this opens audio editing workflows that were previously impossible or required hours of manual work. If you work with audio in any capacity, this is worth exploring immediately. (6,450 likes | 933 RTs) Read more →

Perplexity pplx-embed-v1-0.6b enters the embedding game. At just 0.6B parameters, it's highly deployable for RAG and search pipelines — and it comes from a team that lives and breathes retrieval quality. If you're running a retrieval stack, benchmark this against whatever you're currently using. (125 likes | 14.5K downloads) Read more →

Claude is a "space to think": Anthropic reframes Claude's product direction — not a chatbot, not a search engine, but a thinking space. This is a deliberate divergence from the ChatGPT/Gemini paradigm, and if you're building on Claude's API, it signals where the platform is headed. Read more →

🔧 TOOL

LocoOperator-4B fills a real gap: a 4B-parameter model purpose-built for GUI interaction and tool calling, small enough to self-host. If you've been waiting for a local agent model that can actually click buttons and fill forms without sending everything to a cloud API, this is your starting point. (268 likes | 3.7K downloads) Read more →

Claude Code Security gets its full blog post. Anthropic details how it scans codebases for vulnerabilities and generates targeted patches for human review — positioning Claude as a security tool, not just a coding assistant. If you're on a security team, request the research preview. Read more →

📝 TECHNIQUE

Karpathy says CLIs are the killer agent interface — and he's right. The argument: CLIs are deterministic, composable, and machine-readable, which is exactly what agents need. Pair GitHub CLI + Polymarket CLI + a coding agent and you get arbitrary dashboards built on demand. The "legacy" interface turns out to be the future one. (11,698 likes | 1,106 RTs) Read more →

Your LLM doesn't write correct code — and it's your fault. Practical guide arguing that LLM code quality is primarily a prompting problem: define acceptance criteria, test cases, and constraints upfront, and correctness improves dramatically. Backed by real quantitative examples, not vibes. (393 likes | 278 RTs) Read more →

AI-resistant technical evaluations: Anthropic's engineering team shares their framework for interview questions that remain meaningful when candidates have AI tools. The key insight: test judgment under ambiguity, not factual recall. If you're hiring engineers in 2026, this is required reading. Read more →

🔬 RESEARCH

GPT-5.2 derives a new result in theoretical physics.

OpenAI, collaborating with IAS, Vanderbilt, Cambridge, and Harvard, published a preprint showing GPT-5.2 discovered that a gluon interaction many physicists expected impossible can arise under specific conditions. This is the first credible AI-derived result in fundamental physics — not just pattern-matching on existing literature, but genuinely extending human knowledge. (9,618 likes | 1,507 RTs) Read more →

LeCun drops empirical ammo in the "understanding" debate. Yann LeCun amplifies research showing GPT gives contradictory moral judgments when question framing changes — judging the same action acceptable or unacceptable depending on how it's described. It's not a new argument, but it's the cleanest experimental evidence yet that pattern matching ≠ understanding. Matters a lot for anyone deploying LLMs in safety-critical domains. (22,774 likes | 2,586 RTs) Read more →

💡 INSIGHT

Cursor's third era: cloud agents have overtaken the IDE.

Cursor has acquired Graphite and Autotab and says cloud agents now exceed its IDE use case in volume. This is a $50B company telling you the future isn't VS Code forks — it's headless agents that write, review, and ship code without ever opening an editor. Every dev-tools founder should be rethinking their roadmap today. Read more →

OpenAI enters the Pentagon's classified network.

Sam Altman announces OpenAI will deploy models inside the Department of War's classified network, while emphasizing prohibitions on domestic mass surveillance and offensive cyber operations. This is a landmark moment — the first frontier AI provider inside classified military infrastructure. The policy guardrails sound reassuring, but enforcement in a classified environment is inherently opaque. (34,437 likes | 4,061 RTs) Read more →

"We might all be AI engineers now": A developer's personal reckoning with the reality that every software role now involves AI engineering — whether you chose it or not. Backend, frontend, infra — the distinctions are dissolving. If you haven't assessed your own AI skill gaps, you're already behind. (110 likes | 158 RTs) Read more →

🏗️ BUILD

Karpathy: GPT training and inference in 243 lines of pure Python.

No PyTorch. No TensorFlow. No dependencies. Karpathy distills the full algorithmic content of GPT into 243 lines of dependency-free Python — a masterclass in separating what actually matters from what's just efficiency scaffolding. If you want to truly understand transformers rather than just import them, study this code. (25,229 likes | 3,179 RTs) Read more →

Qwen3-14B distilled from Claude Opus 4.5 is available in GGUF for local inference. Ironic timing given Anthropic's anti-distillation stance — and practically useful if you need Opus-level reasoning on consumer hardware. 14B parameters means this runs comfortably on a decent GPU. (277 likes | 83.1K downloads) Read more →

🎓 MODEL LITERACY

Distillation: When you see a model "distilled from Claude Opus 4.5," it means a smaller model was trained to mimic the larger model's outputs rather than learning from raw data. The student model is fed inputs, the teacher model generates responses, and the student learns to reproduce them. The result: a fraction of the parameters, a fraction of the compute cost, and — if done well — surprisingly close performance on specific tasks. The catch? Distilled models inherit the teacher's biases and blind spots, and they tend to be narrow where the teacher was broad. It's compression with lossy tradeoffs.

⚡ QUICK LINKS

AI Engineer will be the LAST job: Latent Space on why AI engineering may be the last technical role standing as agents take over implementation. Link
Truth in the time of Artifice: The epistemological crisis of AI-generated content becoming indistinguishable from human work. Link

🎯 PICK OF THE DAY

Cursor's cloud-agent pivot is the canary in the coal mine for every dev tool. When a $50B company that built its empire on "better VS Code" tells you the IDE is no longer the main event, pay attention. Cursor acquiring Graphite (code review) and Autotab (browser agents) isn't diversification — it's a bet that the entire developer workflow moves from human-in-the-editor to human-reviewing-agent-output. The implications cascade: if coding agents run headlessly in the cloud, your IDE becomes a review tool, your CI/CD pipeline becomes the primary interface, and the CLI — as Karpathy pointed out today — becomes the agent's native habitat. For developers, the question isn't whether this shift happens, but whether you're the one directing the agents or the one being replaced by them. Read more →

Until next time ✌️