Anthropic Founds the Anthropic Institute to Tackle AI's Societal Fallout
π§ LAUNCH
Anthropic Founds the Anthropic Institute to Tackle AI's Societal Fallout
Anthropic isn't just doing safety research anymore β it's launching a full institute dedicated to the policy, governance, and societal challenges that powerful AI creates. This is a deliberate shift from "how do we make models safe" to "how does society adapt to models that are already here." The founding charter signals Anthropic wants a seat at every regulatory table, not just the technical ones. If you care about where AI governance is headed, this is the org to watch. Read more β
Google Ships Gemini Embedding 2 β Its First Fully Multimodal Embedding Model
Gemini Embedding 2 puts text, images, and video into a single vector space β meaning your retrieval pipeline can now search across modalities without separate models or stitched-together hacks. This is Google's most capable embedding model yet, and the multimodal piece is the real unlock: imagine searching a video library with a text query and getting frame-level results. If you're running any kind of RAG or search system, benchmark this against whatever you're using today. (2,325 likes | 289 RTs) Read more β
Meta ships four generations of custom AI silicon in two years. MTIA, Meta's custom training and inference chip, has gone through four architectural generations since 2024 β a pace that closes the gap between chip cycles and the model architecture changes that demand new hardware. Custom silicon is no longer a nice-to-have; it's a competitive moat that determines how fast you can iterate on model training. (264 likes | 44 RTs) Read more β
Fish Audio drops s2-pro, a next-gen open TTS model. Already trending on HuggingFace, s2-pro is an open text-to-speech model worth testing if you're building voice interfaces or audio content pipelines. If your current TTS sounds robotic, this is your excuse to switch. (240 likes | 746 downloads) Read more β
π§ TOOL
Claude Code Gets /btw β Side Conversations While the Agent Works
The new /btw command in Claude Code lets you ask side questions without interrupting the agent's main task β solving the "do I wait or do I interrupt?" dilemma that kills flow in agentic coding sessions. Think of it as whispering to your pair programmer while they're mid-keystroke. This is the kind of UX detail that separates a tool you tolerate from one you love. (22,849 likes | 1,388 RTs) Read more β
OpenAI ships a Codex skill to automate GPT-5.4 migration. Instead of manually swapping model IDs and tweaking prompts, this dedicated skill handles the model swap, prompt adjustments, and compatibility checks for you. If you're still on GPT-5.3, this cuts migration from hours to minutes. (700 likes | 26 RTs) Read more β
Claude for Excel and PowerPoint now share context seamlessly. Cross-app context sharing means you can pull spreadsheet analysis into a deck without re-explaining your data β Claude remembers what it saw in Excel when you switch to PowerPoint. A genuine workflow unlock for anyone who lives in Office. (407 likes | 34 RTs) Read more β
Gemini CLI adds Plan Mode for read-only exploration. You can now explore and strategize with the model before making any changes β a safety-first pattern that lets you think through a refactor without accidentally touching files. More CLI tools should steal this idea. (250 likes | 22 RTs) Read more β
π TECHNIQUE
Why your agentic coding evals are lying to you: Agentic coding benchmarks can vary by several percentage points between identical runs β infrastructure config, server load, and non-deterministic tool use all inject noise. If you're choosing a model based on a 2% benchmark edge, you might just be measuring luck. Run your own evals, multiple times, on your own infra. (480 likes | 16 RTs) Read more β
Simon Willison's patterns for AI-assisted code that's actually better: Not faster code β better code. Willison lays out practical agentic engineering patterns that prioritize correctness, readability, and maintainability over raw speed. The key insight: AI amplifies whatever you optimize for, so optimize for quality. Read more β
LangChain breaks down the anatomy of an agent harness: The harness is the system that wraps a model to turn intelligence into work β tool routing, memory management, error recovery, and orchestration. If you're building production agents and your architecture doesn't map cleanly to these components, you probably have a gap. Read more β
π¬ RESEARCH
AlphaEvolve establishes new results in extremal combinatorics. DeepMind's AlphaEvolve isn't just verifying known proofs β it's producing novel mathematical results in extremal combinatorics that human mathematicians hadn't found. This is AI contributing to pure math at the frontier, not pattern-matching on textbook problems. (1,615 likes | 168 RTs) Read more β
NVIDIA Cosmos Policy connects world models to robot control. Cosmos Policy bridges the gap between simulation-trained world foundation models and actual physical manipulation β turning a model that understands physics into one that can act on it. The unified architecture means one model handles perception, planning, and control instead of a fragile pipeline of specialists. (115 likes | 19 RTs) Read more β
NVIDIA publishes methodology for synthetic code training data from concept seeds. The approach generates high-quality training examples by starting from programming concepts rather than scraping repos β giving you control over coverage, difficulty distribution, and domain focus. Practical for anyone fine-tuning code models who's hit the ceiling on natural data. Read more β
π‘ INSIGHT
Karpathy: The IDE Isn't Dying β It Needs to Level Up
Karpathy argues that the rise of agentic coding doesn't kill the IDE β it demands a fundamentally better one. Humans now program at a higher level of abstraction, and the basic unit of work has shifted from characters and lines to intentions and plans. The tooling needs to match that shift: better visualization of agent state, richer feedback loops, and interfaces designed for supervising work rather than typing it. If you're building developer tools, this is the north star. (2,920 likes | 198 RTs) Read more β
Security researchers use an AI agent to hack McKinsey's AI platform. Codewall details how they compromised McKinsey's production AI system using an autonomous agent β exploiting tool-use permissions and prompt injection vectors that most enterprise deployments haven't hardened against. If you're deploying agents with real credentials, audit your attack surface before someone else does. (223 likes | 88 RTs) Read more β
Anthropic opens Sydney office β fourth in Asia-Pacific. After Tokyo, Bengaluru, and Seoul, Sydney rounds out an aggressive APAC footprint. Four offices across four major AI markets in under a year signals both a talent play and a regulatory engagement strategy β Anthropic wants local relationships with every government that's writing AI rules. Read more β
ποΈ BUILD
Open-source browser protocol built for AI agents. Standard browsers weren't designed for autonomous navigation β this protocol handles page interaction, data extraction, and multi-step workflows in a way that's native to how agents actually work. If you've been fighting Puppeteer or Playwright to make your agent browse the web, this is the purpose-built alternative. (33 likes | 14 RTs) Read more β
π MODEL LITERACY
Multimodal Embedding Spaces: Traditional embedding models map text into a vector space where similar meanings cluster together β but what happens when you add images and video? Gemini Embedding 2 puts all three modalities into a single shared space, meaning a text description of a sunset and a photo of a sunset land near each other in the same vector neighborhood. This works because the model is trained on aligned pairs across modalities, learning that "golden retriever playing fetch" (text) and an image of exactly that should have nearly identical vectors. Why this matters: retrieval pipelines can now search across modalities without separate models or post-hoc alignment β ask a question in text, get back the relevant video frame, image, or document in one query.
β‘ QUICK LINKS
- HuggingFace Storage Buckets: S3-like mutable storage with Xet dedup for ML checkpoints. (15 likes | 5 RTs) Link
- TEI v1.9: HuggingFace Text Embeddings Inference adds NVIDIA Blackwell GPU support. (25 likes | 3 RTs) Link
- Claude Code outages: Login issues and slower performance across Claude Code and claude.ai. (1,500 likes | 41 RTs) Link
- China's OpenClaw craze: MIT Tech Review profiles the gold rush of AI entrepreneurs building on China's open-source device control tool. Link
π― PICK OF THE DAY
AlphaEvolve producing novel combinatorics results marks a genuine inflection point. There's a meaningful difference between AI verifying known proofs faster and AI discovering results that humans hadn't found β and AlphaEvolve just crossed that line. Extremal combinatorics isn't a toy domain; these are hard, open problems that mathematicians have been chipping away at for decades. What makes this significant isn't the specific results β it's the methodology. AlphaEvolve uses evolutionary search guided by a language model to explore proof strategies that human intuition wouldn't prioritize. This is AI as a collaborator in unknown math, not a calculator for known math. The distinction reshapes what "AI for science" actually means: not just accelerating existing research pipelines, but expanding the space of what gets explored in the first place. Every research lab running compute-intensive searches over combinatorial spaces should be paying close attention. (1,615 likes | 168 RTs) Read more β
Until next time βοΈ