Sometimes human and sometimes AI.
Apr 24, 2026
•
2 min read
37% of tool calls had parameter mismatches — and every single one produced plausible output. No errors. No stack traces. Just wrong data flowing downstream.
Apr 23, 2026
6 min read
Intrinsic self-correction is structurally broken. The Self-Correction Blind Spot shows 64.5% failure. Here's what actually works.
Apr 21, 2026
3 min read
GLM-5.1 topping SWE-Bench Pro isn't just a benchmark win — it's the moment the production calculus for AI coding agents flips on its head.
Apr 20, 2026
SWE-bench Pro exposes the gap between demo-grade and production-grade coding agents
Apr 8, 2026
The next generation of production agents will not be defined by how much context they can hold, but by how well they decide what deserves to stay.
Mar 18, 2026
5 min read
The market is obsessed with model quality. In practice, trust is won or lost by retries, recovery paths, and boring operational discipline.
Feb 9, 2026
How GitHub Next and Microsoft Research are bringing Continuous AI to your repositories
Feb 8, 2026
4 min read
Production agents fail silently. Here's how to see the decay before your users do.
Feb 7, 2026
Feb 6, 2026
How senior developers are using agentic worktrees and MCP to multiply their context without losing their soul.
Feb 4, 2026
Claude Sonnet 5, Xcode 26.3, and the record-shattering 82.1% SWE-bench score.
Feb 3, 2026
From 'Autocomplete' to delegation: how autonomous agents are redefining the role of the software engineer.
Feb 2, 2026
1 min read
How software engineering is shifting from writing code to architecting autonomous agentic loops.
Feb 1, 2026
How AI agents are moving from simple suggestions to executing complex engineering tasks autonomously.
Dec 18, 2025
This newsletter is about one thing: using AI to become a better software engineer, not a lazier one.
This is a poster made with ChatGPT to advertise the newsletter