← 回總覽

QCon London AI 编码现状:能力更强、成本更高、更危险的编码智能体

📅 2026-03-22 16:08 Olimpiu Pop 人工智能 5 分鐘 5212 字 評分: 90
AI 编码智能体 上下文工程 提示注入 安全风险 智能体集群
📌 一句话摘要 Thoughtworks 杰出工程师 Birgitta Böckeler 在 QCon London 的主题演讲分析了 AI 编码从感觉编程到自主智能体的演变,强调了关键挑战:提示注入带来的安全威胁升级和开发成本上升。 📝 详细摘要 本文报道了 Birgitta Böckeler 在 QCon London 上关于 AI 编码现状的主题演讲。演讲涵盖三大转变:从感觉编程到自主编码智能体和智能体集群的演变,上下文工程的重大进展(从单一规则文件到具有延迟加载的精细化技能),以及新兴的安全和成本问题。Böckeler 提出了一个基于错误概率、影响和错误可检测性的风险框架。她强调,

In her QCon London keynote, Birgitta Böckeler, Distinguished Engineer for AI-assisted Software Delivery at Thoughtworks, reflected on the changes in the AI coding space over the past year. She emphasised a shift from vibe coding to using autonomous coding agents or swarms of agents. According to her, two major concerns in the field are the worsening security landscape and the rising costs of agent-based development.

In the introductory part of her presentation, she reminded the audience about the state of AI coding just a year ago: "Vibe coding was just two months old", "MCP was all the rage," and "Claude Code was not even generally available yet." She highlighted that context engineering is probably the most significant development of the year. This context refers to the curated information a model or agent reads to improve its results. Last spring, it was limited to a single rules file (agents.md or claude.md) loaded at the start of each session to capture coding conventions and recurring pitfalls.

!Image 1/filters:no_upscale()/news/2026/03/ai-coding-state/en/resources/21image2-1774156227046.png)

Anthropic has since broken down this "monolithic" file into smaller skills, resulting in a more granular approach to coding capabilities. This enables a more pragmatic approach known as "lazy loading," in which different sets of rules are loaded based on the task at hand. This not only improves organisation but also ensures that the limited context window fills more slowly. However, Böckeler pointed out that a "fresh" Claude Code session had already reached 15% capacity before any prompt was even given.

Böckeler emphasised that we are moving closer to "hands-off" coding, as these coding agents can now run unsupervised for up to 20 minutes. Headless CLI modes can directly connect to CI/CD pipelines via GitHub Actions. Some practitioners, following Steve Yegge's "eight stages of dev evolution to AI", run three or more local sessions in parallel; however, Böckeler noted her experience of "typing the wrong thing into the wrong session."

!Image 2/filters:no_upscale()/news/2026/03/ai-coding-state/en/resources/15image3-1774156457259.png)

An even more advanced approach involves using coding agent swarms. Though she argued that experiments from Cursor or Anthropic—where C compilers or web browsers were built in a few days by a "team" of coding agents—are somewhat skewed, as these tasks are well-defined and have extensive public test suites. This is usually not true for enterprise software. A more accessible entry point is Claude Code's Agent Teams feature, which orchestrates a small number of agents with a clear coordination model.

To ensure the appropriate level of supervision, she proposed a risk framework based on three variables: the probability that the AI will make a mistake, the impact of that mistake, and the detectability of the error. Only the first variable is genuinely novel: developing intuition for how well a tool can handle a given task. The other two are engineering judgments that experienced developers should already possess.

!Image 3/filters:no_upscale()/news/2026/03/ai-coding-state/en/resources/22image1-1774156457259.png)

Beyond simply generating functionally incorrect code, security incidents involving coding agents are now occurring weekly, with most rooted in prompt injection. Eleven days before the talk, an attacker used a crafted GitHub issue to extract secrets and upload malicious packages to an NPM registry. This was a direct result of an unsupervised agent operating without sufficient sandboxing. Simon Willison's lethal trifecta defines that better: when an agent combines exposure to untrusted content, access to private data, and the ability to communicate externally, the risk becomes significant. For example, connecting an email with read-and-send permissions satisfies all three conditions.

> Böckeler: Security is not a technical problem; it's a conceptual problem.

In her conclusion, Böckeler noted that while model improvements are real, they are the least interesting developments compared to the shifts in tooling and practices surrounding them. An OpenAI team running a five-month autonomous greenfield project still reported entropy creeping in, despite custom linters and garbage-collection agents. The main question she posed to the audience was, "What practices will you enforce on your coding agent?" Whether these practices are good or bad, AI coding will amplify them.

查看原文 → 發佈: 2026-03-22 16:08:00 收錄: 2026-03-22 18:00:16

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。