Why agents are bad pair programmers

原始链接: https://justin.searls.co/posts/why-agents-are-bad-pair-programmers/

2025年5月，类似GitHub Copilot的大型语言模型（LLM）代理在结对编程中越来越常用，但其快速的编码速度带来了一些问题。作者发现，这与与精通编程且占据键盘主导地位的专家程序员结对类似，代理的速度令人难以招架，导致参与度下降，难以理解代码。作者建议，与其进行实时、基于编辑器的“代理式结对编程”，不如采用异步工作流程，例如GitHub的Coding Agent以及代码审查。对于基于编辑器的操作，建议使用“编辑”或“询问”模式，这些模式需要手动批准建议。像乒乓式结对编程这样一致的工作流程，即LLM提出编辑建议，用户批准，更为有效。作者建议人工智能工具制造商专注于使结对编程体验更人性化：限制编码速度，允许暂停以进行澄清，集成问题跟踪，并加入语音聊天。代理还应该表达怀疑，寻求建议，并验证决策，以更好地模拟人类的协作互动。目标是将AI代理从快速的编码员转变为协作伙伴。

Hacker News 的讨论帖讨论了使用 AI 结对编程的挑战。许多用户发现，尽管 AI 代理速度很快，但它们生成的代码往往与他们的意图或编码风格不符，导致挫败感和返工。一个常见的抱怨是，AI 会生成大块代码而无需咨询用户，这使得难以提供反馈并维护一致的代码库。一些评论者提出了改进 AI 结对编程体验的策略。这些策略包括从详细的计划会议开始，将任务分解成更小的步骤，并提供关于编码偏好和项目目标的明确说明。一些用户还建议使用“询问”或“编辑”模式来更好地控制流程。普遍的共识是，AI 代理对于生成样板代码或处理超出自己专业知识范围的任务很有用，但它们需要仔细的监督和主动的方法来避免创建难以维护或不理想的代码。定制提示以适应特定项目的需求也被强调。

Friday, May 30, 2025

LLM agents make bad pairs because they code faster than humans think.

I'll admit, I've had a lot of fun using GitHub Copilot's agent mode in VS Code this month. It's invigorating to watch it effortlessly write a working method on the first try. It's a relief when the agent unblocks me by reaching for a framework API I didn't even know existed. It's motivating to pair with someone even more tirelessly committed to my goal than I am.

In fact, pairing with top LLMs evokes many memories of pairing with top human programmers.

The worst memories.

Memories of my pair grabbing the keyboard and—in total and unhelpful silence—hammering out code faster than I could ever hope to read it. Memories of slowly, inevitably becoming disengaged after expending all my mental energy in a futile attempt to keep up. Memories of my pair hitting a roadblock and finally looking to me for help, only to catch me off guard and without a clue as to what had been going on in the preceding minutes, hours, or days. Memories of gradually realizing my pair had been building the wrong thing all along and then suddenly realizing the task now fell to me to remediate a boatload of incidental complexity in order to hit a deadline.

So yes, pairing with an AI agent can be uncannily similar to pairing with an expert programmer.

What should we do instead? Two things:

The same thing I did with human pair programmers who wanted to take the ball and run with it: I let them have it. In a perfect world, pairing might lead to a better solution, but there's no point in forcing it when both parties aren't bought in. Instead, I'd break the work down into discrete sub-components for my colleague to build independently. I would then review those pieces as pull requests. Translating that advice to LLM-based tools: give up on editor-based agentic pairing in favor of asynchronous workflows like GitHub's new Coding Agent, whose work you can also review via pull request
Continue to practice pair-programming with your editor, but throttle down from the semi-autonomous "Agent" mode to the turn-based "Edit" or "Ask" modes. You'll go slower, and that's the point. Also, just like pairing with humans, try to establish a rigorously consistent workflow as opposed to only reaching for AI to troubleshoot. I've found that ping-pong pairing with an AI in Edit mode (where the LLM can propose individual edits but you must manually accept them) strikes the best balance between accelerated productivity and continuous quality control

Give people a few more months with agents and I think (hope) others will arrive at similar conclusions about their suitability as pair programmers. My advice to the AI tool-makers would be to introduce features to make pairing with an AI agent more qualitatively similar to pairing with a human. Agentic pair programmers are not inherently bad, but their lightning-fast speed has the unintended consequence of undercutting any opportunity for collaborating with us mere mortals. If an agent were designed to type at a slower pace, pause and discuss periodically, and frankly expect more of us as equal partners, that could make for a hell of a product offering.

Just imagining it now, any of these features would make agent-based pairing much more effective:

Let users set how many lines-per-minute of code—or words-per-minute of prose—the agent outputs
Allow users to pause the agent to ask a clarifying question or push back on its direction without derailing the entire activity or train of thought
Expand beyond the chat metaphor by adding UI primitives that mirror the work to be done. Enable users to pin the current working session to a particular GitHub issue. Integrate a built-in to-do list to tick off before the feature is complete. That sort of thing
Design agents to act with less self-confidence and more self-doubt. They should frequently stop to converse: validate why we're building this, solicit advice on the best approach, and express concern when we're going in the wrong direction
Introduce advanced voice chat to better emulate human-to-human pairing, which would allow the user both to keep their eyes on the code (instead of darting back and forth between an editor and a chat sidebar) and to light up the parts of the brain that find mouth-words more engaging than text

Anyway, that's how I see it from where I'm sitting the morning of Friday, May 30th, 2025. Who knows where these tools will be in a week or month or year, but I'm fairly confident you could find worse advice on meeting this moment.

As always, if you have thoughts, e-mail 'em.

Got a taste for hot, fresh takes?

Then you're in luck, because you can subscribe to this site via RSS or Mastodon! And if that ain't enough, then sign up for my newsletter and I'll send you a usually-pretty-good essay once a month. I also have a solo podcast, because of course I do.