浪费推理与辅助工具

浪费推理与辅助工具
Wasting Inferences with Aider

原始链接: https://worksonmymachine.substack.com/p/wasting-inferences-with-aider

本实验探索了同时使用多个AI编码代理自动化修复bug。通过在Asana中将bug分配给指定的代理，一个子层代理会触发三个独立的辅助程序实例，每个实例都由不同的LLM驱动（GPT-4o、Claude 3.5 Sonnet、Gemini 2.0 Flash）。每个实例都尝试修复bug，在各自的Git分支中操作并创建一个单独的pull request。结果表明，“浪费推理”——运行多个AI尝试——是可行的，因为强大的模型成本低廉。这种方法提供了冗余性，即使一个模型失败也能获得成功的解决方案，并提供了不同AI方法之间宝贵的比较。更重要的是，它自动化了整个bug修复过程，将手动编码转变为由项目管理工作流触发的后台任务。该实验表明，这种“浪费性”的代理集群方法并非遥远的未来概念，而是在当今改进编码自动化的一种可行策略，尤其是在模型变得更便宜和更复杂的情况下。

这篇 Hacker News 讨论串围绕一篇文章展开，该文章讨论使用多个 AI 代理来解决编码问题，称之为“浪费推理”（Wasting Inferences）。作者认为，对多个 AI 模型和提示运行相同的任务，比依赖单次尝试能产生更好的解决方案。评论者们就这种方法的实用性展开了辩论。一些人强调了这种方法可能增加成本，并且需要仔细审查多个代码请求，这可能超过其益处。另一些人建议 AI 代理也可以用来审查其他代理生成的代码，或评估测试的质量。人们也担心大型语言模型可能会收敛到类似的、可能有缺陷的解决方案。几位用户分享了他们使用 AI 编码工具的经验，指出了它们的优缺点，清晰提示的重要性以及人工监督的必要性。讨论涉及软件开发的未来，包括 AI 在问题解决中的作用以及开发人员技能的演变。一些人指出了持续反馈和 AI 模型再训练的重要性。

开发技能在自主编码中的作用 2025-03-26

你能写出最简洁的代理编码器是什么？ 2025-03-25

（评论） 2025-03-26

与氛围编程说再见 2025-03-30

原文

This week’s ‘Works on My Machine’ explores a pattern that leans into the “Waste Inferences!” concept I’ve shared in the past and asks the question: what if we just trigger multiple AI attempts automatically from our project management tool and just pick the best result? Inspired by Steve Yegge’s thoughts on Agent Clusters/Fleets in Revenge of The Junior Developer, this experiment connects Asana directly to the Aider coding agent via a Sublayer agent.

This demo showcases an automated workflow triggered entirely by assigning an Asana task:

A bug exists in a simple Rails Todo app, tracked as a task in Asana.
We assign the task to our designated “BugfixAgent” user in Asana.
A running Sublayer BugMonitorAgent (code below!) detects this assignment via an AsanaAssignedToAgentTrigger.
The agent takes a step: it fetches the task details (title, description) from Asana.
It then scripts the Aider coding agent, instructing it to fix the bug based on the task info.
Crucially, the agent runs Aider three separate times using the exact same prompt but targeting different powerful LLMs: GPT-4o, Claude 3.5 Sonnet, and Gemini 2.0 Flash.
Each Aider run operates in its own Git branch, applying the changes and running tests.
The result: Three distinct PRs are automatically created on GitHub, each representing one LLM’s attempt to fix the bug

This experiment highlights a few key things relevant today:

Agent Clusters are Accessible: While Yegge places Agent Clusters/Fleets in late 2025/2026, this simple setup demonstrates a basic form available now. We’re not coordinating deeply, but we are leveraging multiple “minds” on one problem.
“Wasting Inferences” is Cheap: The core idea from my previous post holds up surprisingly well here. Running three separate attempts with powerful models cost basically nothing for this simple bug - less than 10 cents! If we continue to see the costs of models of this scale fall by by 90% again, not only will parallel attempts be economically feasible, but they could become the standard way we get a high success rate when working with LLM-based agents.
Redundancy & Comparison: Even if one model fails or produces a suboptimal fix (which happened!), others might succeed. You also get to compare different valid approaches (like Claude and GPT-4o in the video). This can be valuable learning or provide options.
Automation Potential: The entire bug-fixing attempt happened automatically, “out of band” from my direct workflow, triggered solely by an Asana assignment. It transforms a manual coding task into a background process initiated by standard project management workflows. The idea of manually “driving” these AI agents with chat seems to me like it wont be a thing much longer…

While this demo used a simple bug and app, the low cost and automation potential suggest this “Waste Inferences” / Agent Cluster approach could absolutely scale to more complex scenarios especially as models and tooling improve. It shows us that we can and should actually experiment with these future workflows today because they’re closer than we think.

The “Buggy Todo App” is available at sublayerapp/buggy_todo_app but the code for the agent that interacts with Asana and Aider is in the bugfix_agent/agents/bug_monitor_agent.rb file.

We also heavily relied on the open-source Aider coding agent: https://aider.chat

The agent framework and Asana trigger use the Sublayer Rubygem, an agent framework designed to make it easy to build your own AI powered devtools.

I’d love to know what you think of this multi-LLM, “wasteful” approach to automated coding. Even I was surprised at how cheap it ended up being. If you get a chance to try this out or have been playing around with ideas like this on your own, I’d love to chat and hear how it’s going for you!

浪费推理与辅助工具 Wasting Inferences with Aider

浪费推理与辅助工具
Wasting Inferences with Aider