人工智能就像个不靠谱的顾问。
AI Is Like a Crappy Consultant

原始链接: https://lukekanies.com/writing/ai-is-like-a-crappy-consultant/

在2025年5月12日的一篇博客文章中,作者回顾了他们使用AI编程助手进行Swift/SwiftUI应用开发的经历。此前,作者是一位编程领域的怀疑论者。他们强调要将AI视为一位不可靠的顾问,主张密切监督,并避免最初直接编辑代码,以便于学习。虽然AI加快了开发速度,但它被证明是一个糟糕的架构师,会提出低效的解决方案,并且缺乏战略思维。 作者强调了AI在快速识别语法错误和处理重复性任务(如方法重命名或样板代码生成)方面的出色能力。随着对Swift/SwiftUI的熟悉程度提高,作者开始将AI视为一名初级开发者,将繁琐的工作委托给它,但保留对关键决策和代码审查的监督权。未来的计划包括允许AI直接编辑代码以完成特定任务,例如实现错误管理框架,但始终要进行彻底的验证。关键的收获是,AI编程助手是很有价值的工具,只要谨慎使用并进行人工监督,尤其是在处理繁琐的任务和错误检测方面。

Hacker News上的一场讨论将AI比作“糟糕的顾问”或“初级工程师”。许多用户分享了他们的经验,指出AI在明确定义的任务和清晰指令下很有帮助,就像指导初级工程师一样。然而,AI难以处理模糊性、探索性和架构设计,经常导致次优或误导性的代码。 讨论强调了提示工程的重要性,包括要求AI澄清其理解并表达不确定性。一些用户认为,为AI编写清晰的指令比自己完成任务更具挑战性。其他人则指出了“谄媚问题”,即AI倾向于假设现有代码和用户假设是正确的,从而阻碍其提供替代方案的能力。 虽然AI可以加速代码完成和重构,但一些用户告诫不要盲目信任AI生成的代码,强调需要进行测试和验证。讨论还涉及到AI取代某些开发任务的可能性,并引发了对就业岗位流失的担忧。总的来说,人们认为AI是一个有用的工具,但需要仔细的指导和人工监督。

原文

I decided to finally give vibe coding a try. I’ve barely written any code since I hired developers at Puppet in like 2009. And I’ve been a staunch AI/LLM skeptic. But I figured I should at least be an educated skeptic.

After a false start, and a couple of months of periodic usage, I’ve come to some conclusions about it. The first one is the most important:

You should treat AI like an untrustworthy consultant.

Think of this scenario: You need help with your company’s core product. You have to bring in an outside expert, either just as another body, or more likely, because they have knowledge your team doesn’t. What do you do?

Give them commit access and let them work unsupervised? Of course not.

At the least, you have someone sit by their side, checking every line.

More likely, you don’t even let them touch the keyboard. After all, you want more than the help – you want your team trained up so they don’t need that help next time. The only way you’ll get that is if your team does the work, even if someone else is telling them what to do.

That’s how I consulted at Puppet: I show up, I walk you through all the work, and when I leave you have more than a functioning system; you actually know how to use it. Sure, it was faster and easier to do all the work myself. But no one ever learned anything then.

All the coding I’ve done with AI help is in Swift, using SwiftUI, to build iOS apps. I’ve never worked with anything like any of that – I’ve not used any of the specific tech (other than an iPhone as a user). I’ve never used a UI framework. I’ve never worked in statically typed languages. And I’ve barely ever worked in compiled languages. So, it seemed like a solid use case for getting some fast help.

My first try used Cursor, and let it edit everything.

After a few iterations, I had a bare-bones application. But… I felt like I was pushing around a bag full of bolts. There was definitely stuff in there. But I didn’t even know how to think about the changes I needed, because I didn’t understand enough.

So on the second iteration, I decided I would do all the typing: The AI does not get to touch the keyboard. When I started it was all gibberish, because I didn’t understand anything. But after only a session or two, I have a pretty good sense of how both the language and framework work. That’s about when I realized the second big thing:

AIs are crappy architects.

It kept giving me stupid advice. For instance, every time it encountered an error, it would just catch it and print some logs. Uhhh… that’s bad. It would encounter a small problem, and design a big stupid solution instead of doing a small rearchitecture. Because it can’t think, it couldn’t realize when it hit a design wall that needed rethinking.

After a while, I concluded that it wasn’t very good at the back end code – I have a lot of experience with modeling and data flow, and I kept finding dumb things it did. But I never found those dumb things in the area I have no experience: the UI.

But… then I thought for a bit. And I realized, duh, that’s probably just because I’m not good enough yet to recognize the dumb stuff it’s doing.

Coincidentally, I am now hitting a repetitive wall with Swift’s type checking. I keep having to break a view into smaller and smaller files so that it can compile fast enough. Turns out that’s the AI’s only trick for fixing this problem. But a small amount of research shows there are other options. In particular, I can instrument the compile and see what’s actually taking all the time, and focus on just rewriting that code to be more compiler-friendly. This is the kind of stuff an experienced programmer does without thinking, but a crappy consultant whose entire experience is based on trawling Stack Overflow probably never figures it out.

I did find one area where LLMs absolutely excel, and I’d never want to be without them:

AIs can find your syntax error 100x faster than you can.

They’ve been a useful tool in multiple areas, to my surprise. But this is the one space where they’ve been an honestly huge help: I know I’ve made a mistake somewhere and I just can’t track it down. I can spend ten minutes staring at my files and pulling my hair out, or get an answer back in thirty seconds.

There are whole categories of coding problems that look like this, and LLMs are damn good at nearly all of them. Where did I fail to log correctly? Where am I not handling an error appropriately? If I am updating this method name, which files do I have to change?

I’ve had about five to ten sessions of using Claude – just on the command line, no fancy tools, and no editing allowed. I have about enough experience with Swift and SwiftUI that the terms of the relationship have changed. I feel more like the senior engineer, and it is the junior developer I have working on a short term contract. I can pass it the stupid grunt work (rename this method through the whole system, fill out this boilerplate for a new view, figure out how this new library works).

But I absolutely can’t trust it to make any big decisions. And I have to check all of its work. (Even better, it won’t be offended when I do.)

After a few more sessions, I’ll probably start letting it edit files directly. But only for that kind of large scale, small change work. Once I build the error management framework, it can probably push it through the system (especially since I’ve sprinkled my code with comments when the error management was missing). But I’ll still walk through all the changes to ensure it actually makes sense.

联系我们 contact @ memedata.com