(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43451742

Hacker News 上关于“苦涩的教训”一文的讨论主要围绕着全自动 AI 与人工监督系统之间的权衡展开。一位评论者强调了管理用户预期的重要性,并建议一个精度略低但一致性更高的 AI 智能体(80% ± 10%)优于一个潜在精度更高但性能更不稳定的智能体(90% ± 40%)。 另一位评论者将此与国际象棋电脑进行了类比,指出虽然已经实现了“超人类”的性能,但市场却由 Stockfish 等“足够好”的解决方案主导。他们认为,仅仅依靠巨大的计算能力来实现某些目标并不能保证拥有庞大的市场来支撑。他们还强调了使计算密集型系统真正发挥作用所需的巨大基础设施和人力投入(例如模型训练)。 其他评论者同意,更大的计算能力通常比人工引导的方法能带来更好的结果,但一些人对这种方法的高昂成本表示担忧,特别是 GPU 的成本。

相关文章
  • 《痛苦教训》是关于AI智能体的。 2025-03-23
  • (评论) 2025-03-22
  • (评论) 2025-02-25
  • (评论) 2025-03-12
  • (评论) 2024-04-25

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    Bitter Lesson is about AI agents (ankitmaloo.com)
    14 points by ankit219 7 hours ago | hide | past | favorite | 4 comments










    This misses that if the agent is occasionally going haywire, the user is leaving and never coming back. AI deployments are about managing expectations - you’re much better off with an agent that’s 80 +/- 10% successful than 90 +/- 40%. The more you lean into full automation, the more guardrails you give up and the more variance your system has. This is a real problem.


    Going back to the original "Bitter Lesson" article, I think the analogy to chess computers could be instructive here. A lot of institutional resources were spent trying to achieve "superhuman" chess performance, it was achieved, and today almost the entire TAM for computer chess is covered by good-enough Stockfish, while most of the money tied up in chess is in matching human players with each other across the world, and playing against computers is sort of what you do when you're learning, or don't have an internet connection, or you're embarrassed about your skill and don't want to get trash-talked by an Estonian teenager.

    The "Second Bitter Lesson" of AI might be that "just because massive amounts of compute make something possible doesn't mean that there will be a commensurately massive market to justify that compute".

    "Bitter Lesson" I think also underplays the amount of energy and structure and design that has to go into compute-intensive systems to make them succeed: Deep Blue and current engines like Stockfish take advantage of tablebases of opening and closing positions that are more like GOFAI than deep tree search. And the current crop of LLMs are not only taking advantage of expanded compute, but of the hard-won ability of companies in the 21st century to not only build and resource massive server farms, but mobilize armies of contractors in low-COL areas to hand-train models into usefulness.



    Good stuff but the original "Bitter Lesson" article has the real meat, which is that by applying more compute power we get better results (just more accurate token predictions, really) than with human guiderails.


    More generally beats better. That’s the continual lesson from data intensive workloads. More compute, more data, more bandwidth.

    The part that I’ve been scratching my head at is whether we see a retreat from aspects of this due to the high costs associated with it. For cpu based workloads this was a workable solution, since the price has been reducing. gpus have generally scaled pricing as a constant of available flops, and the current hardware approach equates to pouring in power to achieve better results.







    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com