原文
| ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
![]() |
原始链接: https://news.ycombinator.com/item?id=43620452
Hacker News 的一个帖子讨论了 Meta 被指控在其 Llama 模型中“操纵 AI 基准测试”。评论者表达了对该模型性能的失望,认为它在其他开源替代方案中并不突出,并质疑其发布是否为时过早。一些人对 LMArena 作为可靠评估工具的价值提出了质疑,认为其主观性仅对专注于用户参与的公司有用。讨论还延伸到 OpenAI 也被指控通过使用承诺不使用的训练数据来操纵基准测试。一位用户认为“针对对话的优化”可能会优先考虑讨好的提示,这引发了人们对基准比较背后动机的担忧。讨论最后指出,“开放权重”的黑盒模型可能会以不可预测的方式被操纵。
| ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
![]() |
This is about what I expected, but it makes you wonder what they're going to do next. At this point it looks like they are falling behind the other open models, and made an ambitious bet on MoEs, without this paying off.
Did Zuck push for the release? I'm sure they knew it wasn't ready yet.
reply