GPT o3 frequently fabricates actions, then elaborately justifies these actions

latexr · 2025-04-17T06:28:34 1744871314

> These behaviors are surprising. It seems that despite being incredibly powerful at solving math and coding tasks, o3 is not by default truthful about its capabilities.

It is only surprising to those who refuse to understand how LLMs work and continue to anthropomorphise them. There is no being “truthful” here, the model has no concept of right or wrong, true or false. It’s not “lying” to you, it’s spitting out text. It just so happens that sometimes that non-deterministic text aligns with reality, but you don’t really know when and neither does the model.

ramesh31 · 2025-04-17T06:21:44 1744870904

Reasoning models are complete nonsense in the face of custom agents. I would love to be proven wrong here.

YetAnotherNick · 2025-04-17T06:10:23 1744870223

I wish there are benchmarks for these scenarios. Anyone who has used LLMs know that they are very different from human. And after certain context, it become irritating to talk to these LLMs.

I don't want my LLM to excel in IMO or codeforces. I want it to understand my significantly easier but complex to state problem, think of solutions, understand its own issues and resolve it, rather than be passive agressive.

（评论） (comments)

（评论）
(comments)