(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43627354

Hacker News用户正在讨论Gemini 2.5 Pro实验版中谷歌新的“深度研究”功能。用户doctoboggan对用于训练大型语言模型的人类偏好评分表示怀疑,担心这会导致奉承式的回应而非事实准确性。其他用户将Gemini的研究能力与ChatGPT进行了比较。用户infecto发现ChatGPT“听起来更有学问”,并且更擅长扮演角色。另一位用户nico进行了一项关于远房亲戚的研究测试,发现ChatGPT在查找来源、连接信息和构建研究方面明显更好。相反,Jeffbee描述了一次积极的体验,Gemini最初给出了一个无用的答案,但随后生成了一个令人印象深刻的研究报告,其中包含关于阿苏萨一个奇特拓扑特征的120个来源,尽管文本略显冗长。总的来说,讨论展示了人们对不同AI研究工具的体验参差不齐,并突显了关于其优缺点的持续争论。

相关文章
  • Gemini 2.5 Pro实验版现已支持深度研究。 2025-04-09
  • (评论) 2025-03-25
  • (评论) 2023-12-12
  • (评论) 2025-03-26
  • (评论) 2025-04-06

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    Deep Research is now available on Gemini 2.5 Pro Experimental (blog.google)
    35 points by extesy 1 hour ago | hide | past | favorite | 4 comments










    > In our testing, raters preferred the reports generated by Gemini Deep Research powered by 2.5 Pro over other leading deep research providers by more than a 2-to-1 margin.

    Are these raters experts in the field the report was written on? Did they rate the reports on factuality, broadness, and insights?

    These sort of tests (and RLHF in general) are the reason that LLMs often respond with "Great question, you are exactly right to wonder..." or "Interesting insight, I agree that...". I do not want this obsequious behavior, I want "correct answers"[0]. We need some better benchmarks when it comes to human preference.

    [0]: I know there is no objective correct answers for some questions.



    Has anyone tested googles functionality vs ChatGPT? I have lightly played around with it but felt that generally ChatGPTs implementation was a little more educated sounding and felt like it took whatever necessary persona well.


    Just did a test last week and OpenAIs research was way better. Found 10x more sources and did an overall pretty great job

    The task was to lookup information about a late distant family member who had been a prominent employee in a certain foreign government about 100 years ago

    Gemini barely scratched the surface and pretty much gave up

    ChatGPT on the other hand, kept building up on its research, connecting the dots and leveraging each bit of acquired information to try to find more



    I stumbled across the feature a few hours ago. I had asked Gemini why there's a hole in the middle of the city of Azusa, topologically speaking. It had given me a useless tautological response: because they never annexed it. Then it offered to create a research report and I agreed. Five minutes later I got a notification on my mobile that the report was ready. It had 120 sources including assessor's maps, historical maps, court cases, and narrative articles. The text that went along with it was too verbose and still contained paragraphs of vague stuff, but it had key information linking the Mexican land grants, the founding of the city, and other events of history. Very impressive.






    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com