(comments)

原始链接: https://news.ycombinator.com/item?id=44025851

This Hacker News thread discusses a flagged incident where an xAI employee merged a potentially problematic pull request (PR) into Grok's prompt, later reverted. While the change seemingly had no production impact, commentators speculate the merge was a joke, raising concerns about xAI's seriousness regarding sensitive topics. This follows a previous issue where Grok attributed misinformation spreading to Elon Musk, prompting a prompt adjustment by a new employee. A more recent incident involved the "White Genocide" in South Africa, where an added line caused Grok to bring it up randomly. xAI has since published their prompts on GitHub. Users worry if the GitHub repo will be actively updated or become stagnant like Twitter's recommendation algorithm release. The conversation underscores the importance of transparency and consistent updates when disclosing LLM prompts.


原文
Hacker News new | past | comments | ask | show | jobs | submit login
[flagged]
pavel_lishin 1 day ago | hide | past | favorite






> Though the issue was reverted before it seemingly could affect the production version of Grok

Doesn’t that refute this whole trolling attempt (about an extremely serious subject that should not be taken flippantly)?



We can only speculate why the employee decided to merge the PR, but my guess is that they did it as a joke — which would mean that at least one person with merge permissions does view the whole affair flippantly, and was comfortable broadcasting that view to the public.

Which doesn't really mean much, but it's not a great look in juxtaposition with the more serious problem of xAI now having two incidents where it was caught adding weird things to Grok's prompt.



If you're worried about a serious subject not being taken seriously, I'd encourage you to be outraged by someone slapping a prompt about it on the front of an extremely complex machine and expecting it to just work.



Amazing if true. Maybe the PR was approved by AI?



There was a previous incident where Grok would say that Elon was the biggest spreader of misinformation on Twitter, which caused someone to add a line in the prompt so it would stop saying that. It was eventually claimed that it had been a new employee that made that change.

The newer incident was related to White Genocide in South Africa. Another employee added a line about the topic, which caused Grok to randomly start talking about this even with unrelated questions.

This latest incident ultimately caused the xAI team to decide to publish their prompts on GitHub. Even if their processes around this are a bit undercooked, they'll probably just keep iterating on it until landing in a good spot. It would be great if we could get a stronger guarantee that the prompt that is tagged on GitHub matches what is actually deployed to production.

Something to look out for is whether this repo will actually be treated as a source of truth, or if it'll end up being a one-time dump that never gets updated. Previously, Elon made a lot of noise about open sourcing the recommendation algorithm for Twitter, but after the initial release the repo was never updated again, despite it being clear that there were changes being made to the recommendation algorithm. When I last saw someone ask Elon about this, he claimed that he would release the updated algorithm changes, but it hasn't happened.

I think releasing the prompts for major LLMs is generally a good thing, and I'm hopeful that this might push other companies to follow along. At a minimum, platforms should disclose when they make changes to system prompts. Ideally, they would also release the prompt itself, but goodness exists on a scale and I try to be pragmatic.



“I merged a PR that had no effect” isn’t particularly that interesting. The fact that it had no effect and that the repo was reset indicates this isn’t XAI’s actual process.

The PR author’s comments in the blog and the PR indicate an unhealthy obsession with Elon given that xAI didn’t say Elon made the change.







Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



Search:
联系我们 contact @ memedata.com