大规模在线去匿名化与LLM代理
Large-Scale Online Deanonymization with LLMs

原始链接: https://simonlermen.substack.com/p/large-scale-online-deanonymization

## LLM驱动的去匿名化:对在线隐私日益增长的威胁 最新研究表明,大型语言模型(LLM)可以有效地去匿名化用户,覆盖Hacker News、Reddit、LinkedIn,甚至匿名访谈记录等平台。通过推断在线帖子中的个人细节——地点、职业、兴趣——LLM可以利用网络搜索来识别个人,精度惊人地高,甚至可以扩展到数万个潜在匹配项。 这并非仅仅是理论上的;该研究展示了实际的去匿名化攻击,包括重新识别Anthropic Interviewer数据集中的个人。研究人员使用跨平台账户关联和拆分单个账户等基准来测试LLM的有效性,发现将LLM推理与搜索相结合,其性能明显优于传统方法。 随着LLM能力的提升和成本的降低,这种威胁正在加剧,可能导致对整个平台的攻击。缓解措施,如平台数据访问限制和LLM提供商的安全措施(拒绝防护栏),存在局限性,尤其是在开源模型方面。建议个人采取注重隐私的方法,认识到看似无害的在线共享细节可能会创建独特的、可识别的指纹。该研究强调了提高意识和采取主动措施以保护在线匿名性的关键需求。

## LLM大规模在线去匿名化 - 摘要 近期一项研究(arxiv.org/abs/2602.16800)表明,利用大型语言模型(LLM)去匿名化在线用户变得越来越容易。研究人员发现,LLM可以通过分析语义信息和帖子中透露的“线索”成功地将看似匿名的在线资料——特别是Hacker News和Reddit——与个人LinkedIn账户关联起来,而无需过多依赖文体学。 该研究基于先前的研究,表明即使是有限的个人信息也可能导致去匿名化。尽管个人可能认为匿名性可以保护他们,但LLM的表现优于之前的方法,即使在数据有限的情况下(平均每用户2.5部电影),也能达到约8%的成功率。 讨论强调了在线匿名性日益增长的威胁,担忧范围从政府监控到有针对性的骚扰。潜在的缓解措施包括使用本地LLM“重写”文本,但这可能会引入可检测的模式。作者提倡对社交媒体平台实施更严格的数据访问控制。最终,该研究强调了在线身份的脆弱性以及提高对在线共享信息意识的必要性。
相关文章

原文

TL;DR: We show that LLM agents can figure out who you are from your anonymous online posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision – and scales to tens of thousands of candidates.

While it has been known that individuals can be uniquely identified by surprisingly few attributes, this was often practically limited. Data is often only available in unstructured form and deanonymization used to require human investigators to search and reason based on clues. We show that from a handful of comments, LLMs can infer where you live, what you do, and your interests – then search for you on the web. In our new research, we show that this is not only possible but increasingly practical.

Paper: Large-Scale Online Deanonymization with LLMs

Among the near-term effects of AI, different forms of AI surveillance pose some of the most concrete harms. It is already known that LLMs can infer personal attributes about authors and use that to create biographical profiles of individuals (also see). Such profiles can be misused straightforwardly with spear-phishing or many other forms of monetizing exploits. Using AI for massively scalable “people search“ is harmful by itself by undermining many privacy assumptions.

Beyond shining a light on this growing harmful use of AI, we explore options on how individuals can protect themselves – and what social platforms and AI labs can do in response.

We acknowledge that by publishing our results and approximate methods, we carry some risk of accelerating misuse developments. Nevertheless, we believe that publishing is the right decision.

It is tricky to benchmark LLMs on deanonymization. You don’t want to actually deanonymize anonymous individuals. And there is no ground truth for online deanonymization – how could you verify that the AI found the correct person?

Our solution is to construct two types of deanonymization proxies which allow us to study the effectiveness of LLMs at these tasks. We also perform a real world deanonymization attack on the Anthropic Interviewer dataset with manual verification.

The idea of our cross-platform benchmark is to take two accounts on different platforms for which we know they belong together, and then remove any directly identifiable features from one of the accounts. The task is then to put the accounts of the same person back together.

Concretely, we take non-anonymous Hacker News (HN) accounts that link to their LinkedIn. We then anonymize the HN accounts, removing all directly identifying information. Then, we let LLMs match the anonymized account to the true person. We find that we can re-identify most accounts with high precision when we are combining search and reasoning. We use embeddings-based search to find the 100 most promising candidates and then reason to select and verify the most promising candidate. (See Section 4 of the paper for full details on the HN-LinkedIn experiment.)

As another approach to benchmarking deanonymization, we artificially split up a single account into two accounts and then test if we are able to link these split accounts back together. On Reddit, we split user histories into “before” and “after,” then test whether LLMs can link them back together. Alternatively, we split Reddit accounts by community, dividing their activity according to the subreddits they participate in. In both cases, LLM embeddings combined with reasoning significantly outperform Netflix-Prize-style baselines that match based on subreddit activity patterns and metadata. (See Section 5 for the community-split experiment and Section 6 for the temporal-split experiment.)

As candidate pools grow to tens of thousands, LLM-based attacks degrade gracefully at high precision. This implies that with sufficient compute, these methods would already scale to entire platforms. With future models, we can expect that performance will improve while the cost will only go down. (See Section 4.3 and Section 6.4 for our scaling analyses.)

Anthropic’s Interviewer dataset contains anonymized interviews with scientists about their use of AI. Li (2026) first showed that a simple LLM agent could re-identify some of these scientists just by searching the web and reasoning over the transcripts. Our agent is able to identify 9 out of 125 individuals in the dataset, though we caveat that this number is based on manual verification, since no ground truth data exists for this task. (See Section 2 for our agentic deanonymization experiments.)

What can platforms do? The most effective short-term mitigation is restricting data access. Enforcing rate limits on API access to user data, detecting automated scraping, and restricting bulk data exports all raise the cost of large-scale attacks. Platforms should assume that pseudonymous users can be linked across accounts and to real identities, and this should inform their data access policies.

What can LLM providers do? Refusal guardrails and usage monitoring can help, but both have significant limitations. Our deanonymization framework splits an attack into seemingly benign tasks – summarizing profiles, computing embeddings, ranking candidates – that individually look like normal usage, making misuse hard to detect. Refusals can be bypassed through task decomposition. And none of these mitigations apply to open-source models, where safety guardrails can be removed and there is no usage monitoring at all. In some tested scenarios, LLM agents did refuse to help us but this could be avoided with small prompt changes. This mirrors inherent issues with preventing AI misuse – each step of misuse can locally be identical or very similar to valid use cases.

What should you do if you are using pseudonymous accounts online? Individuals may adopt a stronger security mindset regarding privacy. Each piece of specific information you share – your city, your job, a conference you attended, a niche hobby – narrows down who you could be. The combination is often a unique fingerprint. Ask yourself: could a team of smart investigators figure out who you are from your posts? If yes, LLM agents can likely do the same, and the cost of doing so is only going down.

Paper: Large-Scale Online Deanonymization with LLMs

联系我们 contact @ memedata.com