A new study from researchers at Anthropic and ETH Zurich reveals something unsettling: modern AI systems can identify the real-world identities behind supposedly anonymous internet accounts. The findings, posted as a preprint on arXiv, demonstrate that large language models (LLMs) can analyze online behavior and link pseudonymous profiles to actual people with surprising accuracy and efficiency.

The research, titled "Large-scale online deanonymization with LLMs," investigates how AI agents can automate the deanonymization process — essentially matching anonymous or pseudonymous accounts with real identities at massive scale.

Traditionally, this kind of work required investigators to manually hunt through posts, analyze writing style, and follow digital breadcrumbs scattered across the internet. What's interesting here is that the research team shows modern AI can handle much of this detective work automatically.

The AI system analyzed public text from online platforms and extracted identity signals such as personal interests, demographic hints, writing patterns, and accidental details revealed in posts. It then searched for matching profiles across the internet and evaluated whether these clues matched known individuals.

To test their approach, researchers built several datasets with pre-identified real identities. In one experiment, the AI attempted to match Hacker News forum users with their LinkedIn profiles — even with obvious identifiers like names and usernames removed.

Another dataset involved linking pseudonymous Reddit accounts active across multiple communities. A third test split one person's posting history into two separate profiles to see if the AI could recognize both belonged to the same individual.

The results were striking. LLM-based systems vastly outperformed traditional deanonymization techniques. In some cases, the model achieved recall rates up to 68% with roughly 90% precision — meaning the AI correctly identified many accounts while maintaining a relatively low false-positive rate. Traditional methods barely registered meaningful results by comparison.

According to the researchers, these results suggest AI can now replicate tasks that previously consumed countless human investigator hours. A single AI system can automatically extract identity-relevant features from text, search through thousands of potential profiles, and infer which candidate is most likely correct.

This development raises serious concerns because anonymity has long been considered basic protection for internet users. Pseudonymous accounts are widely used by journalists, whistleblowers, activists, and individuals wanting to discuss sensitive topics without revealing their identity.

The research suggests this protective layer — sometimes called "obscurity through reality drift" — is weakening as AI systems grow better at connecting digital traces across multiple platforms. If automated tools can perform this linking quickly and cheaply, the barrier to identifying anonymous users could drop dramatically.

The researchers estimate the cost to deanonymize a single online account using their experimental system could be as low as $1 to $4 per profile — meaning large-scale investigations could become economically feasible.

That said, the authors note they conducted this research in a controlled environment using only public data. The work hasn't undergone peer review yet, and they deliberately withheld certain technical details to reduce misuse risk.

Still, the findings quickly sparked debate among privacy and tech experts. Many argue that users may need to reconsider how much personal information they share online, even in spaces that feel anonymous.

Looking ahead, researchers believe we need deeper investigation into both the risks and defenses against AI-powered deanonymization. Potential solutions could include better privacy tools, stronger platform security measures, or AI systems designed to automatically redact sensitive information before content goes public.

The real concern is this: as artificial intelligence becomes more powerful at analyzing vast amounts of online content, we face a new challenge. How do we balance AI's investigative capabilities with our fundamental need for privacy in the digital age?

Related Articles