抗滑移:一种消除语言模型中重复模式的框架
Antislop: A framework for eliminating repetitive patterns in language models

原始链接: https://arxiv.org/abs/2510.15061

## Antislop:减少大型语言模型中的重复语言 本文介绍 **Antislop**,一个旨在识别和消除“slop”的新框架——这种重复措辞是大型语言模型(LLM)输出的典型特征,会降低质量并暴露人工智能的作者身份。 Antislop 利用了三个关键创新:**Antislop Sampler**(推理时模式抑制)、**自动化 slop 分析流程**(基于人类基准生成训练数据)和 **Final Token Preference Optimization (FTPO)** – 一种有针对性的微调方法。 研究表明,LLM 表现出 slop 模式的频率远高于人类写作(在某些情况下超过 1,000 倍)。Antislop Sampler 可以有效管理数千种模式,优于简单的 token 禁止。 重要的是,FTPO 实现了 **90% 的 slop 减少** *同时保持或提高* 在各种基准测试(GSM8K、MMLU、创意写作)上的性能,不同于 DPO 等方法,后者为了较弱的抑制而牺牲质量。 所有代码和结果均在 MIT 许可下公开可用。

## Hacker News 讨论:解决语言模型中的“废话” 一篇最近的 arXiv 论文(“Antislop:消除语言模型中重复模式的框架”)在 Hacker News 上引发了关于人工智能生成文本中持续存在的问题,被称为“废话”的活跃讨论。用户们强调了对 ChatGPT 等模型常见的困扰——过多的破折号、随机表情符号、肯定句、奇怪的特定形容词以及随机加粗——即使在最新版本中也是如此。 核心问题不仅仅是重复性,而是人工智能输出中更广泛的缺乏质量和努力。虽然存在可以减轻这些表面问题的工具,但评论员们争论这些是治标不治本还是解决了影响语义和创造性输出的潜在“模式崩溃”。有人建议将废话检测整合到训练过程本身中。 许多人同意,识别人工智能生成的内容越来越困难,可能会降低人类创作的价值。讨论还涉及这些模式的起源,将其与在线交流趋势甚至用于训练这些模型的庞大数据集中固有的偏差联系起来。最终,这场对话凸显了改进人工智能以产生真正高质量、细致文本的持续挑战。
相关文章

原文

View a PDF of the paper titled Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models, by Samuel Paech and 3 other authors

View PDF
Abstract:Widespread LLM adoption has introduced characteristic repetitive phraseology, termed "slop," which degrades output quality and makes AI-generated text immediately recognizable. We present Antislop, a comprehensive framework providing tools to both detect and eliminate these overused patterns. Our approach combines three innovations: (1) The Antislop Sampler, which uses backtracking to suppress unwanted strings at inference time without destroying vocabulary; (2) An automated pipeline that profiles model-specific slop against human baselines and generates training data; (3) Final Token Preference Optimization (FTPO), a novel fine-tuning method that operates on individual tokens, surgically adjusting logits wherever a banned pattern has appeared in an inference trace. We demonstrate that some slop patterns appear over 1,000x more frequently in LLM output than human text. The Antislop Sampler successfully suppresses 8,000+ patterns while maintaining quality, whereas token banning becomes unusable at just 2,000. Most importantly, FTPO achieves 90% slop reduction while maintaining or improving performance in cross-domain evals including GSM8K, MMLU, and creative writing tasks. In contrast, DPO suffers significant degradation in writing quality and lexical diversity despite achieving weaker suppression. We release all code and results under MIT license: this https URL.
From: Samuel Paech [view email]
[v1] Thu, 16 Oct 2025 18:22:22 UTC (536 KB)
[v2] Tue, 21 Oct 2025 21:42:07 UTC (536 KB)
联系我们 contact @ memedata.com