RFC 454545 – 人类破折号标准
RFC 454545 – Human Em Dash Standard

原始链接: https://gist.github.com/bignimbus/a75cc9d703abf0b21a57c0d21a79e2be

## 人类破折号 (HED) – 摘要 RFC 454545 提出了一种新的 Unicode 字符,即人类破折号 (HED),其视觉效果与标准破折号 (—) 相同,但具有独特的编码,以解决“破折号真实性崩塌” (DAC) 问题。这个问题源于人工智能生成文本中破折号的使用越来越频繁,且常常过于自信,这导致人类作者感到焦虑,因为他们的文风选择可能被错误归因。 HED 标准引入了一种配对的“人类证明标记” (HAM) – 一种不可见或微弱可见的字符 – 放置在 HED *之前*,以表明人类作者身份。验证人类输入(通过停顿、退格键,甚至叹息声!)的系统将插入 <HAM><HED> 序列。 该提案承认可能存在伪造犹豫不决的对抗性尝试,并建议使用“人类认知工作量证明” (HCPoW) – 例如不协调的表情符号使用 – 作为验证手段。虽然现有的破折号仍然有效,但其真实性可能会受到质疑。RFC 还预计可能会出现关于自动化系统使用 HED 的潜在法律法规,并请求 IANA 建立“人类标点符号”注册表。最终,HED 旨在将破折号重新确立为真正的人类思想和表达的标志。

一个 Hacker News 的讨论围绕 RFC 454545,提议一种名为“Human Em Dash”的独特 Unicode 字符,作为 AI 生成文本的潜在水印。这个想法之前曾在“99% Invisible”节目中介绍,旨在微妙地识别 AI 创建的内容。 然而,评论者普遍持怀疑态度。提出的担忧包括 AI 公司不太可能遵守 RFC 的“必须不”规定,以及 LLM 可能会直接*使用*该字符。还有人指出,由于编码问题,AI 输出中已经存在非官方的“水印”。 对话还涉及过去的 Unicode 争议(如字节顺序标记之战),以及该提案的表演性质,质疑其在检测 AI 生成内容方面的实际有效性。一些用户幽默地建议提示 LLM 特别*使用* Human Em Dash。
相关文章

原文
RFC 454545 Human Em Dash Standard Status: Informational March 2026 Authors: Janice Wilson, Jeff Auriemma RFC 454545 — Human Em Dash Standard Abstract This document proposes the Human Em Dash (HED), a Unicode character visually indistinguishable from the traditional em dash (—) but encoded separately for the purpose of indicating probable human authorship. Recent proliferation of automated text generation systems has produced a measurable increase in the frequency and enthusiasm of em dash usage. This trend has created ambiguity for human writers who have historically relied upon the em dash as a stylistic device. The Human Em Dash standard introduces a new Unicode code point and an associated Human Attestation Mark (HAM) that allows writers to signal that the dash in question originated from a human cognitive process involving hesitation, revision, or mild frustration. 1. Status of This Memo This memo provides information for the Internet community. Distribution of this memo is unlimited. 2. Problem Statement Historically, the em dash (—) has served as a flexible punctuation mark used by human authors to indicate interruption, emphasis, or sudden changes in thought. Recent developments in large-scale automated text generation have altered the punctuation ecosystem in several notable ways. Automated systems frequently produce em dashes with suspicious regularity. The dash is often used with unwavering grammatical confidence. Human writers report increasing anxiety that their punctuation choices may be interpreted as machine-generated. This phenomenon has produced what researchers have termed Dash Authenticity Collapse (DAC). As a result, a mechanism is required to allow human authors to distinguish their punctuation from that of automated systems. 3. Terminology The key words MUST, SHOULD, and MAY in this document are to be interpreted as described in RFC 2119. Additional terms defined for this document include: HED (Human Em Dash) A Unicode character representing a human-authenticated em dash. HAM (Human Attestation Mark) A preceding Unicode mark asserting that the following dash was produced by a human. Automated System Any system incapable of genuine hesitation. Hesitation Event A pause, deletion, cursor movement, or other evidence of cognitive uncertainty. 4. The Human Em Dash 4.1 Character Definition The Human Em Dash (HED) SHALL be visually identical to the traditional em dash but encoded as a distinct Unicode code point. U+10EAD HUMAN EM DASH Rendering engines MUST display the Human Em Dash identically to a standard em dash in order to preserve typographic compatibility. 4.2 Human Attestation Mark (HAM) To assert human authorship, implementations MUST precede the Human Em Dash with a Human Attestation Mark (HAM). Example proposed code point: U+10EAC HUMAN ATTESTATION MARK The resulting sequence is: <HAM><HED> The HAM character SHOULD render invisibly or with negligible typographic impact. Automated systems MUST NOT emit the Human Attestation Mark. Non-compliant implementations SHOULD be considered adversarial. 4.3 Behavioral Verification Conforming implementations SHOULD verify evidence of human authorship prior to insertion of the Human Em Dash. Evidence MAY include one or more of the following: A pause exceeding 137 milliseconds A backspace event Cursor repositioning A visible moment of indecision Audible sighing Systems incapable of hesitation MUST NOT emit the Human Em Dash. 5. Proof-of-Work Mechanisms Some environments MAY require additional verification before allowing insertion of the Human Em Dash. Suggested mechanisms include: Incongruous emoji usage Neroticism Expression of personal values or accountability These measures are collectively referred to as Human Cognitive Proof-of-Work (HCPoW). 6. Backwards Compatibility Legacy em dashes (—) remain valid punctuation. However, in contexts where authorship authenticity is important, legacy dashes MAY be interpreted as unverified punctuation artifacts. Readers SHOULD avoid making harsh judgments when encountering such characters. 7. Security Considerations Adversaries may attempt to simulate human hesitation through randomized delays or artificially inserted backspaces. Advanced implementations SHOULD monitor for suspicious patterns such as: Excessively consistent hesitation intervals Statistically improbable grammar perfection Uncanny servility These patterns, when noticed alongside em dash usage, are indicative of LLM-generated text. 8. Policy Considerations Jurisdictions MAY regulate the use of the Human Em Dash by automated systems. Use of the Human Em Dash by non-human agents MAY constitute punctuation impersonation. Policy frameworks for such regulation remain under development and will likely be debated at length. 9. IANA Considerations IANA is requested to establish the Human Punctuation Registry, including but not limited to: Human Em Dash Human Ellipsis Authentic Parenthetical Aside Allocation procedures SHOULD involve excessive documentation. 10. Non-Normative Examples Traditional usage: The committee reached a conclusion—after some debate. Human-authenticated usage: The committee reached a conclusion<HAM><HED>after some debate. In compliant systems, both render identically. 11. Acknowledgments The author would like to acknowledge human writers everywhere who now find themselves nervously reconsidering their punctuation choices. Special thanks to the em dash, which did nothing to deserve this. 12. References RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels. Various style guides, all of which disagree about the proper usage of the em dash.
联系我们 contact @ memedata.com