编号应从零开始:E.W. Dijkstra
Numbering should start at zero (1982)

原始链接: https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831.html

迪杰斯特拉论证了从零开始而不是从一开始的序列的优越性,重点在于这给表达子序列和索引带来的清晰性和优雅性。他提出了四种表示子序列(例如 2, 3, ..., 12)的约定,并得出结论认为“a ≤ i < b”约定更可取。 他从零开始索引的核心论点是简洁性:当对长度为 N 的序列进行编号时,从 0 开始得到范围“0 ≤ i < N”,这比从 1 开始得到的“1 ≤ i < N+1”更简洁。这直接反映了元素的索引代表前面元素的数量。 他批评了 FORTRAN、ALGOL 60、PASCAL 和 SASL 等语言的不一致或从 1 开始的索引,并指出 Mesa 的经验证实了坚持像“a ≤ i < b”这样的约定和从零开始编号的好处。迪杰斯特拉还认为反对这种做法是迂腐的观点是错误的,他认为这种抵制可能源于害怕被证明是错的,并用公司宗教作类比。

Hacker News 上的一个帖子讨论了 Edsger W. Dijkstra 关于零索引的论点。原帖强调了 Dijkstra 从零开始编号的立场。 一位评论者 roenxi 认为,Dijkstra 的影响力一部分源于他的声望,而不是他的观点本身有多么明智。Roenxi 更喜欢从一开始的索引,尽管数学中约定俗成的是零索引,但他认为这种方式更直观。他推测零索引的盛行源于数组中的内存偏移量,而不是任何逻辑推理。 另一位评论者 sim7c00 分享了一段引言,突出了围绕这一话题的激烈争论,其中一位数学家批评计算机科学家们持续使用零索引。该帖子表明,关于从哪里开始计数这个看似简单的选择,持续存在着,有时甚至是充满争议的讨论。

原文

Why numbering should start at zero

To denote the subsequence of natural numbers 2, 3, ..., 12 without the pernicious three dots, four conventions are open to us

a) 2 ≤ i < 13
b) 1 < i ≤ 12
c) 2 ≤ i ≤ 12
d) 1 < i < 13

Are there reasons to prefer one convention to the other? Yes, there are. The observation that conventions a) and b) have the advantage that the difference between the bounds as mentioned equals the length of the subsequence is valid. So is the observation that, as a consequence, in either convention two subsequences are adjacent means that the upper bound of the one equals the lower bound of the other. Valid as these observations are, they don't enable us to choose between a) and b); so let us start afresh.

There is a smallest natural number. Exclusion of the lower bound —as in b) and d)— forces for a subsequence starting at the smallest natural number the lower bound as mentioned into the realm of the unnatural numbers. That is ugly, so for the lower bound we prefer the ≤ as in a) and c). Consider now the subsequences starting at the smallest natural number: inclusion of the upper bound would then force the latter to be unnatural by the time the sequence has shrunk to the empty one. That is ugly, so for the upper bound we prefer < as in a) and d). We conclude that convention a) is to be preferred.

Remark  The programming language Mesa, developed at Xerox PARC, has special notations for intervals of integers in all four conventions. Extensive experience with Mesa has shown that the use of the other three conventions has been a constant source of clumsiness and mistakes, and on account of that experience Mesa programmers are now strongly advised not to use the latter three available features. I mention this experimental evidence —for what it is worth— because some people feel uncomfortable with conclusions that have not been confirmed in practice. (End of Remark.)

*                *
*

When dealing with a sequence of length N, the elements of which we wish to distinguish by subscript, the next vexing question is what subscript value to assign to its starting element. Adhering to convention a) yields, when starting with subscript 1, the subscript range 1 ≤ i < N+1; starting with 0, however, gives the nicer range 0 ≤  i < N. So let us let our ordinals start at zero: an element's ordinal (subscript) equals the number of elements preceding it in the sequence. And the moral of the story is that we had better regard —after all those centuries!— zero as a most natural number.

Remark  Many programming languages have been designed without due attention to this detail. In FORTRAN subscripts always start at 1; in ALGOL 60 and in PASCAL, convention c) has been adopted; the more recent SASL has fallen back on the FORTRAN convention: a sequence in SASL is at the same time a function on the positive integers. Pity! (End of Remark.)

*                *
*

The above has been triggered by a recent incident, when, in an emotional outburst, one of my mathematical colleagues at the University —not a computing scientist— accused a number of younger computing scientists of "pedantry" because —as they do by habit— they started numbering at zero. He took consciously adopting the most sensible convention as a provocation. (Also the "End of ..." convention is viewed of as provocative; but the convention is useful: I know of a student who almost failed at an examination by the tacit assumption that the questions ended at the bottom of the first page.) I think Antony Jay is right when he states: "In corporate religions as in others, the heretic must be cast out not because of the probability that he is wrong but because of the possibility that he is right."

Plataanstraat 5
5671 AL NUENEN
The Netherlands
11 August 1982
prof.dr. Edsger W. Dijkstra
Burroughs Research Fellow

Transcriber: Kevin Hely.
Last revised on Fri, 2 May 2008.

联系我们 contact @ memedata.com