比尔·盖茨的8位BASIC复活节彩蛋 (2008)
Bill Gates's Personal Easter Eggs in 8 Bit BASIC (2008)

原始链接: https://www.pagetable.com/?p=43

微软最初为Altair 8800开发的BASIC解释器,后来被改写以适应6502处理器,为早期的KIM-1和Commodore PET等电脑提供了动力。PET上通过输入“WAIT6502,1”触发的一个隐藏的“MICROSOFT!”彩蛋,曾引发了关于比尔·盖茨和Commodore创始人杰克·特拉梅尔之间紧张关系的传闻。 这个巧妙地隐藏在ROM中的彩蛋,是专门为PET的字符编码和内存地址编写的,暗示了其针对性。虽然Commodore由于内存限制移除了这个彩蛋,但在后来的版本中承认了微软的版权。 有趣的是,“MICROSOFT!”字符串,尽管经过编码,却出现在其他系统(如Apple II和TRS-80 Color Computer)的BASIC版本中。Color Computer甚至还有一个更明显的彩蛋,通过输入“CLS9”激活,这显示了微软明显的疏忽。人们推测比尔·盖茨可能参与了“WAIT”函数和最初彩蛋的实现,但Ric Weiland可能是6502 BASIC的主要作者。

这个Hacker News帖子讨论了比尔·盖茨在8位BASIC程序中隐藏的彩蛋。帖子链接了盖茨本人的评论,他提到了为日本机器编写的其他BASIC版本。用户们讨论了MSX-Basic及其功能。一位用户声称曾通过编写演示程序获得了一台Spectravideo MSX电脑。讨论随后转向了反克隆措施在现代时代能否生存,以及数字盗版的“丑陋军备竞赛”。其他话题包括老式电脑的历史背景,彩蛋作为知识产权保护,以及BASIC与Forth的比较。 一个充满争议的子线程质疑了盖茨的技术能力,一个用户声称他“根本就不是开发者”。这一说法受到了盖茨编程经验的证据的反驳,包括他在FAT、Fortran方面的工作以及他对代码性能的批评。该帖子还提到了盖茨早期的编程经历以及与他共事的人的轶事,驳斥了盖茨并非开发者的说法。另一位用户声称盖茨的离婚与杰弗里·爱泼斯坦有关。

原文

If you type “WAIT6502,1” into a Commodore PET with BASIC V2 (1979), it will show the string “MICROSOFT!” at the top left corner of the screen. Legend has it Bill Gates himself inserted this easter egg “after he had had an argument with Commodore founder Jack Tramiel”, “just in case Commodore ever tried to claim that the code wasn’t from Microsoft”.

In this episode of “Computer Archeology“, we will not only examine this story, but also track down the history of Microsoft BASIC on various computers, and see see how Microsoft added a second easter egg to the TSR-80 Color Computer – because they had forgotten about the first one.

Stolen From Apple?

This whole story sounds similar to Apple embedding a “Stolen From Apple” icon into the Macintosh firmware in 1983, so that in case a cloner copies the ROM, in court, Steve Jobs could hit a few keys on the clone, revealing the icon and proving that not just a “functional mechanism” was copied but instead the whole software was copied verbatim.

Altair BASIC

Let’s dig into the history of Microsoft’s BASIC interpreters. In 1975, Microsoft (back then still spelled “Micro-soft”) released Altair BASIC, a 4 KB BASIC interpreter for the Intel 8080-based MITS Altair 8800, which, despite all its other limitations, included a 32 bit floating point library.

An extended version (BASIC-80) that consisted of 8 KB of code contained extra instructions and functions, and, most importantly, support for strings.

Microsoft BASIC for the 6502

In 1976, MOS Technology launched the KIM-1, an evaluation board based around the new 6502 CPU from the same company. Microsoft converted their BASIC for the Intel 8080 to run on the 6502, keeping both the architecture of the interpreter and its data structures the same, and created two versions: an 8 KB version with a 32 bit floating point library (6 digits), and a 9 KB system with 40 bit floating point support (9 digits).

Some sources claim that, while BASIC for the 8080 was 8 KB in size, Microsoft just couldn’t fit BASIC 6502 into 8 KB, while other sources claim there was an 8KB version for the 6502. The truth is somewhere in the middle. The BASIC ROMs of the Ohio Scientific Model 500/600 (KIM-like microcomputer kits from 1977/1978) and the Compukit UK101 were indeed 8 KB in size, but unlike the 8080 version, it didn’t leave enough room for the machine-specific I/O code that had to be added by the OEM, so these machines required an extra ROM chip containing this I/O code.

In 1977, Microsoft changed the 6 digit floating point code to support 9 digits and included actual error stings instead of two-character codes, while leaving everything else unchanged. A 6502 machine with BASIC in ROM needed more than 8 KB anyway, why not make it a little bigger to add extra features. The 6 digit math code was still an assembly time option; the 1981 Atari Microsoft BASIC used that code.

In 1977, Ohio Scientific introduced the “Model 500”, which was the first machine to contain (6 digit) Microsoft BASIC 1.0 in ROM. Upon startup, it printed:

OSI 6502 BASIC VERSION 1.0 REV 3.2
COPYRIGHT 1977 BY MICROSOFT CO.
OK

In the same year, MOS started selling a tape version of 9 digit Microsoft BASIC 1.1 for the KIM-1. Its start message was:

MOS TECH 6502 BASIC V1.1
COPYRIGHT 1977 BY MICROSOFT CO.
OK

Woz Integer BASIC

The 1976 Apple I was the first system besides the KIM to use the MOS 6502 CPU, but Steve Wozniak wrote his own 4KB BASIC interpreter instead of licensing Microsoft’s. An enhanced version of Woz’ “Integer BASIC” came in the ROM of the Apple II in 1977; Microsoft BASIC (called “AppleSoft”) was available as an option on tape. On the Apple II Plus (1978), AppleSoft II replaced Integer BASIC.

Commodore PET

Commodore had bought MOS in October 1976 and worked on converting the KIM platform into a complete computer system. They licensed Microsoft BASIC for 6502 (also October 1976), renamed it to Commodore BASIC, replaced the “OK” prompt with “READY.”, stripped out the copyright string and shipped it in the ROMs of the first Commodore PET in 1977.

The Easter Egg

In 1979, Commodore started shipping update ROMs with a version 2 of Commodore BASIC for existing PETs. Apart from updates in array handling, it also contained the WAIT 6502 easter egg.

This is what the easter egg code looks like:

.,D710 20 C6 D6  JSR $D6C6      fetch address and value
.,D713 86 46     STX $46        save second parameter
.,D715 A2 00     LDX #$00       default for third parameter
.,D717 20 76 00  JSR $76        CHRGOT get last character
.,D71A F0 29     BEQ $D745      no third parameter
.,D71C 20 CC D6  JSR $D6CC      check for comma and fetch parameter
.,D71F 86 47     STX $47        save 3rd parameter
.,D721 A0 00     LDY #$00
.,D723 B1 11     LDA ($11),Y    read from WAIT address
.,D725 45 47     EOR $47        second parameter
.,D727 25 46     AND $46        first parameter
.,D729 F0 F8     BEQ $D723      keep waiting
.,D72B 60        RTS            back to interpreter loop

On pre-V2 BASIC, the branch at $D71A just skipped the next line: If there is no third parameter, don’t fetch it. On V2, the line is subtly changed to make the two-parameter case branch to a small patch routine:

.,D745 A5 11     LDA $11        low byte of address
.,D747 C9 66     CMP #$66       = low of $1966 (=6502)
.,D749 D0 D4     BNE $D71F      no, back to original code
.,D74B A5 12     LDA $12        high byte of address
.,D74D E9 19     SBC #$19       = high of $1966 (=6502)
.,D74F D0 CE     BNE $D71F      no, back to original code
.,D751 85 11     STA $11        low byte of screen buffer = 0
.,D753 A8        TAY            index = 0
.,D754 A9 80     LDA #$80       high byte of screen buffer
.,D756 85 12     STA $12        screen buffer := $8000
.,D758 A2 0A     LDX #$0A       10 characters
.,D75A BD 81 E0  LDA $E081,X    read character
.,D75D 29 3F     AND #$3F       throw away upper bits
.,D75F 91 11     STA ($11),Y    store into screen RAM
.,D761 C8        INY
.,D762 D0 02     BNE $D766      no carry
.,D764 E6 12     INC $12        increment screen buffer high address
.,D766 CA        DEX
.,D767 D0 F1     BNE $D75A      next character
.,D769 C6 46     DEC $46
.,D76B D0 EB     BNE $D758      repeat n times
.,D76D 60        RTS            back to interpreter loop

The text “MICROSOFT!” is stored in 10 consecutive bytes at $E082, cleverly hidden after a table of coefficients that is used for the SIN() function:

.;E063 05                        6 coefficients for SIN()
.;E064 84 E6 1A 2D 1B            -((2*PI)**11)/11! = -14.3813907
.;E069 86 28 07 FB F8             ((2*PI)**9)/9!   =  42.0077971
.;E06E 87 99 68 89 01            -((2*PI)**7)/7!   = -76.7041703
.;E073 87 23 35 DF E1             ((2*PI)**5)/5!   =  81.6052237
.;E078 86 A5 5D E7 28            -((2*PI)**3)/3!   = -41.3147021
.;E07D 83 49 0F DA A2               2*PI           =  6.28318531
.;E082 A1 54 46 8F 13            "SOFT!" | backwards and with
.;E087 8F 52 43 89 CD            "MICRO" | random upper bits

If we reverse the bytes, we get

CD 89 43 52 8F 13 8F 46 54 A1

The easter egg code clears the upper 2 bits, resulting in

0D 09 03 12 0F 13 0F 06 14 21

The easter egg code does not print the characters through library routines, but instead writes the values directly into screen RAM. While BASIC used the ASCII character encoding, the Commodore character set had its own encoding, with “A” starting at $01, but leaving digits and special characters at the same positions as in ASCII. Thus, the 10 hidden and obfuscated bytes decode into:

MICROSOFT!

Microsoft’s Code?

Commodore engineers are known for putting easter eggs into ROM, but there would be no reason for them to encode the string “MICROSOFT!” and hide it so well. The “WAIT 6502” easter egg did not show up in Commodore BASIC until version 2, which is in contrast to almost all sources claiming Commodore licensed Microsoft BASIC for a flat fee and never returned to Microsoft for updates, but continued improving BASIC internally.

Commodore had indeed updated its source with Microsoft’s changes since V1. 6502 guru Jim Butterfield states:

Commodore paid Microsoft an additional fee to write a revision to the original BASIC that they had bought. Among other things, spaces-in-keywords were changed, zero page shifted around, and (unknown to Commodore) the WAIT 6502,x joke was inserted.

Targeting Commodore?

While all of Microsoft BASIC only depends on the CPU, makes no other assumptions on the hardware it runs on (be it Commodore, Apple, Atari, …), and does all its input and output by calling into ROM functions external to BASIC, the easter egg writes directly to screen RAM at a fixed address of $8000, and uses the PET character encoding for it: The easter egg has clearly been written specifically for the PET.

We can only speculate on the reasons why Microsoft and possibly Bill Gates himself added the easter egg. A possible reason is that Microsoft wanted to make sure Commodore cannot take credit for “Commodore BASIC” – similar to the “Stolen From Apple” case.

Or it was only about showing the world who really wrote it. Jim Butterfield: As an afterthought, Microsoft would have liked to see their name come up on the screen. But it wasn’t in the contract.

Commodore’s Reaction

The easter egg only exists in BASIC version 2 on the PET. All later Commodore computers didn’t contain it: The branch was restored and the extra code as well as the 10 bytes hidden after the SIN() coefficients were removed.

Jim Butterfield: Shortly after that implementation, I show this to Len Tramiel [of Commodore engineering] at the Commodore booth of a CES show. He was enraged: “We have a machine that’s short of memory space, and the #$#!* [Gates] put that kind of stuff in!!”

Commodore employee Andy Finkel states that the “Gates” (!) easter egg had to be removed for space reasons. It had occupied 51 extra bytes.

Interestingly, starting with the BASIC V7 on the C128 six years later, Commodore started crediting Microsoft, like this:

        COMMODORE BASIC V7.0 122365 BYTES FREE
          (C)1985 COMMODORE ELECTRONICS, LTD.
                (C)1977 MICROSOFT CORP.
                  ALL RIGHTS RESERVED

According to Jim Butterfield, this is probably due to negotiations concerning Microsoft BASIC for the Amiga.

The Easter Egg before the PET

But Microsoft did not encode its company name specifically for Commodore: The 9 digit BASIC 6502 version 1.1 for the KIM-1 contained the 10 hidden bytes:

.;3FAA 05                        6 coefficients for SIN()
.;3FAB 84 E6 1A 2D 1B            -((2*PI)^11)/11! = -14.3813907
.;3FB0 86 28 07 FB F8             ((2*PI)^9)/9!   =  42.0077971
.;3FB5 87 99 68 89 01            -((2*PI)^7)/7!   = -76.7041703
.;3FBA 87 23 35 DF E1             ((2*PI)^5)/5!   =  81.6052237
.;3FBF 86 A5 5D E7 28            -((2*PI)^3)/3!   = -41.3147021
.;3FC4 83 49 0F DA A2               2*PI           =  6.28318531
.;3FC9 A6 D3 C1 C8 D4            "!TFOS"
.;3FCE C8 D5 C4 CE CA            "ORCIM"

The extra bytes here are:

A6 D3 C1 C8 D4 C8 D5 C4 CE CA

If we XOR every byte with 0x87, we get:

21 54 46 4f 53 4f 52 43 49 4d

which, again, is “MICROSOFT!” backwards, but this time in the ASCII encoding. (Note that no XOR or add/sub can be found for the 10 bytes in Commodore BASIC that would convert them into ASCII instead of PETSCII. Also, thanks to Tom for his help here.)

The version of Microsoft BASIC for the 6502-based Apple II, called “AppleSoft“, contains the same 10 bytes after the coefficients in all tape and ROM versions. On AppleSoft II, for example, they are located at address $F075.

KIM-1 BASIC was released in 1977, AppleSoft II in spring 1978, and the V2 ROM of the PET in spring 1979. So Microsoft didn’t “target” Commodore with this at first, but probably put the data in for all their customers – possibly right after they had shipped the easteregg free V1 to Commodore. And when Commodore came back to them, they changed their codebase to encode string differently and added the easter egg code to show the string.

The Easter Egg after the PET

After the second source drop to Commodore, they removed the “WAIT6502” code again, but kept the 10 encoded bytes in their master codebase: Every non-Commodore post-1978 6502 Microsoft BASIC with the 40 bit floating point library contains the 10 encoded bytes after the SIN() coefficients – still in PET encoding:

  • Tangerine Microtan 65
  • Tangerine Oric-1 and Oric-Atmos
  • Pravetz 8D

This is a snippet from microtan/tanex_h2.rom:

0000fd8: 0f da a2 a1 54 46 8f 13  ....TF..
0000fe0: 8f 52 43 89 cd a5 d5 48  .RC....H

The ROM of the Ohio Scientific Superboard II (and its clone, the Compukit UK101) as well as the Atari Microsoft BASIC tape are based on the 32 bit floating point version and don’t contain the easter egg data.

“MICROSOFT!” on the 6800 and the 6809

It doesn’t stop there: Even the BASIC versions on the TRS-80 Color Computer and the TRS-80 MC-10, which were versions for the 6809 and 6800 CPU architectures, respectively (BASIC-69 and BASIC-68), had the encoded “MICROSOFT!” string after the SIN() coefficients. Here is a snippet of Spectral Associates’ disassembly of the CoCo ROM in his book “Color Basic Unravelled II

                       * MODIFIED TAYLOR SERIES SIN COEFFICIENTS
BFC7 05                LBFC7   FCB   6-1                   SIX COEFFICIENTS
BFC8 84 E6 1A 2D 1B    LBFC8   FCB   $84,$E6,$1A,$2D,$1B   * -((2*PI)**11)/11!
BFCD 86 28 07 FB F8    LBFC8   FCB   $86,$28,$07,$FB,$F8   *  ((2*PI)**9)/9!
BFD2 87 99 68 89 01    LBFD2   FCB   $87,$99,$68,$89,$01   * -((2*PI)**7)/7!
BFD7 87 23 35 DF E1    LBFD7   FCB   $87,$23,$35,$DF,$E1   *  ((2*PI)**5)/5!
BFDC 86 A5 5D E7 28    LBFDC   FCB   $86,$A5,$5D,$E7,$28   * -((2*PI)**3)/3!
BFE1 83 49 0F DA A2    LBFE1   FCB   $83,$49,$0F,$DA,$A2   *    2*PI

BFE6 A1 54 46 8F 13 8F LBFE6   FCB   $A1,$54,$46,$8F,$13   UNUSED GARBAGE BYTES
BFEC 52 43 89 CD               FCB   $8F,$52,$43,$89,$CD   UNUSED GARBAGE BYTES

You can tell that Microsoft didn’t reimplement BASIC for the remaining 8 bit architectures, but practically converted the 6502 code, copying all constants verbatim, even the ones they did not understand, since these are still the obfuscated bytes in PET-encoding.

A Second Easter Egg on the Color Computer

The TSR-80 Color Computer (1980) also has an easter egg in BASIC: If you type “CLS9” (or any higher number), it will clear the screen and print “MICROSOFT”.

Let’s see how it is done:

                  * CLS
A910 BD 01 A0     CLS     JSR RVEC22     HOOK INTO RAM
A913 27 13                BEQ LA928      BRANCH IF NO ARGUMENT
A915 BD B7 0B             JSR LB70B      CALCULATE ARGUMENT, RETURN VALUE IN ACCB
A918 C1 08                CMPB #8        VALID ARGUMENT?
A91A 22 1B                BHI LA937      IF ARGUMENT >8, GO PRINT ‘MICROSOFT’
[...]
A937 8D EF        LA937   BSR LA928      CLEAR SCREEN
A939 8E A1 65             LDX #LA166-1   *
A93C 7E B9 9C             JMP LB99C      * PRINT ‘MICROSOFT’

The string to be printed is stored here:

A166 4D 49 43 52 4F 53 LA166 FCC 'MICROSOFT'
A16C 4F 46 54
A16F 0D 00             LA16F FCB CR,$00

That’s right, Microsoft added a different easter egg, and included the string “MICROSOFT” again, this time in cleartext. They seem to have forgotten about the obfuscated 10 bytes intended for the PET that had been copied from the 6502 version to the 6800 during conversion, and had still been present in the Color Computer ROM.

The same easter egg exists on the 6800-based TRS-80 MC-10 (also 1980), which also had the 10 PET bytes in ROM:

FBBF 27 13                BEQ $FBD4       ; branch if no argument
FBC1 BD EF 0D             JSR $EF0D       ; get argument
FBC4 C1 08                CMPB #$08       ; easter egg?
FBC6 22 1D                BHI $FBE5       ; yes
[...]
FBE5 8D ED                BSR $FBD4       ; clear screen
FBE7 CE F8 33             LDX #$F834-1
FBEA 7E E7 A8             JMP $E7A8       ; print "MICROSOFT"
[...]
F834 4D 49 43 52 4F       FCC "MICROSOFT"
F834 53 4F 46 54 0D       FCB $0D
F834 00                   FCB $00
[...]
F724 A1 54 46 8F 13       FCB $A1,$54,$46,$8F,$13 ; "!TFOS"
F729 8F 52 43 89 CD       FCB $8F,$52,$43,$89,$CD ; "ORCIM"

Microsoft BASIC 6502 Timeline

  • Version 1.0 (in the 6 digit version) is used on the Ohio Scientific, and contains a major bug in the garbage collection code.
  • Version 1.0 (in the 9 digit version) is also used in the first Commodore PET as Commodore BASIC V1. It is the oldest known Microsoft BASIC to support 9 digit floating point.
  • Version 1.1, which contained bug fixes, is used on the KIM-1. It is the oldest version to contain the “MICROSOFT!” string (in ASCII).
  • AppleSoft BASIC I is forked from Microsoft BASIC 1.1. It contains the ASCII string.
  • Microsoft BASIC version 2 changes the ASCII string to PET screencode, adds the easter egg code, and is given to Commodore.
  • The code is removed again after the source drop to Commodore. The Tangerine Microtan is based on this.
  • Apple, Commodore and Tangerine continue development of their respective forks without the involvement of Microsoft.
  • The BASIC V2 used on the VIC-20 and the C64 is actually a stripped-down version of PET BASIC 4.0 and not a ported version of PET BASIC V2.

So did Bill Gates write it himself?

Altair BASIC was written by Bill Gates, Paul Allen (the founders of Microsoft) and Monte Davidoff (a contractor), as comments in the original source printout show:

00560  PAUL ALLEN WROTE THE NON-RUNTIME STUFF.
00580  BILL GATES WROTE THE RUNTIME STUFF.
00600  MONTE DAVIDOFF WROTE THE MATH PACKAGE.

Bill Gates wrote “the runtime stuff” (which probably means the implementation of the instructions), as opposed to “the non-runtime stuff” (probably meaning tokenization, memory management) and “the math package”. Consequently, the implementation of the WAIT command would have been his work – on the 8080, at least.

Now who wrote the 6502 version? The KIM-1 BASIC manual credits Gates, Allen and Davidoff, the original authors of the 8080 version, but it might only be left over from the manual for the 8080 version. Davidoff, who worked for Microsoft in the summers of 1975 and 1977, had not been there when BASIC 6502 was written in the summer of 1976, but he probably changed the 6 digit floating point code into the 9 digit version that is first found in BASIC 6502 1.1 (KIM-1, 1977).

The ROM of the 1977/1978 Ohio Superboard II Model 500/600 (6 digit BASIC 1.0) credits RICHARD W. WEILAND, and the 1977 9 digit KIM-1 BASIC 1.1 as well as the 1981 Atari Microsoft BASIC 2.7 credit “WEILAND & GATES”. Ric Weiland was the second Microsoft employee. These credits, again, were easter eggs: While they were clearly visible when looking at the ROM dump, they were only printed when the user entered “A” when BASIC asked for the memory size.

According to apple2history.org, Marc McDonald (employee number 1) wrote the 6502 version, but it is more likely that McDonald wrote the 6800 simulator and Weiland ported 8080 BASIC to the 6800 and then McDonald adapted the 6800 simulator to the 6502 and Weiland wrote the 6502 BASIC.

This and the hidden credits in version 1.0 of 6502 BASIC suggest that Weiland was the main author of 6502 BASIC. Gates is added to the hidden credits in the 1.1 version, so Gates probably contributed to the 1.1 update..

So it is very possible that Gates wrote the easter egg code himself, given that he was responsible for the implementation of WAIT on the 8080, he is credited in BASIC 6502 1.1+, Finkel and Butterfield refer to WAIT6502 as “Gates'” easter egg – and after all, he can write code.

Open Questions

  • What was Atari’s version based on? What versions were there? Atari Microsoft BASIC images are very hard to find.
  • Why did Atari use the 6 digit version, if they extended it with lots of commands (so size couldn’t have been an issue)?

Annotated Disassembly of Different Versions

联系我们 contact @ memedata.com