To explain the source (http://www.gammon.com.au/GPascal/source/) a bit, a lot of work was put into fitting it into the available memory. One approach I used was to tokenise things like error messages.
Message tokens
This was done by putting bytes with the high-order bit set inside messages, and then expanding them out at display time. This is a table I extracted of the various tokens, in hex, (from PAS1.ASM lines 1754+):
B0 = P-codes
B1 = full
B2 = Constant
B3 = Identifier
B4 = expected
B5 = missing
B6 = Illegal
B7 = Incorrect
B8 = string
BA = compiler
BB = literal
BC = mismatch
BD = Error
BE = zero
BF = source file
C0 = of
C1 = or
C2 = to
C3 = ended at
C4 = Symbol
C6 = Stack
C7 = Instruction
C8 = table
C9 = Type
CA = list
CC = Number
CD = Line
CE = Gambit
CF = Games
D2 = Version 3.1 Ser# 5001
D3 = Copyright 1983
D4 = <C>ompile
D5 = <S>yntax
D6 = Written by Nick Gammon
D7 = <Q>uit
D8 = Range
D9 = Parameter
DA = <E>dit,
DB = <
Error messages
The error messages, in decimal, once the tokens are expanded, are (from PAS1.ASM lines 1222+):
1: Memory full
2: Constant expected
3: = expected
4: Identifier expected
5: , or : expected
6: bug
7: *) expected
8: Incorrect string
9: . expected
10: ; expected
11: Undeclared Identifier
12: Illegal Identifier
13: := expected
14: literal string of zero length
15: compiler limits exceeded
16: THEN expected
17: ; or END expected
18: DO expected
19: Incorrect Symbol
20: bug
21: Use of procedure Identifier in expression
22: ) expected
23: Illegal factor
24: Type mismatch
25: BEGIN expected
26: "of " expected
27: Stack full
28: TO or DOWNTO expected
29: string literal too big
30: Number out of Range
31: ( expected
32: , expected
33: [ expected
34: ] expected
35: Parameters mismatched
36: Data Type not recognised
37: Symbol table full
38: Duplicate Identifier
Source tokens
When the source was being processed it was turned into "tokens" (eg. numbers, symbols, reserved words, identifieds, etc.).
This made it easy to do comparisons in the compiler proper, because rather than having to do string compares, you simply checked a single byte. The source tokens, in hex, are (from PAS1.ASM line 559+):
81 = get
82 = const
83 = var
84 = array
85 = of
86 = procedure
87 = function
88 = begin
89 = end
8A = or
8B = div
8C = mod
8D = and
8E = shl
8F = shr
90 = not
91 = mem
92 = if
93 = then
94 = else
95 = case
96 = while
97 = do
98 = repeat
99 = until
9A = for
9B = to
9C = downto
9D = write
9E = read
9F = call
A1 = char
A2 = memc
A3 = cursor
A4 = xor
A5 = definesprite
A6 = plot
A7 = getkey
A8 = clear
A9 = address
AA = wait
AB = chr
AC = hex
AD = spritefreeze
AE = close
AF = put
DF = sprite
E0 = positionsprite
E1 = voice
E2 = graphics
E3 = sound
E4 = setclock
E5 = scroll
E6 = spritecollide
E7 = groundcollide
E8 = cursorx
E9 = cursory
EA = clock
EB = paddle
EC = spritex
ED = joystick
EE = spritey
EF = random
F0 = envelope
F1 = scrollx
F2 = scrolly
F3 = spritestatus
F4 = movesprite
F5 = stopsprite
F6 = startsprite
F7 = animatesprite
F8 = abs
F9 = invalid
FA = load
FB = save
FC = open
FD = freezestatus
FE = integer
FF = writeln
Notice that the "message tokens" in the range 0xB0 to 0xDB are not in the list. This is so that the output routine can convert back tokens which are either messages or reserved words without clashes.
This makes the snippet of source in the earlier post more understandable:
1800 * REPEAT
1801 *
9734: 20 02 90 1802 REPEAT JSR PSHPCODE
9737: 20 49 80 1803 REP1 JSR GTOKEN
973A: 20 63 93 1804 JSR STMNT
973D: A5 16 1805 LDA TOKEN
973F: C9 3B 1806 CMP #';'
9741: F0 F4 1807 BEQ REP1
9743: A9 99 1808 LDA #$99
9745: A2 0A 1809 LDX #10
9747: 20 34 80 1810 JSR CHKTKN
974A: 20 40 90 1811 JSR GETEXPR
974D: 20 55 80 1812 JSR PULWRK
9750: 20 51 90 1813 JSR WRK:OPND
9753: A9 3D 1814 LDA #61
9755: 4C 88 80 1815 JMP GENRJMP
The code calls GTOKEN (get token) and processes a statement. Then it checks if we got a ";" token, and if so, gets another statement. When the statements separated by semicolons run out, it checks for token 0x99 (which is "until" from the above table) and if it doesn't get it outputs error 10 which is "; expected" from the earlier table).
P-codes
This is the meanings of the P-codes (pseudo machine codes):
Code Function Description
---- ---------- ------------------------------------
00 = LIT Load constant
01 = DEF:SPRT DEFINESPRITE
02 = NEG Negate (sp)
03 = HPLOT PLOT
04 = ADD Add (sp) to (sp - 1)
05 = TOHPLOT PLOT (not used)
06 = SUB Subtract (sp) from (sp - 1)
07 = GETKEY GETKEY
08 = MUL Multiply (sp) * (sp - 1)
09 = CLEAR CLEAR
0A = DIV Divide (sp - 1) / (sp)
0B = MOD Modulus (sp - 1) MOD (sp)
0C = ADRNN Address of integer
0D = ADRNC Address of character
0E = ADRAN Address of integer array
0F = ADRAC Address of character array
10 = EQL Test (sp - 1) == (sp)
11 = FINISHD Stop run (end program)
12 = NEQ Test (sp - 1) != (sp)
13 = CUR Cursor position
14 = LSS Test (sp - 1) < (sp)
15 = FREEZE:S SPRITEFREEZE
16 = GEQ Test (sp - 1) >= (sp)
17 = INH Input hex number
18 = GTR Test (sp - 1) > (sp)
19 = LEQ Test (sp - 1) <= (sp)
1A = ORR OR (sp - 1) | (sp)
1B = AND AND (sp - 1) & (sp)
1C = INP Input number
1D = INPC Input character
1E = OUT Output numbher
1F = OUTC Output character
20 = EOR Not (sp) (logical negate)
21 = OUH Output hex number
22 = SHL Shift left (sp) bits
23 = OUS Output string
24 = SHR Shift right (sp) bits
25 = INS Input string into array
26 = INC Increment (sp) by 1
27 = CLL Relative procedure/function call
28 = DEC Decrement (sp) by 1
29 = RTN Procedure/function return
2A = MOV Copy (sp) to (sp + 1)
2B = CLA Call absolute address
2C = LOD Load integer onto stack
2D = LODC Load character onto stack
2E = LDA Load absolute address integer
2F = LDAC Load absolute address character
30 = LDI Load integer indexed
31 = LDIC Load character indexed
32 = STO Store integer
33 = STOC Store character
34 = STA Store integer absolute address
35 = STAC Store character absolute address
36 = STI Store integer indexed
37 = STIC Store character indexed
38 = ABSCLL Absolute procedure/function call
39 = WAIT WAIT
3A = XOR XOR (sp - 1) ^ (sp)
3B = INT Increment stack pointer
3C = JMP Jump unconditionally
3D = JMZ Jump if (sp) zero
3E = JM1 Jump if (sp) not zero
3F = SPRITE SPRITE
40 = MVE:SPRT POSITIONSPRITE
41 = VOICE VOICE
42 = GRAPHICS GRAPHICS
43 = SOUND SOUND
44 = SET:CLK SETCLOCK
45 = SCROLL SCROLL
46 = SP:COLL SPRITECOLLIDE
47 = BK:COLL GROUNDCOLLIDE
48 = CURSORX CURSORX
49 = CURSORY CURSORY
4A = CLOCK CLOCK
4B = PADDLE PADDLE
4C = SPRT:X SPRITEX
4D = JOY JOYSTICK
4E = SPRT:Y SPRITEY
4F = OSC3 RANDOM
50 = VOICE3 ENVELOPE
51 = SCROLLX SCROLLX
52 = SCROLLY SCROLLY
53 = SPT:STAT SPRITESTSTATUS
54 = MOV:SPT MOVESPRITE
55 = STOP:SPT STOPSPRITE
56 = STRT:SPT STARTSPRITE
57 = ANM:SPT ANMINATESPRITE
58 = ABS ABS (absolute value of (sp))
59 = INVALID INVALID
5A = LOADIT LOAD
5B = SAVEIT SAVE
5C = X:OPEN OPEN
5D = FR:STAT FREEZESTATUS
5E = OUTCR Output a carriage-return
5F = X:CLOSE CLOSE
60 = X:GET GET
61 = X:PUT PUT
Operations that mention (sp) refer to "whatever value is on the top of the stack", and (sp - 1) is the second value from the top.
Thus for example, when you add, it pulls the stop value from the stack, and then the second top value, adds them, and pushes the result onto the stack.