Extended text tokens - Elite on the BBC Micro and NES

The extended text token system in the enhanced versions of Elite

All versions of Elite use a clever text tokenisation system to store the game's text in an efficient manner - you can read all about it in the deep dive on printing text tokens. In addition to this system, the enhanced disc and 6502 Second Processor versions of Elite have two additional extended text token systems that provide a lot more text, supporting missions, disc access menus, extended system descriptions and more.

For example, here's the second briefing screen for the Constrictor mission, which uses token 10 from the extended token table at TKN1:

The second briefing screen for the Constrictor mission in BBC Micro Elite

The two extended text token systems are as follows:

There are 256 recursive tokens in the TKN1 table that can be printed with the DETOK routine. This is the bulk of the extended token system, and contains any game text that isn't already covered by the standard text tokens or the special extended descriptions in RUTOK.
There are 27 special extended system descriptions in the RUTOK table that can be printed with the DETOK3 routine. These override the procedurally generated descriptions for a small group of systems, typically during the two missions when they are used to guide the player towards their mission briefings and goals (though there are some non-mission descriptions in there that provide some interesting Easter eggs for the player to find).

To print an extended token, we simply put the token number into the accumulator and call either DETOK or DETOK3. It's a lot simpler than the encoding system we have to use with TT27 for the standard tokens, though under the hood, the extended token system is just as complicated...

Types of extended token
-----------------------

Just like the standard text tokens, the tokens in the TKN1 and RUTOK tables are themselves composed of different types of token, though this complexity is hidden inside the routine that does the actual printing. This routine is known by two names, which are aliases of each other: TT26, which shares its name with the character printing routine in the BBC Micro cassette version, and DASC, which points to exactly the same routine. We'll talk about DASC here, as it's a slightly friendlier name.

Like the standard token system, with its control codes, two-letter tokens and recursive tokens, there are quite a few different types of extended token that DASC prints. They are:

Jump tokens: Instead of printing, these tokens call the corresponding routine from the jump table at JMTB. These can do anything from setting the letter case to rotating ships on screen while waiting for key presses.
Characters: These are standard ASCII characters, with the case determined by the extended token flags.
Random tokens: These are used to display the procedurally generated extended system descriptions, which use the random number generator to generate random sequences of tokens. This randomness can be controlled by seeding the random number generator before printing, which is how we ensure each system always has the same description.
Extended recursive tokens: These work in the same way as the recursive tokens from the standard text token system, allowing us to include tokens within tokens.
Extended two-letter tokens: These work in the same way as the two-letter tokens in the standard text token system, but with a larger range of two-letter sequences that extends the standard set, at the expense of dropping four of the original tokens.

As with the standard token system, the type of token is determined by the character code that is stored in memory. So, in the same way that control codes in standard text tokens are in the range 0-13, jump tokens in the extended text tokens are in the range 1-31, and random tokens are in the range 91-128. Here's a breakdown of the code ranges:

Character	Macro	Process
1-31	EJMP	Call the corresponding JMTB routine
32-64	ECHR	Print numbers and punctuation with TT27
65-90	ECHR	Print letters A-Z in the correct case with DASC
91-128	ERND	Print an extended recursive token with DETOK, fetching the token number from the MTIN table (subtract 91 to get 0-37 then add random 0-4)
129-214	ETOK	Print an extended recursive token with DETOK
215-255	ETWO	Print an extended two-letter token from table TKN2 (subtract 215 to get 0-40)

As with the standard text tokens, let's make things easier to follow by referring to the four token types like this, where n is the character code:

  {n}           Jump token                n = 1 to 31
  [n?]          Random token              n = 91 to 128
  [n]           Recursive token           n = 129 to 215
  <n>           Two-letter token          n = 215 to 255

Also like the standard text tokens, the extended text tokens are stored in memory in an obfuscated manner, though this time they are EOR'd with the value of the VE configuration variable (&57) rather than the EOR 35 that is used to hide the standard tokens. To make the source code easier to read, we use various macros to assemble the tokens into memory while retaining some level of human readability of the source code. For extended tokens, the names all start with an "E", and they are:

  ECHR n          Insert ASCII character n        n = 32 to 99
  EJMP n          Insert jump token n             n = 1 to 31
  ERND n          Insert random token n           n = 91 to 128
  ETOK n          Insert recursive token n        n = 129 to 215
  ETWO 'x', 'x'   Insert two-letter token "xy"    "xy" is in the table below

Let's look at each of these types in turn, but before we do, it's worth noting that as part of the extended token system, it's possible to switch from extended tokens back to standard text tokens, and then back again, all within one extended token (we do this using jump tokens 5 and 6). When standard tokens are enabled, the DASC routing does the following instead:

Character	Macro	Implementation
1-31	EJMP	Call the corresponding JMTB routine
32-255	RTOK	Print a standard text token with TT27

This behaviour is controlled by bit 7 of the print flag in DTW3: if it is clear then extended tokens are enabled, and if it set then standard tokens are enabled.

Let's now take a look at the various types of token that make up the extended text token system.

Extended recursive tokens: [n]
------------------------------

Extended recursive tokens work in the same way as standard recursive tokens, in that tokens can contain other tokens. However, the range of extended tokens that can be included in other tokens is a bit smaller than in the standard system, where you can include all but three tokens recursively. There are 256 extended tokens in the TKN1 table that the DETOK routine can print, but only tokens in the range 129 to 215 can be included in other tokens.

Apart from this, recursive tokens expand in the same way as in the standard system, and tokens can contain tokens that contain other tokens, recursing as deep as you like.

Extended two-letter tokens: <n>
-------------------------------

Also similar to the standard token system, the extended two-letter token system is based on the range of standard two-letter tokens from the table at QQ16, but with an additional set of tokens, and four of the original tokens dropped. The full range of extended two-letter tokens is as follows:

  215     {crlf}
  216     AB
  217     OU
  218     SE
  219     IT
  220     IL
  221     ET
  222     ST
  223     ON
  224     LO
  225     NU
  226     TH
  227     NO
  228     AL
  229     LE
  230     XE
  231     GE
  232     ZA
  233     CE
  234     BI
  235     SO
  236     US
  237     ES
  238     AR
  239     MA
  240     IN
  241     DI
  242     RE
  243     A?
  244     ER
  245     AT
  246     EN
  247     BE
  248     RA
  249     LA
  250     VE
  251     TI
  252     ED
  253     OR
  254     QU
  255     AN

Tokens 215 to 227 are exclusive to the extended token system, while the standard tokens start at 228. They have token numbers that are 100 higher than the same tokens in the standard system, which is why the last four standard tokens are not available in the extended list, as they would have to have token numbers higher than 255.

The new two-letter tokens can be found in the table at TKN2, which appears directly before the standard two-letter token table at QQ16. This means we can subtract 215 from the token number to get a number in the range 0-40, which acts as an index into the TKN2/QQ16 table when doubled (as each entry takes up two bytes).

Random tokens: [n?]
-------------------

Random tokens are encoded with values in the range 91-128. When DASC is asked to print a random token in this range, it subtracts 91 from the token number to get a number in the range 0 to 37, and then it fetches the corresponding entry from the table at MTIN, adds a random number in the range 0-4 to this number, and calls DETOK to print that token.

The ERND macro, which we use to encode random tokens in the TKN1 and RUTOK tables, takes an argument between 0 and 37, which corresponds to the lookup value in MTIN.

Random tokens are used to generate the extended descriptions for each system. For example, the entry at position 13 in the MTIN table (counting from 0) is 66, so ERND 14 will expand into a random token in the range 66-70, i.e. one of "JUICE", "BRANDY", "WATER", "BREW" and "GARGLE BLASTERS".

Jump tokens: {n}
----------------

Jump tokens do exactly that - they call subroutines instead of being printed. The jump token is a very powerful token type, and implements all sorts of functionality, from drawing boxes and setting letter case, to justifying text and fetching input from the keyboard.

Jump tokens are in the range 1 to 31, though tokens 20 and 31 don't do anything. The best way to work out what each token does is to visit the relevant routine in the source. Here is a list of jump tokens, along with the subroutines that they call (the MT routines are listed in more detail below):

Jump token	Shorthand in documentation	Routine
1	{all caps}	MT1
2	{sentence case}	MT2
3	{selected system name}	TT27
4	{commander name}	TT27
5	{extended tokens}	MT5
6	{standard tokens, sentence case}	MT6
7	{beep}	DASC
8	{tab 6}	MT8
9	{clear screen}	MT9
10	{lf}	DASC
11	{draw box around title}	NLIN4
12	{cr}	DASC
13	{lower case}	MT13
14	{justify}	MT14
15	{left align}	MT15
16	{drive number}	MT16
17	{system name adjective}	MT17
18	{random 1-8 letter word}	MT18
19	{single cap}	MT19
20	Unused	DASC
21	{clear bottom of screen}	CLYNS
22	{display ship, wait for key press}	PAUSE
23	{move to row 10, white, lower case}	MT23
24	{wait for key press}	PAUSE2
25	{incoming message screen, wait 2s}	BRIS
26	{fetch line input from keyboard}	MT26
27	{mission captain's name}	MT27
28	{mission 1 location hint}	MT28
29	{move to row 6, white, lower case}	MT29
30	{white}	WHITETEXT
31	Unused	DASC

Here's a list of MT routines that implement the bulk of the jump token functionality. The number of the MT routine corresponds to the jump token that triggers that routine.

Routine	Function
MT1	Switch to ALL CAPS when printing extended tokens
MT2	Switch to Sentence Case when printing extended tokens
MT5	Switch to extended tokens
MT6	Switch to standard tokens in Sentence Case
MT8	Tab to column 6 and start a new word when printing extended tokens
MT9	Clear the screen and set the current view type to 1
MT13	Switch to lower case when printing extended tokens
MT14	Switch to justified text when printing extended tokens
MT15	Switch to left-aligned text when printing extended tokens
MT16	Print the character in variable DTW7
MT17	Print the selected system's adjective, e.g. Lavian for Lave
MT18	Print a random 1-8 letter word in Sentence Case
MT19	Capitalise the next letter
MT23	Move to row 10, switch to white text, and switch to lower case when printing extended tokens
MT26	Fetch a line of text from the keyboard
MT27	Print the captain's name during mission briefings
MT28	Print the location hint during the mission 1 briefing
MT29	Move to row 6, switch to white text, and switch to lower case when printing extended tokens

The MT routines use a number of extended print flags to store the current text state. The best way to work out what each flag does is to read the relevant variable's header in the source. They are as follows:

Location	Function	Set by
DTW1	A mask for applying the lower case part of Sentence Case to extended text tokens	MT1, MT2, MT13
DTW2	A flag that indicates whether we are currently printing a word	CLYNS, DASC, MT8, TTX66
DTW3	A flag for switching between standard and extended text tokens	MT5, MT6
DTW4	Flags that govern how justified extended text tokens are printed	MT14, MT15, MESS
DTW5	The size of the justified text buffer at BUF	DASC, MESS, MT14, MT15, MT17
DTW6	A flag to denote whether printing in lower case is enabled for extended text tokens	MT1, MT2, MT13
DTW7	Contains the character printed by MT16	CATS
DTW8	A mask for capitalising the next letter in an extended text token	DASC, MT19

The extended text token system in the enhanced versions of Elite

Types of extended token -----------------------

Extended recursive tokens: [n] ------------------------------

Extended two-letter tokens: <n> -------------------------------

Random tokens: [n?] -------------------

Jump tokens: {n} ----------------

Types of extended token
-----------------------

Extended recursive tokens: [n]
------------------------------

Extended two-letter tokens: <n>
-------------------------------

Random tokens: [n?]
-------------------

Jump tokens: {n}
----------------