Saturday, July 13, 2024

Pretty pictures, bootable floppy disks, and the first Canon Cat demo?

Now that our 1987 Canon Cat is refurbished and ready to go another nine innings or so, it's time to get into the operating system and pull some tricks.
As you'll recall from our historical discussion of the Canon Cat, the Cat was designed by Jef Raskin as a sophisticated user-centric computer but demoted to office machine within Canon's typewriter division, which was tasked with selling it. Because Canon only ever billed the Cat as a "work processor" for documents and communications, and then abruptly tanked it after just six months, it never had any software packages that were commercially produced. In fact, I can't find any software written for it other than the original Tutor and Demo diskettes included with the system and a couple of Canon-specific utilities, which I don't have and don't seem to be imaged anywhere.

So this entry will cover a lot of ground: first, we have to be able to reliably read, image and write Canon disks on another system, then decipher the format, and then patch those images to display pictures and automatically run arbitrary code. At the end we'll have three examples we can image on any PC 3.5" floppy drive and insert into a Cat, turn it on or hit DISK, and have the Cat automatically run: a Jef Raskin "picture disk," a simple but useful dummy terminal, and the world's first (I believe) Canon Cat, two-disk, slightly animated and finely dithered, slideshow graphics demo!

But before we get to pumping out floppy disks, we first need to talk about how one uses a Cat. And that means we need to talk a bit about Forth, the Canon Cat's native tongue. And that means we should probably talk more about the operating system too. Oh, and we should go down to CompUSA and buy a brand new floppy drive while we're at it.

For those of you without a Cat, which sadly will be most of you, at least a little of this entry can be done in MAME, which is at present the sole extant Cat emulator. (A Wasm version with ROMs ready to go is at the Internet Archive, though here I'll be running it on my Raptor Talos II in Fedora.) Unfortunately MAME emulates neither the internal disk drive nor the serial port, so you won't be able to run these demos on it yet, and there's a couple glitches we need to work around, but this will still give you a flavour of the machine and let you poke around a bit.

Everything in this entry assumes the production v1.74 ROMs, which were to the best of my knowledge the only version mass-produced Cats were ever officially shipped with. Source code and binaries for v2.40 ROMs exist but my Cat doesn't run them. If you want to run the v2.40 ROMs anyway, don't be surprised if some of this doesn't work right, and I'm very sure the demos won't.

The major glitches to rectify are the emulator freezing when the beeper is triggered (probably incomplete emulation of one of the custom Canon gate array chips) and switching the keyboard layout so we can type angle brackets (unpossible with the default US layout, also true of a real Cat). Let's fix those issues first.

The default screen after power on. This is where you would enter a document, which we'll get to in a second. These MAME grabs (captured at the system's raw 672x344 screen resolution) have been corrected for aspect ratio so that they'll look approximately the same as they would on the Cat's CRT display.
The Cat's interface is entirely keyboard-driven with no mouse, joystick or even cursor keys. To perform special functions, the USE FRONT key acts like a Command or Control key when combined with others. Here, we'll enter the setup menu by holding down USE FRONT (in the default MAME key layout, this is probably Control) and pressing SETUP, which is usually the left brace key. Keep USE FRONT/Control depressed after you release the SETUP key.
The first setup screen then appears. With USE FRONT/Control still down, tap SETUP/left brace again for the second screen. Do not release USE FRONT/Control yet.
The two settings we'll want to change are the keyboard and problem signal. The keyboard is most likely set to United States (or as appropriate to your own locale), but we'll need it set to true ASCII for hacking purposes. With USE FRONT/Control still down, repeatedly tap either of the LEAP keys (usually mapped to Alt/Option) to cycle through the choices until "ASCII" is shown. If you started at United States, tap the left LEAP/Alt/Option key for Dvorak and again for ASCII, and that should do it, but do not let go of USE FRONT/Control yet.

Then, to change the problem signal, with USE FRONT/Control still down, tap SPACE twice to move to that, and tap LEAP until "Flash" is shown (and only flash, none of the beep options). Your screen should look like the grab above. Assuming it does, now you can let go of USE FRONT. After a brief pause the settings will be written to the battery-backed settings RAM and the Cat will go back to the main document.

So that you understand what we'll be trying to accomplish (and, eventually, subvert), let's have a quick look at what makes the Cat's interface unique.

A tour of the Cat

The Cat's operating system is entirely in ROM, so there's no waiting to start — though if you have a disk in the drive, and the machine generally expects you will so that your workspace can be autosaved, it will load when you power it on. Either way, you can start entering text almost immediately. What looks like two cursors will track your typing, but what they really are is the selection (inverse video) and the insertion point (grey block). The selection is what is deleted immediately by pressing ERASE (backspace).

At the bottom of the screen is the ruler. This shows your document margins and your tab stops, though in characters, not inches — the Cat uses fixed-width, non-proportional characters. Paragraph style, line spacing and the keyset in use (I or II, for those keys with multiple character pairs assigned to them) are at the bottom, along with a memory usage gauge and the current line number within the current document.

The Cat doesn't have a filesystem, nor even a concept of one. Instead, all documents are subsumed into one big overarching workspace, each workspace being identified by a unique key stored to its matching disk (so that you can't overwrite a disk with the wrong workspace, and you won't lose unsaved text with a new disk — that is, unless you force it to do those things). Disk and memory permitting, you can create as many documents as you like with the DOCUMENT/PAGE (typically Page Down) key. They can be individually titled so you can scan the list to see what's present, and carry their own independent margin and document settings. The Cat was never shipped with a memory capacity greater than what its floppy drive could store, so one disk always equals one workspace.
How do you navigate without cursor keys or a mouse? The LEAP keys, when held down, search either backwards (left) or forwards (right) in the workspace based on what you type. You can LEAP by characters, or paragraphs (LEAP to RETURN or RETURN-RETURN), or even documents (LEAP-DOCUMENT). SHIFT-LEAP can be used to scroll the screen line by line.

LEAPing also is how you mark text. If you LEAP first to the beginning of a block of text and then LEAP to the end of it, then press both LEAP keys, that text is selected. Besides applying styles like boldface or all-caps to it, you can move it by LEAPing elsewhere, or copy it and move the copy, or delete it. A single level of undo is available with the UNDO key.

Selecting text has other functions too. When I say everything goes in the workspace, I do mean everything. The Cat is designed to be collabourative: you can hook up your Cat to a phone line, or at least you could when landlines were more ubiquitous, and someone could call in and literally type into your document remotely. If you dialed up a service, you would type into the document and mark and send text to the remote system, and the remote system's response would also become part of your document. (That goes for the RS-232 port as well, by the way. In fact, we'll deliberately exploit this capability for the projects in this article.)

Raskin's intention was that the document should handle dynamic text as fluidly as static text (if Canon had let him and the firmware permitted, this would undoubtedly have been how graphics were handled also), and that such text should be computable on the fly. Let's compute my favourite easy-to-remember approximation of π. Type 355/113 and highlight it with the LEAP keys: hold down LEAP LEFT and type 3 (jumps to the first three), then hold down LEAP RIGHT and type 3 again (jumps to the last three), then press both LEAP keys.

We can now tap USE FRONT-CALC (the G key), a little CALC icon will briefly appear, and suddenly the result of the computation is displayed: 3.14. (The Cat defaults to two decimal digits of displayed precision but this is configurable.) The dotted line underneath it indicates it is a generated value.
That's handy as a desk calculator, and the syntax supports operator precedence and parentheses as well as functions like sqrt, but the really fun part is when we use this to store values in memory. Let's define a variable pi with the same calculation 355/113. We enter pi:355/113 and then highlight it with the LEAP keys (leap left to the p, leap right to the 3, then with LEAP RIGHT still held down press USE FRONT-LEAP RIGHT to leap again to the second three, then with LEAP RIGHT still held down tap LEAP LEFT and finally release both keys).
The same result appears, but we've now stored that as an internal variable pi. We'll do the same for radius and define that to be 6.
With those variables computed, we can now use them in other calculations. Anybody for πr2? We enter pi*radius*radius and calculate that.
The correct answer to the given precision is shown.
As you may have already guessed, those values we entered aren't fixed: they can be changed and recalculated. Let's change radius to 10 by tapping LEAP LEFT until we're back in that field (or LEAP LEFT to the 6). If we press USE FRONT-CALC while within the result of a calculation, the expression you entered to generate it is preserved and now can be edited.
We change the 6 to a 10 and press USE FRONT-CALC to recalculate its result, which in turn will update the variable and recompute anything depending on that variable.
And thus our pi*radius*radius duly changes as well. If we changed the name of the variable or entered other nonsense or did something like divide by zero, the Cat will calmly display question marks to indicate any dependent result is invalid, and you can edit the expression with CALC again to fix it.

This process can be slow if there are lots of things to compute, and the 5MHz 68K isn't exactly a speed demon. Pressing any key while the CALC icon is displayed will cancel the operation, true for any such task including printing except DISK (for obvious reasons) and one more to be discussed.

Does this sound a little like a spreadsheet? I'm so glad you asked! Because Cat text is fixed-width, setting up consistent rows and columns is simply a matter of tab stops and character width, adjusting the document margins if necessary. Tabs separate cells and the editor treats them as such with the function use().

Let's enter a simple spreadsheet where we will compute the tax on a sale of widgets and then total the transaction. We'll specify unit price and number sold, and compute a fixed tax of 10% (though this could just as easily be a variable). There are two tab stops between the Unit Price and the Sales, so the expression becomes use(-3)*use(-1)*0.10 to compute unit price times sales times 10%.

That yields the tax.
To get the total, we will now reference that cell and add it to the sales price with use(-1)+(use(-2)*use(-4)) (notice that everything is relative; positive values are also valid).
Even though the unit price and sales count are simply plain text we entered and didn't originate as computed values, we can still change them and press CALC to recalculate the miniature spreadsheet. If we wanted to address a different row, we add a second argument to use() for the Y delta (such as use(-1 1) for the value in the next row but the previous column; the arguments to the function are separated by spaces). All of your variables and expressions are saved to disk as part of the document.
Of course, you can do other sorts of lists with it ... or sort other sorts of lists with it. Here is a list of random, commonplace, everyday words you regularly use in conversation. We'll highlight them and then press USE FRONT-SORT (usually the comma key).
And presto, a dictionary-sorted list. Lists have other applications, naturally, like mail merges. The LEARN facility provides keystroke macros, so with a list and some unique field sequences you could implement such a scheme in a kludgey but effective fashion.
It would be a gross distortion to say every nuance of the Cat is discoverable and/or self-explanatory. In fact, some procedural aspects of it, while sensible once you learn them, are not at all obvious.

But there is ample online help if you hold down USE FRONT/Control and press HELP (by default the N key). This is the default screen if there is no pending error state. If there were an error, such as a problem reading the disk, the Cat will beep politely and pressing USE FRONT-HELP will instead explain the error in prose. (Raskin hated modals.)

Either way, once in the help facility, you can release the HELP key and, while USE FRONT/Control is still down, press a key you want information about. Like, here's help on ... HELP.

Here are the other keys we looked at before, SETUP, CALC and SORT. Every bit of their text is also built-in to ROM.

You can also get a hidden credits screen from here.
In the document, hold down LEAP LEFT, then hold down SHIFT, then type Q W E R A S D F Z X C V (nothing will appear), then release SHIFT, release LEAP, and finally press USE FRONT-EXPLAIN. Good to see a real doctor doctor on the team.

Sallying tForth into the ROM

Anyway, except for that easter egg, everything you saw was what Canon wanted you to see — a sophisticated word processor. But what Jef Raskin wanted you to see, and Canon didn't, was a general-purpose computer that just happened to use a word processor-like primary interface, complete with a built-in programming language. That language is Forth, as descended from the original Swyft, or more accurately tForth, an implementation using tokens (the "t") instead of direct addresses for each word in a word definition. This requires a level of indirection to look up the execution address when a word is executed, a semantic difference between this concept and a bytecode approach, but it also means that words can be redefined or even deleted without leaving stale pointers in unrelated words that reference them. More importantly, code can be considerably more compact because tokens don't necessarily need to be the size of an entire address (Cat tokens are byte-sized, though often taken as word-sized groups) and the code they reference can be moved and compacted. This was important given the 256K of memory the Cat shipped with, which wasn't a huge amount even in the mid-1980s and had to be shared with the video circuitry.

Nevertheless, although officially undocumented, a gateway to Forth remains in the Cat. It is triggered by a straightforward — albeit unlikely — series of keystrokes, which works in both MAME and the real hardware. Start out by typing Enable Forth Language (easiest in a blank document or workspace).
Highlight it by pressing both LEAP keys.
Now press USE FRONT-ANSWER (usually your backspace or [Mac] delete key). The bottom of the screen will flash and/or a real Cat will beep, and a little "FORTH" icon will briefly light and then disappear.
When the FORTH icon has disappeared, press SHIFT-USE FRONT-SPACE. Nothing will appear to happen until you hit RETURN/ENTER a few times ... and get a Forth ok prompt.
Lock this change in by typing the magic incantation -1 wheel! savesetup re and press ENTER. This enables "expert mode" and saves it to battery-backed settings RAM, then returns to the editor. Now the Forth mode is unlocked, and the hacking can begin. In future you won't need to do the "Enable Forth Language" dance again as long as your settings RAM or battery doesn't get whacked.

Forth is accessed in two ways, one of which you just saw for entering commands at a traditional REPL-like prompt. This is the mode we will primarily be using, and is fully supported by the operating system, but the other fashion is clearly more Raskinesque. Remember how you could highlight expressions and have the Cat compute them?

Well, now that works for Forth, but instead of USE FRONT-CALC, you'll use USE FRONT-ANSWER. This article will not seek to teach you Forth (the online version of Starting Forth is a far better instructional tool than I could ever write), and I certainly do not profess to have superior fluency in it myself. However, for illustrative purposes I'll be using relatively simple code that most people should be able to grok at a conceptual level. Just remember that since Forth is the canonical stack language, everything gets pushed and popped to stacks, and as such control structures and arithmetic appear "backwards" compared to other programming languages. (Forth's famous use of reverse Polish notation is thus simply a logical consequence of the language.)

In any event, upon pressing USE FRONT-ANSWER ...

... the answer is computed (1 2 3 + +, i.e., 3 plus 2, and that result plus 1) and printed (.), the editor captures the output, and then inserts the result: six.
Interestingly, if you try to ask the online help for information on the ANSWER key, you just get the ERASE key help page no matter what you do. ANSWER isn't even referenced in the user manual. While ANSWER does appear on the front of the key just like any other key you can USE FRONT with, it does nothing until you enable Forth, and many or even most of its contemporary users likely never pondered what it was there for.

Aside: what if you make an error?

Say we entered bogus . instead, a Forth word you can take on faith is not built-in to the Cat, and USE FRONT-ANSWER with that. The FORTH icon lights up and flashes, and there is no output. Here, USE FRONT-HELP will tell you what it didn't like:
This is tForth for "I don't know that word." Forth has a reputation as a language so close to the metal that one wrong move will bring the system down. I've personally experienced that writing for the 68K Mac in Pocket Forth, where unbalancing the stack can lead to a system error faster than you can say guru meditation. Yes, Forth programmers are Real Men, Manly Men, Speakers of the One True Manly Language, Men on the Bleeding Edge, who point and laugh uproariously at you toddling along with your IDE and your wussy little C compiler. (Or womenly women, or neither. This blog is totes egalitarian.)

So here's a little warning: tForth is not nearly that hostile, but while it may have more guardrails than the typical Forth implementation, that's not saying very much. Although the ROM is relatively resistant to rookie mistakes and I'm not aware of anything that can permanently brick the logic board, you can damage things (like getting the track and address transposed when you command the disk drive to seek!), you can't interrupt a Forth operation if you inadvertently cause an infinite loop, you can freely trash the system and anything in memory, and you can lock up the system so severely that nothing will fix it but a powercycle. To borrow a quote from Terry A. Davis, tForth "is a motorbike. If you lean over too far, you'll fall off. Don't do that." This particular minefield won't blow off your arms, but it might throw you on the ground pretty hard.

Anyway, we'll return to our previous example and do some more messing around in MAME before we switch over to the real system. Press SHIFT-USE FRONT-SPACE and type the Forth word page to both clear the screen and home the cursor (cls home also works).

I should note parenthetically that on a real Cat you can soft-reset the system by typing cold at the ok prompt. It's very handy for reducing wear on the power switch, but doesn't seem to work in MAME.

A fair bit of the fundamentals on which this article is based come from the copious technical documentation on the platform, collected and preserved at canoncat.net. Some of it actually hails from the Swyft's development, but the Cat is so similar to its ancestor that most of their content is still relevant. We'll then build on this basis for the tricks we'll pull.

Here's the memory map from the Cat tForth manual, which I have marked up and corrected according to production models. The stock Cat officially comes with 256K of main memory plus 8K of battery-backed RAM for storing settings and the user dictionary. This was apparently 16K in earlier iterations, and is 16K in MAME, but my system just has an 8K SRAM and only an 8K SRAM is described in the official service manual. 512K is the greatest amount of RAM supported by the hardware.

The 256K system ROM is located at the bottom of the address space with the lowest 1K being the 68K exception vector table, and above that the battery-backed settings RAM from $040000 to $041fff. The main RAM starts at $400000, which is also where the 28896-byte (~28K) 672x344 monochrome display bitmap is stored. From $600000 and up is the memory-mapped I/O range, but we will rarely manipulate those devices directly.

From the Forth prompt you can do any Forth operation, including defining new words, though long lines don't wrap and complex words are better defined within the editor. Here we're storing a simple vertical strip of bytes to video memory as a "hello world" (20 0 do i i 54 * 400000 + c! loop). The framebuffer is linear and byte-addressed with 84 bytes for each 672 pixel line, and is white on black (i.e., set pixels are white, unset pixels are black). Note that all values displayed and entered are in hexadecimal by default.
Usually, the next thing you do in a new Forth environment after your "hello world" or moral equivalent is ask what other words are around. But here, by default, nothing shows.
tForth has the concept of vocabularies, which are organized into a tree and a search order, and the user can define additional ones. The word existing tells us the present vocabularies, which are function, arithmetic, user, hidden and forth, all of which descend from forth, which is the root. As shipped, however, this scheme is not greatly exploited: the arithmetic vocab is completely empty, there are only a handful of math and stat operations in function, and you obviously have to add your own to user. Most of the ROM word action occurs in hidden, which contains the words for the editor (and is hidden from listings, but are all executable), and in forth, from which all blessings flow. Although sixteen vocabularies may be in play at any time, it doesn't look like this capability was ever taken to its fullest extent.
We'll mostly deal with the forth vocabulary, and the new vocabulary, of course, because we'll be adding words of our own. However, the hidden vocabulary is absolutely present and available and you needn't explicitly open a vocabulary simply to use the words in it.

Also, you don't need to try to read every word on this screen and extract them from ROM either because I've already done it for you. Go ahead and open up the Github Catbox project and refer to the handy list of words, including everything in the hidden vocab. We'll be using tools from the Catbox to do more spelunking in the ROM very shortly.

A couple more things about tForth before we get to talking about the disk format. The ' word (a literal apostrophe, say "tick") tells you the token for any word, including words in ROM. As usual for Forth, all words are (at least initially) executable machine code, though some words are entirely so. You can use the token to do a lookup and find where that code begins, or you can use the word c' ("c-tick") which will do that for you. The lookup is done in the token table, which exists in RAM and whose location can potentially vary, though on a 1.74 system the first address reliably seems to remain at $410400 (not $40ec00 as in the memory map). The token table gets initialized and copied into RAM as part of startup, making it possible to patch ROM words if necessary to point to new in-RAM definitions. Spoiler alert: this will be necessary. Here, we've dumped some of the memory at cold's execution address to demonstrate it's written in 68K assembly, as you would expect for something that cold-starts the machine.
But not every word in ROM is in assembler, just ones where performance was important or direct code merely more expedient. The save word, which we'll be exploring a lot more later on, is written in tForth. We know this because the first 68K opcode to get executed is $4ed3 or jmp (a3), calling the address in 68K register A3, which during tForth execution contains the address for the nesting routine. This block of native code saves the tForth equivalent of the return address, computing it as a delta from the calling word to the new word. It then does some other housekeeping, sets the new tForth instruction pointer in A5 to the byte after the jmp (a3) and falls through to the very critical next routine (kept in A4), which grabs the execution address for the next word at the A5 instruction pointer and jumps into it.
For housekeeping, words in a vocabulary have headers, which contain an encoded offset into the starting address table and the word's ASCII string with a length and flags byte. Only the first 32 bytes of a word's ASCII string are considered significant. The header for a word can ordinarily be obtained with n' ("n-tick") but words in ROM don't appear to have one. (In fact, they do, but the routines intentionally? don't look at that portion of ROM.)
However, new words we define certainly have headers, like this more canonical "hello world" example (: hello ." hello world" cr ;). Here we can see it was given a token, see its execution address (starting with jmp (a3)), and see its header containing the encoded offset, then the length-and-flags byte, and then the string hello, plus some trailing garbage (the entry ends right after the string). The encoded offset and flags are mutable and can be changed later.

That's a sufficient understanding of tForth under the hood for what's coming up.

Reading and writing Cat disks

In our previous entry on the Canon Cat hardware, we shored up the Cat's internal floppy drive, which is a Canon-manufactured device specific to it. However, the disks it uses are ordinary 3.5" floppy disks, and modern low-level imaging tools can read them in a standard PC drive.

The basic rudiments of the format have been known for awhile thanks to Dwight Elvey — if Raskin was the father of the Cat, then Elvey would surely be its godfather — and his notes indicate the drive is writing regular 512-byte sectors, 10 per track and 80 tracks per disk, single-sided. (Atari ST enthusiasts will find this format familiar. While the ST by default stores only nine sectors per track, it can store ten, such as on titles such as Dungeon Master.) That's 400K a disk, and since the Cat only shipped with 256K and had additional RAM sockets only for 384K total, the entirety of the supported RAM will fit. Both double-density and high-density 3.5" floppies apparently work, though I have plenty of new-in-box DD floppies here and we'll use those.

Since I wanted nice clean disks written on a good quality drive for my Cat, I decided to go down to CompUSA and buy a new floppy drive too.

Yeah, okay, eBay. But while I was looking for a new shrink-wrapped drive I ran across this NOS CompUSA-boxed floppy drive package with additional floppy disks and even a cleaning kit. The funny part is I do remember buying one of these back in the day! I just don't know where it went, and I wanted a new one anyhow.
The drive is one of the very common Teac FD-235 series. They are very easy to find used, and, it turns out, new as well. The disk cleaner kit went immediately to doing a few cycles on the Cat drive itself even though the head looked pretty good when I inspected it.
For our disk imaging we'll use an off-the-hobbyist-shelf and inexpensive Greaseweazle. These USB-based devices have open source software for Linux, Windows and macOS and are available from multiple sellers.

Sadly, here's where we leave MAME behind, since it doesn't support the floppy drive or disk images yet. Remember us typing Enable Forth Language? Let's make a disk out of that on the real Cat. The DISK help screen looks like this:

In short terms, if you have a blank disk in the drive and have unsaved text in memory, then pressing USE FRONT-DISK will cause the Cat to format the disk and save your document. However, the tForth documentation also explains that all Forth words in RAM get written as well, which is critical for our hackery. Ordinarily there wouldn't be any, of course, but it would if there were. Correspondingly, DISK will also load a document and any associated words from a new disk as well, though any disk access regardless of direction is still centered on the editor — the Cat doesn't run applications in the traditional sense. (Spoiler alert: hang tight.)

But that's not all it's saving, though. If you read the DISK help carefully, you'll note it says something about showing "a sample of what's on the disk" if you have a different disk present. Ignoring the question right now of how it determines which disk goes with what workspace, how can it show the document on the floppy disk if the entirety of RAM could be on it? Wouldn't it overwrite the document in memory?

Let's dump this disk and investigate. It's possible that the Canon drive uses some weir-dass GCR format, but it turns out to our delight that it's regular MFM. Since we know it's single-sided, there's no need to check both sides, and we can stop scanning at track 80. Thus, we can get a read of the disk as a straight in-order sector dump with gw read --format=ibm-scan --tracks=c=0-79:h=0 test.img.

% gw read --format=ibm-scan --tracks=c=0-79:h=0 test.img
Reading c=0-79:h=0 revs=2
Format ibm.scan
T0.0: IBM MFM (10/10 sectors) from Raw Flux (91178 flux in 400.47ms)
T1.0: IBM MFM (10/10 sectors) from Raw Flux (91231 flux in 400.50ms)
[...]
T18.0: IBM MFM (10/10 sectors) from Raw Flux (85652 flux in 400.49ms)
T19.0: IBM Empty from Raw Flux (50039 flux in 400.50ms)
T20.0: IBM Empty from Raw Flux (50038 flux in 400.46ms)
[...]
T78.0: IBM Empty from Raw Flux (50043 flux in 400.51ms)
T79.0: IBM Empty from Raw Flux (50043 flux in 400.52ms)
Cyl-> 0         1         2         3         4         5         6         7         
H. S: 01234567890123456789012345678901234567890123456789012345678901234567890123456789
0. 0: ...................                                                            
0. 1: ...................                                                            
0. 2: ...................                                                            
0. 3: ...................                                                            
0. 4: ...................                                                            
0. 5: ...................                                                            
0. 6: ...................                                                            
0. 7: ...................                                                            
0. 8: ...................                                                            
0. 9: ...................                                                            
Found 190 sectors of 190 (100%)

An immediately interesting thing about this image dump is that each track (cylinder)'s ten sectors are numbered 0-9, not 1-10 which is more typical.

If we dump strings, we can find our text Enable Forth Language near the end of the image. This makes sense since the text is stored in the upper part of RAM, above the Forth dictionary. We can see some other strings like -1 wheel! savesetup re presumably from when we entered that command, though it appears earlier in the file.

We know that each track is 5120 bytes ($1400, 512 bytes times 10 sectors) long. The first track is 10 repeated sectors of

00000000  00 00 81 2c 00 00 db c6  00 00 74 e0 00 00 74 c6  |...,......t...t.|
00000010  00 00 74 d4 00 02 c8 ea  00 40 7a e8 00 40 7c fc  |..t......@z..@|.|
00000020  00 00 00 2c 00 00 ff 81  00 00 00 11 00 00 ff ff  |...,............|
00000030  00 41 10 10 00 02 c8 d0  00 41 00 68 00 41 01 dd  |.A.......A.h.A..|
00000040  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00000060  ff ff ff ff 20 08 33 26  00 11 ff ff ff ff ff ff  |.... .3&........|
00000070  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00000080  01 d7 3c 0c 00 42 00 00  00 00 00 ed 00 41 40 20  |..<..B.......A@ |
00000090  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
000000a0  00 00 00 00 00 00 00 00  00 43 c0 28 ff ff ff ff  |.........C.(....|
000000b0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
000000c0  ff ff ff ff 00 42 00 08  00 43 ac 00 00 42 06 8f  |.....B...C...B..|
000000d0  00 43 ab b4 00 43 ab b8  00 42 06 8e 00 00 00 00  |.C...C...B......|
000000e0  00 00 00 00 00 41 3f b4  00 42 00 00 00 43 ac 20  |.....A?..B...C. |
000000f0  00 41 38 00 00 00 00 00  00 04 00 10 ff ff ff ff  |.A8.............|
00000100  19 52 e5 78 43 7d 72 d2  dd d2 1a 46 10 9d f7 58  |.R.xC}r....F...X|
00000110  03 dc a6 13 88 75 16 74  00 00 00 00 00 00 00 00  |.....u.t........|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000180  00 41 35 52 00 00 00 00  00 41 35 5a 00 00 00 00  |.A5R.....A5Z....|
00000190  00 41 35 62 00 00 00 00  00 41 35 6a 00 00 00 00  |.A5b.....A5j....|
000001a0  00 41 35 72 00 00 00 00  00 41 35 7a 00 00 00 00  |.A5r.....A5z....|
000001b0  00 41 35 82 00 00 00 00  00 41 35 8a 00 00 00 00  |.A5......A5.....|
000001c0  00 41 35 92 00 00 00 00  00 41 35 9a 00 00 00 00  |.A5......A5.....|
000001d0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00000200

This repeated sector is called the idblock and appears in the technical documentation for the Cat's editor. Because this is a 68K, everything is big-endian as G-d Himself intended, though a fair bit of it is unused. It contains copies of the 68000 data and address registers in the first 64 bytes, the 68000 status register, a two-bit disk format ID ($3326 here, but an earlier $3325 format is also accepted) required to identify the disk as written by a Cat, a number of tracks ($0011 == 17, though we found nineteen), various editor state variables, a 128-byte disk identifier called the idtable (that's how it knows which workspace is what, though only about 24 bytes of it are actually populated here), and then pointers for any keyboard macros that were defined.

The documentation notes that backup idblocks appear elsewhere on the disk as well. One of these is in the first sector of the second track. After that we see a lot of binary data with some scattered snippets of text (later comparison shows this is the contents of the SV-RAM), and then starting with the first sector of the third track this peculiarly familiar-looking pattern:

00003c00  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00003df0  ff ff ff ff ff ff ff ff  ff ff 55 55 55 55 55 55  |..........UUUUUU|
00003e00  55 55 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |UUUUUUUUUUUUUUUU|
*
00003e40  55 55 55 55 55 55 55 55  55 55 ff ff ff ff aa aa  |UUUUUUUUUU......|
00003e50  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa aa aa  |................|
*
00003e90  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa ff ff  |................|
00003ea0  ff ff 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |..UUUUUUUUUUUUUU|
00003eb0  55 55 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |UUUUUUUUUUUUUUUU|
*
00003ef0  55 55 ff ff ff ff aa aa  aa aa aa aa aa aa aa aa  |UU..............|
00003f00  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa aa aa  |................|
*
00003f40  aa aa aa aa aa aa ff ff  ff ff 55 55 55 55 55 55  |..........UUUUUU|
00003f50  55 55 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |UUUUUUUUUUUUUUUU|
*
00003f90  55 55 55 55 55 55 55 55  55 55 ff ff ff ff aa aa  |UUUUUUUUUU......|
00003fa0  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa aa aa  |................|
*
00003fe0  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa ff ff  |................|
00003ff0  ff ff 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |..UUUUUUUUUUUUUU|
00004000  55 55 55 55 55 55 55 55  55 55 55 55 55 55 55 55  |UUUUUUUUUUUUUUUU|
*
00004040  55 55 ff ff ff ff aa aa  aa aa aa aa aa aa aa aa  |UU..............|
00004050  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa aa aa  |................|
*
00004090  aa aa aa aa aa aa ff ff  ff ff ff ff ff ff ff ff  |................|
000040a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

If you look at our screenshot for Enable Forth Language, you see two 50% dithered grey blocks above and below it. Those are the document separators. Anyone who's messed around with the video memory in a monochrome framebuffer will recognize the pattern of aa 55 as 10101010 01010101 — a 50% grey. Since we know our framebuffer is linear and contiguous, let's pull out the 28896 bytes (84 bytes per line times 344 lines) from this offset and stick a Portable Bit Map header on it. Guess what:

That's our text as a screenshot, just inverted, because PBM is black on white. As the Cat also dumps video memory when it saves to disk, the preview feature is nothing more than just loading it back. A clever idea, too: it's quick, it's reversible with a repaint and it doesn't disturb anything else in memory. In fact, we can also see when fully loading a document disk into the Cat that the preview image is loaded into video memory as part of that process too (which gives the illusion of it loading faster than it truly does). Moreover, the Cat doesn't seem to care what image data is there as long as it can read it.

So, as a test to make sure we can write our own arbitrary data to a Cat floppy, let's write something that can patch in a custom image. For the picture data, I found a photo of Jef Raskin sitting at a desk in front of a Cat, cropped and resized it to the right proportions, then crushed and dithered it out to a ready PBM with magick raskin.png -resize 672x344\! -dither FloydSteinberg -remap pattern:gray50 -negate raskin.pbm.

Now we need a disk format we can master a new floppy from. This sector dump, as easy as it was to parse, will not suffice for this purpose; there's not enough metadata about the sectors and tracks for the Greaseweazle software to create the right geometry. We'd also like a format that's well-documented and higher-level than raw flux or raw MFM because we know the Canon drive is writing renumbered but otherwise normal sectors, and it will make the disk image easier to alter. And, of course, the Greaseweazle software needs to support it. After reading through the Greaseweazle wiki's list of supported formats, I settled on the CPC extended DSK format since it has a fully documented specification and was likely to capture any unforeseen necessary nuances of the Cat's format, but still provides the actual sector data as decoded bytes. It can be captured with a command line like gw read --format=ibm-scan --tracks=c=0-79:h=0 test.edsk.

To verify viability, I captured the same floppy as eDSK and then wrote it to another blank disk (gw write test.edsk), and checked if the Cat would read that. It did!

However, moving to a higher fidelity format sometimes introduces complications. One complication while reading test disks written by the Cat was that some (not all) of them didn't actually start on physical track zero: they started on physical track two, leaving physical tracks zero and one unformatted, even though the track as written claimed to be track zero. I wasn't sure of the significance of this at the time — we'll revisit this later. Interestingly, the eDSK format also revealed that every sector on track zero is tagged as sector one, all ten of them, probably to speed up checking if the idtable on disk matches the current workspace or not since any sector will do.

The second complication is that we no longer get sectors in logical order but rather in the physical order they appear on disk, which may or may not match. The Cat doesn't seem to write tracks with any particular interleave, so to read and write the eDSK sectors we'll need to consult the Track Information Block (TIB) for each track to determine which sector is at which offset.

In the Catbox are two Perl tools, catrpic that will extract a PBM-format image from a Cat eDSK image (emitting the picture to standard output), and catwpic that will take a PBM-format image of the right dimensions and insert it into a Cat eDSK (emitting the hybrid eDSK to standard output). You've seen this picture before, but here it is again, the Cat loading from this modified disk image:

Unfortunately there's not much else on this disk to load, so the picture doesn't stay on screen for very long. To rectify that we'll have to figure out how to make the Cat automatically run code from a disk. Spoiler alert: there's a way to make the Cat automatically run code from a disk.

Hack No. 1: The Jef Raskin Picture Disk

In fact, some of you may have already realized how. Let's look at the idblock again, in which I mentioned the first 64 bytes are an image of the 68K data and address registers, eight each. In the 68000 architecture register A7 is the stack pointer. Because we have control of the stack pointer, we have control of the CPU's return address: if we change A7 in the idblock to another location, we can store a different return address there, and on the next RTS we'll take over the CPU. This isn't a flaw in the Cat's operating system, by the way — think of it more like save-file hacking, except that there's no ASLR or execute protection, so once you're in, you're in.

Reprising our disk's idblock, A7 is marked in bold:

00000000  00 00 81 2c 00 00 db c6  00 00 74 e0 00 00 74 c6  |...,......t...t.|
00000010  00 00 74 d4 00 02 c8 ea  00 40 7a e8 00 40 7c fc  |..t......@z..@|.|
00000020  00 00 00 2c 00 00 ff 81  00 00 00 11 00 00 ff ff  |...,............|
00000030  00 41 10 10 00 02 c8 d0  00 41 00 68 00 41 01 dd  |.A.......A.h.A..|
00000040

That location $407cfc is past the end of display memory at $4070e0 but before the variable area declared to start at $40a280, so we can hypothesize the processor stack normally lives somewhere in this range (the tForth memory map calls this block part of the display memory, but it doesn't appear onscreen).

The Cat editor documentation provides two words that control saving and loading from a disk: save and restore. In the Catbox I wrote up a tForth ROM disassembler in Perl called fdis that works with the 1.74 ROMs (recovered tForth source code also exists for this, but you can be confident you're seeing exactly what's being run by pulling from the ROM). If you've got the MAME ROMs, unzip cat.zip and convert the 1.74 ROMs from interleaved to continuous using the Catbox's interleave_rom tool, like so:

% ./interleave_rom r74*.ic[24] > rom
% ./interleave_rom r74*.ic[35] >> rom

Now we can use fdis to walk the ROM and print out the tForth opcodes for a given token (if it's a variable or 68K code, it will say so instead). We'll start with the restore routine, since we're trying to intercept loading. It appears deceptively short:

% ./fdis rom restore
token = 0x016b
0002cb88 1128 cstate
0002cb8a 65   off
0002cb8b 0305 <restore>
0002cb8d ff   don
0002cb8e 4e   raddr
0002cb8f d3   emit
0002cb90 120c lastop
0002cb92 14e8 %disk
0002cb94 7c   =
0002cb95 26   <;>

Yes, words can have angle brackets in them or indeed any symbol other than whitespace. The meat appears to be in the word <restore>, but the editor technical documentation says it's tricky:

The word <restore> reads useful information from the idblock (already read from the disk by the Disk command ... After this, it reads the display memory, immediately displaying it on the screen. Then it reads the remaining off-screen contents of the disk into memory.

It then executes the second half of <save> by use of a non-standard programming method. The entire machine state, including return stack, program counter and instruction pointer, is recorded. When it is restored (copied back into memory), execution resumes where it left off when <save> began writing the image into memory. This creates an unusual situation.

The same code (in the second half of <save> is used by both save and <restore>.

The text isn't quite accurate because the PC is not, in fact, part of the idblock, but it does tell us most of the action is actually in <save>. The reason both operations exit through common code is because the save operation compacts the dictionary and working text in memory before writing them out, so both the restore and save operations must unpack them before returning to the editor — in the restore case, after loading it from disk, and in the save case, after writing it. This particular word has a somewhat lengthy definition and we're going to tear it apart later, but we can see right in the beginning of <save> that the magic is done by a 68K assembly routine:

% ./fdis rom "<save>"
token = 0x0304
0002c8d2 0433 notepointers
0002c8d4 0151 packforth
0002c8d6 0435 packtext
0002c8d8 0434 noteramsize
0002c8da 21   wlit 0x3326
0002c8dd 22   lit 0x00407666
0002c8e2 5b   w!
0002c8e3 0106 recal
0002c8e5 e5   ?diskerror
0002c8e6 21   wlit 0x812c
0002c8e9 dd   call
[...]

This is calling a ROM routine at $812c. This routine, in part, starts by capturing the CPU registers:

 00812C  move    SR, $407664.l                               40F9 0040 7664
 008132  movem.l A0-A7, $407600.l                            48F9 FF00 0040 7600
 00813A  movem.l D0-D7, $407620.l                            48F9 00FF 0040 7620

which is used for the memory image of the outgoing idblock starting at $407600. When the restore runs, another 68K ROM routine is called at the end of <restore> after loading everything:

% ./fdis rom "<restore>"
token = 0x0305
0002ca6c d4   cls
0002ca6d 0106 recal
0002ca6f e5   ?diskerror
0002ca70 0109 idblock
[...]
0002cb5b 21   wlit 0x8368
0002cb5e dd   call
0002cb5f 55   1
0002cb60 37   <"> "Unable to restore text from disk."
0002cb83 38   <abort">
0002cb84 26   <;>

The ROM routine it calls at $8368 finishes by restoring those registers from the in-memory idblock.

 008368  move    #$2700, SR                                  46FC 2700
 00836C  clr.l   $40723c.l                                   42B9 0040 723C
 008372  movea.l $40fdfc.l, A0                               2079 0040 FDFC
 008378  move.l  A0, $40725c.l                               23C8 0040 725C
 00837E  movea.l #$407260, A1                                227C 0040 7260
 008384  move.l  #$265, D0                                   203C 0000 0265
 00838A  move.b  (A0)+, (A1)+                                12D8
 00838C  dbra    D0, $838a                                   51C8 FFFC
 008390  move.l  $410068.l, $407230.l                        23F9 0041 0068 0040 7230
 00839A  movea.l #$407800, A0                                207C 0040 7800
 0083A0  move.w  $407668.l, D0                               3039 0040 7668
 0083A6  move.l  $40fbd0.l, D1                               2239 0040 FBD0
 0083AC  lea     ($6,PC) ; ($83b4), A1                       43FA 0006
 0083B0  jmp     $7e5e.w                                     4EF8 7E5E
 0083B4  move.l  D0, $407238.l                               23C0 0040 7238
 0083BA  beq     $83cc                                       6710
 0083BC  move.l  #$12345678, $40723c.l                       23FC 1234 5678 0040 723C
 0083C6  jmp     $3210c.l                                    4EF9 0003 210C
 0083CC  movem.l $407600.l, A0-A7                            4CF9 FF00 0040 7600
 0083D4  movem.l $407620.l, D0-D7                            4CF9 00FF 0040 7620
 0083DC  move.l  $40725c.l, $40fdfc.l                        23F9 0040 725C 0040 FDFC
 0083E6  move    $407664.l, SR                               46F9 0040 7664
 0083EC  rts                                                 4E75

Assuming no error condition is detected, this routine has the effect of restoring the stack pointer and all registers, including the state of the tForth VM which is entirely stored in registers, to point to the tForth word after the call to $812c in <save>. I note for completeness that this call actually occurs in two places, but both are in the first part of the word, so the second part of <save> is executed regardless. (The LEA and JMP idiom at $83ac is a fast way of doing a subroutine call by not pushing to the stack and appears frequently in the Cat ROM.)

Now that we know where it comes out, and since we know restoring the previous stack pointer will eventually return to the editor, we can insert a diversion: during the loading process we'll have the user press a key to continue, so that they can appreciate our splendidly rendered image, and then return to the editor via the rest of <save>. Let's make this a picture disk, complete with Jef Raskin's Wikipedia entry, so that you can read his biography using the Cat he designed. It should show his picture, wait for the user's admiration, and then load the article which can be then scrolled and searched with the LEAP keys.

For the text, we're going to take advantage of the serial port to load the document for us. I used Lynx to download and render the Wikipedia text to 80-column plain text, cleaned it up in a text editor, and added some instructions.

The next step is to transmit it. Oddly enough, the serial connection does not use a null modem: use a straight-thru DB-25 male on the Cat end to DE-9 ("DB-9") female on your workstation's end, with your favourite USB to RS-232 dongle if you need it. The serial port on the Cat is between the printer port and the phone jacks.
By default the serial port is used as an alternate printer port, so in the Setup screen we'll switch it to the "SEND Command" (used for serial communications) and "Full Duplex" with "CR/LF" for your line terminator at 9600bps, 8-N-1. (The serial port is not supported in MAME yet either.)
Next we'll widen the margins to the full 80 columns (USE FRONT-LEFT MARGIN and LEAP LEFT, then USE FRONT-RIGHT MARGIN and LEAP RIGHT).
You may have noticed the absence of flow control options in the Setup screen, and that's because the Cat only supports software flow control with XON/XOFF. (This will be relevant in the third hack.) Sending the text full blast will drop characters because the CPU has to wrap and format it as text arrives.

To avoid this problem, you could obviously use a terminal program that supports XON/XOFF, or you could simply send data with a slight delay after each character. I provide just such a program (written in C) in the Catbox; compile it with something like gcc -O2 -o sendfile sendfile.c. To send the file with a 10ms intercharacter delay, use a command line like sendfile /dev/ttyUSB0 9600 raskin.txt 1,10 which says to use /dev/ttyUSB0 at 9600 baud, sending raskin.txt with 10ms after every one character. Here the POWER9 Linux workstation is pushing the document to the Cat which streams as characters are received to the screen. Make sure text you transmit this way has CR-LF line endings (e.g., set ff=dos in vim).

With the document loaded, we'll lock it against editing by pressing DOCUMENT LOCK and save it to disk with USE FRONT-DISK. The disk is blank, so it saves it immediately, after which we image it to an eDSK.

Now for the executable portion. Obviously we could write this code in 68K assembly, but we have a limited amount of space before potentially running into the real stack and we'd like to keep the code as small as possible. That means writing it in tForth, so we need a means to jump back into tForth and have the VM run anonymous code that is not actually stored as a "real" word.

As it happens, tForth supports this as part of the goto word, which directly executes "naked" tForth code from memory given an execution address. It does this using a special reserved word called temp. temp's definition ordinarily is simply to return an error message:

% ./fdis rom temp
token = 0x00f8
0000dc52 55   1
0000dc53 37   <"> "unassigned token "
0000dc66 38   <abort">
0000dc67 26   <;>

The <abort"> word takes a condition and a string, and if the condition is true, displays the string as an error and aborts. As the default definition pushes a literal 1 to the stack as the first value, running this word under normal circumstances will immediately raise an error and halt. But remember that the token table is in RAM and therefore mutable. The goto word exploits this fact:

./fdis rom goto
token = 0x00fe
0000dbbe 21   wlit 0x00f8
0000dbc1 92   +table
0000dbc2 5c   !
0000dbc3 f8   temp
0000dbc4 26   <;>

The word +table is 68K code that takes the given token value (it's $f8 here, which we know is temp) and returns the location of its execution address in the token table, basically multiplying it by 4 and adding the base address. This gets called a lot, so it needs to be fast. In practice the address it yields appears to be constant and predictable, so the execution address for temp will always be located in the token table at $4107e0 on a 1.74 ROM system. It then stores the new address — presto, the word is redefined — and calls it. The tForth technical docs warn that the code must end with a next token instead of the usual <;> for proper continuation, which we can accommodate. To run the new chimaeric temp, we'll call the Forth word execute, which is a 68K ROM routine residing at $d5ea that takes the token value and token table address of a word and runs it.

The 68K assembly equivalent of this is straightforward, which we'll call our trampoline. We'll change A7 in every copy of the idblock in the eDSK to point to a new 32-bit address which points to the word after that, where our machine code payload will reside, followed by our tForth tokens. Since we know where the display is on disk and we know the framebuffer ends in the middle of a sector (28896 is not an even multiple of 512), we have 288 extra bytes in that sector to insert the code right there and still remain reasonably clear of the real stack. This address could be $4070e0, immediately after the display ends, but since things can get pushed temporarily it's better to move it down a bit so garbage doesn't spill onto the bottom of the screen (I use $407120 which gives a 64-byte red zone). The code looks like this, which we compute at the time we patch the eDSK:

00407120 address pointing to entry 0040 7124
entry:
00407124 movea.l #OLD_A7,a7        2E7C OLD_ A7__
0040712a movea.l #$4107e0,a1       227C 0041 07E0
00407130 move.l #target,(a1)       22BC 0040 7142
00407136 move.l #$000000f8,d0      203C 0000 00F8
0040713c jmp $0000d5ea             4EF9 0000 D5EA
target:
00407142 jmp (a3)                  4ED3
; tforth tokens follow

We restore the old A7 value, store the address of our payload (target) to temp's entry in the token table and leaving the entry's address in A1, load the token for temp to D0 and jump into execute. I chose to use full 32-bit long encodings for the parameters to make it easier to adjust for other ROM versions with different addresses and token values. Notice that our payload looks just like a regular tForth word, starting with the call into the nesting routine. The D0 and A1 registers belong to a volatile set used for arguments to ROM routines and are not part of the tForth register state, so we can clobber them with impunity. We changed no other relevant register other than A7 and we immediately set it back, so we have not changed anything else other than to divert into our own code. When our tForth block finishes, our cuckoo word will be unnested and we'll fall right back into <save> to complete the restore process as if we never left it.

Finally, the tForth itself, which immediately follows the machine code trampoline. The <demit> word will draw text to the display at the given address; if the text has the high bit set, it will render it in reverse video. Bold or normal text is determined by what was used the last time a character was displayed, though for this purpose either weight works. The code we want to execute is " press any key" over + swap do i c@ i 407106 - 17 <demit> loop beep key drop next ; which will put a string on the stack with its length, loop over it and display each character, beep once, wait for a key, throw it away and then exit as goto requires through next. We could look the tokens up by hand, but we also can just define it to a throwaway word and dump from its execution address using c'.

Our patcher will insert all of this into the 288 bytes at the end of display memory on disk. Since most of you don't have a Cat, here's a video of the picture disk in action.

We turn on the Cat, insert the Raskin picture disk and press USE FRONT-DISK. The disk loads, displaying our picture, and pauses with "press any key" and a beep. (The disk light remains on as a side effect since we haven't yet run the word that turns the disk drive off.) We press a key, loading completes, and the document appears. Next, we use SHIFT and the LEAP keys to scroll line by line, then LEAP to the word "Apple" forwards and back a few times (using USE FRONT-LEAP to leap again with the same search), LEAP to the word "Aza," and finally LEAP to the DOCUMENT/PAGE marker, which will eventually bring us at the bottom of the text. Sorry about the darkness and changing light levels, but it's hard to video a CRT with a Pixel 7 Pro.

The picture disk image is in the Catbox. The patching program we'll unveil at the end, since we're going to add more to it for the next couple hacks. Speaking of!

Hack No. 2: CatTerm

There are a couple downsides to what we just pulled for Hack No. 1, all due to the fact we temporarily deviate from the restore process before it completes. One minor drawback is that the floppy disk drive is still active during code execution as the drive motor hasn't been turned off yet. But the big one is our limited execution area: the Forth dictionary and document workspace are not unpacked until the end of loading, so the technique we just used can only run within the small space provided and can't call longer words we create, only ones already in ROM.

That means we have to let the loading process finish to run larger words, but if we do that naïvely then the Cat will still wind up in the editor and never run our code. Altering the execution address for <save> in the token table won't help us because it's already in motion, and we can't change the code that's executing ... unless we copy <save> to RAM and run that as the payload to our trampoline.

Here's the rest of <save>, including the second call to $812c. Either way we'll end up in common code at $2c900 assuming no errors.

0002c8f5 7e   0=
0002c8f6 3a   and
0002c8f7 2b   <0bran> 0x08 (0x0002c900)
0002c8f9 0106 recal
0002c8fb e5   ?diskerror
0002c8fc 21   wlit 0x812c
0002c8ff dd   call
0002c900 15fc system.status
0002c902 20   blit 0x18
0002c904 66   +
0002c905 58   @
0002c906 10d4 trkbuf
0002c908 61   to
0002c909 15fc system.status
0002c90b 20   blit 0x1c
0002c90d 66   +
0002c90e 58   @
0002c90f 1854 ramend
0002c911 61   to
0002c912 22   lit 0x00407230
0002c917 58   @
0002c918 1868 t#on
0002c91a 61   to
0002c91b 06d1 parksafe
0002c91d 0100 doff
0002c91f 0436 unpacktext
0002c921 21   wlit 0x0088
0002c924 010a @ptr
0002c926 11b0 text
0002c928 0152 unpackforth
0002c92a 22   lit 0x00407700
0002c92f 1574 idtable
0002c931 21   wlit 0x0080
0002c934 8a   move
0002c935 21   wlit 0x00a0
0002c938 010a @ptr
0002c93a 13dc idadvance
0002c93c 61   to
0002c93d 54   0
0002c93e 0592 getdata
0002c940 54   0
0002c941 06b4 oldsetdata
0002c943 21   wlit 0x0100
0002c946 8a   move
0002c947 06c5 setupcat
0002c949 0279 resetcursor
0002c94b 0205 initruler
0002c94d 0209 checkline#
0002c94f 020b checkgauge
0002c951 020c checkbattery
0002c953 0492 resetphonelight
0002c955 0495 checklocallight
0002c957 0336 rule
0002c959 14e8 %disk
0002c95b 1210 curop
0002c95d 61   to
0002c95e 22   lit 0x00407238
0002c963 58   @
0002c964 2b   <0bran> 0x07 (0x0002c96c)
0002c966 11fc dirtytext?
0002c968 64   on
0002c969 0741 verifyerror
0002c96b 9d   error
0002c96c 48   ?dup
0002c96d 2b   <0bran> 0x05 (0x0002c973)
0002c96f 11fc dirtytext?
0002c971 64   on
0002c972 e5   ?diskerror
0002c973 26   <;>

The goal is to add code after the ?diskerror word at $2c972 (which aborts on errors) and before the end of the definition at $2c973. At that point the Cat's state is fully restored and expanded from disk but the editor has not yet been entered. Our tForth payload therefore becomes a copy of everything from $2c900 to $2c972 inclusive, followed by pushing our word's token value with a wlit literal word and execute re <;> with the explicit terminal call to re cleaning up any residual leftovers. Mercifully, the branches are relative and need not be relocated, and with the stack padding and trampoline it still fits in the 288 bytes available, so we just spew out the same tokens otherwise.

This idea has a lot of possibilities. For example, you could write up a word to automatically execute from the disk to patch up editor words in the token table for additional features, or install a utility, or even take over the system entirely. (Spoiler alert: stay tuned.) So for this second hack, we're going to do something useful for a change: a simple terminal program for the Cat. But wait, you say, doesn't the Cat already have the capability to send and receive serial data from the editor? It does, but it's not very asynchronous: you have to highlight and manually send text (using SEND) and you can't just type to the remote system, and individual keystrokes are particularly inconvenient. Plus, maybe you just don't want your entire transcript in your document, especially if you're running low on space.

As in the first hack, we'll load the tForth source over the serial port. Here it is:

Converted to text:

: cattermloop  begin
                    ser.?rxrdy if
                              @ser.char
                              demit
                    then
                    <?k> if
                              char char? off record dup
                              e1 = if
                                        leave
                              then
                              semit
                    then
          0 until ;
: cattermhi cls home ." CatTerm oldvcr.blogspot.com UNDO to quit" cr ;
: catterm edde if 0 40f884 ! -1 410078 ! cattermhi cattermloop 0 410078 ! -1 40f884 ! new-display else cattermhi cattermloop then ;

(Now here's a great advancement in open source: it comes with its own source code as the document, preparsed into the dictionary and set to autostart. You can even preview the code with USE FRONT-DISK and see what code you're about to run. Wanna change it? Change the code in the editor and USE FRONT-ANSWER, and the words will be redefined and ready to execute. Save it to disk, repatch the disk image, and make it your own.)

The main loop in cattermloop bounces between checking if a character is waiting at the serial port and if a keypress is pending. If one is at the port, it reads it and prints it to the terminal at the cursor position. If one is at the keyboard, it grabs it, stuffs it into the LEARN buffer if appropriate, and then checks to see if it's the UNDO key (code $e1). If it is, it exits the loop.

The word we'll execute, however, is catterm. I decided to be a little silly here to prove a point. The Cat can operate in various modes, with the variable edde set to true if it's in the editor and crt set to true if it's at the prompt. But these are just memory locations, so you can either do edde on or -1 40f884 !, which references its address explicitly (yes, values are signed). By forcing these modes, you can make text go different places. We don't want anything emitted to the document, so if we started execution from the editor we turn the edde variable off and the crt variable on, start the loop, and then set them back and repaint the screen (otherwise we "just do it"). Properly written, however, you can also just say edde on and crt off, and vice versa.

We highlight the entire text and do USE FRONT-ANSWER to define the words, and then take a test drive with catterm from the ok prompt. It works! Now we'll grab the token value for catterm and use that as a parameter to the patch utility. The exact value will vary depending on what words were defined already, but in this case, it's $07e2.

As a last convenience we enter and highlight the word catterm in the document so that you can just run it again from the editor by pressing USE FRONT-ANSWER. We save it to disk and create a disk image, and patch it. Here it is booting. I couldn't run to the workstation and type at the same time as I was filming, but you can see that it autostarts and cleanly returns to the editor. The CatTerm disk image is in the Catbox.

I also demonstrate running it from the ok prompt by pressing SHIFT-USE FRONT-SPACE, entering page, entering catterm, and after pressing UNDO entering re to return to the editor. Everything works just as it should. Port speed and settings are configured with SETUP, just like everywhere else.

But what if you don't want to return to the editor?

Hack No. 3: The First Canon Cat Demo?

This might be where I become branded a heretic by the School of Raskin for escaping the primary interface. But it's a general purpose computer, and we should be able to use it for general purpose computer things. Plenty of stuff took over the Mac entirely, so I don't see why the Cat should be any different.

For this last hack, we'll go all the way live and make a full-fledged slideshow demo. I'll select and convert some pictures as a love letter to the 1980s office and we'll have the Cat load them in sequence directly from disk. When you exit the demo, it will cold-start the machine back to the editor (so we'll dub this a "cold booter").

Our payload can now be shorter, because the words for setting up the editor no longer need to be run. However, we still need to unpack the document text, such as it is (we can overwrite it later if we want), because it must be shifted up in RAM first to ensure the Forth dictionary ends up in the right place. In addition, we need to set a system flag to prevent screen repaints during that process or any title screen we put on the disk will get obliterated.

After the doff at $2c91d, which turns off the disk drive, in our cold booter payload I insert a showmove? off to ensure unpacktext doesn't try to repaint the screen. We'll then do everything up to, but stop short of, the setupcat at $2c947. Instead, at that point we'll push our token and end with execute cold ; since we won't be returning to the editor and we won't try to keep any of the system variables intact.

Now for the pictures. It will be simplest to segregate the slides to a second disk which we can occupy the whole of. We'll have the Cat format it, then promptly overwrite it with our own data. The rtrk and wtrk words will load or save an entire track of data (5120 bytes) to or from a given memory address, so we will have each picture occupy 28896 bytes divided by 5120 bytes per track to equal 5.64375 tracks, i.e., six tracks apiece rounded up. This is a little inefficient for the last portion of the display but we're optimizing for speed here.

If we're going to ask for a new disk, we should do so with style, so this will be our screenshot image. C'mon, he's the original father of the Mac too!

We will blink the question mark until the disk is switched, as a classic Mac would do, then draw the original Happy Mac over it and proceed with the slides. Here they are, largely extracted from contemporary marketing material, then edited in Krita and Floyd-Steinberg dithered with ImageMagick. Any blurriness in these grabs is due to the upscaling I did to match the CRT aspect ratio; they are sharp on the Cat.
For the credits screen, I typed it into the editor and saved it to another disk.
I then used catrpic from the first hack to extract it and added it to the pile as the final image.

To create the slides, after having the Cat format a blank disk, we'll transmit the bitmap data via the serial port. This is binary safe if we disable XON/XOFF first, or otherwise the Cat will intercept and act on those characters. For example, this snippet at the ok prompt will wait for data from the serial port and read 28896 bytes from it, spewing them to the screen directly. We assume your serial port is still configured for "SEND Command/Full Duplex" at 9600bps, 8-N-1.

ser.xon.tx.off page 4070e0 400000 do ser.rx i c! loop key drop

Then send the 28896-byte raw bitmap data (in our case, the scaled and dithered PBM with the PBM header removed) with something like sendfile /dev/ttyUSB0 9600 image.cat — no intercharacter delay is necessary as this loop will have no problem keeping up. Note that we use ser.xon.tx.off to disable acting on received XON/XOFF characters, but actually fetch characters with ser.rx (which will wait for a byte to be ready). Once all bytes are loaded, it will wait for a key, and then stop.

We'll test it on Captain Solo by writing out a track and trying to read it back.

Converted to text,

ser.xon.tx.off
4070e0 400000 do ser.rx i c! loop 400000 0 wtrk 401400 1 wtrk 402800 2 wtrk 403c00 3 wtrk 405000 4 wtrk 405ce0 5 wtrk . . . . . .

We turn off XON/XOFF for receive, load the image, then write out six tracks at the given addresses (5 overlaps slightly with 4). The six dots at the end report out the status of each track write with wtrk.

Don't get those arguments transposed to these track read/write routines, by the way. The operating system will try to seek to track 4194304, causing the head to repeatedly hit the outside rail and possibly damage itself! There is no bounds checking on these routines!

Receiving and writing the test image.
Now for the other direction, which is quite logically the reverse with rtrk.

400000 0 rtrk 401400 1 rtrk 402800 2 rtrk 403c00 3 rtrk 405000 4 rtrk 405ce0 5 wtrk key drop . . . . . .
The image reappears from the floppy.
Now with a successful test, we will create the disk with this small program. The return values from wtrk are just left on the stack since we'll be cold booting it after anyway to set up the main code word. XON/XOFF is already (x)off.

3c 0 do 4070e0 400000 do ser.rx i c! loop 400000 i wtrk 401400 i 1 + wtrk 402800 i 2 + wtrk 403c00 i 3 + wtrk 40500 i 4 + wtrk 405ce0 i 5 + wtrk 6 +loop
Leaving the blinking disk screen for the boot disk, we transmit each bitmap from the Linux workstation using sendfile. When all the bytes have been received, the Cat will write it to disk and come back for the next, and so on until all the images are sent.
Unlike every other graphic image in the demo, however, we'll draw the Happy Mac from an internal set of bitmap values instead of trying to fetch it from disk as well. Here I'm messing with the positioning to determine the best addresses to blink the question mark and where onscreen the Happy Mac should be loaded. Eventually I settled on the following, which is the entirety of the demo source code in Forth:

( wait for a key )
: keywait ( delay - key )
        0 swap
        0 do
                drop
                <?k> if 
                        char char? off leave
                then
                1 ms
                0
        loop ;

( wait until the disk is out )
: diskwaitout ( delay - flag )
        0 swap
        0 do
                drop
                ?diskrdy not dup if
                        leave
                then
                1 ms
        loop doff ;

( wait until the disk is in )
: diskwaitin ( delay - flag )
        0 swap
        0 do
                drop
                ?diskrdy dup if
                        leave
                then
                1 ms
        loop doff ;

( general disk wait loop )
: diskwait ( waitword )
        begin
                ( erase question mark )
                403cdd 4038ed do ffff i w! 54 +loop
                ( ensure drive is on, needs a delay )
                drive0 40 800000 or! 180 ms
                dup 0500 swap execute if drop leave then

                ( draw question mark - bitmap data is lifo )
                fe7f fe7f ffff fe7f fe7f fe3f ff1f ff8f f3cf f3cf f00f f81f
                        403cdd 4038ed do i w! 54 +loop
                drive0 40 800000 or! 180 ms
                dup 0500 swap execute if drop leave then
        0 until doff ;

( display a picture stored on disk at a particular track number )
: trackpic ( track - )
        ( each picture is six tracks long with overlap in the last track )
        dup
        400000 swap rtrk drop
        dup 1 +
        401400 swap rtrk drop
        dup 2 +
        402800 swap rtrk drop
        dup 3 +
        403c00 swap rtrk drop
        dup 4 +
        405000 swap rtrk drop
        5 +
        405ce0 swap rtrk drop
        ( wait for a keypress for 2560 ticks or reset if UNDO pressed )
        0a00 keywait e1 = if cold then
        ;

( main loop )
: slideshow ( - )
        ( turn off editor )
        edde off crt on

        ( mac gimme disk screen is already encoded in the screenshot )
        ['] diskwaitout diskwait
        ['] diskwaitin diskwait

        ( make sure we can read track 0 )
        recal 0 <> if
                page ." disk read failure"
                1000 keywait drop cold
        then

        ( draw 32x32 happy mac and erase bottom of floppy disk icon )
        aaaaaaaa 50000015 a7ffffca 57ffffd5 a7ffffca 50000015 afffffea 4fffffe5 afffffea 4fffffe5 acffc0ea 4fffffe5 afffffea 4fffffe5 afffffea 4e0000e5 adffff6a 4dffff65 adf87f6a 4df7bf65 adffff6a 4dfcff65 adfeff6a 4dfeff65 adeeef6a 4deeef65 adffff6a 4dffff65 adffff6a 4e0000e5 afffffea 57ffffd5 a800002a
                403dd8 403304 do i ! 54 +loop

        ( shocked mac slide, then loop the rest of the slideshow )
        400 ms 10 keywait drop ( prime key events ) 0 trackpic begin
                0a ( number of pictures, then multiplied by tracks per pic )
                6 * 6 do i trackpic 6 +loop ( each picture is six tracks )
        0 until ;

( shamelessness )
: credits ." copyright 2024 cameron kaiser" cr ;

This is all pushed over the serial link and parsed into words on the Cat using the same process as before.

The main code starts with slideshow where we first make sure all updates go direct to the screen. After that, the diskwait word is what waits for the state changes of the floppy. It has its own routines for erasing and redrawing the blinking question mark. This word takes the token of another word as an argument which it calls to determine whether the desired condition is satisfied, so we call it first with diskwaitout to wait for the first disk to be removed, then diskwaitin to wait for the second disk to be inserted. These words directly manipulate the floppy drive through Gate Array #3 by briefly turning it on to check for the disk, then waiting for a settling period after the motor starts to get the disk status back. The question mark is encoded as a series of literal words pushed onto the stack in reverse order so they get pulled off in the right order for the loop.

With the next disk in, the recal word seeks for track zero and returns 0 for success or an error code for failure. Assuming it can find it, it draws the Happy Mac (with whole 32-bit words this time but using the same principle), primes the keyboard, and starts with the first picture, the Shocked Mac. The trackpic word loads the six-track image and waits for a keypress in a given timeframe, cold-starting the Cat if the UNDO key is pressed. The cold start process will of course try to read the disk, but it won't be recognized as a Cat floppy and will be ignored, thus putting the user back into the editor.

This does give the floppy drive quite a workout. You'll notice that the Shocked Mac appears only the very first time it boots and never in the loop after that, and it's because I think my floppy drive has a marginal track zero switch (at the age these drives are, I suspect other otherwise-working Cat drives have a similar problem). This switch appears to be mechanical on the Canon drive rather than the more typical LED/sensor combination. More times than not this particular disk drive could get back to track zero, but sometimes it wouldn't, which explains why some of my reads from disks the Cat wrote started on physical track two even though the Cat's firmware obviously believed it was writing track zero. It doesn't seem like an alignment problem with the switch or the drive because when it does find track zero, it reads disks written by the Teac (our "perfectly aligned" brand spanking new floppy drive) just fine, and obviously it's also able to write track zero more often than not or this demo might not have been possible.

Accordingly, the problem I ran into with this hack was the drive couldn't reliably get back to track zero after the last picture either (it's over 60 tracks away by then). If this happens to you in the editor, you can just try reading the disk again until it "gets it." But here rtrk will pause waiting patiently for the correct track to come by, which probably won't happen for the good minute or so it waits until it times out. Trying to manually step the head back in track by track didn't really help. The best solution was just to start back over on track 6 instead of track 0, and that seemed to work generally okay, though this is probably not a demo you want to leave running repeatedly on the Cat regardless.

Here's the finished product. Both disk images are in the Catbox.

This hack brings you the final iteration of catcpic, which now is a three-headed application. It requires your original eDSK and a properly formatted and inverted PBM, generates a nice randomized idtable so that your Cat can't confuse it with anything else (though you can pass it one in hex with -idtable=... if you want) and emits the resulting disk image to standard output. If you don't pass it a token in hex, it will install the "press any key" patch; if you do, and you specify -cold, you'll get a cold booter; otherwise you'll get a autostarter that will enable you to cleanly return to the editor.

To compensate for the situation where your Cat's drive might also be acting up and not truly writing to track zero either, I've also included an eDSK optimizer tool. It will remove and properly relocate unformatted tracks, or even formatted tracks before the detected idtrack if you use -force, adjusting the eDSK DIB and TIBs and emitting the corrected disk image to standard output. You can also just say -test to see what it thinks about a particular image (-test -force will make it continue even if it doesn't think the disk is a Cat disk).

Things to do

Some considerations for a future entry:

  • Figure out how to lash something like a Gotek floppy emulator to the Cat because I'm not sure how much more life this disk drive's got in it. The idea should be along the lines of building one of the adapters that let you use a PC floppy drive, but then plug an emulator into it instead.
  • Fix for v2.4x ROMs, which some Cat enthusiasts run now, but that would require me to modify my Cat to do so, which I'm not entirely willing to do without a good reason. I'll take patches, though.
  • A proper set of games for the Cat. How about a text adventure played in the editor?
  • Audio demo? The tone and related words for playing audio tones documented in the tForth manual are not in the 1.74 ROM. There is a beep word which twiddles with the DUART, but it's not clear if it only beeps or can be made to play other tones.
  • uCLinux? If a Palm Pilot can run it ...
  • Write shorter blog posts? (Nah.)

Everything in this blog post is on Github under the BSD 3-clause license.

No comments:

Post a Comment

Comments are subject to moderation. Be nice.