Saturday, November 16, 2024

One-parting some Commodore 64 utilities for fun and profit

I've got a few retrocomputing bucket list items I'm slowly working down, and a couple of them involve some little Commodore 64 games I've had kicking around on the backburner. However, every game needs media assets, and while there are many great tools for doing art on your present-day workstation and exporting it, sometimes you just want what you used to work with — in as convenient and quick-loading a way as possible that blends with modern emulation workflows. So here's two I tweaked and one-parted — Ultrafont+ and DOODLE! — and some tips for doing that yourself.

Ultrafont+ is a fairly popular character set editor and the first one I ever used, originating from the COMPUTE! type-in magazine world as yet another Charles "SpeedScript" Brannon tour de force. This is "V.2" from the September 1986 COMPUTE!'s Gazette, an update of Ultrafont+ from July 1984 which was itself an update of the original Ultrafont in COMPUTE!'s First Book of 64 Sound and Graphics. I couldn't afford a subscription back then but I did periodically buy copies off the newsstand and fortunately I was able to get the disk for this one too, which had some great games like the massively sprite multiplexed "Eagles and Gators" and an exceptionally flexible text windowing package called Window Wizard. I personally found Gazette to be at its peak around the mid-late 1980s (with a later local maximum during the General Media days when yours truly actually got some of my own articles published) and this issue was a great example.

As a rule Gazette type-ins weren't generally multiloads and Ultrafont+ as published was already in one file. However, I ran across a nice version (no idea who did this) with a BASIC stub that printed the keystroke commands, which I thought was a handy addition. My own Perl-based BASIC cross-linker can one-part a BASIC program and any number of additional modules, so I took the original BASIC loader and made the following changes and deletions:

10
120 print "{Sr}        press a key for ultrafont+      "
255 poke198,.:wait198,1:geta$
260
300 sys49152
310 print"{Sq}'run' for help screen and ultrafont+"
320

Since my linker correctly resets the end-of-BASIC pointer (recall that LOADing any program file, even a non-BASIC program outside of the normal BASIC text range, will set this pointer), there's no need to NEW everything out like the old BASIC stub did to avoid hitting spurious out-of-memory errors. Plus, doing so keeps the instructions in memory so you can just re-RUN it.

The next step was to make it respect the default device number. Ultrafont+ supports both cassette and disk, but it only supports disk device 8. In both Power64 and VICE, I usually reserve device 8 for any mounted D64s and 9 for the development filesystem, and I'd prefer that files I save and load go directly to the filesystem for cross-manipulation, linking, etc.; ideally it should save and load from the device that was most recently accessed. That kind of patch is ordinarily trivial and usually you can find what needs to be changed by looking for any Kernel SETLFS calls to $ffba, the routine that sets the channel number, device and secondary address. This call occurs in exactly two places in Ultrafont+:

.C:ca32  A0 01       LDY #$01
.C:ca34  A9 01       LDA #$01
.C:ca36  20 BA FF    JSR $FFBA

.C:caf2  A9 00       LDA #$00
.C:caf4  20 BD FF    JSR $FFBD
.C:caf7  A9 0F       LDA #$0F
.C:caf9  A2 08       LDX #$08
.C:cafb  A0 0F       LDY #$0F
.C:cafd  20 BA FF    JSR $FFBA
.C:cb00  20 C0 FF    JSR $FFC0

The first call is curiously incomplete and doesn't seem to set a device number at all (it would be in the X register). The second one sets a null filename, then opens disk drive 8's command channel (the equivalent of OPEN 15,8,15 with no string). We definitely want to patch that too, but it's not what's being used to actually access any files.

Next I went looking for LOAD, reasoning that a binary font file would be LOADed using the Kernal file load call instead of byte-by-byte. That's at $ffd5, and this call appears exactly once:

.C:cb3c  20 0F CA    JSR $CA0F
.C:cb3f  20 52 CB    JSR $CB52
.C:cb42  A9 00       LDA #$00
.C:cb44  20 D5 FF    JSR $FFD5

LOAD still has to have the device number set somehow, so we look at the subroutines it calls first. It starts off with one at $ca0f.

.C:ca0f  20 E7 FF    JSR $FFE7
.C:ca12  A9 36       LDA #$36
.C:ca14  A0 C4       LDY #$C4
.C:ca16  20 91 C4    JSR $C491
.C:ca19  20 E4 FF    JSR $FFE4
.C:ca1c  F0 FB       BEQ $CA19
.C:ca1e  A2 01       LDX #$01
.C:ca20  C9 54       CMP #$54
.C:ca22  F0 0B       BEQ $CA2F
.C:ca24  A2 08       LDX #$08
.C:ca26  C9 44       CMP #$44
.C:ca28  F0 05       BEQ $CA2F
.C:ca2a  68          PLA
.C:ca2b  68          PLA
.C:ca2c  4C 8D C4    JMP $C48D
.C:ca2f  8D 4D 02    STA $024D
.C:ca32  A0 01       LDY #$01
.C:ca34  A9 01       LDA #$01
.C:ca36  20 BA FF    JSR $FFBA

This routine closes all channels, then calls a utility routine to display a string at the bottom of the screen terminated by a Commodore backarrow character (ASCII 95). The string at $c436 is "{r}T{R}APE OR {r}D{R}ISK?" Hey, that looks familiar! It accepts a key and based on T or D sets the X register to 1 or 8 and falls through into $c42f ... which completes our mysterious first SETLFS call.

What about for saves ($ffd8)?

.C:cab0  20 0F CA    JSR $CA0F
.C:cab3  20 52 CB    JSR $CB52
.C:cab6  A9 00       LDA #$00
.C:cab8  85 FD       STA $FD
.C:caba  85 FB       STA $FB
.C:cabc  A9 70       LDA #$70
.C:cabe  85 FC       STA $FC
.C:cac0  A2 FF       LDX #$FF
.C:cac2  A0 77       LDY #$77
.C:cac4  A9 FB       LDA #$FB
.C:cac6  20 D8 FF    JSR $FFD8

It makes the same call to $ca0f and thus uses the same code to set the device. The last device number used by the system is stored in location 186, which is zero page, so we can replace a2 08 (LDX #8) with a6 ba (LDX 186) and not have to make any other adjustments. That means the following in-place changes to both the device prompt routine and the routine that opens the disk drive command channel (two bytes in each of two places; byte "address" is actually the offset within the main Ultrafont+ binary):

 00000a00  99 00 02 20 d2 ff c8 4c  ba c9 a9 5f 99 00 02 98  |... ...L..._....|
 00000a10  60 20 e7 ff a9 36 a0 c4  20 91 c4 20 e4 ff f0 fb  |` ...6.. .. ....|
-00000a20  a2 01 c9 54 f0 0b a2 08  c9 44 f0 05 68 68 4c 8d  |...T.....D..hhL.|
+00000a20  a2 01 c9 54 f0 0b a6 ba  c9 44 f0 05 68 68 4c 8d  |...T.....D..hhL.|
 00000a30  c4 8d 4d 02 a0 01 a9 01  20 ba ff a9 48 a0 c4 20  |..M..... ...H.. |
 00000a40  c4 c4 20 b8 c9 d0 07 ad  4d 02 c9 54 d0 ed ad 4d  |.. .....M..T...M|

 00000ad0  d0 06 20 72 cb 4c 8d c4  20 72 cb 20 e7 ff ad 4d  |.. r.L.. r. ...M|
 00000ae0  02 c9 44 f0 0f a9 23 a0  c4 20 91 c4 20 e4 ff f0  |..D...#.. .. ...|
-00000af0  fb 4c 8d c4 a9 00 20 bd  ff a9 0f a2 08 a0 0f 20  |.L.... ........ |
+00000af0  fb 4c 8d c4 a9 00 20 bd  ff a9 0f a6 ba a0 0f 20  |.L.... ........ |
 00000b00  ba ff 20 c0 ff a2 0f 20  c6 ff a0 00 20 cf ff c9  |.. .... .... ...|
 00000b10  0d f0 07 99 00 02 c8 4c  0a cb a9 5f 99 00 02 20  |.......L..._... |

Save that, and we'll generate a nice compressed single-parter with the linker and pucrunch:

% linkb ufl.bas ufl9.bin 
Commodore 64 BASIC Linker for Perl (c)2006, 2022 Cameron Kaiser.
Distributed under the Floodgap Free Software License; see documentation.
Use /home/spectre/bin/linkb --version for version string and help information.

Loaded 1068 bytes of BASIC text.
BASIC text allocated from 2049 ($0801) to 3116 ($0c2c).
Loaded ufl9.bin (3260 bytes) from 49152 ($c000) to 52411 ($ccbb).
4328 total bytes in raw binary data.
Creating runnable relocator.
Package successfully created.
% pucrunch -fshort apkg.prg ultrafnt.prg
Load address 0x0801=2049, Last byte 0x19a2=6562
Exec address 0x080d=2061
New load address 0x0801=2049
Interrupts enabled and memory config set to $37 after decompression
Runnable on Commodore 64
Checked: 4514 
Selecting the number of escape bits.. Selected 2-bit escapes
Optimizing LZ77 and RLE lengths...
Selecting LZPOS LO length.. Selected 8-bit LZPOS LO part
Note: Using option -m6 you may get better results.
In: 4514, out: 3516, ratio: 77.90% (6.24[5.76] b/B), gained: 22.11%
Gained RLE: 119 (S+L:119+0), LZ: 1256, Esc: -90, Decompressor: -284
Times  RLE: 35 (35+0), LZ: 690, Esc: 145 (normal: 1822), 2 escape bits
Saving C64 short
ultrafnt.prg uses the memory $2d-$30, $f7-$1d7, and $0801-$19aa.
Compressed 4514 bytes in 0.00 seconds (2844.00 kB/sec)

For the record, -m6 saved me nothing and -m5 saved me a whole stinking byte. There wasn't much speed difference with -ffast either for this small of a program. Ah, it runs, who cares.

So there's that. Armed with these changes, you should be able to generate your own single-parted self-documenting device-patched Ultrafont+ by following the same steps. Doodle!, on the other hand, was a little trickier.
My folks had gotten us a number of fun graphics programs for the Commodore 64, starting with a Koala Pad touch tablet and KoalaPainter on cartridge (by Audio Light), and eventually culminating in Inkwell Systems' Flexidraw, a light-pen driven system with separate drawing and colour modules. This was especially fitting since I grew up in San Diego, California, where Inkwell was based. Along the way we got Doodle, which is a rather capable hi-res drawing program controlled primarily by joystick and hotkeys using simple menus. Although for a certain period of time it was heavily advertised, and seemed very well regarded by reviewers, I know little about its history other than that it was originally written in 1983 by Mark R. Rubin and Omni Unlimited and distributed by City Software in Milwaukee, Wisconsin. I can't find any other program he wrote, they developed or they published (though there is an apocryphal mention that they may have been involved in PaperClip's distribution at one point), and Doodle advertising had disappeared from Commodore magazines by 1985.
The version we will be using here is "REV 2.0" but I haven't ever seen any earlier version, and even this version has a 1983 copyright date (unless they didn't change it). It loads several modules — the numerically third first, then first and second — and displays the welcome screen, which after a little snooping in the VICE monitor appears to be at $501a. The modules do not overlap, with a large segment under the BASIC ROM:

(C:$f13e) l "module03" 8
Loading module03 from 9000 to B55F (2560 bytes)
(C:$f13e) l "module01" 8
Loading module01 from 5240 to 55BF (0380 bytes)
(C:$f13e) l "module02" 8
Loading module02 from 8000 to 82BF (02C0 bytes)

At this point it doesn't load anything further (at least obligatorily) and the program is fully operational.

My first initial idea was to patch out the portions of the 75-block (!) booter that loaded these modules, link everything together, and use that as the one-part executable (the disk also contains printer drivers and sample images but we won't need those). With a little more work in VICE I was able to identify where the three modules loaded from (third module at $4ed5, then first module at $4ef7 and second at $4f14), but pre-loading the modules and turning those Kernal LOADs into no-ops caused the booter to abort with a "disk error" (despite the disk drive reporting 00, OK, 00, 00). Some additional disassembly, besides finding an odd routine at $2b47 that sends commands placed at $2b5c to the disk drive command channel and leaves it open, notes that it seems to frequently interrogate the disk drive for its status. This didn't look like copy protection or obfuscation and seemed more like paranoia on Mr. Rubin's part, but it was similarly unwelcome for this purpose.

I dithered over patching all of those calls out as well and seeing what was left, but it dawned on me I had already determined Doodle's entry point, so all I should need to do was drop a breakpoint when it got there and save memory from $0801 to $b55f (the very end of module03), then jump back in at $501a with the BASIC ROM banked out.

This almost worked. It turns out that Doodle also drops some data in low memory as well: when switching to the zoomed-in pixel editor, the sprite used for the cursor was garbled. Looking at the VIC-II registers I could see we were in the lowest 16K bank and the cursor's sprite pointer (as determined by location 53272, indicating the start of screen memory) was pointing to location 832 in the cassette buffer, which we didn't save. I noted what was there with Doodle loaded normally and searched for it in the dump, finding it at $5440. I also noticed that our background and foreground colours in the "colour mode" were a bit scrambled; disassembly around $47b1 shows Doodle keeps this value as two nybbles at $08 in zero page and this had been earlier initialized to 1 (black on white) prior to the main entry point. Since we're going to need a fixup trampoline to start Doodle from the memory dump anyway, I made a mental note to add that change as well. There are probably some other low memory variables that aren't getting initialized but I didn't notice any other problems either.

Now for the device number. Searching for SETLFS calls, I found multiple places where device 8 had been hardcoded (a2 08) and initially changed them to consult location 186 again (a6 ba), but Doodle went right on accessing device 8; location 186 somehow got reset somewhere, probably some non-obvious disk call I hadn't found yet. One solution might have been to watchpoint for where this happens and modify that, but that gets a lot of false hits in the Kernal and a simpler one is just to use another address in zero page Doodle isn't using and copy it there. 190 was free and didn't seem to get changed by anything, so patching the dump looked like this (remember that the "addresses" are file offsets):

 00002320  01 d0 0b ad 78 08 c9 10  b0 03 ee 78 08 60 ce 77  |....x......x.`.w|
 00002330  08 60 ad 78 08 c9 01 d0  0b ad 77 08 c9 10 b0 03  |.`.x......w.....|
-00002340  ee 77 08 60 ce 78 08 60  48 a9 0f a2 08 a0 0f 20  |.w.`.x.`H...... |
+00002340  ee 77 08 60 ce 78 08 60  48 a9 0f a6 be a0 0f 20  |.w.`.x.`H...... |
 00002350  ba ff 68 a2 5c a0 2b 20  bd ff 4c c0 ff 49 10 05  |..h.\.+ ..L..I..|
 00002360  29 7f 4c 6b 2b 48 a9 93  20 d2 ff 68 0a a8 b9 00  |).Lk+H.. ..h....|

 00002490  08 a9 9b 20 5d 2b 4c 9d  2c a9 9c 20 5d 2b a9 99  |... ]+L.,.. ]+..|
 000024a0  20 5d 2b a9 01 20 47 2b  20 cc 25 90 03 4c c4 30  | ]+.. G+ .%..L.0|
-000024b0  a9 02 a2 08 a0 00 20 ba  ff a9 01 a2 e8 a0 30 20  |...... .......0 |
+000024b0  a9 02 a6 be a0 00 20 ba  ff a9 01 a2 e8 a0 30 20  |...... .......0 |
 000024c0  bd ff 20 c0 ff 20 34 26  90 03 4c c4 30 a2 02 20  |.. .. 4&..L.0.. |
 000024d0  c6 ff a0 03 a2 0b 18 20  f0 ff a9 97 20 d2 ff 20  |....... .... .. |

 00003b00  01 80 20 e4 ff d0 fb 4c  d4 42 c9 56 d0 d4 4c eb  |.. ....L.B.V..L.|
 00003b10  43 a9 01 8d 52 08 20 77  2c ad a7 08 f0 b7 20 29  |C...R. w,..... )|
-00003b20  44 20 10 0a a9 01 a2 08  a0 00 20 ba ff ad a7 08  |D ........ .....|
+00003b20  44 20 10 0a a9 01 a6 be  a0 00 20 ba ff ad a7 08  |D ........ .....|
 00003b30  18 69 02 a2 aa a0 08 20  bd ff a9 00 a2 00 a0 5c  |.i..... .......\|
 00003b40  20 d5 ff a9 01 20 c3 ff  20 3f 44 a9 0f 20 c3 ff  | .... .. ?D.. ..|
 00003b50  20 a4 0a 4c d4 42 a9 00  8d 52 08 20 77 2c ad a7  | ..L.B...R. w,..|
-00003b60  08 d0 03 4c d4 42 20 29  44 20 10 0a a9 01 a2 08  |...L.B )D ......|
+00003b60  08 d0 03 4c d4 42 20 29  44 20 10 0a a9 01 a6 be  |...L.B )D ......|
 00003b70  a0 01 20 ba ff a9 40 8d  a8 08 ad a7 08 18 69 04  |.. ...@.......i.|
 00003b80  a2 a8 a0 08 20 bd ff a9  00 85 1b a9 5c 85 1c a9  |.... .......\...|

 00003ba0  a9 0f 20 c3 ff 20 a4 0a  4c d4 42 a9 80 8d 52 08  |.. .. ..L.B...R.|
 00003bb0  20 77 2c ad a7 08 d0 03  4c d4 42 20 29 44 a9 0f  | w,.....L.B )D..|
-00003bc0  20 c3 ff a9 0f a2 08 a0  0f 20 ba ff a9 53 8d a8  | ........ ...S..|
+00003bc0  20 c3 ff a9 0f a6 be a0  0f 20 ba ff a9 53 8d a8  | ........ ...S..|
 00003bd0  08 ad a7 08 18 69 04 a2  a8 a0 08 20 bd ff 20 c0  |.....i..... .. .|
 00003be0  ff 20 3f 44 a9 0f 20 c3  ff 4c d4 42 a9 1f 20 5d  |. ?D.. ..L.B.. ]|
 00003bf0  2b 20 e4 ff f0 fb c9 59  f0 03 4c d4 42 20 29 44  |+ .....Y..L.B )D|
-00003c00  a9 0f 20 c3 ff a9 0f a2  08 a0 0f 20 ba ff a9 56  |.. ........ ...V|
+00003c00  a9 0f 20 c3 ff a9 0f a6  be a0 0f 20 ba ff a9 56  |.. ........ ...V|
 00003c10  8d a8 08 a9 01 a2 45 a0  44 20 bd ff 20 c0 ff 20  |......E.D .. .. |
 00003c20  3f 44 a9 0f 20 c3 ff 4c  d4 42 a9 01 20 47 2b 20  |?D.. ..L.B.. G+ |

With that, we can now construct our trampoline. This assembles with xa and automatically links in the memory dump (stored as doodledump). I also added a "DECOMPRESSING DOODLE" message since we're in screen memory range anyway.

        .word $0340
        * = $0340

; sprite data for zoom mode
; copied from ($51 = 81 * 64 + 16384 = $5440)

.byt $00, $00, $00, $00,  $00, $00, $0f, $ff,  $f0, $0f, $ff, $f0,  $0c, $00, $30, $0c
.byt $00, $30, $0c, $00,  $30, $0c, $00, $30,  $0c, $00, $30, $0c,  $00, $30, $0c, $00
.byt $30, $0c, $00, $30,  $0c, $00, $30, $0c,  $00, $30, $0c, $00,  $30, $0c, $00, $30
.byt $0f, $ff, $f0, $0f,  $ff, $f0, $00, $00,  $00, $00, $00, $00,  $00, $00, $00, $00

        ; enter fixup code at $0380 = 896
        ; set up default screen
        jsr $ff84       ; this resets $01
        lda #$36
        sta $01
        lda #12
        sta $d020
        lda #1
        sta $d021

        ; stash current device (186 gets overwritten)
        lda 186
        sta 190

        ; set ink to white background, black ink (see routine ~$47b1)
        lda #$01
        sta $08

        ; call the screen display routine $2b5d and enter main loop
        jmp $501a

eot     = *

        ; print helpful message
        * = 1024+80+2
        .dsb (*-eot),32
        * = 1024+80+2

        .asc "decompressing doodle ..."

eom     = *

        ; account for starting address
        * = $07ff
        .dsb (*-eom),32
        * = $07ff

.bin 0,0,"doodledump"

The screen portion requires the xa feature to transform ASCII into Commodore screen codes, so don't forget to add -O PETSCREEN to select the proper character set when assembling. We then pass a couple special options to pucrunch when generating the compressed single-parter.

% xa -O PETSCREEN -o dumple.o runner.xa
% pucrunch +f -ffast -c64 -g54 -x896 dumple.o 1doodle
Discarding execution address 0x080d=2061
Load address 0x0340=832, Last byte 0xb560=46432
Exec address 0x0380=896
New load address 0x0801=2049
Interrupts enabled and memory config set to $36 after decompression
Runnable on Commodore 64
Checked: 45601 
Selecting the number of escape bits.. Selected 2-bit escapes
Optimizing LZ77 and RLE lengths...
Selecting LZPOS LO length.. Selected 8-bit LZPOS LO part
In: 45601, out: 18051, ratio: 39.59% (3.17[3.12] b/B), gained: 60.42%
Gained RLE: 12424 (S+L:1119+11305), LZ: 15780, Esc: -320, Decompressor: -332
Times  RLE: 187 (182+5), LZ: 4662, Esc: 512 (normal: 7920), 2 escape bits
Saving C64 fast
1doodle uses the memory $2d/$2e, $f7-$1cd, $200-$234, and $0340-$b56a.
Compressed 45601 bytes in 0.02 seconds (2113.24 kB/sec)

This makes the cruncher ignore the existing entry point and use our trampoline in low memory as the execution address, as well as setting banking correctly on exit. As part of the process, it will splash on our message as it decompresses it.

The best tools are the ones that aren't in your way.

No comments:

Post a Comment

Comments are subject to moderation. Be nice.