Saturday, February 10, 2024

CAP-X and COMP-X: how the Tandy Pocket Computers got a sucky Japanese assembler

I grew up primarily with the Commodore 64, where if you wanted to do anything really cool and useful, you had to do it in 6502 assembly language. Today I still write 6502 assembly, plus some Power ISA and even a little TMS9900. I like assembly languages and how in control of the CPU you feel writing in one. But you know what would make me not like an assembly language? One that was contrived and not actually the CPU it was running on. And you know what would make me like it even less? If it were kneecapped, convoluted and limited without even proper I/O facilities.

But this particular odd little assembler dialect had the bureaucratic weight of the Japanese government behind it, because in 1969 what was then the Ministry of International Trade and Industry (MITI, 通商産業省) developed a completely artificial processor architecture to help ensure everyone taking the Information Technology Engineer Examination (情報処理技術者試験) would do so on an even keel. No one would have been an expert in this architecture or how to program it because we just made it up, reasoned the Ministry, so therefore no one will have an unfair advantage on the test.

Of course, that lasted only a few years before the specifications got out, and soon afterwards a handful of Japanese manufacturers had added it to their computers as a feature — including their pocket computer line. Through the magic of Tandy badge engineering, two of them made it to Radio Shack stores in the United States in the mid-1980s, perplexing a generation of larval nerds like me who couldn't understand what the heck it was doing there. While it was no secret the Tandy PC-5 and Tandy PC-6 Assembler feature was a fake, few people knew its history or ever did a detailed exploration. Let's dig into the dark and gloomy corners of this utterly bogus virtual CPU that a few real computers ran — sort of — and write our own cross-assembler and virtual machine so that future geeks can be just as befuddled.

In 1970 (昭和45年) the Japanese National Diet passed the Promotion of Information Technology Act (情報処理の促進に関する法律), which among other changes formalized a certification examination for "information technology engineers" (情報処理技術者). This test, in an updated form, is still administered today and is the second-most taken national exam in Japan after the driver's license examination, with as many as half a million people sitting for it annually. Those who pass the Information Technology Engineer Examination (henceforth ITEE) receive an Information Technology Engineer Examination Certificate (情報処理技術者試験合格証書) from the Minister of what is now called the Ministry of Economy, Trade and Industry (経済産業省), the successor to MITI. Originally created by then-MITI and today maintained by METI's subordinate Information Technology Promotion Agency, the ITEE nowadays is administered in two halves over an entire day and comes in four skill levels. While an ITEE certificate is not a legal requirement to do IT work in Japan, it is considered very helpful and a gateway to other exams or educational opportunities, and public institutions hiring such positions will often list one or more of the levels (and/or subcategories) as a prerequisite. A smattering of countries outside Japan even accept it as a credential.

Unlike many industry certification exams, however, the ITEE doesn't primarily deal in specific vendors or platforms; instead, it seeks to test more generalized knowledge such as strategy, project management, data structures and algorithms. There were only two tiers when the test was first established in 1969 (prior to the Act), namely class I (advanced) and class II (basic), today combined as subcategories in the modern level two exam.

As part of the test, both exam classes allowed you to select questions in the programming language you felt most competent at, which at that point consisted of Fortran, COBOL, PL/I, ALGOL and assembly language. To eliminate an examinee's knowledge of a particular CPU as being an advantage (or disadvantage), the assembler questions instead used a simple and completely contrived virtual machine that the examinee was expected to learn on the fly. This virtual machine was called COMP-X, and its mnemonics and assembler syntax were collectively called CAP-X.

While COMP-X and CAP-X were always optional on the class II exam and you could skip it by choosing one of the other languages, in 1977 it became mandatory for the class I. The pass rate was never very high (even today it hovers around the ten percent range or so) and until 1979 there wasn't even a reference text for COMP-X that you could study from. You only got to see it the day of, and the only computer that it ran on then was between your ears with a pencil as your co-processor. Since you were doing all the number crunching on paper, there weren't any I/O instructions either since they wouldn't be needed to answer the exam questions.

Probably the first actual computer that could run CAP-X code — in a simulator, mind you — was one of the OKITAC-4300 line, which stood for Oki Transistor Automatic Computer. Oki Electric's OKITAC series was diverse, starting in 1960 with the 4000-transistor OKITAC-5080, the first Japanese domestic computer to use core memory, and expanding to large systems and mainframes including the planned 1971 OKITAC-8000 that was 32-bit, dual CPU and supported up to a full megaword of total memory to compete directly with IBM's System/360.

However, although it was completed, the 8000 couldn't be released due to an existing joint venture arrangement with Sperry Rand (now part of Unisys), with which it would compete also. During the 8000's ultimately doomed development Oki determined that the minicomputer market could give them another way out. The 1968 OKITAC-4300 was the first of these smaller machines, a 16-bit system supporting up to 32kW of core memory and running at about 667kHz, and retailing for the comparatively inexpensive equivalent of around US$10,000 (in 2024 about $88,000). It became a successful architecture for Oki and 4300-class systems were manufactured well into the early 1980s, the last of which being the 1980 4300c shown here in a 2010 Oki product retrospective.

This particular machine is a 1974 OKITAC-4300b, the first with an LSI CPU, offering multiplication and division instructions standard instead of an option. It was also the first OKITAC-4300 system that could use conventional RAM chips instead of core, though the core variant was noticeably faster (1.667MHz compared to 1.429MHz). As a public service, Oki Electric partnered with a study group in Akashi in October 1979 to develop a COMP-X simulator on a 4300b and allow study group members to send in answers to practice questions for execution. COMP-X and the OKITAC LSI CPU weren't especially similar architecturally other than both being 16-bit and having some similar minicomputer idiosyncrasies — which we'll discuss — but the CAP-X instruction set was small enough to make a full software model practical. Because there was no other way to remotely debug sample code, the simulator added unofficial read and write instructions to directly deposit and examine data in COMP-X registers.
With the specification now public in major Japanese computer magazines like I/O, there were certainly other software implementations for the home and personal computers of the time such as this one for UCSD Pascal, but the architecture became a potential selling point for other small systems — really small ones. Indeed, probably CAP-X's most famous commercial implementations were in pocket computers.

It might be instructive first to talk about pocket computers as a class. Pocket computers and handheld computers occupied a strange niche in the 1980s and very early 1990s sandwiched between calculators and more conventional portable computers (laptops, portable workstations and so forth), largely using bespoke low-power CPUs and small, sometimes single-line LCD screens. There is somewhat of a continuum between pockets and handhelds; pockets tended towards portability at the cost of keyboard or computing power while handhelds tended towards computing power at the cost of size, so on the "definitely a pocket computer" side would be things like the Casio PB-100 and Sharp PC-1250, and on the "definitely a handheld computer" side would be things like the Kyotronic 85 family (TRS-80 Model 100, NEC PC-8201A, etc.), the Texas Instruments CC-40 and the Canon X-07. However, there were larger pocket computers powerful enough to rival handhelds such as the Sharp PC-1500, the Casio PB-2000C, the Panasonic HHC-4 and the Texas Instruments TI-74, and a few full-size handhelds that had computing capabilities more typical of pocket computers, like the VTech Laser 50. As with most computers of this era, the vast majority of both computing classes provided BASIC as their primary programming language.

While many companies made pocket-class machines, Sharp and Casio were probably the best known and most prolific, and their devices are the ones most commonly encountered in the West. They were also cloned and rebadged both domestically and abroad, and both companies made pocket computers that implemented CAP-X compatibility as a secondary feature. Sharp's 1985 PC-1440 was a 4K device (3500 bytes free) using the 8-bit Hitachi SC61860 CPU employed in some of Sharp's other systems. The PC-1440's CPU could be directly programmed with BASIC PEEK, POKE and CALL, similar to their bigger and more powerful PC-1500, but the only advertised assembly language feature on the PC-1440 was CAP-X. In 1986 Sharp made a cheaper variant with slightly less memory, the PC-1416G. Both systems prominently badged their case with their CAP-X capability but neither of them was widely seen outside of Japan.

Casio's units became much better known, however, not least of which because of who imported them. These early-generation Casio pocket computers were based around the Hitachi 8-bit HD61700 and 4-bit HD61900 microcontroller families, such as the popular PB-100 and its many relatives which used a 200kHz HD61913. The microcontrollers incorporated the CPU, mask ROM, I/O (primarily for the keyboard) and an LCD controller into a single monster IC, with most of the RAM usually in separate chip(s). Casio's CAP-X compatible line started with the 1985 FX-770P, a clamshell folding device with 2K of RAM, expanding to the 1985 FX-780P with 4K, the FX-781P with 2K again (but with a 2K expansion option), the FX-785P also with 2K (but an 8K expansion option), the 1986 FX-790P with 8K (and the 8K option), and the FX-791P with 10K (and the 8K option also). All of these machines had exactly the same form factor and LCD with mostly the same onboard software and keyboard, differing only in Data Bank features in the FX-785P and up and memory size. Unlike Sharp's entries, however, the Casio devices carried no mention of CAP-X or COMP-X on the case, merely that they had an "Assembler" mode.

I mentioned that both Sharp and Casio pocket computers were cloned and rebadged, and at least in the United States and for most countries that had a Radio Shack retail presence, the most significant of these rebadgers by far was the Tandy Corporation. While Tandy did have internal development resources, they generally preferred to expand their product line through aggressive badge engineering instead, selling a smattering of relabeled import consumer electronics direct to customers in Radio Shack stores. Tandy's first pocket computer, the 1980 PC-1, was a rebadge of the Sharp PC-1211, the direct successor and upgraded RAM version of the PC-1210, the first "true" pocket computer. Tandy selected two more Sharp units to rebadge (and one more several years later) before migrating to Casio's cheaper line in 1983 with the PC-4 (PB-100) and finally the 1985 PC-5 and 1986 PC-6 (from the FX-780P and FX-790P respectively), the units that came to puzzle young me reading about them in the Radio Shack catalogue. And that's how CAP-X came to America.

Tandy's interest probably most came from the FX-770P series' laptop-like form factor (here showing the PC-5/FX-780P compared with an M1 MacBook Air), but as 2K of RAM wasn't much of an upgrade over the 1K base PC-4 which Tandy was still selling, the company instead chose to start with the midrange 4K FX-780P. The arrangement was particularly convenient for Tandy as Casio had fitted the FX-770P series with the same I/O port as the PB-100 and relatives, so the same tape interface and printer Tandy was already selling for the PC-4 would work for the PC-5 and PC-6, though they had to use a rather clumsy link cable to connect them. The FX-780P was sold officially as the Tandy Pocket Scientific Computer PC-5 to emphasize its scientific and statistical features (as was the PC-6), and the PC-5 will be the unit we'll concentrate on today since it was the first on these shores.
Unfortunately, you can't really use it like a laptop (compare with the Tandy 600) because the alphabet, symbol and some function keys are on an obnoxious membrane on the top, with number, scientific and the other function keys on more proper Chiclets on the bottom. This really makes thumb typing difficult, by the way.

It's also not advisable to put it totally flat because the tiny hinge covers on these things are under enormous tension, many have cracked already, and they're exceptionally difficult to repair. Once they do crack, the hinge pin won't stay in and starts putting pressure on the ribbon cable between the two sides (I've had this happen to me as well). Be sure either to put finger pressure over the hinge covers as you open it to reinforce them, or cover them with a strip of sturdy tape connecting the two halves. Don't you keep the Commandments?

The PC-5 has 4K of RAM on board of which 3,552 bytes are available to the user (the 8K PC-6 in the base configuration has 7,520 bytes free). A memory-remaining counter appears during program entry. The 3,552 bytes can be used between the 10 BASIC program spaces, which serve more or less as a primitive fixed filesystem, or the Data Bank, which is effectively an on-board text editor where lines can also be optionally treated as records with comma-separated fields.
The FX-770P family has two HD61747 microcontrollers in the top half that operate independently, the left-most CPU (the lower "B" number, indicating its particular mask ROM identifier) being the primary processor. It should be noted that this is not a multiprocessing system because only one CPU is running at any given time, regularly switching to the other. This is not an unusual thing to find in pocket computers when there are significant I/O or ROM requirements: for example, the O.G. Sharp PC-1210 and PC-1211 have two CPUs as well which also switch back and forth because there wasn't enough ROM capacity in each individual microcontroller by itself. Each HD61747 contains its own ROM, controls particular parts of the LCD and services particular keys, communicating with the other CPU through reserved memory locations in the common RAM pool.

In the PC-5/FX-780P, the HD61747B10 on the left is the main CPU and contains the ROM for BASIC. It services the lower keyboard and maintains LCD character positions 1-6 and 13-18. The HD61747B11 on the right is the secondary CPU and contains the ROM for the calculator, CAP-X/COMP-X and the Data Bank and driving external I/O devices. It services the upper keyboard and maintains LCD character positions 7-12 and 19-24. The PC-6/FX-790P has HD61747B20/B25 CPUs which are functionally equivalent to the B10/B11 except for different ROMs; the PC-5 lacks the Data Bank search mode and obviously has a lower memory size and ceiling. (Compare with the pathetic Tandy PC-7 which has only one CPU, and thus lacks half the ROM, half the display and many of the PC-5's built-in features — including the I/O port!)

The lower half contains RAM, interface logic, the I/O connector and the power supply circuit. In the PC-5/FX-780P, RAM is provided by four HD61914C SRAM chips containing 1K each. The HD61914 is also the RAM chip used in the 4-bit PC-4/PB-100; despite the HD61747 being an 8-bit CPU internally, it only exposes four data lines on the bus which is the same as these accept. The lower board in the PC-5 has an expansion connector that would seem to allow more RAM, probably with something like a 4K OR-4, but the connector is blocked off with a black plastic adhesive pad which also covers two of the SRAMs. I removed the pad for this photograph. The PC-6 has a different connector here for Tandy's 8K OR-8 equivalent and uses a single Hitachi HM6264A 8K SRAM instead of eight HD61914s for its base memory.
Although the Sharp-derived Radio Shack TRS-80 PC-2 (PC-1500) had documented PEEK, POKE and CALL commands in BASIC to run machine language programs on its 8-bit Sharp LH5801 CPU, the only pocket computers Radio Shack advertised as having an assembly language feature were the PC-5 and PC-6 — which as we know by now has bupkis to do with their actual processors.

I think I've tantalized you enough now with the history, so let's finally get to the COMP-X virtual machine itself. I've included photographs of pages from the US domestic Tandy PC-5 manual. Amusingly it comes straight out and says that the assembler is a simulation, but nowhere does it mention where the assembly language dialect comes from — and why would it? Hardly anybody in the United States would have even heard of this exam!

The COMP-X virtual machine is a 16-bit von Neumann architecture using signed twos-complement arithmetic in which program code and data share the same memory and all memory access is by word (no bytes). Although by convention the most significant bit of a COMP-X 16-bit word is numbered 0, there is no way to directly access the internal structure of a word in pieces, and the VM therefore has no intrinsic endianness. With the exception of a 1-bit condition code register (CC), which is really just a flag containing the sign bit from adds and subtracts, all registers are 16-bit as well. More or less visible to the programmer are the condition code register (CC); the sequence counter (SC), better understood as the program counter; the base register (BR), which is used as the MSB of the effective address — more on that shortly; and three general purpose registers numbered GR0 through GR3. There are no floating point registers, no vector registers and no stack pointer, and things like interrupts and NMIs are not defined in the spec. Execution begins where you tell it to and the initial contents of the GPRs and memory are undefined (in practice no implementation clears them).

Two additional 16-bit registers are really an implementation detail and aren't observable by user programs: the instruction register (IR) contains the value of the currently executing instruction, and the operand register (OR) is internal storage for words fetched from memory prior to operation. Since this is all crap anyhow, we'll go ahead and model these accurately in our reimplementation later, but they could just as easily be elided away.

The instruction set is extremely simple and strongly influenced by minicomputer architectures of the era, with which it has many presumably intentional similarities. (A thought about this at the very end.) There are only 14 major instruction types supported by the VM, and originally just twelve. I should note that the history of CAP-X/COMP-X prior to 1979 or so is very murky, so for convenience I've chosen to treat the spec as springing fully formed from the head of the Emperor even though it may have gone through various revisions now lost to history. The architecture is not canonical load/store because one of the operands usually comes from memory (i.e., the OR), with the exception of the I/O, shift (SFT) and load immediate (LAI) instructions that use immediates. Addition, subtraction and bitshifts are all signed.

Despite its simplicity, this set is enough to be Turing-complete and to do all standard operations (multiplication, for example, is just repeated addition, and division is repeated subtraction), though of course doing so is not necessarily efficient. One of the biggest gaps is a complete lack of register-to-register transfers, invariably requiring a trip to memory. For logical operations, the VM offers only logical-AND and exclusive-OR, which are not functionally complete by themselves and can't compute all possible Boolean operations. With logical-NOT and logical-AND we can be functionally complete (effectively logical-NAND, which is functionally complete all on its own), and we can use exclusive-OR with a true value to implement logical-NOT, but we have to have a constant and/or burn a register for that (load it with all one bits). Likewise, comparisons can be implemented with subtraction, but that also requires burning a register to receive the result. I'll show you some examples in a moment.

I mentioned that the architecture lacks a hardware stack, but this wasn't unusual in minicomputer architectures of the time either, such as the famous DEC PDP-8. For subroutine calls the PDP-8 gets around this problem by depositing the return address in the first word of the destination routine, and returns jump back out through that value. COMP-X is somewhat more advanced in that the JSR opcode lets you designate any of the four GPRs as a link register; the current program counter plus one is put into the specified register and the new program counter and base register are loaded from the specified word in memory. However, also like the PDP-8, this scheme doesn't allow a subroutine to be recursive without additional work. To return from a subroutine, the temporary link register needs to be deposited in memory somewhere so that JSR (which does double duty as the call instruction and the return instruction — in returns, the address previously stored to is provided as the branch target and the new "link register" is never stored anywhere) can branch to it, and if this location does not change between calls, it will be overwritten by later calls just like the PDP-8's will.

Three of these instructions are unique to COMP-X. The HJ (Halt and Jump) instruction ceases execution, leaving the program counter pointing to the specified location. In some of our examples we'll abuse this feature to serve as a return value. The other two, READ and WRITE, display and deposit the contents of the specified GPR to the screen in the radix requested (only decimal and hexadecimal were supported). On the pocket computer implementations, these instructions necessarily pause execution to either allow data entry or to make sure you can read the value on the single line LCD before it gets replaced by something else. They were the only I/O opcodes in the instruction set after their appearance in the OKITEC-4300b simulator, meaning you can't write a COMP-X "hello world" program in the traditional sense (we'll address this). To the best of my knowledge, the I/O instructions never actually appeared on any version of the ITEE and were only ever artifacts of the unofficial computer implementations.

Instructions, like data, are 16 bits wide. The opcode is in the most significant nybble, then the GPR being referenced (2 bits), then the GPR being used as an index (2 bits), and then an eight-bit operand, which is usually an address. Like the much later PowerPC, COMP-X has a mscdfr0 "means something completely different for r0" phenomenon in that GR0 can never be an index register: if the specified index register is 0, then the index is literally 0, not the value of GR0 (like those instructions in Power ISA where specifying zero for the register is treated as zero, such as addi).

You might wonder how an eight-bit address can specify a 16-bit address, and the answer is the base register. Although the base register is notionally 16 bits too, its least significant byte is always zero. Once the address is computed from the address field and the index register (or 0), then its LSB is combined with the MSB of the base register to yield the effective address. As it happens, only a jump to a subroutine with JSR can change the base register; no other instruction can modify it. This seems to be a concession to allow subroutines some sort of data protection from their callers (that is, as long as they aren't called recursively or reentrantly), since one page can't access another page directly without calling into that page, but it also has some important implications when we get to edge cases in the VM like when an address or the program counter wraps.

However, you won't need much of the base register's range anyway because the memory model in the Casio units is very constrained. Only up to 512 words' addressing space is supported on the 4K PC-5 regardless of how many bytes are currently available (i.e., just two pages), and even the 8K PC-6 only supports up to 2048 words (eight). Plus, you'll only get that on either machine if that much memory is actually free — if you have less than 512 bytes of free memory, the assembler will refuse to run at all! (At least memory expansion can help on the PC-6.) When we get to reimplementing the VM later in this article, you'll be able to use a full 64kW.

Because the 8-bit load immediate instruction is unsigned and clears the MSB of the destination register, and there is no way to directly OR or EOR a second immediate value after a bitshift (a la the load upper-lower or load-shift-or pattern many RISCs use), loading a full 16-bit value or any negative value will always require a fetch from memory. In practice it's easier just to load all but small trivial non-negative values from a separate constant pool.

The CAP-X assembler format is similar to that of other assemblers, with fields for the line label (three characters max and the first one must be a capital letter), the instruction mnemonic, the referenced GPR, the index GPR and an eight-bit immediate or address. However, unlike many other assemblers, the colon separates fields, and whitespace other than newlines is ignored — in our examples here we'll line things up nice but this is not required and in fact wastes memory. Memory pressure also explains why comments aren't supported either.

Not all fields need be specified: the index GPR is optional and interpreted as zero if left out, and if the label field is left blank, no label is generated (though you still need the leading colon). As such, the particular program line in this photograph defines a label L1 at the current address and is a load instruction putting the contents of memory location M (previously defined as a label) into GR0. If there were an index register, it would follow the address in CAP-X even though it's not encoded that way in the instruction.

CAP-X also specifies a handful of assembler pseudo-ops for setting code location (START), marking the end of subroutines and main code (END), reserving memory words (RESV), and embedding constants (CONST for literal constants, ADCON for addresses and labels) in the emitted object code. Strangely, while CONST takes its argument in hexadecimal, the other pseudo-ops take arguments in decimal.

On the Casio systems, the Data Bank is how CAP-X assembler lines are entered, using it in this case as the internal text editor (clear it in MODE 1 with NEW#; press MODE 9 to enter the editor). The assembler assumes everything in the Data Bank is program text, even after an END, so don't store your phone numbers in the machine you're using for your white-hot COMP-X application or your source code won't parse.

This is a good time for some examples.

   :START:0
GO :READ :0:10
   :READ :1:10
   :ST   :1:TAD
   :ADD  :0:TAD
   :WRITE:0:10
   :HJ   :0:GO
TAD:RESV :1
   :END  :GO

This program begins assembly (START) at location 0. It asks the user for two signed base-10 numbers (numbers out of range or other characters don't cause an error; the VM will just ask you again), depositing them in GR0 and GR1. Because it's not possible to add GR1 and GR0 together directly, we store GR1 to a temporary location we've defined (TAD, marked by the RESV pseudo-op which inserts the specified number of zero words, here one), and then add its contents to GR0, display it to the user, and halt. The END pseudo-op provides the default starting address to the VM and indicates the end of assembly; all code must include it at the end. Once entered in the Data Bank, we can now press the Asmbl key to enter the monitor.

When the Asmbl key is pressed, all lines in the Data Bank are scanned and converted to COMP-X bytecode, and if assembly is successful then this menu will appear (Go/Dump/Source/Cal). The number at the beginning is the number of pages reserved for the VM, which on the PC-5 will always be one or two, but on the PC-6 may be up to eight. The options, in order, start execution (G key), show either memory (as words and disassembled instructions) or the state of the registers (D key), return to the Data Bank editor to rework the source (S key), or return to calculator mode (C key).
If we press G for Go, it will ask for the starting address, defaulting to zero (since the label GO is at word zero, and that's what we provided to END), and then run the program.
Asking for the first register.
And, after the second one is entered, the sum is reported, and the PC-5 returns to the monitor.

Let's try a more substantial example. This one is a modified version of an example from the manual, computing the greatest common divisor of two values.

   :START:0
L0 :READ :0:10
   :READ :1:10
   :ST   :0:M
   :ST   :1:N
L1 :LD   :0:M
   :SUB  :0:N
   :JNZ  :0:L2
   :LD   :0:M
   :WRITE:0:10
   :HJ   :0:L0
L2 :JC   :2:L3
   :LD   :0:N
   :SUB  :0:M
   :ST   :0:N
   :JC   :3:L1
L3 :ST   :0:M
   :JC   :3:L1
M  :RESV :1
N  :RESV :1
   :END  :L0

After reading the desired values into locations M and N (I use separate registers so that the prompt changes when an acceptable value is entered), the program does the equivalent of comparing M and N by a subtraction, putting the result in GR0. If GR0 ends up zero, then M == N, and no further iterations need to be made; the JNZ falls through to the following instructions in which the contents of M are shown (the greatest common divisor for both values is necessarily in both locations because they're equal) and the program ends. If the result is 1, then they had no other divisors in common.

Otherwise, the CC flag is set from the most significant bit of the result, i.e., its sign. If the CC flag is clear (tested by option 2 to the jump conditional instruction), then the result of M - N was positive, meaning M is greater than N, and it jumps to the code at L3 storing the current value of GR0 (that is, M - N) back into M for another iteration (option 3 to the jump conditional instruction means an unconditional jump). Otherwise, M must have been less than N, so the code then computes N - M into GR0 instead and stores GR0 into N for another iteration, looping until M == N as above.

How swiftly does all this run? While the program flow seems convoluted, if the runtime was particularly efficient it should be possible to run the simpler COMP-X opcodes faster than BASIC tokens, which can sometimes be quite complex. Let's write a version of this program that runs 100 times and time it, using a worst-case set up like 28672 and 17 (they have no common divisor other than 1).

   :START:0
LI :LAI  :1:100
L0 :LD   :0:OM
   :ST   :0:M
   :LD   :0:ON
   :ST   :0:N
L1 :LD   :0:M
   :SUB  :0:N
   :JNZ  :0:L2
   :SUB  :1:ONE
   :JNZ  :1:L0
   :LD   :0:M
   :WRITE:0:10
   :HJ   :0:LI
L2 :JC   :2:L3
   :LD   :0:N
   :SUB  :0:M
   :ST   :0:N
   :JC   :3:L1
L3 :ST   :0:M
   :JC   :3:L1
M  :RESV :1
N  :RESV :1
OM :CONST:7000
ON :CONST:0011
ONE:CONST:0001
   :END  :LI

The first time I did this I sat there for several minutes. Eventually I got impatient, stopped the program (there's a BRK key) and changed the loop counter at LI to just 2.

Almost 52 seconds. That's absolutely potty. What about BASIC? Here is a simpleminded conversion of the above to make the operations as similar as possible. To make the fight fair, we'll also cap it to two iterations.

1 L=2
2 M=28672:N=17
10 G=M-N:IF G≠0 THEN 20
11 L=L-1:IF L>0 THEN 2
12 PRINT M:END
20 IF G>0 THEN 30
21 G=N-M:N=G:GOTO 10
30 M=G:GOTO 10

You should be able to identify the same steps we're taking (I'm also modeling the same idea of putting math results into an intermediate register like what COMP-X has to do) and the same sections of code. Although a relatively vanilla BASIC, Casio Pocket Computer BASIC was notable for having actual characters for not equals, greater than or equal to, etc., instead of the more typical BASIC compound operators.

Labouriously keying it in, which really makes you wonder what Casio was thinking with the split keyboard approach because for most of these lines you have to jump back and forth between halves. On the other hand, it did maintain the shifted quick entry tokens of earlier units like the PB-100, so typing is faster than it looks.

Anyway, some of you will be expecting the big reveal that BASIC will run rings around COMP-X. Heck, I was expecting that. Drum roll please.

Drum roll please.

Drum roll ... is this thing done yet?

It's not even close. At 52 seconds versus 346 seconds, COMP-X does the same computations nearly seven times faster. There are obvious ways to improve the BASIC code (on a quick first pass we can rework a couple unnecessary jumps) but it's unlikely even a maximally optimized version would run anywhere near as quickly as the COMP-X original. Let's hear it for completely contrived bolted-on virtual machines!

That said, while both programs are interpreted, they're both also running on a deliberately underpowered set of CPUs to eke out as much battery life as possible, so we shouldn't be surprised their respective run times absolutely stink. If you want the answer quickly, go buy a Cray.

You'll have noticed from this example that the CC flag is not a carry flag (it's the sign bit), so how is overflow dealt with? Let's answer that question.

   :START:0
L  :LD   :0:M
   :ADD  :0:M
   :WRITE:0:10
   :HJ   :0:L
M  :CONST:7FFF
   :END  :L

That's right: it throws a fatal exception. The current value of the sequence counter (program counter) is reported for debugging purposes. (The result would naturally be FFFE or -2, not +65534.) The same thing happens if we add another constant for 1 and add that instead of itself, or if we underflow with subtraction instead — anything that would change the sign unexpectedly will get an error. Bafflingly, the manual says that "[o]verflow will be disregarded in this case." Which overflow would that be, exactly?

I get why they did this and it eliminates a particular class of bugs, but it also makes wider sums much more difficult. If we want an unsigned add with carry so we can handle larger numbers, we'll need to dance around those operations that could cause the VM to fault. Here's one solution, which I don't claim is optimal nor covers all edges. It ass-U-mes you are adding one unsigned value to another. Hold on to your ankles:

   :START:0
L  :READ :1:16
   :ST   :1:M
   :AND  :1:BM1
   :ST   :1:TM1
   :READ :0:16
   :ST   :0:N
   :AND  :0:BM1
   :ADD  :0:TM1
   :ST   :0:TM1
   :AND  :0:BM1
   :ST   :0:TM2
   :LD   :0:TM1
   :SFT  :0:8
   :ST   :0:TM1
   :LD   :0:M
   :SFT  :0:8
   :AND  :0:BM1
   :ST   :0:TM3
   :LD   :0:N
   :SFT  :0:8
   :AND  :0:BM1
   :ADD  :0:TM3
   :ADD  :0:TM1
   :ST   :0:TM3
   :AND  :0:BM1
   :ST   :0:TM1
   :SFT  :0:8:1
   :EOR  :0:TM2
   :LD   :1:TM1
   :AND  :1:BM2
   :JNZ  :1:FIX
X  :LD   :1:TM3
   :SFT  :1:8
   :AND  :1:ONE
   :WRITE:0:16
   :WRITE:1:16
   :HJ   :0:L
FIX:EOR  :0:NEG
   :JC   :3:X
M  :RESV :1
N  :RESV :1
BM1:CONST:00FF
BM2:CONST:0080
NEG:CONST:8000
ONE:CONST:0001
TM1:RESV :1
TM2:RESV :1
TM3:RESV :1
   :END  :L

We ask for the arguments in hex this time, since we're dealing with unsigned values (if we did it in base 10, it would expect signs). This routine effectively does the addition in two steps byte by byte as we cannot directly add the numbers together. We start with the LSBs of each value by masking off their MSBs and store the sum, mask it again to generate the LSB of the answer, then shift down the MSB of the original result as the carry. We then move to adding the MSBs of each value (by shifting them down), masking them to their LSBs since the shift is signed, and also adding the carry from the LSBs at the same time.

At this point you'd think the remainder is merely a matter of then shifting the MSBs back into position and OR-ing in the 8-bit sum of the LSBs (we use an EOR, which is fine because the shift leaves a clean lower half). However, remember shifts are also signed, which would obliterate the high sign bit because these values are all "positive," so after the shift-and-EOR we look back at what the MSB result was as a byte value. If this byte is 128 or higher, then we know we need to restore the most significant bit, which we EOR on at FIX. The routine ends with the low 16 bits of the sum in GR0 and the 17th bit in GR1, which we mask off all but the least significant bit of before displaying the two result registers.

What a pain! Other thoughts solicited in the comments. Anyway, the usual edge cases seem to work, like 8000 + 8000 correctly equals 1 0000 (as does FFFF + 0001), FFFF + FFFF = 1 FFFE, 7FFF + 1 = 0 8000 and 7FFF + 7FFF = 0 FFFE. Subtraction is left as an exercise for the supremely masochistic reader.

Since we've started stepping into the bloodied buzzsaw of edge cases, let's look at some other dark corners of CAP-X/COMP-X. (I should also note that the behaviour we will document in this article is observed on the Casios, not on the less common Sharp models, and it is possible some of the edge cases could behave differently.)

We know we can change locations we reserve in the assembler source, but can we modify our own program code? It turns out we can, and although self-modifying code is understandably discouraged in modern practice, in a language this constrained the ability to do so can simplify some tasks. First, let's see what happens with an illegal instruction. There are no COMP-X instructions with an opcode nybble of 7 or 9, but we can make one with CONST.

   :START:0
L7 :CONST:7000
L9 :CONST:9000
   :HJ   :0:0
   :END  :L7

If you start this with Go to either label L7 or L9, the result is the same except for where the fault is:

Our next test is to substitute an invalid instruction with a valid one. We know this will crash if it hits the bogus instruction, but if the halt-jump replaces it, it should terminate normally (with SC set to 65).

   :START:0
L  :LD   :0:OPC
   :ST   :0:PTR
PTR:CONST:7000
OPC:HJ   :0:65
   :END  :L

And it does (we can see the result of SC by either dumping the registers or "going again," which I did here, which will use the current value). Likewise, we get a crash by replacing a valid instruction with an illegal one, basically reversing the two instructions in the same code:

   :START:0
L  :LD   :0:OPC
   :ST   :0:PTR
PTR:HJ   :0:65
OPC:CONST:7000
   :END  :L

This faults, as expected.

Let's next look at the edges around effective addresses: we can modify an address with an index register, so what happens when the result wraps? That brings us back to the base register and how page segmentation is implemented. In the typical case, we start with a pointer and modify it with an index, and if it all stays in the same 256 word page, then everything occurs as expected. For example,

   :START:0
NP :CONST:0001
W  :CONST:0000
W2 :CONST:0000
   :START:128
L  :LD   :1:NP
   :LAI  :0:65
   :ST   :0:W:1
   :HJ   :0:L
   :END  :L

(Yes, you can have multiple STARTs. You can even have them overlap: later code will overwrite previous code, which doesn't seem like a good idea.) Recall that GR0 can never be an index. We load GR1 with the index offset (1) and GR0 with a sentinel value (65), then store GR0 to location W with GR1 added to it. At the end W remains 0 and W2, the word one up (W + 1), is 65. We can prove that by dumping that memory location from the monitor ("object" in this case refers to memory):

This remains valid even if you pass it a negative index. If we changed NP to FFFF (i.e., 65535, or -1 as a signed 16-bit integer), then the effective address will be zero, and the program will overwrite its own constant!

What if W and W2 are in separate pages? I mentioned that this is where the base register comes into play. Only the least significant byte of the computed address is used; the most significant byte comes from the base register. That means if you run

   :START:0
NP :CONST:0100
W  :CONST:0000
   :START:128
L1 :LD   :1:NP
   :LAI  :0:65
   :ST   :0:1:1
   :HJ   :0:L1
   :START:256
   :CONST:0200
W2 :CONST:0000
   :END  :L1

you can't use the pointer NP to access memory in the next page; the store will occur to W and not W2 even though our index is 256. The same thing happens if you do

   :START:0
NP :CONST:0002
W  :CONST:0000
   :START:128
L1 :LD   :1:NP
   :LAI  :0:65
   :ST   :0:255:1
   :HJ   :0:L1
   :START:256
   :CONST:0200
W2 :CONST:0000
   :END  :L1

or even

   :START:0
NP :CONST:00FF
W  :CONST:0000
   :START:128
L1 :LD   :1:NP
   :LAI  :0:65
   :ST   :0:2:1
   :HJ   :0:L1
   :START:256
   :CONST:01FF
W2 :CONST:0000
   :END  :L1

That brings up this interesting situation where a pointer to word 513, which should be illegal on a PC-5, actually turns into a pointer to word 1 and overwrites itself. This program terminates normally, but will have corrupted its own bytecode in the process.

   :START:0
L  :LD   :0:PTR
   :ST   :0:PTR
   :HJ   :0:0
PTR:CONST:0201
   :END   :L

The wrapping works even if the resulting effective address is negative.

   :START:0
W  :CONST:0000
NP :CONST:FFFF
   :START:128
L  :LD   :1:NP
   :LAI  :0:65
   :ST   :0:W:1
   :HJ   :0:L
   :START:255
W2 :CONST:0000
   :END  :L

This doesn't fault either despite the fact that in signed math 65535/FFFF equals -1, making the effective address 0 + -1 == -1. The access remains clipped to the current page regardless, so at the end of execution, W2 at word 255 (that's hex FF for everybody keeping score at home) is 65.

There is simply no way to write to (or, for that matter, read from) another page outside of the page currently executing. So ... how about we make the program counter wrap? Unfortunately, we get our first clue this doesn't make CAP-X happy when we try to simply generate code straddling a page boundary.

   :START:255
L  :LAI  :0:0
   :HJ   :0:L
   :END  :L

This won't even assemble; the message Error1 generally comes up when we've overflowed a particular tract of memory, such as the label space or the current page.

This variant will assemble, and has a halt-jump at 256. However, you'll also notice that I've also put a halt-jump at 0 with a different SC.

   :START:0
   :HJ   :0:63
   :START:255
L1 :LAI  :0:0
   :START:256
   :HJ   :0:65
   :END  :L1

The program starts at word 255. If we can wrap into the next page, we should get a final SC of 65; if we can't, the SC at 0 will be 63 (numbers picked randomly and have no significance otherwise). Wanna guess what SC will be at the end? It indeed ends cleanly, but SC is 63, not 65. This wrapping might seem inexplicable, but it was also seen on other heavily page-oriented CPUs like the Texas Instruments TMS1000, which limits execution to the page in the page buffer register.

What if we try to use an index register to springboard us into the next page? Same thing:

   :START:0
   :HJ   :0:63
   :START:253
L1 :LD   :1:NP
   :JC   :3:0:1
NP :CONST:0100
   :START:256
   :HJ   :0:65
   :END  :L1

Here, we're adding 256 to the address of zero passed to the unconditional jump (JC:3). The program terminates cleanly, but SC is still 63, not 65.

   :START:0
   :HJ   :0:63
   :HJ   :0:64
   :START:253
L1 :LD   :1:NP
   :JC   :3:1:1
NP :CONST:00FF
   :START:256
   :HJ   :0:65
   :HJ   :0:66
   :END  :L1

What if we just give it 255 so it all fits in the low byte and then add 255 to 1 in the unconditional jump? No dice, SC is 63 (not 64, 65 or 66 — consider why I might have put those in as checks).

This is very different from segmentation schemes like the 8086's where you can have multiple different paths to the same memory location and segments can be continuous. In COMP-X the only way you'll be able to access data in other pages is to execute from those other pages, and only one instruction can move execution into another page: JSR. This snippet writes a value into the correct location by calling a subroutine in the new page to do it. Notice that the address is still three!

   :START:0
L  :LAI  :1:65
   :JSR  :0:NP
   :HJ   :0:0
W  :CONST:0000
NP :ADCON:NPC
   :START:256
NPC:ST   :0:LR
   :ST   :1:3
   :JSR  :0:LR
W2 :CONST:0000
LR :CONST:0000
   :END  :L

This is our first use of ADCON to insert the address of the NPC routine, since CONST (arbitrarily?) doesn't accept labels. Although the store NPC makes at word 257 has an address of 3, the base register is now 1 (+256) and therefore the store goes to W2 (word 259), not W (word 3).

Thus, at the end, W2 is finally 65 and W is 0, as expected.

Armed with this information, we can construct code longer than one page as long as we have this linking stanza straddling the page boundary (in bold):

   :START:253
L  :LAI  :1:65
   :JSR  :0:NP
NP :CONST:0100
   :START:256
   :HJ   :0:65
   :END  :L

The JSR brings execution over the line to the next page, using the code pointer at NP (you'd have a different pointer and a matching START for each page boundary) and throwing away the return address in GR0, and execution ends cleanly with SC set to 65.

As inconvenient as this arrangement is, an interesting consequence is that you can almost never have an out-of-bounds access to memory except if the base register gets out of range of the allocated maximum. That can only happen with a JSR instruction that was given a bogus code pointer, such as

   :START:0
L  :JSR  :0:DIE
DIE:CONST:FFFF
   :END  :L

That's the COMP-X equivalent of a segmentation fault, and the only way such a condition can be triggered, assuming we're dealing with a system with less than 64kW available (like the PC-5 and PC-6). Otherwise, we never have to worry about illegal loads and stores because we have no way of putting instructions in an illegal page to execute — sucks for programmers but simple for the VM designers. Strangely, unlike the other two exceptions, this exception doesn't provide the illegal value of the sequence counter or what set it.

Let's tie this room together with the rug of a game. I'm going to be crass enough to advance that this may be the first game ever written for CAP-X/COMP-X. If you did one, post in the comments. I labouriously typed this into a real PC-5, so I know it works. The game itself is Rock, Paper, Scissors.

:start:200
l:jsr:0:xoi
:lai:0:0
:st:0:you
:st:0:me
lup:read:1:10
:and:1:bit
:st:1:yom
:jnz:1:rps
:jc:3:lup
rps:jsr:0:xoj
:and:3:bit
:st:3:mem
:jnz:3:shm
:jc:3:rps
shm:write:3:10
sam:eor:3:yom
:jnz:3:tpa
:jc:3:sho
tpa:ld:3:mem
:eor:3:bit
:jnz:3:paa
:eor:1:bio
:jnz:1:paa
:jc:3:lus
paa:ld:3:mem
:ld:1:yom
:eor:1:bit
:jnz:1:sco
:eor:3:bio
:jnz:3:sco
:jc:3:win
sco:ld:3:mem
:sub:3:yom
:jc:1:lus
win:lai:0:1
:add:0:me
:st:0:me
:jc:3:sho
lus:lai:1:1
:add:1:you
:st:1:you
sho:ld:0:me
:ld:1:you
:write:0:10
:write:1:10
:jc:3:lup
you:const:0000
yom:const:0000
me:const:0000
mem:const:0000
bit:const:0003
bio:const:0001
xoi:adcon:xoc
xoj:adcon:xos

This first part is the game loop. It sets up its variables and asks for your move, where 1 = paper, 2 = scissors and 3 = rock, and rock smashes scissors, scissors cut paper and paper covers rock. It then gets a random number and displays that as its move. It checks that the two moves weren't the same (using EOR; if so, tie with no score) and that there wasn't a paper-rock setup (if so, paper wins), and then determines which is the higher value and awards them a point. Its score is printed, followed by your score, and the game repeats.

The two pointers at the end are into this set of subroutines that compute a pseudorandom number using an Xorshift 16-bit variation. The first routine initializes state, and the second generates a number between 0 and 32767. Notice we save our return address before entering.

:start:256
rtn:resv:1
xot:resv:1
xoc:st:0:rtn
:jnz:3:xod
:lai:3:1
xod:st:3:xot
:jsr:0:rtn
xos:st:0:rtn
:ld:3:xot
:sft:3:7:1
:eor:3:xot
:st:3:xot
:sft:3:9
:eor:3:xot
:st:3:xot
:sft:3:8:1
:eor:3:xot
:st:3:xot
:jsr:0:rtn
:end:l

This somewhat abuses the likelihood of GR3 having stale data in it, and as such employs it as a seed (ensuring it's nonzero, of course). It then does all the needed shifts to return another pseudorandom number, which the game clips to the 1-3 range as its move.

That's pretty much the breadth of CAP-X/COMP-X as implemented on the Tandy Pocket Computers, and with those kinds of bizarre limitations I think it makes a stellar unintentional esoteric programming language. So much so, in fact, that I've reimplemented it as a Perl-based assembler, disassembler and runtime I've christened "CRAP-X," which you can find on Github. It includes full documentation of not only itself but also CAP-X/COMP-X since most of those resources are in Japanese. There are a few things I did improve upon:

  • The runtime now supports an addressing space of up to a full 64 kilowords.
  • If you specify a radix (base) of 0 to READ or WRITE, then it will either read an ASCII character from standard input into the register, or write an ASCII character in the register to standard output. Now we can finally write a proper hello world! Sorry, even though we could support something like UCS-2 this way, it's currently limited to 7-bit ASCII for simplicity.
  • I also added a radix (base) 1 for them as well which just shows/accepts signed decimal numbers without all the other nonsense.
  • You can have comments, delimited by either ; or #.
  • Labels can be more than three characters, though everything is still case-insensitive.
  • You can specify hexadecimal addresses/immediates with a preceding $ to those instructions and pseudo-ops expecting a decimal argument (CONST is always hex, but the $ is accepted and politely ignored if you include it).
  • CONST can take labels now, making ADCON more or less into a synonym.

If you provide the -pc5 or -pc6 arguments to the assembler or runtime, then these changes are turned off and the memory limits of the respective unit are enforced, so you can write and test code on the Perl runtime before labouriously keying it into your Casio-Tandy and have good confidence you aren't using something it doesn't support.

The assembler reads a source file on standard input and emits binary bytecode on standard output, with the load address, starting address and length encoded in the first six words (all words are big-endian as divinely intended), followed by the entire program, followed by a symbol table. This compound file is what you pass to the runtime for execution. By default the runtime just runs it from the starting address and returns to your shell, but if you pass -debug, it brings up a facsimile of the FX-series monitor instead for interactive debugging. The disassembler is the most undeveloped of the three, but at least it lets you see what's in a binary.

I've included a few sample scripts in the demo folder. Here's a sample run, with two CRAP-X test scripts (hello world and a ROT-13 generator), and one of the classic CAP-X ones (add two numbers).

% perl asmbl.pl examples/hello.cap > hello.bin
% perl mon.pl hello.bin
hello world
% perl asmbl.pl examples/rot13.cap > rot13.bin
% echo 'hello world in rot13' | perl mon.pl rot13.bin
uryyb jbeyq va ebg13
% echo 'hello world in rot13' | perl mon.pl rot13.bin | perl mon.pl rot13.bin
hello world in rot13
% perl asmbl.pl examples/classic/add.cap > add.bin
% perl mon.pl add.bin
GR0 (10) 123
GR1 (10) 456
GR0 (10) 579 
% perl mon.pl -debug add.bin
256:Go/Dump/Symbols/^Cal or Trace (Off) t
Trace is On
256:Go/Dump/Symbols/^Cal or Trace (On) s
0000 GO
0006 TAD
256:Go/Dump/Symbols/^Cal or Trace (On) g
Go 0 
GR0 (10) 123
    0:500A 123 -6000 -25203 -15365 
GR1 (10) 456
    1:540A 123 456 -25203 -15365 
    2:D406 123 456 -25203 -15365 
    3:A006 579 456 -25203 -15365 
GR0 (10) 579 
    4:600A 579 456 -25203 -15365 
exiting with SC = 0000
256:Go/Dump/Symbols/^Cal or Trace (On) d
Dump:Object/Register r
BR : 0000 0
GR0: 0243 579
GR1: 01C8 456
GR2: 9D8D -25203
GR3: C3FB -15365
SC : 0000 0
CC : 0000 0
256:Go/Dump/Symbols/^Cal or Trace (On) d
Dump:Object/Register o
from 0
from 0 to 10
from 0 to 10
    0:500A READ 00   10
    1:540A READ 10   10
    2:D406 ST   10   6
    3:A006 ADD  00   6
    4:600A WRITE00   10
    5:0000 HJ   00   0
    6:01C8 HJ   01   200
    7:0000 HJ   00   0
    8:0000 HJ   00   0
    9:0000 HJ   00   0
   10:0000 HJ   00   0
256:Go/Dump/Symbols/^Cal or Trace (On) c
%

In the classic examples folder, I've also included PC-5 compatible versions of the Rock Paper Scissors game you just saw, the addition program from earlier (both the little "hello world" one and the unsigned adder), the GCD program from earlier, the Xorshift RNG as an endless loop of random numbers, and even a Nim misère game I adapted from the First Book of KIM. I've also included all our test cases in the edges folder so you can test them yourself.

In the full CRAP-X examples folder, you'll find the hello world and ROT-13 programs, plus an enhanced Rock Paper Scissors with proper prompts and inputs. To prove at least a vague sort of Turing completeness, or at least as a bounded-storage machine, I've also included a Subleq machine example that implements probably the most useful of the one-instruction set computers and embeds a hello world program in Subleq that it will run. Note that for long programs you'd need to implement separate pages with their own copy of the Subleq interpreter and figure out some way to jump around on branches, an exercise left for the truly obsessed.

Let's finish the story. Given all the architecture's idiosyncracies, in my estimation there's no good technical reason why Sharp included CAP-X/COMP-X, especially on units that already had an (almost certainly faster) native code execution feature. To be sure, such code is running directly on the metal, so if you screw up it could lock up the machine until reset or corrupt anything else in memory. However, Sharp had already embraced this risk for great reward, particularly with the hardcore PC-1500/Tandy PC-2, so CAP-X/COMP-X didn't really add any new technological capabilities and native machine language programs would have had infinitely greater access to the hardware. With that in mind, Sharp's interest in the architecture appears to have been purely for the purpose of adding just another feature.

But Casio had yet to offer support for native assembly on any of its pocket computers up to this point. What CAP-X/COMP-X gave it was a faster runtime that was fully trappable and protected, plus (like Sharp) being a public standard by then that — at least in Japan, anyway — was something their customer base would find worthwhile knowing. A COMP-X program can't run wild and trash your BASIC programs, it can be halted with the BRK key if it gets stuck, and it can be debugged and stepped through instruction by instruction. Sure, it was cumbersome and convoluted, but Casio COMP-X programs were much swifter than Casio BASIC even if you could get more BASIC code in, and unlike naked machine code were absolutely safe to run. Now, if only Casio had allowed you to mix CAP-X code and BASIC, and/or store multiple CAP-X/COMP-X programs in memory at once like BASIC can, we could have had the best of both worlds with faster code and BASIC convenience. Plus, what modern standard do we know that offers you a contrived virtual machine but has no "I/O," so to speak, of its own?

It should also be observed that for as dire an assembly language as CAP-X/COMP-X is, it was never intended to be anything other than a means of skill assessment, and certainly not for programming in the large. As such we really can't blame MITI for Japanese tech companies, perceiving a potential market in a language that likely buyers might want to learn and required no licensing fees, sticking it in their products as a value-added feature. It must have been worth it to those companies, too: in 1986 MITI introduced a successor to CAP-X/COMP-X in that year's ITEE, which was named CASL/COMET (CASL being the assembly language, allegedly for "Common Algebraic Specification Language," and COMET being the virtual machine), and within months Sharp and Casio had introduced CASL-compatible pockets, suggesting the feature generated at least some sales. The first CASL/COMET pocket computer, the 1986 Sharp PC-1445, was virtually identical to the CAP-X/COMP-X PC-1440 otherwise, changing only the ROM and the names on the bezel and key (as was the PC-1417G, derived from the PC-1416G). Casio followed suit with the 1987 FX-840P, derived from the FX-850P, and putting the feature on the case for the first time. However, Tandy imported neither unit, and while they were sold in Japan and to some degree in Europe they are rather rare in the United States. Although Casio did release one more FX-770P-based model, the FX-795P in 1987 with 16K of RAM, the CAP-X mode was completely removed and replaced with additional specialized math operations (matrix math, calculus and even complex numbers).

CASL/COMET is similar conceptually to CAP-X/COMP-X and is also a 16-bit system, but it uses 32-bit instruction words to accommodate 23 operations, a full 16-bit address/immediate field, and an additional fourth GPR, which also serves as the stack pointer. While still not true load/store, it did have an instruction to load an effective address a la Intel lea that could add the value of the specified index GPR to the target GPR simultaneously with a constant, finally supporting limited register-to-register math (and if the index register was zero, it did double duty as a 16-bit load immediate). There was also no more base register and no more segmentation, a more useful condition field register was added, and special instructions to read and display entire strings replaced the clumsy READ and WRITE. CPUs had become less quirky, so CASL/COMET became less quirky too. That also made it much nicer to program in, though correspondingly much less interesting as an esolang.

CASL/COMET had other implementations, including this one for the NEC PC-9800 that I spotted on a Japanese auction site, but its most interesting appearance in pocket computers was as part of the remarkable Casio PB-1000 family. The PB-1000 had a much larger 192x32 dot-addressable LCD that could display four lines of 32 characters and 8K of RAM (supporting up to 40K), and while still based on the HD61700, Casio supported programming it directly in assembly — except for the 1987 PB-1000C, in which Casio replaced the HD61700 assembler with CASL/COMET. In 1989, Casio introduced the PB-1000's successor, the PB-2000C. This unit kept the same screen but eliminated the clamshell and came with 32K of RAM (supporting up to 64K with an RP-33), plus a ROM slot for cartridges. Notably, the unit is programmable in real, honest to goodness C with an on-board compiler and editor. The cartridges included BASIC, a surprisingly credible pocket Prolog implementation (a subject of a future post) ... and CASL/COMET.

The last CASL/COMET Casio I could find was the 1989 Casio VX-4, a Japan-only reworking of the HD61700-based FX-870P with 18K RAM (expandable to 50K with an RP-33) that supported BASIC, CASL/COMET and C all in the same unit. This machine is also notable for a "secret" built-in self-test accessible with the undocumented command SYSTEM *. Near as I can determine, Sharp never issued any others of its own.

In 2001 CASL/COMET was revised into CASL II and COMET II, adding more register-to-register operations, expanding the register file, adding an overflow condition bit, and changing the instruction mix as well as reversing the bit order (the most significant bit is now 15). Being the current dialect of the language on today's ITEE, CASL/COMET II has several open source implementations including PyCASL2/PyCOMET2 and a Java conversion. But by then pocket computers had given way to PDAs and graphing calculators and the architecture has yet to grace a handheld from the factory again, so if you want it on your Android, I suppose you'll have to write it yourself. That's good enough for a passing grade, isn't it?

One parting thought before we close: why on earth would MITI have put in all these bizarre quirks? Probably because so many other contemporary systems had similar quirks of their own and we've already pointed out a few. COMP-X wasn't supposed to be an ideal CPU — it was supposed to be a real one, and real ones back then were hairy. CPUs were much more primitive and a lot of their weird edges would be unashamedly exposed to the programmer, and a competent IT professional of the era would have needed to account for them. Real world code had to work and be achievable in spite of those idiosyncrasies. Or, put more succinctly, CAP-X/COMP-X sucks because it was designed to.

The Perl "CRAP-X" reimplementation of CAP-X/COMP-X is on Github under a 3-clause BSD license.

No comments:

Post a Comment

Comments are subject to moderation. Be nice.