Sunday, April 25, 2021

Refurb weekend: Hewlett-Packard 9000/350

I'm not really a "big iron" enthusiast; I've always liked small systems (for one thing, you can collect more of them without annoying your spouse, though my wife points out for the record she is generally tolerant of my hobbies). One really must specialize in those kind of machines as a collector, not only for their power and space demands, but also their sometimes unusually complex maintenance requirements.

That doesn't mean I don't have larger machines, however. Besides my three Apple Network Servers (about the size of a decent dorm fridge), a PDP-11/44 in storage I'm not sure what to do with yet and the 2U-in-a-tower IBM POWER6 which runs Floodgap, my other "big" system is my only 1980s-era Un*x workstation, a 1987 HP 9000/350. It came to me already named (homer).

Homer's system processing unit has a 25MHz 68020 and 20MHz 68881 FPU paired with HP's custom MMU (not a 68851) and 32K of cache, which HP claimed was four times as fast as the VAX 11/780 at integer math. It is closely related to the slower, stripped-down 330 (both CPU and FPU at 16.67MHz, no cache; in fact, HP calls the 350 a 98562B and the 330 a 98562A). 9000/300 systems are unusually modular by modern standards: the SPU is in a separate, self-contained box from the rest of the peripherals, all of which are installed in a custom HP steel rack. As internal options it has a HP 98545A colour graphics board (in the bundled configuration HP sold as the 350C) that delivers 1024x768 graphics with 16 colours, 16MB of parity RAM (up to 32MB, but that needs the three-connector system bus plate which I don't have) and a standard Human Interface board (HP 98562-66530, with later versions sold as the HP 98247A) containing low-speed HP-IB, NIC (10Base2, the last Thinnet machine on my household network), HP-HIL (with 46010A keyboard and 46060A mouse), audio and RS-232, plus the HP 98562-66531 optional high-speed HP-IB board necessary for booting from a hard disk. The monitor is the largest CRT display I own, a 19" Sony GDM-1902 that HP repackaged as the 98782A, capable of 1024x768 megapixel graphics.

Over high-speed HP-IB it is connected to a HP 6000 C2203A 670H, an indestructable 670MB CS/80 hard disk with the system name on the front that will outlast the cockroaches. I also have a benighted 9144A tape drive that refuses to stay locked in the rack and requires pre-formatted IOTAMAT QIC cartridges yet won't read them even with a retrofitted capstan, and a 9122D dual DS/DD 3.5" floppy drive. (Yet to be racked, pending investigation, are a 600/A CD-ROM and a 6400 C1511A 1300H DDS-1 tape drive.) It runs HP-UX 8.0, though I am told the NetBSD port is excellent.

In 1987 this would have been a heck of a computer, but you would have paid somewhere north of $50,000 for this configuration which would be a whopping $115,000+ in 2021 money. For comparison, the most I've ever personally paid for a computer was $11,000 for my POWER6, purchased used in 2010 (in 2021 about $13,300), whereas this machine I got for "come and get it" over a decade and a half ago (tip of the hat to Stan and Kevin). It also came with a separate 9000/319C+ system unit, but that's in storage since the 350 is much more powerful (the 319C+ is essentially a consolidated, cost-reduced and minimally upgradeable 330). The Homer Simpson doll was included.

A refurb weekend was planned for Homer for awhile owing to the dead clock battery (it uses the slightly larger 2325 lithium coin cells instead of the more typical 2032s), and it had always had a flaky 10Base2 connection to the network backbone which I chalked up to cabling because I could usually fix this with messing with the cable and resetting the LAN hardware in /usr/bin/landiag.

This time, however, no amount of tweaking and cajoling could get the network connection back up again. The time had finally come for ... a Refurb Weekend!

The 350 and relatives subdivide internally into RAM board(s), the CPU board, any graphics and other option boards, and the Human Interface board, which is where most of the peripheral connections reside in the default loadout. Its original HP 98562-66530 board looks like this:

The low-speed HP-IB, HP-HIL, audio, RS-232 and NIC are all consolidated onto a single unit, replacing the separate boards (HP 98625 HP-IB, HP 98643 LAN, HP 98620B DMA and HP 98644 serial port cards) required in earlier models. The golden board piggybacked on it is the 98562-66531 high-speed HP-IB board with an integrated cable, which is a functional substitute for the HP 98625B. The unified Human Interface board idea is nice in that you don't need a separate expansion box to get a good mix of devices but bad in that repair is correspondingly much less granular.

The self-test screen showed a valid MAC address for the NIC (the 080009... code), which suggested the MAC portion was working and the problem was either the port or the Thinnet PHY (you kids today have it easy with your newfangled integrated NIC chipsets). On a board of this era they would be separated, and later we'll demonstate this, but you can already see that everything was soldered down and not socketed. Since I was uncertain at the time what really was the fault, let alone what I would actually replace the faulty component with, I decided to see if I could simply replace the board.

This turned out to be serendipitous because someone was selling a two-pack of 98562-66534 Human Interface boards for a very reasonable price.

("MADE IN USA": don't see that much anymore!) These newer boards were introduced with the later 360 and 370, but because those SPUs are also quite similar to the 350, they'll work just fine in a 330 or 350. In particular the 66534 variant was especially handy to find because it had a more conventional AUI connector to the MAC (the 360/370's 66533 variant was also Thinnet). Just make sure when you get the board that you slide it into the card guides and fully engage the connectors, or you'll get weird DMA and device failures like this:

After an initial moment of panic, making sure the board had a good connection made the problem go away. Both of them checked out and passed the system self-test. Still, since one had thumbscrews and the other didn't, I decided to use the thumbscrewed one. First, let's replace that bad old battery which was almost certainly burned out as well:

A nice 3 volts and change. Next, let's move the high-speed HP-IB board over (the 670H cannot be booted from the on-board low-speed HP-IB). The integrated cable needs to be removed first, so a bit of nylon spudger action frees that up:

With the cable disconnected, removing the board is then a matter of removing the four screws holding it on its standoffs and levering it out of its connector with the spudger again:

Inspecting the 66531 board.

No damage, pin headers look good. Lots of glue logic and not much else.

Now to slip off the cable. The integrated cable has two metal chokes which serve to orient it on the rear plate. You don't need to remove these chokes, but you do need to slide the part of the plate holding the backmost choke off to the side (a pair of pliers helps). Don't pull out that stud holding it; the stud is merely an axis. Just grab and pull the plate tab itself.

With the plate tab off, the cable can now be gently pulled out of its clamp.

When installing the high-speed board in its new home, connect the HP-IB cable to the board first (there's a hollow tab that serves as a key; in the installed position this hollow tab should be up and visible) so that you don't trap the cable under the card installing it. The 66534 board does not have a moveable tab, just a gap for the divot in the rear choke to sit in as shown here. Also, the cable clamp is facing up on the 66534 and not to the side as in the 66530, so the whole thing just goes straight down onto it and the board connector.

Seat the board and put the screws back in. It may flex a little until it settles into its connector. I then made sure the DIP switches matched between the old board and the new board since they would be configured the same way.

One last detail is what we'll use to connect it to the network. While my hub does have an AUI port, so I could just run a straight-thru DA-15 cable, might as well put that box of MAU transceivers I have to good use. I've been in this business long enough to even have some favourite brands:

BoseLAN MAUs have lots of blinkenlights and are the most compact, but this model's RJ-45 jack is to the side, which would be right in the way of the high-speed board's HP-IB cable. (BoseLAN got bought by Cable Design Technologies, which later merged with Belden.) I am also a big fan of Allied Telesyn gear (now Allied Telesis) -- that 10MBit backbone hub is a AT unit that has been in almost continuous service since about 1999 -- and their MAUs are also very good but I don't like opening up NOS boxes if I have something loose that will suffice. So I dug out a Transition (still around, apparently) MAU which has very few blinkenlights but wasn't sitting pretty in a new box either. Yes, I've got a shoebox stuffed full of these things.

Installed the new board, but not without a little bit of blood from the side rails. I'm not sure the degrading foam from the sides of the rack are so good for open wounds either.

Booting HP-UX. No more errors!

And testing out the new network card by firing up the Chimera web browser under HP VUE. CDE jockeys will recognise the Visual User Environment as an ancestor, not least of which due to the Motif interface, and indeed CDE was strongly influenced by it.

After all that, a post-mortem: was the board repairable? Other than minor differences in chip and component revision (and tape covering an EPROM window, which was replaced by a conventional ROM on the later card), the only differences of significance between the 66531 and 66534 are in the corner near the bar code where the AUI or 10b2 port would be. The most obvious change is a large chip marked Reliability 2VP5U9 ("QUALITY IS RELIABILITY") which is not present on the 66534. The 2VP5U9 LAN-PAC is a DC/DC converter that turns up many places, including the Commodore A2065 Ethernet card, and according to its blurb "is designed to provide power and isolation for Local Area Network transceiver chips."

The pinout for these things is quite simple (here is a scan from the datasheet); most of the pins are wired together. There are other variants of this part but this one specifically serves for Thinnet (which was also called "Cheapernet" because it was cheaper than the alternatives, as shown in the table).

The underside of the board shows its connections. The chip itself is at U20. There's really only one line involved here, which naturally is one of the outputs.

With the pinout and following the lone trace, it looks like the 2VP5U9 powers U14, which is some glue logic also not on the 66534, and the surrounding discrete components, but not T1 (you'll notice the traces carefully avoid its pins), which is preserved on the 66534. Helpfully the region is outlined in a lighter green than the PCB, which likely constitutes the entirety of the PHY, but any of these components or any combination thereof could have been faulty. HP warns about this in the service manual: "Field Repair Philosophy for the Model 330/350 Computers and the HP 98568A Opt. 132 and 98570A Expander is assembly, or board level." Well, I guess that's what we ended up doing anyway.

I'm happy to have it fully working again, though it's sad not to have a reason to mess with Thinnet anymore. Or, maybe it's not that sad: I remember how much I hated it in large deployments. But now that it's back in action I'd like to get at least one tape drive working too; that and a port of Crypto Ancienne will be the next project(s).

Tuesday, April 20, 2021

The better way to get VICE on Ethernet with SELinux

Although I was a registered hardcore user of Power64 when my daily driver was still a Power Mac, now that I'm a daily Linux user on this Raptor Talos II the best Commodore 64 emulator is clearly VICE, the Versatile Commodore Emulator. It not only has highly accurate emulation, but can talk to real disk drives over OpenCBM (I use it with a ZoomFloppy xum1541) and even emulates a whole mess of peripherals, including Ethernet cartridges like the RRNet and clones (on my real Commodore 128, I use a 64NIC+).

However, I'm a Fedora user and SELinux is on by default. SELinux will really ruin your day here because it (quite reasonably) sees a random user application trying to tunnel out a network connection through libpcap/libnet as a security risk and disables it by policy. You find this out the hard way by trying to enable the Ethernet cartridge from the VICE preferences interface and getting a message you need to run it as root. I don't run things like Commodore emulators as root, spank you very much.

Fortunately, there's an easy, (probably) one time workaround; with libpcap and libnet installed (using tun/tap isn't supported yet), you will have to be root just once to fix the problem. Assuming x64sc (or whichever VICE component you're using) is in /usr/bin, you can give it raw network access with setcap cap_net_raw,cap_net_admin=eip /usr/bin/x64sc. Now you should be able to run it without root privileges and be able to access the raw interface. Here's a little test in Kipper BASIC:

Makes cross-development a lot easier!

Sunday, April 18, 2021

Don't be fooled by cheap USB multimeters

A fair number of computers people nowadays would refer to as vintage have USB either as an option or built-in, and USB ports crap out like everything else. Accordingly there are testers: at Big Box Hardware Store the other day while I was buying paint, they had one on clearance for just $3 each. That was worth picking up a couple to mess with.

Whoever wrote the package copy was either a slick advertiser or a liar, but I repeat myself. Among other things it bills itself as a "USB multimeter." This is barely technically true since it does measure both DC voltage and amperage, but is definitely not what you'd consider a typical multimeter. It also says on the back that it's "USB 3.0-3.1 Type A" yet it lacks the blue tongue and extra pins of a true Type A USB 3.x connector.

Still, it's cheap, and it will correctly tell you the voltage off the port (as tested with one of my real multimeters). This isn't enough to tell you if the whole shebang including data lines and signaling is working but it seems unlikely you'd have voltage but nothing else, assuming no monkey business like a "condom" was installed. If that's all you want, plus some reassurance the voltage you're getting is nominal, then this is $3 well spent.

However, what it doesn't accurately tell you, and apparently none of the similar small devices of this type will, is the available current. You'll be able to estimate the current draw by plugging something in the other end, but you won't be able to use it to tell if you're connected to a 1 amp or 2.1 amp port. There are USB testers that will put an adjustable constant load of however many amps on the line, and you can determine the available current by how high you can go before the lines sag, but while basic ones aren't exorbitant they certainly cost more than this one did.

You may be able to infer that a device is drawing more power than is available, but you'll need a powered hub to compare. For example, attaching my INOGENI VGA2USB3 showed exactly 60mA draw and a voltage of 4.99V as connected directly to this Raptor Talos II (and doesn't work). Connected to a powered USB 3.0 hub, however, it reads 5.14V and 550mA (and works). You wouldn't have any idea it's not enough without seeing how it performs connected to something else, and you can't assume the port only offers 60mA because the device may simply not draw anything when it fails to initialize. Likewise, the voltage difference probably isn't salient because the USB spec allows up to 5% variance under load, meaning even a voltage of 4.75V wouldn't necessarily be "sagging" per se.

Cheap USB testers like this aren't utterly useless but they're really more useful for confirming normal function rather than troubleshooting. If you get low voltage you'd still need to test the computer's power supply as well, and you can only correctly estimate a device's current demand if it's actually functioning and drawing power. You can't also conclude anything about the port's performance under a consistent load since it doesn't generate one. I don't think I wasted my money but you probably don't want to spend any more than that for such a limited device either.

Monday, March 29, 2021

The final official release of Classilla

An apology is owed to the classic Mac users who depend on Classilla as the only vaguely recent browser on Mac OS 9 (and 8.6). I've lately regretted how neglected Classilla has been, largely because of TenFourFox, and (similar to TenFourFox in kind if not degree) the sheer enormity of the work necessary to bring it up to modern standards. I did a lot of work on this in the early days and I think I can say unequivocally it is now far more compatible than its predecessor WaMCoM was, but the Web moves faster than a solo developer and the TLS apocalypse has rendered all old browsers equal by simply chopping everyone's legs off at once. There is also the matter of several major security issues with it that I have been unable to resolve without seriously gutting the browser, and as a result of all of those factors I haven't done an official release of Classilla since 9.3.3 in 2014.

Now that I've announced TenFourFox is winding down, let's just recognize the inevitable and officially declare that Classilla is no longer supported. I may still do minor work on it for my own purposes (for example, like issuing updates to stelae that I think might be helpful to other people, and if so I will post announcements about that here), but I don't make any guarantees on when or in what fashion or even if I will publicly release such work, and I will not be accepting bug reports or feature requests. Effective immediately is now a placeholder with static documents. Files have been placed on a new permanent location on the Floodgap Gopher server, though files on SourceForge will remain as a faster HTTP mirror. Report-A-Bug is disabled and the Github project (which was never really used) is read-only as of now and will eventually be removed. Today Classilla is once again just a "hobby" project and I'm sorry I couldn't make it more than that.

Naturally the browser has always been open-source. The build instructions are intimidating but they do work, and I've collected the build prerequisites on the gopher server, which Classilla can access, of course. If you decide to make your own build of Classilla, all I ask is that you change the name to something else so people don't ask me about it.

To sort of make it up to folks, today I'm also releasing the incomplete work I have done towards 9.3.4. I'm calling this a "beta" since it hasn't had a great deal of testing, but you can use it or 9.3.3 (there are no substantial differences in security content). 9.3.4b has updates to layout, eliminating the old manual "slow scroll" option and automatically doing a more conservative repaint on sites that used to scroll incorrectly. Unfortunately this is sometimes also slower and occasionally dramatically so, and some sites will flicker as well, but no site will render worse than 9.3.3. It also has a contributed fix to JavaScript to fix a problem with high-precision math (due to a compiler bug in the version of CodeWarrior Pro 7 I use), and adds a convenience hot key (Command-Shift-Z) to toggle between "no style" and the default style sheet so that a badly rendering page can immediately be destyled.

Finally, as a last-minute thing, I also updated some of the built-in stelae and added a couple more for SourceForge, this blog and the TenFourFox Development blog. While the latter renders fine (if slowly), this blog doesn't, so I sped up both, and fixed visual problems and download issues with SF at the same time. I did this after I'd certified the source, though, so just copy them from the revised binary archive if you want to roll them into your own builds (I didn't feel like doing several more hours validating the source archive again for plain text files which can simply be copied).

However, these changes don't help much on modern pages for which the majority require TLS 1.2 to access at all. Although 9.3.3 added support for SHA-2 certificates and SNI, it's still limited to TLS 1.0, which was recently deprecated and which many servers no longer offer. Adding TLS 1.2 (and, for that matter, 1.3) capability needs sizeable updates to both Necko and NSS which are technically possible but non-trivial. However, now that we have Crypto Ancienne, an easier route is to modify Necko's proxy support to use carl as a backend, which I also implemented in 9.3.4b. If you run Classilla under Classic (as you might on 10.0 through 10.3), or Rhapsody's Mac OS mode, or run Power MachTen, then you can even self-host crypto support without a second system. Here's how.

First, set up carl, Crypto Ancienne's combination proxy and command line demonstration application, either locally or on a machine on your local network (I'll explain why this is important in a second). For Rhapsody/Classic and Power MachTen users, I have pre-compiled binaries available on the Floodgap gopher server that also include micro_inetd configured to bind to localhost. You can download these directly from Classilla or any other compatible Gopher client. The Power MachTen version runs on 4.1.4 and possibly earlier versions. The Rhapsody version runs on any Power Mac running any version of Mac OS X or Rhapsody 5.6/OS X Server v1.2, and possibly earlier versions. Source code is included.

  • Download the binary archive. On Power MachTen, put it into the root folder of the drive you are running Power MachTen from. On OS X or Rhapsody, you can leave it in your home directory or any other desired location.
  • On Power MachTen, log into the virtual machine; on Rhapsody or OS X, start a Terminal session.
  • On Power MachTen only, dfork //carl-machten-414.tar.gz ~/carl-machten-414.tar.gz (yes, two slashes). Change ~ to the desired destination if you want it anywhere else. This copies the archive from the Power MachTen root and strips any resource fork it may have accidentally acquired.
  • cd ~ (or where you put/copied the archive to)
  • gunzip carl-machten-414.tar.gz or gunzip carl-rhapsody-56.tar.gz
  • tar xvf carl-machten-414.tar or tar xvf carl-rhapsody-56.tar
  • You will now have a new folder cryanc with the binaries, so cd cryanc

Now bring up the proxy. On OS X with Classic or Power MachTen (assuming you are tunneling Power MachTen through Open Transport, which is the default), start micro_inetd listening to port 8765 like so: ./micro_inetd 8765 ./carl -p

Don't forget the ./s and the -p, or it won't start or listen correctly on the socket. If for some unexplained reason you are already using port 8765, then change that number in the command line and everywhere you see it below.

On Rhapsody, your instance of "Blue Box" Mac OS may be set up to use a separate IP address (this is the case on my Wally G3), which means connecting to localhost won't work. If you have only one IP address assigned to your main Rhapsody installation, but this address is different from what Mac OS is using, then run ./micro_inetd_any 8765 ./carl -p instead to listen on that interface. Be careful if your Rhapsody machine has a publicly routable IP address; this will make your system into an open proxy! If your Rhapsody install has multiple IPs, however, you really should be handy enough to modify micro_inetd.c and recompile it to listen on the right one.

Regardless, with micro_inetd running, now configure Classilla 9.3.4b. Classilla is based on an earlier version of Mozilla that allowed separate proxy definitions for "regular" HTTP proxies and special "SSL proxies" that supported the CONNECT method, since in those days doing so was not necessarily guaranteed. (Today every modern HTTP proxy supports CONNECT and the distinction is no longer relevant.) Here we have set both proxy settings to localhost on port 8765 — if you are on Rhapsody or using a non-standard Power MachTen configuration, or running carl on a separate machine, substitute that IP or hostname for localhost as necessary — though you don't have to proxy unencrypted HTTP traffic through Crypto Ancienne (that said, it will politely pass such traffic through).

The reason the browser prefers to use CONNECT is so the connection between the server and the browser is encrypted end-to-end (all the proxy is doing, in this case, is shoveling data back and forth). However, this means the browser is doing the encryption, which is not what we want. 9.3.4b adds a new preference called network.http.proxy.use-http-proxy-for-https which says that the browser should make an unencrypted request for an encrypted resource and defer the encryption to the proxy. Find this preference in about:config and set it to true.
Now view any https:// URL. The request will be forwarded to Crypto Ancienne, which will do the encryption for you. Here is Classilla 9.3.4b accessing Hacker News.
You'll note that the padlock icon which would ordinarily indicate a secure link shows an insecure one, and if you click on it Classilla will indicate that the connection is not encrypted. This is correct and intentional: the connection between you and the proxy is not encrypted. It just so happens that the proxy is the same computer via the internal loopback, so nothing can get in the middle. However, if you place Crypto Ancienne on another system on the local network, other systems on the network can potentially snoop you, and if you use this method to connect to a proxy that's not on your local network ... well, that's just dumb. Don't do that.

In a like fashion, since Classilla never sees the server's certificate in this configuration, it can't verify its authenticity either. Crypto Ancienne may do this in a future version but for now, you may wish to find other means of confirming the host you have connected to is the host you want to connect to.

Pulling it all together, here is a screenshot from a running system, my dual 1.8GHz Mirrored Drive Doors Power Mac G4 (the system used to develop Classilla) running 9.2.2 and Power MachTen. Yes, my copy is legally purchased from Tenon, spank you very much.

You can also use the same installation of Crypto Ancienne for MacLynx; just change the proxy URLs in lynx.cfg to http://localhost:8765/ for HTTPS and/or HTTP as appropriate.

As a polite warning for Power MachTen users, Classilla can still unsettle your Mac, and if it does, that can corrupt your MachTen FFS volumes. Once you get this set up, you may want to back up your FFS images and consider running Power MachTen and Classilla from separate partitions or hard disks. At some point I would like to port carl into an MPW Tool so you can run it there, but I haven't even started on that yet, and I don't make any guarantees I'll ever do so. If I end up doing that, just like with stelae for other important sites that you can download and add to the Classilla `Byblos` folder, I will post about those things here and there is a "classilla" tag for them.

Regardless, Classilla still serves a basic purpose for me, and with judicious use of destyling serves as a very basic browser that's a little more than MacLynx and Netscape, a little faster than iCab, and now can access more pages than IEMac can. Hopefully this gives it a little longer lease on life because classic Mac OS still has an interface and user experience no other OS, even little-m macOS, has ever matched. I learned a lot from working on it. Thanks to everyone who said kind things about it.

Saturday, March 20, 2021

When you have too much memory for SheepShaver

When I first got my 133MHz BeBox (not new, sadly), it had "only" 32MB of memory and it had four more SIMM slots to fill. While Be only officially supported 256MB of RAM, I was blissfully ignorant of that, bought an additional 256MB of memory in four equally sized 72-pin SIMMs and installed it for 288MB of RAM. (It can actually take up to 1GB, I later learned.) Nice, I said! And then SheepShaver never worked again.

SheepShaver is a desperate pun and an unusual emulator: much like Classic on PowerPC Mac OS X, on big-endian PowerPC most of the MacOS and its applications run natively on the processor, in a form analogous to KVM-PR. In fact, SheepShaver on Leopard is pretty much the best way to run Classic applications on Power Macs that must run Leopard, though it also runs on Tiger and presents certain advantages there as well. It existed first on BeOS as a paid product before becoming open source, though multiple later forks fix various problems on modern platforms.

My original theory was that I had somehow broken something in the update or some other installation, and so I never did much with it (especially since I have plenty of real Power Macs around here). But while I was doing other work on the machine, after a game of BeOS Doom I accidentally double clicked on its icon on the desktop and ... it started up! What could have restored it, I feverishly wondered? Did something monkey around with the memory map? (Foreshadowing music plays here.) It only ran the one time, however, and I spent hours trying to retrace my steps to see if I could make it work again and I never could.

But this at least told me that the install was fine and the problem lay elsewhere. I had never closely looked at it in a debugger. Perhaps it was time.

The BeOS debugger isn't gdb, but you get the idea. The offending instruction was an stbu (store byte with update), but the effective address was ... really weird. It looks like it's wrapped around the entire addressing space back to 0! How did this program even work?

In the source code, for all supported platforms, SheepShaver (and Basilisk II, a 68K emulator it shares substantial code with) has a SIGSEGV handler for trapping segmentation faults; here is BeOS's. My initial thought was that somehow the handler wasn't being installed, but a couple debug printfs in the handler showed that not only was the handler being triggered, it was actually passing the segfault along to the system handler apparently on purpose.

A partial explanation appears in the Darwin (Mac OS X) port:

Under Mach there is very little assumed about the memory map of object files. It is the job of the loader to create the initial memory map of an executable. In a Mach-O executable there will be numerous loader commands that the loader must process. Some of these will create the initial memory map used by the executable. Under Darwin the static object file linker, ld, automatically adds the __PAGEZERO segment to all executables. The default size of this segment is the page size of the target system and the initial and maximum permissions are set to allow no access. This is so that all programs fault on a NULL pointer dereference. Arguably this is incorrect and the maximum permissions shoould be rwx so that programs can change this default behavior. Then programs could be written that assume a null string at the null address, which was the convention on some systems. In our case we need to have 8K mapped at zero for the low memory globals and this program modifies the segment load command in the basiliskII [sic] executable so that it can be used for data.

So, the handler expects to have actual memory mapped indeed at an effective address of zero for the MacOS's low memory globals, a holdover from the 68K days (and if I'd read the Basilisk technical notes, I would have realized that sooner). Since such a fault should never have gotten to the handler in the first place, it just passes it along and crashes. That kind of significant address space remapping clearly could not come from a user-level executable on BeOS; there had to be some sort of system component doing that remapping.

Turns out SheepShaver did in fact install a couple system extensions:

$ find /boot/home/config/add-ons -name 'sheep*' -print
$ find /boot/beos/system/add-ons -name 'sheep*' -print
The last one is used for tunneling emulated networking through the host machine; the sheep driver is the one we want (the two sheep drivers are actually the same file; the dev/ one is a symlink to the actual file in bin/). After a little digging in the source tree, I found the C source for it. It became rapidly obvious after a cursory readthrough that it manipulates the PowerPC page tables.

On PowerPC (prior to POWER9 which introduces a higher-performance radix MMU), the mapping between virtual addresses and physical addresses is maintained by a set of hashed page tables, divided into page table entry groups, or PTEGs. (There is an alternate pathway using block address translation "BAT" registers but I'm going to ignore that for the purposes of this discussion.) The low memory globals region is 8K in size, so (with 32-bit PowerPC) we need two 4K memory pages to map to 0x0000 and 0x1000, which needn't be contiguous in real memory since we'll set up mappings for each page individually. The driver allocates three pages with malloc() and takes a page-aligned slice of two pages within it, then tries to find where in physical memory those pages got mapped to using get_memory_map(). Now we want to make those pages' effective address mapping in SheepShaver point to 0x0000 and 0x1000 instead.

To find a real address in 32-bit PowerPC, the top four bits of the effective address select one of 16 segment registers mapping each 256MB effective address block. The segment register's low 24 bits (the Virtual Segment ID) is combined with the 16-bit effective address' page number and 12-bit byte number within that page to generate a 52-bit virtual address. The VSID and the page number then get hashed and combined with the storage description register SDR1 to yield the address of the PTEG, the correct PTE is found within it, and the real page number within it then becomes the upper 20 bits of the resulting real page address. We're going to work this in a similar fashion to find the PTEG that would contain the mapping for these lowest page addresses.

Traditionally the number of PTEGs is optimally half the number of real pages to be accessed, and since the next highest power of two in a 288MB BeBox is 512MB, that means 229 addressable bytes in (divided by 4K, or 212) 217 pages. Halving that yields 216, or 65536, 64-byte PTEGs to equal a total size of 4MB. BeOS has a specific memory area for this, appropriately named pte_table, that we can look up with find_area() (thus giving us the effective address of the page table pointed to by SDR1). We find the relevant PTEG for each page by doing the same hashing steps the processor would do to resolve the address. In that PTEG, each PTE's highest bit is whether it's valid, followed by the 24-bit VSID, one bit for the hash type flag, five bits of the effective address called the Abbreviated Page Index, the 20-bit Real Page Number, and protection and access control fields.

We won't know the VSID without looking at the segment registers, but we can just walk the entire page table instead since we only have to set this mapping up once. When we find a valid PTE that matches the API, then we know this is a candidate PTEG and derive the VSID from that. We can then either directly modify an existing PTE within it or take advantage of the fact that each PTEG essentially offers up to eight hash collision resolution slots to add a PTE of our own. If we do this to the first place the CPU will look, we will take over that memory mapping for the life of the process.

The memory mapper conveniently has debug logging support for a simple tool called PortLogger that I patched up for BeOS R5. I compiled it with debugging on, restarted, ran PortLogger, started SheepShaver (it crashed, of course) and looked at the output:

$ ./PortLogger 
    PortLogger version 0.4.1
   Cameron Kaiser    - 14/02/21
   Simon Thornington - 14/02/97
control(10000) data 0xfd001bb8, len 00000000
3 pages malloc()ed at 0x0202b228
Address aligned to 0x0202c000
Memory locked
get_memory_map returned 0
PTE table seems to be at 0x30000000
PTE table size: 4096KB
Found page 0  PtePos 58b84 V1 VSID c70 H0 API 08 RPN b11a R0 C0 WIMG2 PP0 
Found page 1d PtePos 58ba4 V1 VSID c70 H0 API 08 RPN b11b R0 C0 WIMG2 PP0 
Found page 1d PtePos 178580 V1 VSID d37 H0 API 2c RPN b11b R1 C1 WIMG2 PP0 
Found page 0  PtePos 1785a0 V1 VSID d37 H0 API 2c RPN b11a R1 C1 WIMG2 PP0 
Trying to map EA 0x00000000 -> RA 0x0b11a000
PTEG1 at 0x30034dc0, PTEG2 at 0x303cb200
 found 80069b80 00000010
 existing PTE found (PTEG1)
 written 80069b80 0b11a012 to PTE
Trying to map EA 0x00001000 -> RA 0x0b11b000
PTEG1 at 0x30034d80, PTEG2 at 0x303cb240
 found 80069b80 00001010
 existing PTE found (PTEG1)
 written 80069b80 0b11b012 to PTE
The driver seemed to properly reserve memory and find the real address (and thus real page number) for its mapping, and was able to resolve and walk the page table. But one problem jumped out immediately: we only have two pages (here 0 and 1d). Why is it that it found four? Notice that the "fraternal twin" pages have matching RPNs, but the VSIDs are different and we don't know which VSID is right. Did our algorithm effectively cause its own hash collision?

Continuing on, when we look at the existing PTE we found, the RPN is the first through fifth hex digits in the second word and both effective addresses match their real ones (80069b80 00000010 and 80069b80 00001010). That seems hinky.

My first thought was maybe we had a stale TLB and our PTE change didn't stick, because on the PowerPC 603 and 603e the code doesn't do a tlbsync to synchronize the translation lookaside buffer (which caches all this work) and this BeBox has two 603e CPUs. However, despite the code and Metrowerks saying it's 604-only, tlbsync is listed as a valid instruction in my copy of the 603e User's Manual Appendix A. I forced it to do a tlbsync by commenting out the check, compiled it again, restarted, ran PortLogger and started SheepShaver. Unfortunately, while it didn't do anything worse, it didn't work either.

My next guess was to see if maybe we were working on the wrong "twin." Assuming we really did have two sets of colliding hashes, what if we used the other one? A line of code to stop the search at the first page pair rather than the second was added and I tried again:

$ ./PortLogger 
    PortLogger version 0.4.1
   Cameron Kaiser    - 14/02/21
   Simon Thornington - 14/02/97
control(10000) data 0xfd001bb8, len 00000000
3 pages malloc()ed at 0x01587bb8
Address aligned to 0x01588000
Memory locked
get_memory_map returned 0
PTE table seems to be at 0x30000000
PTE table size: 4096KB
Found page 0  PtePos 33f04 V1 VSID c70 H0 API 05 RPN 45a8 R1 C1 WIMG2 PP0 
Found page 1f PtePos 33f24 V1 VSID c70 H0 API 05 RPN 45cd R0 C0 WIMG2 PP0 
Trying to map EA 0x00000000 -> RA 0x045a8000
PTEG1 at 0x30031c00, PTEG2 at 0x303ce3c0
 found 80069b80 00147010
 found 80076280 082b5190
 found 00000000 00000000
 free PTE found (PTEG1)
 written 80063800 045a8012 to PTE
Trying to map EA 0x00001000 -> RA 0x045cd000
PTEG1 at 0x30031c40, PTEG2 at 0x303ce380
 found 80069b80 00146010
 found 80076280 082b4190
 found 00000000 00000000
 free PTE found (PTEG1)
 written 80063800 045cd012 to PTE
Success! Now we actually have a free PTE, instead of modifying a questionable one, and we alter that. The mapping now takes precedence over anything else for that effective address and SheepShaver starts and runs normally. It also fixed Basilisk II, which would not run for the same reason, though SheepShaver seems to run 68K applications rather better than Basilisk II does.

Why was this never noticed? Well, like I say, Be never advertised support for more than 256MB in the BeBox, and in 1997 that would have been a significant amount of memory (my Power Mac 7300 in 2000 had 192MB and I thought that was a lot). Most PowerPC systems running BeOS probably had substantially less. Like many other bugs due to clock speed and RAM, no one ever dreamed future users would have such a surfeit of them.

The patched binary and source code are on Be-Power.

Sunday, March 14, 2021

Those mysterious Toshiba T-chips (plus: VTech madness with the Laser 50 and Type-right)

One of my wife's favourite possessions from childhood is an Australian Dick Smith Electronics (she's an Aussie, I'm a half-breed) Type-right. Much like Tandy Radio Shack in the United States, Dick Smith would happily rebadge anything he didn't have to engineer or build from scratch, and a number of his early computers were actually rebadges from Hong Kong company VTech. For example, the 6502-based VTech Creativision was sold in Dick Smith stores as the Wizzard [sic], and several of their Laser computers (notably the Z80-based 200, 210 and 310) were sold as the VZ line.

It won't come as a surprise to learn that this one is a rebadge too:

Yes, it's another VTech unit. In the United States it was sold (as the VTech Type-right) with an instruction manual and a tutorial cassette, but my wife only remembers it coming with a spiral-bound manual she is no longer able to find. Let's power it on.
The display is a simple 8-character LCD. (Cheap is the name of the game here. More on that in a moment.) We can select lessons "CLASS" or a "GAME," which simply shows a queue of letters you have to clear from the screen by typing. She liked it, and it was kinda fun, in a Typing Tutor III Letter Invaders kind of way. The class mode, however, offers multiple lessons:
Yeah, okay, you want to see the hardware, too. This is where my wife turned a little paler than usual when I went after it with a screwdriver. Don't worry, it still works!

Another thing you won't be surprised by: there's not much inside. It's pretty much mostly keyboard.

These cheap circuit boards are typical of the time. Everything in sight is covered in belched-up strands of epoxy. Although the sticker on the underside of the case has a 1985 copyright date, the actual board seems to have a manufacture date of "20 JAN 1988" with a yellow sticker "U.K." (Hong Kong was of course still a British territory then, but who knows what the sticker actually means).
There is a chip visible, but if we peel back the "T R" sticker,
we see it's just an Intel NMOS 8K 2764 ROM. (Best guess from the visible markings is that it was manufactured in the 35th week of 1985.) So where's the CPU?

If we look back at the boards, we see a dotted area outline with the legend "U1" under that folded length of stiff ribbon cable. Time to pull out more screws and turn the board over.

There is one big chip here in that position. Close up,
it's a surface mount Toshiba chip labeled T7951. What is this thing? Clearly it controls just about everything on the board!

The first clue comes from another VTech computer called the Laser 50, though not one that Dick Smith Electronics is known to have rebadged, which was sold in several countries including the United States. It resembles a Casio Pocket Computer and functions much the same way (even with the same P0 through P9 BASIC program spaces), but has a full keyboard and a carrying handle. Hmmmmm, does this form factor look familiar?

The underside also carries a 1985 copyright date.
Cracked open to reveal, again, an obscenely cheap design. Besides the liberal application of epoxy, also notice the cardboard washers where the screws go in the left PCB to prevent them from shorting traces. There are card edges for a 16K memory expander and a cartridge/peripheral bus, but I've never seen them used, and I'm told they mated incredibly badly. External power (though it does do marvelously well on batteries) and cassette connectors are on the top.
On the right board we again see silkscreened outlines labeled U1 and U2. This unit is non-functional and you can see they're definitely not meant to be repaired, so we're not going to make anything worse trying to get the board out. This turned out to be an unexpectedly difficult undertaking because the chips were literally stuck with (now degenerating) adhesive foam pads to the back of the LCD and it took a spudger to pry them off. They don't seem like they were there for cooling and in fact probably made things worse. I scraped off one so you can see.
Zooming in, this chip is labeled Toshiba T7813. I scraped the pad off the other chip after, and logically it's labeled Toshiba T7812. Otherwise they look identical.
The only other chip we saw was a Sharp LH5116-15, which is a 2K (16Kbit) static RAM chip. It has a manufacture date of 33rd week 1988.

There is precedent for subsuming multiple functions into a single chip; it makes manufacture cheaper (and these things are all about cheap). Many of the small Pocket Computers used similar all-in-one chips that contained the CPU, LCD driver, I/O, mask ROM and RAM. Armed with this information, we can comfortably conjecture that there are two in the Laser 50 because it has a 16 character display, so they must each run one half, and the Type-right just has one because it only has eight characters. Likely the only difference between the T7812 and T7813 are the contents of the mask ROM.

That just leaves what the CPU is. The Laser 50 does not have any facility for directly writing machine code programs, and we can't get into its ROM/ROMs (though if I wanted to wreck my wife's precious toy and be doomed to sleep outside forever I might be able to get the ROM out of it), but there are other Toshiba T-chips serving as CPUs in other machines. The ones I could find after a few hours of Googling include:

  • T7775 (MSX). This contains a Toshiba Z80 clone CPU with Intel i8255 PPI clone I/O and clock-bus-mapper-glue, with separate VDP (video), PSG (sound), ROM and RAM. It only appeared in four systems and they all dated from 1985.
  • T7826. This turns up in another cheap-as VTech device, the Whiz Kid. This unit has a copyright date of 1984. It seems even more deprived than the units here, but later, related-in-name-only toys in this series were more functional and some even had speech synthesis.
  • T7937 and T7937A (MSX-Engine v1). This also contains a Z80 clone CPU (the TMP84C00A), plus other notionally discrete peripheral T-chips on die, namely the Toshiba T7766A (PSG clone) and Toshiba T6950B (VDP clone). It also carries a TMP82C55A (8255 PPI clone) and clock-bus-mapper-glue, but uses separate RAM and ROM.
  • T9763, T9769 and T9769A,B,C (MSX-Engine v2). These are upgraded versions of the T7937 series with MSX 2/2+ capability, but are otherwise the same, and also use separate RAM and ROM.

By 1988, VTech seems to have abandoned the Toshiba T-chips and used a discrete and genuine Zilog Z80 in its PreComputer 1000. With this in mind, while we don't know what was in the T7826, all the other CPU-like T-chips I could find incorporate Z80s and VTech kept using the Z80 even after they stopped using T-chips, so these systems are probably Z80-based too.

Last but not least, which came first, the chicken Type-right or the Laser 50? Given the tendency of cost-reduced designs to be follow-ons, and that T-chip numbers appear to be more or less sequential, the Type-right appears to have been a cost-reduced design based on the Laser 50. And, well, probably a better design as well. After all, hers still works.