Sunday, August 30, 2020

A second look at computer stereoscopy with the Minoru 3D webcam and Camaglyph

This article will be part of a series -- you can read other entries

Did you see what I did there? I'm proud of that title. Thank you very much.

Is 3D vintage computing? Well, it's complicated. 3D in computing didn't really have much traction in the early microcomputer age because of the limited palette, resolution and computing power: it was hard enough generating one image, let alone two and then figuring out ways to merge them. (I have some ideas about that but more later.) Apart from colour anaglyphs, best known as the red-blue glasses method, early stereoscopic computing was generally limited to active technologies (i.e., show one eye and then show the other), and the slower switching rates made this a rather flickery, headache-inducing affair. While some higher-end video cards like the Nvidia Quadro FX 4500 in my Power Mac Quad G5 (circa 2005-6) have connectors for active stereo glasses -- in this case a 3-pin mini-DIN -- almost all of these were proprietary and very few software packages supported it. Worse, these glasses wouldn't work with flat-panel LCDs, which were already displacing CRTs by then, because of the higher refresh rates required. There were some home gaming products like the Sega Scope 3D for the Master System, plus the infamous Nintendo Famicom 3D System and Nintendo Virtual Boy, but these only succeeded in showing the technical infancy of the genre and possibly increasing local referrals to ophthalmologists. (I'm intentionally ignoring lenticular and Pulfrich stereoscopy here because of their limited applications.)

IMHO, stereoscopic computing didn't really hit the mainstream until the advent of Blu-ray 3D, which increased, at least temporarily, the home market for polarized LCD displays that could show 3D content using passive glasses. Colour anaglyphs are neat, and can be done well, but they necessarily interfere with the colour palette to ensure that each eye gets as separate an image as possible (even tricks like Dolby 3D do this, though usually imperceptibly). Although active displays are still a thing -- my 3D home theatre is DLP, and uses DLP-Link, which are 120Hz active glasses -- they're also heavier and have to be powered, and active display solutions are overall more expensive. Passive LCD 3D screens, however, were for at least a few years plentiful and relatively inexpensive, and most TVs of a certain vintage came with the feature when it was thought 3D would be the next big thing. 3D movie cameras powered many major studio production shoots and 3D movies were plentiful in consumer stores. Many games for contemporary consoles, notably the Xbox 360 and PlayStation 3, offered stereoscopic gameplay, and you could even buy computer monitors that were 3D. My secondary display is a Mitsubishi Diamondcrysta RDT233WX-3D, which is an exceptionally fine IPS passive display I ended up importing from Japan, but there were many manufacturers on both shores.

Well, those days have died. If you look at Wikipedia's list of stereoscopic video games, which is a good summation of the industry, there is a dramatic falloff around 2014. Most 4K TVs in the United States don't offer a 3D mode of any sort, partially for technical reasons I'll mention, and only a subset of console games still support stereoscopy. Many players will still play them but Blu-ray 3D movies are all but gone from the American market. While movie theatres still offer(ed) 3D showings, COVID-19 has kind of crushed that industry into little tiny theatre bits, and very few major motion pictures are filmed in native 3D anymore. I think it's safe to say that until such time as the fad revives, as fads do, stereoscopic computing has retreated back to the small realm of enthusiasts.

And, well, I'm one of them. I'm not as nuts as some, but in addition to the Diamondcrysta monitor I have a 3D home theatre (using DLP and DLP-Link glasses), a Vizio 3D TV, a Nintendo 3DS and a Fuji FinePix Real 3D W3 still and video camera. I've got heaps of Blu-ray 3D that I imported at sometimes confiscatory rates, but happily most of them are all-region. I'm pretty much on my own for software, though, so I figured writing a simple display tool for a 3D webcam was a good place to start with writing my own stereoscopic applications. And there's one available very cheaply on the remaindered market:

This device is the Minoru 3D webcam, circa 2009. Despite the name (it means "reality" in Japanese), the device is actually a British creation. It is not a particularly good camera as webcams go, but it's cute, it's cheap and it's 3D. However, it only comes with Windows drivers, and they don't even work with Windows 10 (probably why it's so cheap now). My normal daily driver is Linux. Challenge accepted.

The Minoru appears to the system as two USB Video Class cameras in a single unit connected by an internal hub; lsusb sees the two cameras as Vimicro Venus USB 2.0 devices, which is a very common USB camera chipset. Despite the documentation, the maximum resolution and frame rate of the cameras is 640x480 at 30fps and the 800x600 mode advertised appears to be simply software upscaling. Nevertheless, when treated as separate devices, the individual video cameras "just work" with Video4Linux2 in VLC (the "eye" lights up when it's viewing you), so what we really need is something that can take the two images and merge them.

The Minoru's included drivers offer a side-by-side mode but most people will run it in anaglyph mode. Appropriately, it comes with a metric crapton of cardboard glasses you can give to your friends so they can see you in all your dimensions. There are many colour schemes for anaglyphs but the most common is probably red on the left lens and blue (or preferably cyan) on the right, and there are many algorithms for doing that. I wrote a simple V4L2 backend that runs both camera sides simultaneously and pulls left and right frames with an SDL-based frontend as the visualizer, and then selected two methods that are generally considered high(er) quality.

The first, the optimized anaglyph method, is quite straightforward to implement: for the merged image, use the green and blue channels from the right image, and compute the red channel of the merged image using 0.3 of the left image's blue channel and 0.7 of the left image's green channel. (Some versions of this boost use a 1.5 factor prior to merging them, i.e., a 50% boost, which helps with dimness, but means the resulting value needs to be clamped.) This has the effect of dropping both images' red channels completely but the eye can compensate somewhat, and the retinal rivalry between eyes is reduced compared to more simplistic methods. A SIMD-friendly method of doing this is to simply copy the entire right image to the merged image, and then overwrite the red channel in a separate loop. There is still a bit of ghosting but this sort of image can be rendered very quickly. Here is an optimized anaglyph image of yours truly from the Minoru, viewable with red-cyan glasses:

A superior and widely adopted anaglyph method is the one devised by Eric Dubois. His innovation was using a least-squares approach in the CIE-XYZ colourspace which can be approximated with precomputed coefficients, using matrix multiplication to merge the two images. It is slow to compute and requires saturation math to deal with clipping, and reds get turned into more of an umber (you can see this on what's supposed to be my red anaglyph lens), but the colour rendition is overall better and ghosting between the two sides is almost absent. Here is a Dubois image of me in almost the same position, also viewable with red-cyan glasses:

But anaglyph is just not satisfactory for good colour rendition even if you get accustomed to it, and even the best anaglyph glasses with good lenses and dioptres will still have some ghosting between sides on most colour images. This is where passive 3D comes in.

Most passive 3D monitors use alternating polarization, implemented as either a second glass substrate called a patterned retarder, or more recently using a film overlay (called, naturally, a film-based patterned retarder). Each separate line of the display alternates polarization, which without polarized glasses only shows up as a very faint linear interlace. My Diamondcrysta and Vizio passive 3D displays are 1080p, so with glasses off you get all 1080 lines, and with glasses on 540 lines go to one eye and 540 lines go to the other. (This is probably why 4K displays don't do this, other than cost: it would make upscaling 3D 1080p content lower quality because the horizontal lines cannot be interpolated or smoothed.)

This is obviously lower 3D resolution than an active display, where the entire screen (like on my DLP projector) alternates between each eye. However, the chief advantage to us as homebrewers is that the polarization is an intrinsic property of the display -- anything displayed on that screen is subject to it, rather than relying on some vendor-specific video mode. That means anything can display anything in 3D as long as you understand which lines will be activated by what you draw.

My Diamondcrysta is a R-L display (the polarization goes right-left-right-left-etc.), so starting with line 1 at the top, odd lines must show the right image and even lines the left image. This requires us to find out where the actual image is positioned onscreen, not just the position of the window, since window decorations will shift the image down by some amount. SDL 2.0 will tell us where the window is, but I'm using SDL 1.2 for maximal compatibility (I'd like to port this to my G5 running Mac OS X Tiger), so instead when we display the image we ask SDL for the underlying widget, get its on-screen coordinates, ask X11 how big the window decorations are and then compute the actual Y-position of the image. You could then draw the merged image by doing alternating memcpy()s line by line, but with today's CPUs with potentially big SIMD vector registers, I simply did a big copy of one entire view and then on every other line drew in the lines from the other view, which is noticeably faster. This yields the following image, which redraws itself with the proper lines in the proper places when the window is moved:

You'll need to view the full-size image (click on it). To view this on a passive polarized 3D display or a 3D TV acting as a monitor, you've got exactly a 50% chance that your web browser will already have it at the right Y-position. If it doesn't, you may want to save the full size image to disk, open it up on that display, and shift its window up or down pixel by pixel with your polarized glasses on until it "pops." Your monitor does not have to be put into any special "3D mode" for this to work.

The source code is in pure C, runs entirely in userspace, and should build on any Linux system (including big-endian) with V4L2 and SDL 1.2. I call it Camaglyph and it's on Github. It's designed to be somewhat modular, so once I sit down and write a QuickTime or QTKit grabber I should be able to make this work on my G5. I've included a configuration file for akvcam so you can use it as, you know, a webcam.

In future posts we'll look at converting classic games to work with stereoscopic displays instead of just having anaglyph modes, and explore some ideas about how we can get classic computers to display anaglyph 3D.

No comments:

Post a Comment