Welcome to my blog. Have a look at the most recent posts below, or browse the tag cloud on the right. An archive of all posts is also available.

RSS Atom Add a new post titled:

I've been into retro-gaming for a couple years now, and, after the 30th anniversary of the Amiga, I've been getting into retro-computing too. To feed my dirty habbit, I've been doing quite a bit of garage sale hunting. Last weekend I went to a garage sale that advertised on craigslist as "HUGE HOARDER SALE". It did not disappoint. There were a bunch of Atari 7800 games in boxes, so that by itself was worth the trip.

I talked to the guy a bit. He told me there was more stuff inside, but he hadn't had a chance to go through it yet. He did want to, "sell an $80 game for $3." Fair enough. A little more chatting, and he offered to let me have a look. The games were almost all common stuff, but there was a Mega Man IV, in the box... I told him to look up a price for that one, and I still got a really good deal.

The rest of the house... OMG. It really was like you see on TV. Boxes covering almost every inch of the floor and stacked to the ceiling in places. The guy told me that he thought it would take about two years to get the place fully cleaned out. Ouch.

In the kitchen (obviously) there was a big, big pile of computers. I started looking through that... and there was some weird stuff there. The most interesting bits were an Osborne 1, a Mac SE/30, and a DaynaFile.

I first saw the DaynaFile from this view, and I thought, "An external SCSI chasis. I can use that."

I picked up, turned it around, and almost dropped it. First of all, that's a 5.25" floppy drive in a SCSI chasis. That earns some serious WTF marks alone. Look closer... 360k.

Now I had to have it. :) I added it to my pile. I figured that after amortizing the total price over all the things I got, I paid about $1 for it.

Upon getting it home, I dissected it. Inside is a perfectly ordinary 360k 5.25" floppy drive.

The whole thing is controlled by a little 8031 microcontroller that bridges the SCSI bus to the floppy bus.

As if the whole thing weren't crazy enough... there's the date on the ROM. It's hard to read in the picture, but it says, "DAYNAFILE REV. 3.1 @ 1989". Yes... 1989. Why? WHY?!? Why did someone need a 360k 5.25" SCSI floppy drive in 1989?!? By that time Macs, Amigas, Atari STs, and even most PCs had 720k 3.5" floppy drives standard in 1989. I understand wanting to read 1.2M 5.25" PC floppies or 1.44M/720k 3.5" PC floppies on a Mac, but 360k? For shame!

The bummer is that there's no powersupply for it. I found a user manual, which is filled with some serious lolz. '"Reading and writing files" is MS-DOS terminology.... Reading a file is the same as opening a document and writing a file is the same as closing a document and saving changes." Now I remember why I used to make fun of Macintosh users.

What the manual doesn't have anywhere in it's 122 pages is a pinout of the powersupply connector. The Internet says it uses an Elpac WM220... should be possible to rig something up.

Posted Thu Oct 8 18:27:18 2015

I just finished my talk at XDC 2014. The short version: UBO support in OpenGL drivers is terrible, and I have the test cases to prove it.

There are slides, a white paper, and, eventually, a video.

UPDATE: Fixed a typo in the white paper reported by Jonas Kulla on Twitter.

UPDATE: Direct link to video.

Posted Wed Oct 8 06:28:09 2014 Tags:

On Saturday I went to the Seattle Retro Gaming Expo. It was fun (as usual), and I spent too much money (also as usual). On Sunday I rolled up my sleeves, and I started working on a game for a retro system. I've been talking about doing this for at least a year, but it was hard to find time while I was still working on my masters degree. Now that I'm done with that, I have no excuses.

Since this is my first foray (back) into game programming on hella old hardware, I decided to stick with a system I know and a game I know. I'm doing Tetris for the TI-99/4a. Around 1997 I did a version of Tetris in 8051 assembly, and the TI is the first computer that I ever really programmed.

My stepdad used to work at Fred Meyer in the 80's, and he got the computer practically for free when TI discontinued them... and retailers heavily discounted their stock-on-hand. It was a few years until the "kids" were allowed to play with the computer. Once I got access, I spent basically all of the wee hours of the night hogging the TV and the computer. :) It was around that same time that my older stepsister swiped the tape drive, so I had no way to save any of my work. I would often leave it on for days while I was working on a program. As a result, I don't have any of those old programs. It's probably better this way... all of those programs were spaghetti BASIC garbage. :)

The cool thing (if that's the right phrase) about starting with the TI is that it uses the TMS9918 VDC. This same chip is used in the ColecoVision, MSX, and Sega SG-1000 (system before the system before the Sega Master System). All the tricks from the TI will directly apply to those other systems.

Fast forward to the present... This first attempt is also in BASIC, but I'm using TI Extended BASIC now. This has a few features that should make things less painful, but I'm pretty sure this will be the only thing I make using the actual TI as the development system... I'm basically writing code using ed (but worse), and I had repressed the memories of how terrible that keyboard is.

On Sunday, for the first time in 28 years, I saved and restored computer data on audio cassette.

Anyway... my plan is:

  1. Get Tetris working. Based on a couple hours hacking around on Sunday, I don't think BASIC is going to be fast enough for the game.
  2. Redo parts in assembly, which I can call from TI Extended BASIC, so that the game is playable.
  3. Maybe redo the whole thing in assembly... dunno.
  4. Move on to the next game.

I really want to do a version of Dragon Attack. The MSX has the same VDC, so it should be possible. Funny thing there... In 1990 I worked for HAL America (their office was on Cirrus Drive in Beaverton) as a game counselor. I mostly helped people with Adventures of Lolo and Adventures of Lolo 2. Hmm... maybe I should just port Lolo (which also started on the MSX) to the TI...

Posted Mon Jun 30 10:09:35 2014 Tags:

The slides and video of my talk from Steam Dev Days has been posted. It's basically the Haswell refresh (inside joke) of my SIGGRAPH talk from last year.

Posted Thu Feb 13 08:33:51 2014 Tags:

The slides from my FOSDEM talk are now available.

Posted Sat Feb 1 04:19:04 2014 Tags:

Multidimensional arrays are added to GLSL via either GL_ARB_arrays_of_arrays extension or GLSL 4.30. I've had a couple people tell me that the multidimensional array syntax is either wrong or just plain crazy. When viewed from the proper angle, it should actually be perfectly logical to any C / C++ programmer. I'd like to clear up a bit of the confusion.

Staring with the easy syntax, the following does what you expect:

    vec4 a[2][3][4];

If a is inside a uniform block, the memory layout would be the same as in C. cdecl would call this, "declare a as array 2 of array 3 of array 4 of vec4".

Using GLSL constructor syntax, the array can also be initialized:

    vec4 a[2][3][4] = vec4[][][](vec4[][](vec4[](vec4( 1), vec4( 2), vec4( 3), vec4( 4)),
                                          vec4[](vec4( 5), vec4( 6), vec4( 7), vec4( 8)),
                                          vec4[](vec4( 9), vec4(10), vec4(11), vec4(12))),
                                 vec4[][](vec4[](vec4(13), vec4(14), vec4(15), vec4(16)),
                                          vec4[](vec4(17), vec4(18), vec4(19), vec4(20)),
                                          vec4[](vec4(21), vec4(22), vec4(23), vec4(24))));

If that makes your eyes bleed, GL_ARB_shading_language_420pack and GLSL 4.20 add the ability to use C-style array and structure initializers. In that model, a can be initialized to the same values by:

    vec4 a[2][3][4] = {
            { vec4( 1), vec4( 2), vec4( 3), vec4( 4) },
            { vec4( 5), vec4( 6), vec4( 7), vec4( 8) },
            { vec4( 9), vec4(10), vec4(11), vec4(12) }
            { vec4(13), vec4(14), vec4(15), vec4(16) },
            { vec4(17), vec4(18), vec4(19), vec4(20) },
            { vec4(21), vec4(22), vec4(23), vec4(24) }

Functions can be declared that take multidimensional arrays as parameters. In the prototype, the name of the parameter can be present, or it can be omitted.

    void function_a(float a[4][5][6]);
    void function_b(float  [4][5][6]);

Other than the GLSL constructor syntax, there hasn't been any madness yet. However, recall that array sizes can be associated with the variable name or with the type. The prototype for function_a associates the size with the variable name, and the prototype for function_b associates the size with the type. Like GLSL constructor syntax, this has existed since GLSL 1.20.

Associating the array size with just the type, we can declare a (from above) as:

    vec4[2][3][4] a;

With multidimensional arrays, the sizes can be split among the two, and this is where it gets weird. We can also declare a as:

    vec4[3][4] a[2];

This declaration has the same layout as the previous two forms. This is usually where people say, "It's bigger on the inside!" Recall the cdecl description, "declare a as array 2 of array 3 of array 4 of vec4". If we add some parenthesis, "declare a as array 2 of (array 3 of array 4 of vec4)", and things seem a bit more clear.

GLSL ended up with this syntax for two reasons, and seeing those reasons should illuminate things. Without GL_ARB_arrays_of_arrays or GLSL 4.30, there are no multidimensional arrays, but the same affect can be achieved, very inconveniently, using structures containing arrays. In GLSL 4.20 and earlier, we could also declare a as:

    struct S1 {
        float a[4];

    struct S2 {
        S1 a[3];

    S2 a[2];

I'll spare you having to see GLSL constructor initializer for that mess. Note that we still end up with a[2] at the end.

Using typedef in C, we could also achieve the same result using:

    typedef float T[3][4];

    T a[2];

Again, we end up with a[2]. If cdecl could handle this (it doesn't grok typedef), it would say "declare a as array 2 of T", and "typedef T as array 3 of array 4 of float". We could substitue the description of T and, with parenthesis, get "declare a as array 2 of (array 3 of array 4 of float)".

Where this starts to present pain is that function_c has the same parameter type as function_a and function_b, but function_d does not.

    void function_c(float[5][6] a[4]);
    void function_d(float[5][6]  [4]);

However, the layout of parameter for function_e is the same as function_a and function_b, even though the actual type is different.

    struct S3 {
        float a[6];

    struct S4 {
        S3 a[5];

    void function_e(S4 [4]);

I think if we had it to do all over again, we may have disallowed the split syntax. That would remove the more annoying pitfalls and the confusion, but it would also remove some functionality. Most of the problems associated with the split are caught at compile-time, but some are not. The two obvious problems that remain are transposing array indices and incorrectly calculated uniform block layouts.

    layout(std140) uniform U {
        float [1][2][3] x;
        float y[1][2][3];
        float [1][2] z[3];

In this example x and y have the same memory organization, but z does not. I wouldn't want to try to debug that problem.

Posted Wed Jan 29 09:08:42 2014 Tags:

This previous weekend was the Portland Retro Gaming Expo, and it was awesome! I was there all day both days.

I got some really good deals. :) Gaiares (loose) for $5! Tatsujin (import version of Truxton) for $8! I got a lot of games. Just don't ask how much I spent altogether...

One of the highlights of the show where the performances by 8Bit Weapon. You'll notice in the second image that she's playing an Atari paddle controller. You can also see a C64 (donated by the local Commodore User Group) in the last image.

The GIANT arcade was... amazing! There were some games there that I haven't played in years. My uncle would have enjoyed Mappy. :) And I didn't win the tabletop Asteroids game. I spent a lot of time playing APB (Hey Ground Kontrol: put that in the arcade!!!). I was also pretty surprised that the only console version of APB is for the Lynx. Oof.

I love the Space Wars instructions... I think those are still the instructions for Windows... lol.

There was a cool presentation about making games for the Atari 2600. There were even a couple "celebrities" in the audience. On the right is Darrell Spice Jr. (programmer of Medieval Mayhem), and on the left is Rebecca Heineman (programmer London Blitz for the 2600 among other things). She's also responsible for the awesome disassembly of the game Freeway.

Joe Decuir was also in the audience, and I met him later at the Atari Age booth (I had to get Medieval Mayhem and Moon Cresta :) ). I can't believe Joe doesn't have a Wikipedia page. He was one of the original designers of the Atari 2600, programmer of Video Olympics and Combat (with Larry Wagner). He was also one of the original Amiga designers. He was having a conversation with one of the Atari Age guys about wanting to design a new console in the style of an old Atari 8-bit or an Amiga. I told him that there's a group trying to make an Amiga 2000 from scratch (using FPGAs for custom chips). So now I'm tasked with finding that project again and getting him in contact with them. I'm sure they'd appreciate any help he could offer!

The Tetris World Championships was so tense!!! I think I have to watch the movie now. After being down 2 games, Jonas came back to win it... for the fourth time in a row!

Posted Mon Oct 7 19:23:50 2013 Tags:

Now my Mesa: State of the Project talk is done too, and the slides are available.

I'm assuming that someone will send a long a link to the video soon... ;)

Posted Tue Sep 24 10:54:52 2013 Tags:

My XDC talk "Effects Framework for OpenGL Testing" just got done, and the slides are available. The talk went pretty well, and the discussion was healthy.

The three big high points of the discussion were:

  • For the most part, adding more language to learn won't necessarily make it easier to add more complex tests. Just writing C tests in piglit isn't bad these days. The worst parts are dealing with cmake and all.tests. The best thing about shader_runner (and similar) is that you don't have muck about with any of that.
  • One difficulty with complex tests is validating the correctness of results. The red / green box tests are good because the pass / fail metric is obvious. Perceptual difference algorithms can work (VMware uses them), but they can be twitchy and frustrating (Cairo gave up on them).
  • The shader_runner parser is a mess because everyone just added one more piece of duct tape for the next tiny feature they need. There used to be a clean, simple piece of code, but you can't see it now... all you can see is the big ball of duct tape. One advantage of nvFX is that it is a consistent, defined language... instead of a ball of duct tape. We could borrow their syntax for some of the things that shader_runner already does.

There may have been other important points, but those are the two that really stuck with me. The forum is open. :)

Posted Mon Sep 23 14:27:45 2013 Tags:

I'm giving a talk tomorrow at LPC in the Graphics and Display microconf. Since the time slots are so short (25 minutes!), I wanted (okay, Laurent requested) to provide some details before the talk to prime the pump, so to speak.

One line summary: Right now debugging and profiling graphics applications and graphics systems on Linux is a disaster. It's better on Windows, but not by a lot (especially for OpenGL).

There are very few tools, and the tools that exist are either insufficient or are vendor-specific. Moreover, the tools don't provide any kind of system view. At some point in some desktop environment, every developer has to figure out why / how the compositor is wrecking performance. This often takes a lot of work because there is no system view.

The one tool for Windows that can provide a system view is GPUView.

As a result, even on Windows, many developers end up rolling their own tools. The methods remind me a lot of the old days of sprinkling rdtsc() calls all over the code. What has changed is the level detail provided by the tools that display the logged data. Valve has famously talked about the system they use. Other developers have told me they use similar systems.

There is common thematic problem in all of these tools and approaches. The developer is either gets a lot of detailed data and is tied to a particular vendor, or the developer gets very coarse data. And they don't get system-wide data.

There are several disparate groups that need data

  • People creating stand-alone debug / profile tools (e.g., apitrace).
  • People building data collection into their application and using an external, post-hoc visualization tool
  • People building data collection and visualization into their application.

Generally, folks doing one of the last two are doing both to varying degrees.

So here's the question. Can we provide a set of interfaces, probably from the kernel, that:

  • Provides finer grained data than is available from, say, GL_ARB_timer_query about the execution of commands on the GPU.
  • Provides the above data at a system level with semantic information (e.g., this block of time was your call to glDrawArrays, this block of time was the compositor doing "stuff", this block of time was your XRender request, etc.) without leaking information in a way that compromises security.
  • Allow closed-source drivers to expose these interfaces.

UPDATE: The slides and notes from the talk are available. Thanks to Nathan Willis for reminding me to post them.

Posted Wed Sep 18 09:00:38 2013 Tags:

This wiki is powered by ikiwiki.