Welcome to my blog. Have a look at the most recent posts below, or browse the tag cloud on the right. An archive of all posts is also available.

RSS Atom Add a new post titled:

On Saturday I went to the Seattle Retro Gaming Expo. It was fun (as usual), and I spent too much money (also as usual). On Sunday I rolled up my sleeves, and I started working on a game for a retro system. I've been talking about doing this for at least a year, but it was hard to find time while I was still working on my masters degree. Now that I'm done with that, I have no excuses.

Since this is my first foray (back) into game programming on hella old hardware, I decided to stick with a system I know and a game I know. I'm doing Tetris for the TI-99/4a. Around 1997 I did a version of Tetris in 8051 assembly, and the TI is the first computer that I ever really programmed.

My stepdad used to work at Fred Meyer in the 80's, and he got the computer practically for free when TI discontinued them... and retailers heavily discounted their stock-on-hand. It was a few years until the "kids" were allowed to play with the computer. Once I got access, I spent basically all of the wee hours of the night hogging the TV and the computer. :) It was around that same time that my older stepsister swiped the tape drive, so I had no way to save any of my work. I would often leave it on for days while I was working on a program. As a result, I don't have any of those old programs. It's probably better this way... all of those programs were spaghetti BASIC garbage. :)

The cool thing (if that's the right phrase) about starting with the TI is that it uses the TMS9918 VDC. This same chip is used in the ColecoVision, MSX, and Sega SG-1000 (system before the system before the Sega Master System). All the tricks from the TI will directly apply to those other systems.

Fast forward to the present... This first attempt is also in BASIC, but I'm using TI Extended BASIC now. This has a few features that should make things less painful, but I'm pretty sure this will be the only thing I make using the actual TI as the development system... I'm basically writing code using ed (but worse), and I had repressed the memories of how terrible that keyboard is.

On Sunday, for the first time in 28 years, I saved and restored computer data on audio cassette.

Anyway... my plan is:

  1. Get Tetris working. Based on a couple hours hacking around on Sunday, I don't think BASIC is going to be fast enough for the game.
  2. Redo parts in assembly, which I can call from TI Extended BASIC, so that the game is playable.
  3. Maybe redo the whole thing in assembly... dunno.
  4. Move on to the next game.

I really want to do a version of Dragon Attack. The MSX has the same VDC, so it should be possible. Funny thing there... In 1990 I worked for HAL America (their office was on Cirrus Drive in Beaverton) as a game counselor. I mostly helped people with Adventures of Lolo and Adventures of Lolo 2. Hmm... maybe I should just port Lolo (which also started on the MSX) to the TI...

Posted Mon Jun 30 10:09:35 2014 Tags:

The slides and video of my talk from Steam Dev Days has been posted. It's basically the Haswell refresh (inside joke) of my SIGGRAPH talk from last year.

Posted Thu Feb 13 08:33:51 2014 Tags:

The slides from my FOSDEM talk are now available.

Posted Sat Feb 1 04:19:04 2014 Tags:

Multidimensional arrays are added to GLSL via either GL_ARB_arrays_of_arrays extension or GLSL 4.30. I've had a couple people tell me that the multidimensional array syntax is either wrong or just plain crazy. When viewed from the proper angle, it should actually be perfectly logical to any C / C++ programmer. I'd like to clear up a bit of the confusion.

Staring with the easy syntax, the following does what you expect:

    vec4 a[2][3][4];

If a is inside a uniform block, the memory layout would be the same as in C. cdecl would call this, "declare a as array 2 of array 3 of array 4 of vec4".

Using GLSL constructor syntax, the array can also be initialized:

    vec4 a[2][3][4] = vec4[][][](vec4[][](vec4[](vec4( 1), vec4( 2), vec4( 3), vec4( 4)),
                                          vec4[](vec4( 5), vec4( 6), vec4( 7), vec4( 8)),
                                          vec4[](vec4( 9), vec4(10), vec4(11), vec4(12))),
                                 vec4[][](vec4[](vec4(13), vec4(14), vec4(15), vec4(16)),
                                          vec4[](vec4(17), vec4(18), vec4(19), vec4(20)),
                                          vec4[](vec4(21), vec4(22), vec4(23), vec4(24))));

If that makes your eyes bleed, GL_ARB_shading_language_420pack and GLSL 4.20 add the ability to use C-style array and structure initializers. In that model, a can be initialized to the same values by:

    vec4 a[2][3][4] = {
        {
            { vec4( 1), vec4( 2), vec4( 3), vec4( 4) },
            { vec4( 5), vec4( 6), vec4( 7), vec4( 8) },
            { vec4( 9), vec4(10), vec4(11), vec4(12) }
        },
        {
            { vec4(13), vec4(14), vec4(15), vec4(16) },
            { vec4(17), vec4(18), vec4(19), vec4(20) },
            { vec4(21), vec4(22), vec4(23), vec4(24) }
        }
    };

Functions can be declared that take multidimensional arrays as parameters. In the prototype, the name of the parameter can be present, or it can be omitted.

    void function_a(float a[4][5][6]);
    void function_b(float  [4][5][6]);

Other than the GLSL constructor syntax, there hasn't been any madness yet. However, recall that array sizes can be associated with the variable name or with the type. The prototype for function_a associates the size with the variable name, and the prototype for function_b associates the size with the type. Like GLSL constructor syntax, this has existed since GLSL 1.20.

Associating the array size with just the type, we can declare a (from above) as:

    vec4[2][3][4] a;

With multidimensional arrays, the sizes can be split among the two, and this is where it gets weird. We can also declare a as:

    vec4[3][4] a[2];

This declaration has the same layout as the previous two forms. This is usually where people say, "It's bigger on the inside!" Recall the cdecl description, "declare a as array 2 of array 3 of array 4 of vec4". If we add some parenthesis, "declare a as array 2 of (array 3 of array 4 of vec4)", and things seem a bit more clear.

GLSL ended up with this syntax for two reasons, and seeing those reasons should illuminate things. Without GL_ARB_arrays_of_arrays or GLSL 4.30, there are no multidimensional arrays, but the same affect can be achieved, very inconveniently, using structures containing arrays. In GLSL 4.20 and earlier, we could also declare a as:

    struct S1 {
        float a[4];
    };

    struct S2 {
        S1 a[3];
    };

    S2 a[2];

I'll spare you having to see GLSL constructor initializer for that mess. Note that we still end up with a[2] at the end.

Using typedef in C, we could also achieve the same result using:

    typedef float T[3][4];

    T a[2];

Again, we end up with a[2]. If cdecl could handle this (it doesn't grok typedef), it would say "declare a as array 2 of T", and "typedef T as array 3 of array 4 of float". We could substitue the description of T and, with parenthesis, get "declare a as array 2 of (array 3 of array 4 of float)".

Where this starts to present pain is that function_c has the same parameter type as function_a and function_b, but function_d does not.

    void function_c(float[5][6] a[4]);
    void function_d(float[5][6]  [4]);

However, the layout of parameter for function_e is the same as function_a and function_b, even though the actual type is different.

    struct S3 {
        float a[6];
    };

    struct S4 {
        S3 a[5];
    };

    void function_e(S4 [4]);

I think if we had it to do all over again, we may have disallowed the split syntax. That would remove the more annoying pitfalls and the confusion, but it would also remove some functionality. Most of the problems associated with the split are caught at compile-time, but some are not. The two obvious problems that remain are transposing array indices and incorrectly calculated uniform block layouts.

    layout(std140) uniform U {
        float [1][2][3] x;
        float y[1][2][3];
        float [1][2] z[3];
    };

In this example x and y have the same memory organization, but z does not. I wouldn't want to try to debug that problem.

Posted Wed Jan 29 09:08:42 2014 Tags:


This previous weekend was the Portland Retro Gaming Expo, and it was awesome! I was there all day both days.

I got some really good deals. :) Gaiares (loose) for $5! Tatsujin (import version of Truxton) for $8! I got a lot of games. Just don't ask how much I spent altogether...

One of the highlights of the show where the performances by 8Bit Weapon. You'll notice in the second image that she's playing an Atari paddle controller. You can also see a C64 (donated by the local Commodore User Group) in the last image.

The GIANT arcade was... amazing! There were some games there that I haven't played in years. My uncle would have enjoyed Mappy. :) And I didn't win the tabletop Asteroids game. I spent a lot of time playing APB (Hey Ground Kontrol: put that in the arcade!!!). I was also pretty surprised that the only console version of APB is for the Lynx. Oof.

I love the Space Wars instructions... I think those are still the instructions for Windows... lol.

There was a cool presentation about making games for the Atari 2600. There were even a couple "celebrities" in the audience. On the right is Darrell Spice Jr. (programmer of Medieval Mayhem), and on the left is Rebecca Heineman (programmer London Blitz for the 2600 among other things). She's also responsible for the awesome disassembly of the game Freeway.

Joe Decuir was also in the audience, and I met him later at the Atari Age booth (I had to get Medieval Mayhem and Moon Cresta :) ). I can't believe Joe doesn't have a Wikipedia page. He was one of the original designers of the Atari 2600, programmer of Video Olympics and Combat (with Larry Wagner). He was also one of the original Amiga designers. He was having a conversation with one of the Atari Age guys about wanting to design a new console in the style of an old Atari 8-bit or an Amiga. I told him that there's a group trying to make an Amiga 2000 from scratch (using FPGAs for custom chips). So now I'm tasked with finding that project again and getting him in contact with them. I'm sure they'd appreciate any help he could offer!


The Tetris World Championships was so tense!!! I think I have to watch the movie now. After being down 2 games, Jonas came back to win it... for the fourth time in a row!

Posted Mon Oct 7 19:23:50 2013 Tags:

Now my Mesa: State of the Project talk is done too, and the slides are available.

I'm assuming that someone will send a long a link to the video soon... ;)

Posted Tue Sep 24 10:54:52 2013 Tags:

My XDC talk "Effects Framework for OpenGL Testing" just got done, and the slides are available. The talk went pretty well, and the discussion was healthy.

The three big high points of the discussion were:

  • For the most part, adding more language to learn won't necessarily make it easier to add more complex tests. Just writing C tests in piglit isn't bad these days. The worst parts are dealing with cmake and all.tests. The best thing about shader_runner (and similar) is that you don't have muck about with any of that.
  • One difficulty with complex tests is validating the correctness of results. The red / green box tests are good because the pass / fail metric is obvious. Perceptual difference algorithms can work (VMware uses them), but they can be twitchy and frustrating (Cairo gave up on them).
  • The shader_runner parser is a mess because everyone just added one more piece of duct tape for the next tiny feature they need. There used to be a clean, simple piece of code, but you can't see it now... all you can see is the big ball of duct tape. One advantage of nvFX is that it is a consistent, defined language... instead of a ball of duct tape. We could borrow their syntax for some of the things that shader_runner already does.

There may have been other important points, but those are the two that really stuck with me. The forum is open. :)

Posted Mon Sep 23 14:27:45 2013 Tags:

I'm giving a talk tomorrow at LPC in the Graphics and Display microconf. Since the time slots are so short (25 minutes!), I wanted (okay, Laurent requested) to provide some details before the talk to prime the pump, so to speak.

One line summary: Right now debugging and profiling graphics applications and graphics systems on Linux is a disaster. It's better on Windows, but not by a lot (especially for OpenGL).

There are very few tools, and the tools that exist are either insufficient or are vendor-specific. Moreover, the tools don't provide any kind of system view. At some point in some desktop environment, every developer has to figure out why / how the compositor is wrecking performance. This often takes a lot of work because there is no system view.

The one tool for Windows that can provide a system view is GPUView.

As a result, even on Windows, many developers end up rolling their own tools. The methods remind me a lot of the old days of sprinkling rdtsc() calls all over the code. What has changed is the level detail provided by the tools that display the logged data. Valve has famously talked about the system they use. Other developers have told me they use similar systems.

There is common thematic problem in all of these tools and approaches. The developer is either gets a lot of detailed data and is tied to a particular vendor, or the developer gets very coarse data. And they don't get system-wide data.

There are several disparate groups that need data

  • People creating stand-alone debug / profile tools (e.g., apitrace).
  • People building data collection into their application and using an external, post-hoc visualization tool
  • People building data collection and visualization into their application.

Generally, folks doing one of the last two are doing both to varying degrees.

So here's the question. Can we provide a set of interfaces, probably from the kernel, that:

  • Provides finer grained data than is available from, say, GL_ARB_timer_query about the execution of commands on the GPU.
  • Provides the above data at a system level with semantic information (e.g., this block of time was your call to glDrawArrays, this block of time was the compositor doing "stuff", this block of time was your XRender request, etc.) without leaking information in a way that compromises security.
  • Allow closed-source drivers to expose these interfaces.

UPDATE: The slides and notes from the talk are available. Thanks to Nathan Willis for reminding me to post them.

Posted Wed Sep 18 09:00:38 2013 Tags:

I've just pushed a branch to my fd.o tree that brings back the standalone compiler. I've also made some small enhancements to it. Why? Thanks for asking. :)

One thing that game developers frequently complain about (and they're damn right!) is the massive amount of variability among GLSL compilers. Compiler A accepts a shader, but compiler B generates errors. Then the real fun begins... which compiler is right? Dig through the spec, report bugs to the vendors, wait for the vendors to finger point... it's pretty dire.

Mesa compiler has a reputation of sticking to the letter of the spec. This has caused some ruffled feathers with game developers and with some folks in the Mesa community. In cases where our behavior disagrees with all other shipping implementations, I have submitted numerous spec bugs. If nobody does what the spec says, you change the spec.

This isn't to say our conformance is perfect or that we don't have any bugs. Reality is quite the contrary. :) However, we are really picky about stuff that other people aren't quite so picky about. When we find deviations from the behavior of other implementations, one way or another, we sort it out.

Sometimes that means changing our behavior (and adding piglit tests).

Sometimes that means changing our behavior (and getting the spec changed).

Sometimes that means implementing a work-around for specific apps (that is only enabled for those apps!).

Sometimes that means not changing anything (and taking a hard line that someone else needs to fix their code).

The combination of our ability to build our compiler on many common platforms and our spec pedanticism puts Mesa in a fairly interesting position. It means that developers could use our compiler, without the baggage of the rest of a driver, as the thrid-party to settle disputes. It can be the "if it compiles on here, it had better compile anywhere" oracle.

Even if it fails at that, we emit a lot of warnings. Sometimes we emit too many warnings. A standalone compiler gives a good start for "lint" for GLSL. There are already a couple places that we give portability warnings (things we accept that some compilers are known to incorrectly reject), so there's already some value there.

Posted Tue Sep 10 14:58:43 2013 Tags:

Three weeks ago today (that's how far behind I am!), I gave one of Intel's "Sponsored Technical Sessions" at SIGGRAPH. Last year I presented one slide in the OpenGL ES BoF. This year I present in my company's paid room. Next year... the world.

The presentation was a brief overview of performance tuning graphics applications... games... for the open-source driver on Intel's GPUs. This is a collection of tips and suggestions that my team has gathered from tuning our driver for shipping apps and working with ISVs like Valve. I'm already working to improve the slide set, and I'm hoping to present something similar at GDC. We'll see how that turns out.

Anyway, my slides are available from Intel.

Posted Tue Aug 13 20:59:19 2013 Tags:

This wiki is powered by ikiwiki.