Welcome to my blog. Have a look at the most recent posts below, or browse the tag cloud on the right. An archive of all posts is also available.

RSS Atom Add a new post titled:

The slides and video of my talk from Steam Dev Days has been posted. It's basically the Haswell refresh (inside joke) of my SIGGRAPH talk from last year.

Posted Thu Feb 13 08:33:51 2014 Tags:

The slides from my FOSDEM talk are now available.

Posted Sat Feb 1 04:19:04 2014 Tags:

Multidimensional arrays are added to GLSL via either GL_ARB_arrays_of_arrays extension or GLSL 4.30. I've had a couple people tell me that the multidimensional array syntax is either wrong or just plain crazy. When viewed from the proper angle, it should actually be perfectly logical to any C / C++ programmer. I'd like to clear up a bit of the confusion.

Staring with the easy syntax, the following does what you expect:

    vec4 a[2][3][4];

If a is inside a uniform block, the memory layout would be the same as in C. cdecl would call this, "declare a as array 2 of array 3 of array 4 of vec4".

Using GLSL constructor syntax, the array can also be initialized:

    vec4 a[2][3][4] = vec4[][][](vec4[][](vec4[](vec4( 1), vec4( 2), vec4( 3), vec4( 4)),
                                          vec4[](vec4( 5), vec4( 6), vec4( 7), vec4( 8)),
                                          vec4[](vec4( 9), vec4(10), vec4(11), vec4(12))),
                                 vec4[][](vec4[](vec4(13), vec4(14), vec4(15), vec4(16)),
                                          vec4[](vec4(17), vec4(18), vec4(19), vec4(20)),
                                          vec4[](vec4(21), vec4(22), vec4(23), vec4(24))));

If that makes your eyes bleed, GL_ARB_shading_language_420pack and GLSL 4.20 add the ability to use C-style array and structure initializers. In that model, a can be initialized to the same values by:

    vec4 a[2][3][4] = {
            { vec4( 1), vec4( 2), vec4( 3), vec4( 4) },
            { vec4( 5), vec4( 6), vec4( 7), vec4( 8) },
            { vec4( 9), vec4(10), vec4(11), vec4(12) }
            { vec4(13), vec4(14), vec4(15), vec4(16) },
            { vec4(17), vec4(18), vec4(19), vec4(20) },
            { vec4(21), vec4(22), vec4(23), vec4(24) }

Functions can be declared that take multidimensional arrays as parameters. In the prototype, the name of the parameter can be present, or it can be omitted.

    void function_a(float a[4][5][6]);
    void function_b(float  [4][5][6]);

Other than the GLSL constructor syntax, there hasn't been any madness yet. However, recall that array sizes can be associated with the variable name or with the type. The prototype for function_a associates the size with the variable name, and the prototype for function_b associates the size with the type. Like GLSL constructor syntax, this has existed since GLSL 1.20.

Associating the array size with just the type, we can declare a (from above) as:

    vec4[2][3][4] a;

With multidimensional arrays, the sizes can be split among the two, and this is where it gets weird. We can also declare a as:

    vec4[3][4] a[2];

This declaration has the same layout as the previous two forms. This is usually where people say, "It's bigger on the inside!" Recall the cdecl description, "declare a as array 2 of array 3 of array 4 of vec4". If we add some parenthesis, "declare a as array 2 of (array 3 of array 4 of vec4)", and things seem a bit more clear.

GLSL ended up with this syntax for two reasons, and seeing those reasons should illuminate things. Without GL_ARB_arrays_of_arrays or GLSL 4.30, there are no multidimensional arrays, but the same affect can be achieved, very inconveniently, using structures containing arrays. In GLSL 4.20 and earlier, we could also declare a as:

    struct S1 {
        float a[4];

    struct S2 {
        S1 a[3];

    S2 a[2];

I'll spare you having to see GLSL constructor initializer for that mess. Note that we still end up with a[2] at the end.

Using typedef in C, we could also achieve the same result using:

    typedef float T[3][4];

    T a[2];

Again, we end up with a[2]. If cdecl could handle this (it doesn't grok typedef), it would say "declare a as array 2 of T", and "typedef T as array 3 of array 4 of float". We could substitue the description of T and, with parenthesis, get "declare a as array 2 of (array 3 of array 4 of float)".

Where this starts to present pain is that function_c has the same parameter type as function_a and function_b, but function_d does not.

    void function_c(float[5][6] a[4]);
    void function_d(float[5][6]  [4]);

However, the layout of parameter for function_e is the same as function_a and function_b, even though the actual type is different.

    struct S3 {
        float a[6];

    struct S4 {
        S3 a[5];

    void function_e(S4 [4]);

I think if we had it to do all over again, we may have disallowed the split syntax. That would remove the more annoying pitfalls and the confusion, but it would also remove some functionality. Most of the problems associated with the split are caught at compile-time, but some are not. The two obvious problems that remain are transposing array indices and incorrectly calculated uniform block layouts.

    layout(std140) uniform U {
        float [1][2][3] x;
        float y[1][2][3];
        float [1][2] z[3];

In this example x and y have the same memory organization, but z does not. I wouldn't want to try to debug that problem.

Posted Wed Jan 29 09:08:42 2014 Tags:

This previous weekend was the Portland Retro Gaming Expo, and it was awesome! I was there all day both days.

I got some really good deals. :) Gaiares (loose) for $5! Tatsujin (import version of Truxton) for $8! I got a lot of games. Just don't ask how much I spent altogether...

One of the highlights of the show where the performances by 8Bit Weapon. You'll notice in the second image that she's playing an Atari paddle controller. You can also see a C64 (donated by the local Commodore User Group) in the last image.

The GIANT arcade was... amazing! There were some games there that I haven't played in years. My uncle would have enjoyed Mappy. :) And I didn't win the tabletop Asteroids game. I spent a lot of time playing APB (Hey Ground Kontrol: put that in the arcade!!!). I was also pretty surprised that the only console version of APB is for the Lynx. Oof.

I love the Space Wars instructions... I think those are still the instructions for Windows... lol.

There was a cool presentation about making games for the Atari 2600. There were even a couple "celebrities" in the audience. On the right is Darrell Spice Jr. (programmer of Medieval Mayhem), and on the left is Rebecca Heineman (programmer London Blitz for the 2600 among other things). She's also responsible for the awesome disassembly of the game Freeway.

Joe Decuir was also in the audience, and I met him later at the Atari Age booth (I had to get Medieval Mayhem and Moon Cresta :) ). I can't believe Joe doesn't have a Wikipedia page. He was one of the original designers of the Atari 2600, programmer of Video Olympics and Combat (with Larry Wagner). He was also one of the original Amiga designers. He was having a conversation with one of the Atari Age guys about wanting to design a new console in the style of an old Atari 8-bit or an Amiga. I told him that there's a group trying to make an Amiga 2000 from scratch (using FPGAs for custom chips). So now I'm tasked with finding that project again and getting him in contact with them. I'm sure they'd appreciate any help he could offer!

The Tetris World Championships was so tense!!! I think I have to watch the movie now. After being down 2 games, Jonas came back to win it... for the fourth time in a row!

Posted Mon Oct 7 19:23:50 2013 Tags:

Now my Mesa: State of the Project talk is done too, and the slides are available.

I'm assuming that someone will send a long a link to the video soon... ;)

Posted Tue Sep 24 10:54:52 2013 Tags:

My XDC talk "Effects Framework for OpenGL Testing" just got done, and the slides are available. The talk went pretty well, and the discussion was healthy.

The three big high points of the discussion were:

  • For the most part, adding more language to learn won't necessarily make it easier to add more complex tests. Just writing C tests in piglit isn't bad these days. The worst parts are dealing with cmake and all.tests. The best thing about shader_runner (and similar) is that you don't have muck about with any of that.
  • One difficulty with complex tests is validating the correctness of results. The red / green box tests are good because the pass / fail metric is obvious. Perceptual difference algorithms can work (VMware uses them), but they can be twitchy and frustrating (Cairo gave up on them).
  • The shader_runner parser is a mess because everyone just added one more piece of duct tape for the next tiny feature they need. There used to be a clean, simple piece of code, but you can't see it now... all you can see is the big ball of duct tape. One advantage of nvFX is that it is a consistent, defined language... instead of a ball of duct tape. We could borrow their syntax for some of the things that shader_runner already does.

There may have been other important points, but those are the two that really stuck with me. The forum is open. :)

Posted Mon Sep 23 14:27:45 2013 Tags:

I'm giving a talk tomorrow at LPC in the Graphics and Display microconf. Since the time slots are so short (25 minutes!), I wanted (okay, Laurent requested) to provide some details before the talk to prime the pump, so to speak.

One line summary: Right now debugging and profiling graphics applications and graphics systems on Linux is a disaster. It's better on Windows, but not by a lot (especially for OpenGL).

There are very few tools, and the tools that exist are either insufficient or are vendor-specific. Moreover, the tools don't provide any kind of system view. At some point in some desktop environment, every developer has to figure out why / how the compositor is wrecking performance. This often takes a lot of work because there is no system view.

The one tool for Windows that can provide a system view is GPUView.

As a result, even on Windows, many developers end up rolling their own tools. The methods remind me a lot of the old days of sprinkling rdtsc() calls all over the code. What has changed is the level detail provided by the tools that display the logged data. Valve has famously talked about the system they use. Other developers have told me they use similar systems.

There is common thematic problem in all of these tools and approaches. The developer is either gets a lot of detailed data and is tied to a particular vendor, or the developer gets very coarse data. And they don't get system-wide data.

There are several disparate groups that need data

  • People creating stand-alone debug / profile tools (e.g., apitrace).
  • People building data collection into their application and using an external, post-hoc visualization tool
  • People building data collection and visualization into their application.

Generally, folks doing one of the last two are doing both to varying degrees.

So here's the question. Can we provide a set of interfaces, probably from the kernel, that:

  • Provides finer grained data than is available from, say, GL_ARB_timer_query about the execution of commands on the GPU.
  • Provides the above data at a system level with semantic information (e.g., this block of time was your call to glDrawArrays, this block of time was the compositor doing "stuff", this block of time was your XRender request, etc.) without leaking information in a way that compromises security.
  • Allow closed-source drivers to expose these interfaces.

UPDATE: The slides and notes from the talk are available. Thanks to Nathan Willis for reminding me to post them.

Posted Wed Sep 18 09:00:38 2013 Tags:

I've just pushed a branch to my fd.o tree that brings back the standalone compiler. I've also made some small enhancements to it. Why? Thanks for asking. :)

One thing that game developers frequently complain about (and they're damn right!) is the massive amount of variability among GLSL compilers. Compiler A accepts a shader, but compiler B generates errors. Then the real fun begins... which compiler is right? Dig through the spec, report bugs to the vendors, wait for the vendors to finger point... it's pretty dire.

Mesa compiler has a reputation of sticking to the letter of the spec. This has caused some ruffled feathers with game developers and with some folks in the Mesa community. In cases where our behavior disagrees with all other shipping implementations, I have submitted numerous spec bugs. If nobody does what the spec says, you change the spec.

This isn't to say our conformance is perfect or that we don't have any bugs. Reality is quite the contrary. :) However, we are really picky about stuff that other people aren't quite so picky about. When we find deviations from the behavior of other implementations, one way or another, we sort it out.

Sometimes that means changing our behavior (and adding piglit tests).

Sometimes that means changing our behavior (and getting the spec changed).

Sometimes that means implementing a work-around for specific apps (that is only enabled for those apps!).

Sometimes that means not changing anything (and taking a hard line that someone else needs to fix their code).

The combination of our ability to build our compiler on many common platforms and our spec pedanticism puts Mesa in a fairly interesting position. It means that developers could use our compiler, without the baggage of the rest of a driver, as the thrid-party to settle disputes. It can be the "if it compiles on here, it had better compile anywhere" oracle.

Even if it fails at that, we emit a lot of warnings. Sometimes we emit too many warnings. A standalone compiler gives a good start for "lint" for GLSL. There are already a couple places that we give portability warnings (things we accept that some compilers are known to incorrectly reject), so there's already some value there.

Posted Tue Sep 10 14:58:43 2013 Tags:

Three weeks ago today (that's how far behind I am!), I gave one of Intel's "Sponsored Technical Sessions" at SIGGRAPH. Last year I presented one slide in the OpenGL ES BoF. This year I present in my company's paid room. Next year... the world.

The presentation was a brief overview of performance tuning graphics applications... games... for the open-source driver on Intel's GPUs. This is a collection of tips and suggestions that my team has gathered from tuning our driver for shipping apps and working with ISVs like Valve. I'm already working to improve the slide set, and I'm hoping to present something similar at GDC. We'll see how that turns out.

Anyway, my slides are available from Intel.

Posted Tue Aug 13 20:59:19 2013 Tags:

So... I've been on an old games kick for some time now. As part of that, I recently purchased a Namco neGcon Playstation controller. I'm not going to dig out a copy of Wipeout... I want to support it in some of the demo programs I write for the graphics programming class I teach... because I can. :)

A tiny bit of background for people too lazy to click the Wikipedia link... This is an old Playstation controller. It came out long before even the Dual Analog (April 1997, according to the Wikipedia article. I couldn't find a firm release date, but I remember seeing adds for it around the launch time of the Playstation. I could find some rec.games.video.sony posts from March 1996 about importing it from Japan. What makes it special are the quirky analog inputs. The left trigger and the two red buttons are analog. The real kicker is twisting it in the middle is also an analog input.

I hooked it up to my laptop with a generic Playstation-to-USB converter, and hacked up a demo program in SDL to see how this thing reports itself. The first disappointing thing is the name. SDL just reports it as "USB Gamepad " (yes, with a space at the end). I'm sure that's a quirk of the adapter. Since I have several controllers that I use, I use the name to set default button mappings. My Logitech DualShock look-a-like reports as "Logitech Logitech Dual Action" (yes, Logitech twice), and my PS3 Sixaxis reports as "Sony PLAYSTATION(R)3 Controller".

It shows up as 5 axes, 12 buttons, and a hat. Let's look at the mapping to see it get crazy:

  • Twisting: axis 0
  • Buttons I and II (the red ones): axis 1. Yes, the two jolly, red, candy-like analog buttons show up, together, as axis 1. Button I gets the negative values, and button II gets the positive values.
  • Button A: button 1
  • Button B: button 0
  • Left trigger: axis 3
  • Right trigger: button 7
  • Start: button 9
  • D-pad: Hat 0. I hate controllers that advertise the d-pad as a hat. My Logitech controller does that, but the Sixaxis just shows them as buttons.

All of this begs the question, "WTF?" It also begs a couple follow-up questions. Is all of this madness caused by the adapter, or is it endemic to the controller itself? I suspect it shows up as 12 buttons because of the DualShock. The DualShock actually has 13 buttons (the "Analog" selector), but I don't think that gets sent over the protocol. I think that just changes the function of the controller itself. My Logitech has a similar "mode" button, and that doesn't go over the protocol.

Did anyone ever use a neGcon with a parallel port adpater? How did it show up? It looks like the Linux kernel has supported it for ages, so someone must have done it...

Posted Fri Apr 12 12:31:39 2013 Tags:

This wiki is powered by ikiwiki.