I'm giving a talk tomorrow at LPC in the Graphics and Display microconf. Since the time slots are so short (25 minutes!), I wanted (okay, Laurent requested) to provide some details before the talk to prime the pump, so to speak.

One line summary: Right now debugging and profiling graphics applications and graphics systems on Linux is a disaster. It's better on Windows, but not by a lot (especially for OpenGL).

There are very few tools, and the tools that exist are either insufficient or are vendor-specific. Moreover, the tools don't provide any kind of system view. At some point in some desktop environment, every developer has to figure out why / how the compositor is wrecking performance. This often takes a lot of work because there is no system view.

The one tool for Windows that can provide a system view is GPUView.

As a result, even on Windows, many developers end up rolling their own tools. The methods remind me a lot of the old days of sprinkling rdtsc() calls all over the code. What has changed is the level detail provided by the tools that display the logged data. Valve has famously talked about the system they use. Other developers have told me they use similar systems.

There is common thematic problem in all of these tools and approaches. The developer is either gets a lot of detailed data and is tied to a particular vendor, or the developer gets very coarse data. And they don't get system-wide data.

There are several disparate groups that need data

  • People creating stand-alone debug / profile tools (e.g., apitrace).
  • People building data collection into their application and using an external, post-hoc visualization tool
  • People building data collection and visualization into their application.

Generally, folks doing one of the last two are doing both to varying degrees.

So here's the question. Can we provide a set of interfaces, probably from the kernel, that:

  • Provides finer grained data than is available from, say, GL_ARB_timer_query about the execution of commands on the GPU.
  • Provides the above data at a system level with semantic information (e.g., this block of time was your call to glDrawArrays, this block of time was the compositor doing "stuff", this block of time was your XRender request, etc.) without leaking information in a way that compromises security.
  • Allow closed-source drivers to expose these interfaces.

UPDATE: The slides and notes from the talk are available. Thanks to Nathan Willis for reminding me to post them.

Pardon my ignorance, but I thought that ApiTrace was a very good tool for graphic debugging/profiling. Is it missing features ? Has it bitroted too much ?
Comment by Vincent dP Thu Sep 19 01:39:45 2013
apitrace is still under very active development, and quite a few developers use it. Recording and (later) playing back GL commands makes it really unsuitable for a lot of kinds of performance work. It is, however, good for a lot of kinds of debugging.
Comment by IanRomanick Thu Sep 19 10:10:39 2013
While I can understand the use cases for unprivileged tracing, in which you'd want to avoid exposing any sensitive information about other processes or the system, why not focus on privileged GPU tracing, where you can safely give a complete view into what the system was doing? Performance analysis tools are mostly used by developers, and developers have root on their own systems.
Comment by Josh Fri Sep 20 01:39:47 2013