Tag Archives: Shaders

Texture Atlas

Texture atlas [1][2] is a technique to group smaller textures into a larger texture. This decreases the number of state switches [3] a renderer needs to do and therefore often increases performance. Texture atlases have been used for a long time in the video game industry for sprite animations. When using texture atlases, the uv-coordinates of the models have to be changed so the original 0..1 map to the textures tile in the atlas. Grouping of textures can be done manually by texture artists or with tools. The texture coordinate system can be changed to map the new texture in a tool, or in the shader at run-time.

An example texture atlas
Image from article [2]

There are some limitations with using texture atlases compared to normal textures. First of all, all texture coordinates must initially be within 0..1 range. So for example, no “free” tiling can be used. The other problem is bleeding between tiles in the atlas when doing filtering, for example when using mipmaps.

Some additional information from Ivan-Assen Ivanov, author of article [2].

” – separate textures hurts not only batching (in facts, it hurts batching less than years ago), but also memory – as there is a certain per-texture overhead. This is especially painful on consoles – on PCs, the overhead is still there, I guess, but the driver hides it from you. The exact numbers are under NDA, of course, but on an old, unreleased project, we saved about 9 MB by atlas-sing a category of textures we didn’t atlas before.

- vertex interpolators are expensive! make sure you measure the remapping from 0..1 to the actual UVs in the atlas both in the vertex and in the pixel shader. Sounds counterintuitive, but on modern GPUs and with dense geometry, pixel shader is actually faster.” 

[1] “Improve Batching Using Texture Atlases” http://http.download.nvidia.com/developer/NVTextureSuite/Atlas_Tools/Texture_Atlas_Whitepaper.pdf

[2] “Practical Texture Atlases” (borrowed image from this page)

[3] “Batch, Batch, Batch: What Does It Really Mean?”


Starting with DirectX Pixel Shader Model 3.0 there exist an input type called VPOS. It’s the current pixels position on the screen and it’s automatically generated. This can be useful when sampling from a previously rendered texture when rendering an arbitrarily shaped mesh to the screen. To do this, we need uv-coords that represents where to sample on the texture. These coordinates can be gained by simply dividing VPOS with the screen dimensions.
When working with older hardware, that doesn’t support shader model 3.0, there is a need to manually create the VPOS in the vertex shader and pass it to the fragment shader as a TEXCOORD. This is the way to do so ( including the scaling to uv-range which manually has to be done for VPOS if you’re using it).

Vertex Shader:

float4x4 matWorldViewProjection;
float2 fInverseViewportDimensions;
struct VS_INPUT
   float4 Position : POSITION0;
struct VS_OUTPUT
   float4 Position : POSITION0;
   float4 calculatedVPos : TEXCOORD0;
float4 ConvertToVPos( float4 p )
   return float4( 0.5*( float2(p.x + p.w, p.w - p.y) + p.w*fInverseViewportDimensions.xy), p.zw);
VS_OUTPUT vs_main( VS_INPUT Input )
   VS_OUTPUT Output;
   Output.Position = mul( Input.Position, matWorldViewProjection );
   Output.calculatedVPos = ConvertToVPos(Output.Position);
   return( Output );

Pixel Shader:

float4 ps_main(VS_OUTPUT Input) : COLOR0
   Input.calculatedVPos /= Input.calculatedVPos.w;
   return float4(Input.calculatedVPos.xy,0,1); // test render it to the screen

The image below shows an elephant model rendered with the shader above. As can be seen, the color (red and green channels) correctly represents the uv-coords for a fullscreen quad. Since 0,0,0 = black, 1,0,0 = red, 0,1,0 = green, 1, 1,0 = yellow.

VPOS Elephant
This is how the pixel shader would have looked like if VPOS were used instead (note: no special vertex shader needed in this case).
struct PS_INPUT
   float2 vPos : VPOS;
float4 ps_main(PS_INPUT Input) : COLOR0
   return float4(Input.vPos*fInverseViewportDimensions + fInverseViewportDimensions*0.5,0,1); // test render it to the screen

The original code, more info and proof can be found here:

Fluid Simulation and Rendering

For effects like smoke or water, a fluid simulation and rendering approach is needed. There are currently two popular methods for this:

  1. Simulate the fluid on the CPU and send the result as particles to the GPU for rendering as billboards. This is often called a particle system. The technique has been around since the dawn of computer graphics.

  2. Simulate the fluid on the GPU and render the result into textures. This will then be rendered by doing volume ray casting (or ray marching) on the GPU. This technique is new and rather unexplored, and there are few real-life implementations. The result can be very realistic but slow.

Technique one burdens both CPU, bandwidth and GPU. Although in modern solutions, it’s the bandwidth that’s the bottleneck. The GPU based technique only burdens the GPU ( but a lot ).

The movie shows the GPU method of fluid simulation and rendering. More info about this particular implementation in the two last links.

Building an Advanced Particle System
Building a Million Particle System
Real-Time Simulation and Rendering of 3D Fluids
The previous page’s authors homepage: