Texture Atlas

Texture atlas [1][2] is a technique to group smaller textures into a larger texture. This decreases the number of state switches [3] a renderer needs to do and therefore often increases performance. Texture atlases have been used for a long time in the video game industry for sprite animations. When using texture atlases, the uv-coordinates of the models have to be changed so the original 0..1 map to the textures tile in the atlas. Grouping of textures can be done manually by texture artists or with tools. The texture coordinate system can be changed to map the new texture in a tool, or in the shader at run-time.

An example texture atlas
Image from article [2]

There are some limitations with using texture atlases compared to normal textures. First of all, all texture coordinates must initially be within 0..1 range. So for example, no “free” tiling can be used. The other problem is bleeding between tiles in the atlas when doing filtering, for example when using mipmaps.

Some additional information from Ivan-Assen Ivanov, author of article [2].

” – separate textures hurts not only batching (in facts, it hurts batching less than years ago), but also memory – as there is a certain per-texture overhead. This is especially painful on consoles – on PCs, the overhead is still there, I guess, but the driver hides it from you. The exact numbers are under NDA, of course, but on an old, unreleased project, we saved about 9 MB by atlas-sing a category of textures we didn’t atlas before.

- vertex interpolators are expensive! make sure you measure the remapping from 0..1 to the actual UVs in the atlas both in the vertex and in the pixel shader. Sounds counterintuitive, but on modern GPUs and with dense geometry, pixel shader is actually faster.” 

[1] “Improve Batching Using Texture Atlases” http://http.download.nvidia.com/developer/NVTextureSuite/Atlas_Tools/Texture_Atlas_Whitepaper.pdf

[2] “Practical Texture Atlases” (borrowed image from this page)
http://www.gamasutra.com/features/20060126/ivanov_01.shtml

[3] “Batch, Batch, Batch: What Does It Really Mean?”
http://developer.nvidia.com/docs/io/8230/batchbatchbatch.pdf

VPOS

Starting with DirectX Pixel Shader Model 3.0 there exist an input type called VPOS. It’s the current pixels position on the screen and it’s automatically generated. This can be useful when sampling from a previously rendered texture when rendering an arbitrarily shaped mesh to the screen. To do this, we need uv-coords that represents where to sample on the texture. These coordinates can be gained by simply dividing VPOS with the screen dimensions.
When working with older hardware, that doesn’t support shader model 3.0, there is a need to manually create the VPOS in the vertex shader and pass it to the fragment shader as a TEXCOORD. This is the way to do so ( including the scaling to uv-range which manually has to be done for VPOS if you’re using it).

Vertex Shader:

float4x4 matWorldViewProjection;
float2 fInverseViewportDimensions;
struct VS_INPUT
{
   float4 Position : POSITION0;
};
struct VS_OUTPUT
{
   float4 Position : POSITION0;
   float4 calculatedVPos : TEXCOORD0;
};
float4 ConvertToVPos( float4 p )
{
   return float4( 0.5*( float2(p.x + p.w, p.w - p.y) + p.w*fInverseViewportDimensions.xy), p.zw);
}
 
VS_OUTPUT vs_main( VS_INPUT Input )
{
   VS_OUTPUT Output;
   Output.Position = mul( Input.Position, matWorldViewProjection );
   Output.calculatedVPos = ConvertToVPos(Output.Position);
   return( Output );
}

Pixel Shader:

float4 ps_main(VS_OUTPUT Input) : COLOR0
{
   Input.calculatedVPos /= Input.calculatedVPos.w;
   return float4(Input.calculatedVPos.xy,0,1); // test render it to the screen
}

The image below shows an elephant model rendered with the shader above. As can be seen, the color (red and green channels) correctly represents the uv-coords for a fullscreen quad. Since 0,0,0 = black, 1,0,0 = red, 0,1,0 = green, 1, 1,0 = yellow.

VPOS Elephant
This is how the pixel shader would have looked like if VPOS were used instead (note: no special vertex shader needed in this case).
struct PS_INPUT
{
   float2 vPos : VPOS;
};
float4 ps_main(PS_INPUT Input) : COLOR0
{
   return float4(Input.vPos*fInverseViewportDimensions + fInverseViewportDimensions*0.5,0,1); // test render it to the screen
}

The original code, more info and proof can be found here:
http://www.gamedev.net/community/forums/topic.asp?topic_id=506573

Position Reconstruction

There are many occasions when the fragment position in world space needs to be reconstructed from a texture holding the scene depth (depth texture). One example of use is in deferred rendering when trying to decrease memory usage by not saving the position but instead only the depth. This will result in one channel of data, instead of three channels needed when saving the whole position.

 

Viewspace scene depth

There are different ways to save the depth. The most popular are view space depth and screen space depth. Saving depth in view space instead of screen space gives two advantages. It’s faster, and it gives better precision because it’s linear in view space.

This is how screen space depth can be rendered in HLSL:

struct VS_OUTPUT
{
   float4 Pos: POSITION;
   float4 posInProjectedSpace: TEXCOORD0;
};
 
// vertex shader
VS_OUTPUT vs_main( float4 Pos: POSITION )
{
   VS_OUTPUT Out = (VS_OUTPUT) 0;
   Out.Pos = mul(Pos,matWorldViewProjection);
   Out.posInProjectedSpace = Out.Pos;
   return Out;
}
 
// pixel shader
float4 ps_main( VS_OUTPUT Input ) : COLOR
{
   float depth = Input.posInProjectedSpace.z / Input.posInProjectedSpace.w;
   return depth;
}

The HLSL pixel shader below shows how the position can be reconstructed from the depth map stored with the code above. Although this is one of the slowest ways of doing position reconstruction since it requires a matrix multiplication.

float4 ps_main(float2 vPos : VPOS;) : COLOR0
{
   float depth = tex2D(depthTexture,vPos*fInverseViewportDimensions + fInverseViewportDimensions*0.5).r;
 
   // scale it to -1..1 (screen coordinates)
   float2 projectedXY = vPos*fInverseViewportDimensions*2-1;
   projectedXY.y = -projectedXY.y;
 
   // create the position in screen space
   float4 pos = float4(projectedXY,depth,1);
 
   // transform position into world space by multiplication with the inverse view projection matrix
   pos = mul(pos,matViewProjectionInverse); 
 
   // make it homogeneous
   pos /= pos.w; // result will be (x,y,z,1) in world space
 
   return pos; // for now, just render it out
}

To reconstruct depth from view space, a ray from the camera position to the frustum far plane is needed. For a full screen quad, this ray can be precalculated for the four corners and passed to the shader. This is how the computer game Crysis did it [1] . But for arbitrary geometry, as needed in deferred rendering, the ray must be calculated in the shaders [2] .

[1] “Finding next gen: CryEngine 2″
http://ati.amd.com/developer/gdc/2007/mittring-finding_nextgen_cryengine2(siggraph07).pdf

[2] “Reconstructing Position From Depth, Continued”
http://mynameismjp.wordpress.com/2009/05/05/reconstructing-position-from-depth-continued/