# Position Reconstruction

There are many occasions when the fragment position in world space needs to be reconstructed from a texture holding the scene depth (depth texture). One example of use is in deferred rendering when trying to decrease memory usage by not saving the position but instead only the depth. This will result in one channel of data, instead of three channels needed when saving the whole position.

There are different ways to save the depth. The most popular are view space depth and screen space depth. Saving depth in view space instead of screen space gives two advantages. It’s faster, and it gives better precision because it’s linear in view space.

This is how screen space depth can be rendered in HLSL:

 ```struct VS_OUTPUT { float4 Pos: POSITION; float4 posInProjectedSpace: TEXCOORD0; };   // vertex shader VS_OUTPUT vs_main( float4 Pos: POSITION ) { VS_OUTPUT Out = (VS_OUTPUT) 0; Out.Pos = mul(Pos,matWorldViewProjection); Out.posInProjectedSpace = Out.Pos; return Out; }   // pixel shader float4 ps_main( VS_OUTPUT Input ) : COLOR { float depth = Input.posInProjectedSpace.z / Input.posInProjectedSpace.w; return depth; }```

The HLSL pixel shader below shows how the position can be reconstructed from the depth map stored with the code above. Although this is one of the slowest ways of doing position reconstruction since it requires a matrix multiplication.

 ```float4 ps_main(float2 vPos : VPOS;) : COLOR0 { float depth = tex2D(depthTexture,vPos*fInverseViewportDimensions + fInverseViewportDimensions*0.5).r;   // scale it to -1..1 (screen coordinates) float2 projectedXY = vPos*fInverseViewportDimensions*2-1; projectedXY.y = -projectedXY.y;   // create the position in screen space float4 pos = float4(projectedXY,depth,1);   // transform position into world space by multiplication with the inverse view projection matrix pos = mul(pos,matViewProjectionInverse);   // make it homogeneous pos /= pos.w; // result will be (x,y,z,1) in world space   return pos; // for now, just render it out }```

To reconstruct depth from view space, a ray from the camera position to the frustum far plane is needed. For a full screen quad, this ray can be precalculated for the four corners and passed to the shader. This is how the computer game Crysis did it [1] . But for arbitrary geometry, as needed in deferred rendering, the ray must be calculated in the shaders [2] .

[1] “Finding next gen: CryEngine 2″
http://ati.amd.com/developer/gdc/2007/mittring-finding_nextgen_cryengine2(siggraph07).pdf

[2] “Reconstructing Position From Depth, Continued”
http://mynameismjp.wordpress.com/2009/05/05/reconstructing-position-from-depth-continued/

# SSAO

This is my SSAO (Screen Space Ambient Occlusion) implementation and it’s both fast and gives good result. It’s inspired by the Crysis SSAO algorithm but also the Starcraft II implementation and a little from Nvidia’s implementation.

To use the SSAO shader, render a fullscreen quad with the SSAO shader applied to it. This shader will trace 10-16 rays for every fragment (pixel when there are no multi-sampling) on the screen. The rays are shot in a random direction in a hemisphere around the normal of the current fragment. By doing texture lookups, the depth and the normals are compared between the current fragment and the traced ones. The comparison is a simple step-formula and from the result a SSAO term can be evaluated. This SSAO term is saved to the red channel to the texture. This result must be blurred before combined with the original scene render. The blur can for example be a bilateral blur. Both the SSAO term, and the blurring can be done in a much lower resolution (half or fourth the size) than the screen to save some clock cycles. Then after blurring, one can upsample it to full screen size and get a some blur for free.

The combination with the original screen can be as simple as just multiplying this AO term with the already rendered screen.

The shader is written to save the AO in the red channel so render it to a single-channel texture to save memory.
The shader requires that the normalMap sampler holds a renderable texture with the normals of the scene in screen space in the RGB channels, and the linear depth (scaled to 0..1) in the A-channel. NOTE: These maps are not the same maps as when doing deferred lighting. For example, the normals should be pure face normals rendered to the screen (no normalmapping).
The rnm sampler should hold a texture with random normals in the RGB-channel, and can for example be the picture below.

These are the properties that worked well for me (but it’s different from setup to setup what gives best result):

 ```uniform float totStrength = 1.38; uniform float strength = 0.07; uniform float offset = 18.0; uniform float falloff = 0.000002; uniform float rad = 0.006;```

Here’s the SSAO GLSL fragment shader:

 ```uniform sampler2D rnm; uniform sampler2D normalMap; varying vec2 uv; uniform float totStrength; uniform float strength; uniform float offset; uniform float falloff; uniform float rad; #define SAMPLES 16 // 10 is good const float invSamples = 1.0/16.0; void main(void) { // these are the random vectors inside a unit sphere vec3 pSphere[16] = vec3[](vec3(0.53812504, 0.18565957, -0.43192),vec3(0.13790712, 0.24864247, 0.44301823),vec3(0.33715037, 0.56794053, -0.005789503),vec3(-0.6999805, -0.04511441, -0.0019965635),vec3(0.06896307, -0.15983082, -0.85477847),vec3(0.056099437, 0.006954967, -0.1843352),vec3(-0.014653638, 0.14027752, 0.0762037),vec3(0.010019933, -0.1924225, -0.034443386),vec3(-0.35775623, -0.5301969, -0.43581226),vec3(-0.3169221, 0.106360726, 0.015860917),vec3(0.010350345, -0.58698344, 0.0046293875),vec3(-0.08972908, -0.49408212, 0.3287904),vec3(0.7119986, -0.0154690035, -0.09183723),vec3(-0.053382345, 0.059675813, -0.5411899),vec3(0.035267662, -0.063188605, 0.54602677),vec3(-0.47761092, 0.2847911, -0.0271716)); //const vec3 pSphere[8] = vec3[](vec3(0.24710192, 0.6445882, 0.033550154),vec3(0.00991752, -0.21947019, 0.7196721),vec3(0.25109035, -0.1787317, -0.011580509),vec3(-0.08781511, 0.44514698, 0.56647956),vec3(-0.011737816, -0.0643377, 0.16030222),vec3(0.035941467, 0.04990871, -0.46533614),vec3(-0.058801126, 0.7347013, -0.25399926),vec3(-0.24799341, -0.022052078, -0.13399573)); //const vec3 pSphere[12] = vec3[](vec3(-0.13657719, 0.30651027, 0.16118456),vec3(-0.14714938, 0.33245975, -0.113095455),vec3(0.030659059, 0.27887347, -0.7332209),vec3(0.009913514, -0.89884496, 0.07381549),vec3(0.040318526, 0.40091, 0.6847858),vec3(0.22311053, -0.3039437, -0.19340435),vec3(0.36235332, 0.21894878, -0.05407306),vec3(-0.15198798, -0.38409665, -0.46785462),vec3(-0.013492276, -0.5345803, 0.11307949),vec3(-0.4972847, 0.037064247, -0.4381323),vec3(-0.024175806, -0.008928787, 0.17719103),vec3(0.694014, -0.122672155, 0.33098832)); //const vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));    // grab a normal for reflecting the sample rays later on    vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));      vec4 currentPixelSample = texture2D(normalMap,uv);      float currentPixelDepth = currentPixelSample.a;      // current fragment coords in screen space    vec3 ep = vec3(uv.xy,currentPixelDepth); // get the normal of current fragment    vec3 norm = currentPixelSample.xyz;      float bl = 0.0;    // adjust for the depth ( not shure if this is good..)    float radD = rad/currentPixelDepth;      vec3 ray, se, occNorm;    float occluderDepth, depthDifference, normDiff;      for(int i=0; i<SAMPLES;++i)    {       // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it       ray = radD*reflect(pSphere[i],fres);         // if the ray is outside the hemisphere then change direction       se = ep + sign(dot(ray,norm) )*ray;         // get the depth of the occluder fragment       vec4 occluderFragment = texture2D(normalMap,se.xy);         // get the normal of the occluder fragment       occNorm = occluderFragment.xyz;         // if depthDifference is negative = occluder is behind current fragment       depthDifference = currentPixelDepth-occluderFragment.a;         // calculate the difference between the normals as a weight         normDiff = (1.0-dot(occNorm,norm));       // the falloff equation, starts at falloff and is kind of 1/x^2 falling       bl += step(falloff,depthDifference)*normDiff*(1.0-smoothstep(falloff,strength,depthDifference));    }      // output the result    float ao = 1.0-totStrength*bl*invSamples;    gl_FragColor.r = ao;   }```

This is an optimized version of the same shader, but a little harder to read and understand

 ```uniform sampler2D rnm; uniform sampler2D normalMap; varying vec2 uv; const float totStrength = 1.38; const float strength = 0.07; const float offset = 18.0; const float falloff = 0.000002; const float rad = 0.006; #define SAMPLES 10 // 10 is good const float invSamples = -1.38/10.0; void main(void) { // these are the random vectors inside a unit sphere vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));      // grab a normal for reflecting the sample rays later on    vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));      vec4 currentPixelSample = texture2D(normalMap,uv);      float currentPixelDepth = currentPixelSample.a;      // current fragment coords in screen space    vec3 ep = vec3(uv.xy,currentPixelDepth);   // get the normal of current fragment    vec3 norm = currentPixelSample.xyz;      float bl = 0.0;    // adjust for the depth ( not shure if this is good..)    float radD = rad/currentPixelDepth;      //vec3 ray, se, occNorm;    float occluderDepth, depthDifference;    vec4 occluderFragment;    vec3 ray;    for(int i=0; i<SAMPLES;++i)    {       // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it       ray = radD*reflect(pSphere[i],fres);         // get the depth of the occluder fragment       occluderFragment = texture2D(normalMap,ep.xy + sign(dot(ray,norm) )*ray.xy);     // if depthDifference is negative = occluder is behind current fragment       depthDifference = currentPixelDepth-occluderFragment.a;         // calculate the difference between the normals as a weight  // the falloff equation, starts at falloff and is kind of 1/x^2 falling       bl += step(falloff,depthDifference)*(1.0-dot(occluderFragment.xyz,norm))*(1.0-smoothstep(falloff,strength,depthDifference));    }      // output the result    gl_FragColor.r = 1.0+bl*invSamples;   }```

To use the SSAO effect. Render a fullscreen quad over the screen with the following vertex shader and the previous SSAO fragment shader.

 ```varying vec2 uv;   void main(void) { gl_Position = ftransform(); gl_Position = sign( gl_Position );   // Texture coordinate for screen aligned (in correct range): uv = (vec2( gl_Position.x, - gl_Position.y ) + vec2( 1.0 ) ) * 0.5; }```

Here’s the source for a school project we did in six weeks (halftime work). It’s a simple FPS game, but does demonstrate the SSAO. Use the “O”-button to toggle the different SSAO modes. This source is unfortunately hard to build since it uses a lot of third party libraries (Ogre3D, boost, fmod, ode …)

http://sourceforge.net/svn/?group_id=244295

And here’s the compiled version. Just unzip it and run the exe.

# Screen Space Blurred Shadow Mapping

This is probably the first technique one will think for creating soft shadows when doing shadow mapping.  The shadows are rendered to a texture (in screen space) and this texture is then blurred (in screen space) and later applied to the screen. This is a very easy technique for getting soft shadows. The main drawbacks are shadow bleeding and the cost of the extra passes.

An article about it on gamedev.net:
http://www.gamedev.net/reference/articles/article2193.asp

# Screen Space Ambient Occlusion

SSAO is the new technique that most new games just must include because of the hype around it since the computer game Crysis. It’s a technique for creating a rude approximation of ambient occlusion by using the a depth of the rendered scene. This works by comparing the current fragments depth with some random sample depths around it to see if the current depth is occluded or not. The current fragment is occluded if the sample is closer to the eye than the current fragment. Although it sounds very bad to do so, in practice it does work beyond all expectations.

How to take the samples is a big concern as it will impact what will occlude and how much. The currently best implementations takes random samples in a hemisphere in the direction of the normal. This limits the amount of self occlusion. Another problem is that if you only take the depth into consideration then a flat surface might occlude itself because of the perspective. By also comparing the normal when calculating the AO, this problem will go away.

One of the hard parts of implementing SSAO is to choose the correct smoothing technique. Because of the big cost of taking occlusion samples you want to take as few samples as possible but this will give much noise in the SSAO so the result needs smoothing. Just doing a simple gaussian blur will not be good as the blur will make the SSAO bleed. Instead a blur that considers the depth and/or normals is needed. One of those is the bilateral filter which often is used in combination with SSAO.

The steps of a simple SSAO implementation:

1. Render the scene. Save the linear depth in a texture. Save the normals in eye space in a texture.
2. Render a full screen quad with the SSAO shader. Save the result to a texture.
3. Blur the result in X
4. Blur the result in Y
5. Blend the blurred SSAO texture with the scene, or use it directly when rendering the scene.

An optimization is to render the SSAO in a lower resolution than the screen and upsample it when blurring. Another optimization is to store both the normals and the depth in a single texture.

SSAO in the NVIDIA SDK

Probably one of the best implementations of SSAO is this one by NVIDIA (although it’ rather slow). The SDK 10 has a paper about the technique and also source code!

And here’s three papers/presentations from NVIDIA describing their SSAO in detail:

SSAO in Starcraft II

Some information of how Starcraft II will use SSAO is included in this paper ( see chapter 5.5 ):
http://ati.amd.com/developer/SIGGRAPH08/Chapter05-Filion-StarCraftII.pdf

SSAO in Two Worlds

A link to a description of the SSAO implementation in the game Two Worlds:
http://www.drobot.org/pub/GCDC_SSAO_RP_29_08.pdf

SSAO in Crysis

The paper and game that started it all (look in the chapter 8.5.4.3):
http://delivery.acm.org/10.1145/1290000/1281671/p97-mittring.pdf?key1=1281671&key2=9942678811&coll=ACM&dl=ACM&CFID=15151515&CFTOKEN=6184618

Hardware Accelerated Ambient Occlusion

One of the papers that probably inspired the Crysis team: