Category Archives: Lighting

Description

SSAO

This is my SSAO (Screen Space Ambient Occlusion) implementation and it’s both fast and gives good result. It’s inspired by the Crysis SSAO algorithm but also the Starcraft II implementation and a little from Nvidia’s implementation.

To use the SSAO shader, render a fullscreen quad with the SSAO shader applied to it. This shader will trace 10-16 rays for every fragment (pixel when there are no multi-sampling) on the screen. The rays are shot in a random direction in a hemisphere around the normal of the current fragment. By doing texture lookups, the depth and the normals are compared between the current fragment and the traced ones. The comparison is a simple step-formula and from the result a SSAO term can be evaluated. This SSAO term is saved to the red channel to the texture. This result must be blurred before combined with the original scene render. The blur can for example be a bilateral blur. Both the SSAO term, and the blurring can be done in a much lower resolution (half or fourth the size) than the screen to save some clock cycles. Then after blurring, one can upsample it to full screen size and get a some blur for free.

The combination with the original screen can be as simple as just multiplying this AO term with the already rendered screen.

SSAO enabled on the left, SSAO disabled on the right

The shader is written to save the AO in the red channel so render it to a single-channel texture to save memory.
The SSAO term is saved only in the red channel
The shader requires that the normalMap sampler holds a renderable texture with the normals of the scene in screen space in the RGB channels, and the linear depth (scaled to 0..1) in the A-channel. NOTE: These maps are not the same maps as when doing deferred lighting. For example, the normals should be pure face normals rendered to the screen (no normalmapping).
The rnm sampler should hold a texture with random normals in the RGB-channel, and can for example be the picture below.
Random normals

These are the properties that worked well for me (but it’s different from setup to setup what gives best result):

uniform float totStrength = 1.38;
uniform float strength = 0.07;
uniform float offset = 18.0;
uniform float falloff = 0.000002;
uniform float rad = 0.006;

Here’s the SSAO GLSL fragment shader:

uniform sampler2D rnm;
uniform sampler2D normalMap;
varying vec2 uv;
uniform float totStrength;
uniform float strength;
uniform float offset;
uniform float falloff;
uniform float rad;
#define SAMPLES 16 // 10 is good
const float invSamples = 1.0/16.0;
void main(void)
{
// these are the random vectors inside a unit sphere
vec3 pSphere[16] = vec3[](vec3(0.53812504, 0.18565957, -0.43192),vec3(0.13790712, 0.24864247, 0.44301823),vec3(0.33715037, 0.56794053, -0.005789503),vec3(-0.6999805, -0.04511441, -0.0019965635),vec3(0.06896307, -0.15983082, -0.85477847),vec3(0.056099437, 0.006954967, -0.1843352),vec3(-0.014653638, 0.14027752, 0.0762037),vec3(0.010019933, -0.1924225, -0.034443386),vec3(-0.35775623, -0.5301969, -0.43581226),vec3(-0.3169221, 0.106360726, 0.015860917),vec3(0.010350345, -0.58698344, 0.0046293875),vec3(-0.08972908, -0.49408212, 0.3287904),vec3(0.7119986, -0.0154690035, -0.09183723),vec3(-0.053382345, 0.059675813, -0.5411899),vec3(0.035267662, -0.063188605, 0.54602677),vec3(-0.47761092, 0.2847911, -0.0271716));
//const vec3 pSphere[8] = vec3[](vec3(0.24710192, 0.6445882, 0.033550154),vec3(0.00991752, -0.21947019, 0.7196721),vec3(0.25109035, -0.1787317, -0.011580509),vec3(-0.08781511, 0.44514698, 0.56647956),vec3(-0.011737816, -0.0643377, 0.16030222),vec3(0.035941467, 0.04990871, -0.46533614),vec3(-0.058801126, 0.7347013, -0.25399926),vec3(-0.24799341, -0.022052078, -0.13399573));
//const vec3 pSphere[12] = vec3[](vec3(-0.13657719, 0.30651027, 0.16118456),vec3(-0.14714938, 0.33245975, -0.113095455),vec3(0.030659059, 0.27887347, -0.7332209),vec3(0.009913514, -0.89884496, 0.07381549),vec3(0.040318526, 0.40091, 0.6847858),vec3(0.22311053, -0.3039437, -0.19340435),vec3(0.36235332, 0.21894878, -0.05407306),vec3(-0.15198798, -0.38409665, -0.46785462),vec3(-0.013492276, -0.5345803, 0.11307949),vec3(-0.4972847, 0.037064247, -0.4381323),vec3(-0.024175806, -0.008928787, 0.17719103),vec3(0.694014, -0.122672155, 0.33098832));
//const vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));
   // grab a normal for reflecting the sample rays later on
   vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));
 
   vec4 currentPixelSample = texture2D(normalMap,uv);
 
   float currentPixelDepth = currentPixelSample.a;
 
   // current fragment coords in screen space
   vec3 ep = vec3(uv.xy,currentPixelDepth);
 // get the normal of current fragment
   vec3 norm = currentPixelSample.xyz;
 
   float bl = 0.0;
   // adjust for the depth ( not shure if this is good..)
   float radD = rad/currentPixelDepth;
 
   vec3 ray, se, occNorm;
   float occluderDepth, depthDifference, normDiff;
 
   for(int i=0; i<SAMPLES;++i)
   {
      // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it
      ray = radD*reflect(pSphere[i],fres);
 
      // if the ray is outside the hemisphere then change direction
      se = ep + sign(dot(ray,norm) )*ray;
 
      // get the depth of the occluder fragment
      vec4 occluderFragment = texture2D(normalMap,se.xy);
 
      // get the normal of the occluder fragment
      occNorm = occluderFragment.xyz;
 
      // if depthDifference is negative = occluder is behind current fragment
      depthDifference = currentPixelDepth-occluderFragment.a;
 
      // calculate the difference between the normals as a weight
 
      normDiff = (1.0-dot(occNorm,norm));
      // the falloff equation, starts at falloff and is kind of 1/x^2 falling
      bl += step(falloff,depthDifference)*normDiff*(1.0-smoothstep(falloff,strength,depthDifference));
   }
 
   // output the result
   float ao = 1.0-totStrength*bl*invSamples;
   gl_FragColor.r = ao;
 
}

This is an optimized version of the same shader, but a little harder to read and understand

uniform sampler2D rnm;
uniform sampler2D normalMap;
varying vec2 uv;
const float totStrength = 1.38;
const float strength = 0.07;
const float offset = 18.0;
const float falloff = 0.000002;
const float rad = 0.006;
#define SAMPLES 10 // 10 is good
const float invSamples = -1.38/10.0;
void main(void)
{
// these are the random vectors inside a unit sphere
vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));
 
   // grab a normal for reflecting the sample rays later on
   vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));
 
   vec4 currentPixelSample = texture2D(normalMap,uv);
 
   float currentPixelDepth = currentPixelSample.a;
 
   // current fragment coords in screen space
   vec3 ep = vec3(uv.xy,currentPixelDepth);
  // get the normal of current fragment
   vec3 norm = currentPixelSample.xyz;
 
   float bl = 0.0;
   // adjust for the depth ( not shure if this is good..)
   float radD = rad/currentPixelDepth;
 
   //vec3 ray, se, occNorm;
   float occluderDepth, depthDifference;
   vec4 occluderFragment;
   vec3 ray;
   for(int i=0; i<SAMPLES;++i)
   {
      // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it
      ray = radD*reflect(pSphere[i],fres);
 
      // get the depth of the occluder fragment
      occluderFragment = texture2D(normalMap,ep.xy + sign(dot(ray,norm) )*ray.xy);
    // if depthDifference is negative = occluder is behind current fragment
      depthDifference = currentPixelDepth-occluderFragment.a;
 
      // calculate the difference between the normals as a weight
 // the falloff equation, starts at falloff and is kind of 1/x^2 falling
      bl += step(falloff,depthDifference)*(1.0-dot(occluderFragment.xyz,norm))*(1.0-smoothstep(falloff,strength,depthDifference));
   }
 
   // output the result
   gl_FragColor.r = 1.0+bl*invSamples;
 
}

To use the SSAO effect. Render a fullscreen quad over the screen with the following vertex shader and the previous SSAO fragment shader.

varying vec2  uv;
 
void main(void)
{
gl_Position = ftransform();
gl_Position = sign( gl_Position );
 
// Texture coordinate for screen aligned (in correct range):
uv = (vec2( gl_Position.x, - gl_Position.y ) + vec2( 1.0 ) ) * 0.5;
}

Here’s the source for a school project we did in six weeks (halftime work). It’s a simple FPS game, but does demonstrate the SSAO. Use the “O”-button to toggle the different SSAO modes. This source is unfortunately hard to build since it uses a lot of third party libraries (Ogre3D, boost, fmod, ode …)

http://sourceforge.net/svn/?group_id=244295

And here’s the compiled version. Just unzip it and run the exe.

Download zip

Screen Space Ambient Occlusion

SSAO is the new technique that most new games just must include because of the hype around it since the computer game Crysis. It’s a technique for creating a rude approximation of ambient occlusion by using the a depth of the rendered scene. This works by comparing the current fragments depth with some random sample depths around it to see if the current depth is occluded or not. The current fragment is occluded if the sample is closer to the eye than the current fragment. Although it sounds very bad to do so, in practice it does work beyond all expectations.

How to take the samples is a big concern as it will impact what will occlude and how much. The currently best implementations takes random samples in a hemisphere in the direction of the normal. This limits the amount of self occlusion. Another problem is that if you only take the depth into consideration then a flat surface might occlude itself because of the perspective. By also comparing the normal when calculating the AO, this problem will go away.

One of the hard parts of implementing SSAO is to choose the correct smoothing technique. Because of the big cost of taking occlusion samples you want to take as few samples as possible but this will give much noise in the SSAO so the result needs smoothing. Just doing a simple gaussian blur will not be good as the blur will make the SSAO bleed. Instead a blur that considers the depth and/or normals is needed. One of those is the bilateral filter which often is used in combination with SSAO.

The steps of a simple SSAO implementation:

  1. Render the scene. Save the linear depth in a texture. Save the normals in eye space in a texture.
  2. Render a full screen quad with the SSAO shader. Save the result to a texture.
  3. Blur the result in X
  4. Blur the result in Y
  5. Blend the blurred SSAO texture with the scene, or use it directly when rendering the scene.

An optimization is to render the SSAO in a lower resolution than the screen and upsample it when blurring. Another optimization is to store both the normals and the depth in a single texture.

SSAO in the NVIDIA SDK

SSAO in the NVIDIA SDK

Probably one of the best implementations of SSAO is this one by NVIDIA (although it’ rather slow). The SDK 10 has a paper about the technique and also source code!
http://developer.download.nvidia.com/SDK/10.5/direct3d/samples.html

And here’s three papers/presentations from NVIDIA describing their SSAO in detail:
http://developer.download.nvidia.com/presentations/2008/GDC/GDC08_Ambient_Occlusion.pdf
http://developer.download.nvidia.com/presentations/2008/SIGGRAPH/HBAO_SIG08b.pdf
http://developer.download.nvidia.com/SDK/10.5/direct3d/Source/ScreenSpaceAO/doc/ScreenSpace AO.pdf

SSAO in Starcraft II

SSAO in Starcraft II

Some information of how Starcraft II will use SSAO is included in this paper ( see chapter 5.5 ):
http://ati.amd.com/developer/SIGGRAPH08/Chapter05-Filion-StarCraftII.pdf

SSAO in Two Worlds

SSAO in Two Worlds

A link to a description of the SSAO implementation in the game Two Worlds:
http://www.drobot.org/pub/GCDC_SSAO_RP_29_08.pdf

SSAO in Crysis

SSAO in Crysis

The paper and game that started it all (look in the chapter 8.5.4.3):
http://delivery.acm.org/10.1145/1290000/1281671/p97-mittring.pdf?key1=1281671&key2=9942678811&coll=ACM&dl=ACM&CFID=15151515&CFTOKEN=6184618

Hardware Accelerated Ambient Occlusion

Hardware Accelerated Ambient Occlusion

One of the papers that probably inspired the Crysis team:
http://perumaal.googlepages.com/

Kindernoiser SSAO

Kindernoiser SSAO

A simple but smart SSAO implementation, here with well commented shader source code:
http://rgba.scenesp.org/iq/computer/articles/ssao/ssao.htm

A gamedev.net thread with lots of discussion about SSAO
http://www.gamedev.net/community/forums/topic.asp?topic_id=463075