SSAO

This is my SSAO (Screen Space Ambient Occlusion) implementation and it’s both fast and gives good result. It’s inspired by the Crysis SSAO algorithm but also the Starcraft II implementation and a little from Nvidia’s implementation.

To use the SSAO shader, render a fullscreen quad with the SSAO shader applied to it. This shader will trace 10-16 rays for every fragment (pixel when there are no multi-sampling) on the screen. The rays are shot in a random direction in a hemisphere around the normal of the current fragment. By doing texture lookups, the depth and the normals are compared between the current fragment and the traced ones. The comparison is a simple step-formula and from the result a SSAO term can be evaluated. This SSAO term is saved to the red channel to the texture. This result must be blurred before combined with the original scene render. The blur can for example be a bilateral blur. Both the SSAO term, and the blurring can be done in a much lower resolution (half or fourth the size) than the screen to save some clock cycles. Then after blurring, one can upsample it to full screen size and get a some blur for free.

The combination with the original screen can be as simple as just multiplying this AO term with the already rendered screen.

SSAO enabled on the left, SSAO disabled on the right

The shader is written to save the AO in the red channel so render it to a single-channel texture to save memory.
The SSAO term is saved only in the red channel
The shader requires that the normalMap sampler holds a renderable texture with the normals of the scene in screen space in the RGB channels, and the linear depth (scaled to 0..1) in the A-channel. NOTE: These maps are not the same maps as when doing deferred lighting. For example, the normals should be pure face normals rendered to the screen (no normalmapping).
The rnm sampler should hold a texture with random normals in the RGB-channel, and can for example be the picture below.
Random normals

These are the properties that worked well for me (but it’s different from setup to setup what gives best result):

uniform float totStrength = 1.38;
uniform float strength = 0.07;
uniform float offset = 18.0;
uniform float falloff = 0.000002;
uniform float rad = 0.006;

Here’s the SSAO GLSL fragment shader:

uniform sampler2D rnm;
uniform sampler2D normalMap;
varying vec2 uv;
uniform float totStrength;
uniform float strength;
uniform float offset;
uniform float falloff;
uniform float rad;
#define SAMPLES 16 // 10 is good
const float invSamples = 1.0/16.0;
void main(void)
{
// these are the random vectors inside a unit sphere
vec3 pSphere[16] = vec3[](vec3(0.53812504, 0.18565957, -0.43192),vec3(0.13790712, 0.24864247, 0.44301823),vec3(0.33715037, 0.56794053, -0.005789503),vec3(-0.6999805, -0.04511441, -0.0019965635),vec3(0.06896307, -0.15983082, -0.85477847),vec3(0.056099437, 0.006954967, -0.1843352),vec3(-0.014653638, 0.14027752, 0.0762037),vec3(0.010019933, -0.1924225, -0.034443386),vec3(-0.35775623, -0.5301969, -0.43581226),vec3(-0.3169221, 0.106360726, 0.015860917),vec3(0.010350345, -0.58698344, 0.0046293875),vec3(-0.08972908, -0.49408212, 0.3287904),vec3(0.7119986, -0.0154690035, -0.09183723),vec3(-0.053382345, 0.059675813, -0.5411899),vec3(0.035267662, -0.063188605, 0.54602677),vec3(-0.47761092, 0.2847911, -0.0271716));
//const vec3 pSphere[8] = vec3[](vec3(0.24710192, 0.6445882, 0.033550154),vec3(0.00991752, -0.21947019, 0.7196721),vec3(0.25109035, -0.1787317, -0.011580509),vec3(-0.08781511, 0.44514698, 0.56647956),vec3(-0.011737816, -0.0643377, 0.16030222),vec3(0.035941467, 0.04990871, -0.46533614),vec3(-0.058801126, 0.7347013, -0.25399926),vec3(-0.24799341, -0.022052078, -0.13399573));
//const vec3 pSphere[12] = vec3[](vec3(-0.13657719, 0.30651027, 0.16118456),vec3(-0.14714938, 0.33245975, -0.113095455),vec3(0.030659059, 0.27887347, -0.7332209),vec3(0.009913514, -0.89884496, 0.07381549),vec3(0.040318526, 0.40091, 0.6847858),vec3(0.22311053, -0.3039437, -0.19340435),vec3(0.36235332, 0.21894878, -0.05407306),vec3(-0.15198798, -0.38409665, -0.46785462),vec3(-0.013492276, -0.5345803, 0.11307949),vec3(-0.4972847, 0.037064247, -0.4381323),vec3(-0.024175806, -0.008928787, 0.17719103),vec3(0.694014, -0.122672155, 0.33098832));
//const vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));
   // grab a normal for reflecting the sample rays later on
   vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));
 
   vec4 currentPixelSample = texture2D(normalMap,uv);
 
   float currentPixelDepth = currentPixelSample.a;
 
   // current fragment coords in screen space
   vec3 ep = vec3(uv.xy,currentPixelDepth);
 // get the normal of current fragment
   vec3 norm = currentPixelSample.xyz;
 
   float bl = 0.0;
   // adjust for the depth ( not shure if this is good..)
   float radD = rad/currentPixelDepth;
 
   vec3 ray, se, occNorm;
   float occluderDepth, depthDifference, normDiff;
 
   for(int i=0; i<SAMPLES;++i)
   {
      // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it
      ray = radD*reflect(pSphere[i],fres);
 
      // if the ray is outside the hemisphere then change direction
      se = ep + sign(dot(ray,norm) )*ray;
 
      // get the depth of the occluder fragment
      vec4 occluderFragment = texture2D(normalMap,se.xy);
 
      // get the normal of the occluder fragment
      occNorm = occluderFragment.xyz;
 
      // if depthDifference is negative = occluder is behind current fragment
      depthDifference = currentPixelDepth-occluderFragment.a;
 
      // calculate the difference between the normals as a weight
 
      normDiff = (1.0-dot(occNorm,norm));
      // the falloff equation, starts at falloff and is kind of 1/x^2 falling
      bl += step(falloff,depthDifference)*normDiff*(1.0-smoothstep(falloff,strength,depthDifference));
   }
 
   // output the result
   float ao = 1.0-totStrength*bl*invSamples;
   gl_FragColor.r = ao;
 
}

This is an optimized version of the same shader, but a little harder to read and understand

uniform sampler2D rnm;
uniform sampler2D normalMap;
varying vec2 uv;
const float totStrength = 1.38;
const float strength = 0.07;
const float offset = 18.0;
const float falloff = 0.000002;
const float rad = 0.006;
#define SAMPLES 10 // 10 is good
const float invSamples = -1.38/10.0;
void main(void)
{
// these are the random vectors inside a unit sphere
vec3 pSphere[10] = vec3[](vec3(-0.010735935, 0.01647018, 0.0062425877),vec3(-0.06533369, 0.3647007, -0.13746321),vec3(-0.6539235, -0.016726388, -0.53000957),vec3(0.40958285, 0.0052428036, -0.5591124),vec3(-0.1465366, 0.09899267, 0.15571679),vec3(-0.44122112, -0.5458797, 0.04912532),vec3(0.03755566, -0.10961345, -0.33040273),vec3(0.019100213, 0.29652783, 0.066237666),vec3(0.8765323, 0.011236004, 0.28265962),vec3(0.29264435, -0.40794238, 0.15964167));
 
   // grab a normal for reflecting the sample rays later on
   vec3 fres = normalize((texture2D(rnm,uv*offset).xyz*2.0) - vec3(1.0));
 
   vec4 currentPixelSample = texture2D(normalMap,uv);
 
   float currentPixelDepth = currentPixelSample.a;
 
   // current fragment coords in screen space
   vec3 ep = vec3(uv.xy,currentPixelDepth);
  // get the normal of current fragment
   vec3 norm = currentPixelSample.xyz;
 
   float bl = 0.0;
   // adjust for the depth ( not shure if this is good..)
   float radD = rad/currentPixelDepth;
 
   //vec3 ray, se, occNorm;
   float occluderDepth, depthDifference;
   vec4 occluderFragment;
   vec3 ray;
   for(int i=0; i<SAMPLES;++i)
   {
      // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it
      ray = radD*reflect(pSphere[i],fres);
 
      // get the depth of the occluder fragment
      occluderFragment = texture2D(normalMap,ep.xy + sign(dot(ray,norm) )*ray.xy);
    // if depthDifference is negative = occluder is behind current fragment
      depthDifference = currentPixelDepth-occluderFragment.a;
 
      // calculate the difference between the normals as a weight
 // the falloff equation, starts at falloff and is kind of 1/x^2 falling
      bl += step(falloff,depthDifference)*(1.0-dot(occluderFragment.xyz,norm))*(1.0-smoothstep(falloff,strength,depthDifference));
   }
 
   // output the result
   gl_FragColor.r = 1.0+bl*invSamples;
 
}

To use the SSAO effect. Render a fullscreen quad over the screen with the following vertex shader and the previous SSAO fragment shader.

varying vec2  uv;
 
void main(void)
{
gl_Position = ftransform();
gl_Position = sign( gl_Position );
 
// Texture coordinate for screen aligned (in correct range):
uv = (vec2( gl_Position.x, - gl_Position.y ) + vec2( 1.0 ) ) * 0.5;
}

Here’s the source for a school project we did in six weeks (halftime work). It’s a simple FPS game, but does demonstrate the SSAO. Use the “O”-button to toggle the different SSAO modes. This source is unfortunately hard to build since it uses a lot of third party libraries (Ogre3D, boost, fmod, ode …)

http://sourceforge.net/svn/?group_id=244295

And here’s the compiled version. Just unzip it and run the exe.

Download zip

Be Sociable, Share!

29 thoughts on “SSAO

  1. Paul Usul Fluegel

    Fantastic! was looking forward to when you would do this one :)

    Thanks for all the good comments and hard work, hope i can learn from this.

  2. gcard28

    Yes a release of a C++ demo with source would be superb I would love to see this running on my computer. Excellent work and website!

  3. Alex

    I can’t get it to work correctly, it’s possible my normals are transformed to to wrong space, also I’m not sure what it means to scale the linear depth from 0 to 1.

    Thanks :)

  4. admin Post author

    If you use RenderMonkey, you can see how the different passes looks like. Very good for debugging :)
    The only problem with it is that it doesn’t handle multiple render targets or float textures very well.

  5. Oscar

    Hi! You’re work looks really amazing. I’m starting to read the same documents to implement SSAOO too. I’ve some doubts about your code. First of all, let me say I’m used to work with HLSL. I never did something with GLSL (just to post some background).

    About the normals : When you say you render them in view space, you mean multiplied by the WorldView, or by the WorlViewProjection. Then, when you read the normals from your normal map, why you don’t resize them from (0,1) to (-1,1) (That’s maybe some trick with GLSL?)

    I guess you’re working all the time in screen space, never in view space. Correct me if I’m wrong. Then, what this means (varying vec2 uv)? It’s that the view direction vector interpolated?

    And here I get lost about reading directly from screen space.

    // current fragment coords in screen space
    vec3 ep = vec3(uv.xy,currentPixelDepth);

    // get the depth of the occluder fragment
    vec4 occluderFragment = texture2D(normalMap,se.xy);

    Sorry if my post is so long, but this info will be helpful. Thanks in advance!

  6. fang

    Hi! Thanks for sharing these helpful codes.

    The second shader you posted is the optimized version, but the only difference from the first version I can see is that you squeezed some math expressions together into function parameters, so saving the use of several variables. Is there other optimizations I missed?

    I translated both versions into HLSL and found out the optimization I mentioned above does not decrease instruction slot number in the assembly codes, and both version renders very slowly (~12fps on my nVidia Quadro NVS 210S). Is it because my card is really that bad, or because I didn’t translate the codes correctly? Some performance benchmark information would be great. Thanks again!

  7. fang

    PS.

    My implementation in HLSL ended up in an assembly shader of as many as 346 instructions when sample ray number is 16. What’s your number?

    1. admin Post author

      I haven’t checked since I was more concerned about the real performance instead of number of instructions when doing this shader.

  8. admin Post author

    – Then, when you read the normals from your normal map, why you don’t resize them from (0,1) to (-1,1) (That’s maybe some trick with GLSL?)

    I don’t need to resize them since I save the normals in a float texture.

  9. keaton

    I’m not sure I understand how “the combination with the original screen can be as simple as just multiplying this AO term with the already rendered screen”. Presumably this applies only if the scene uses no direct lighting?

    1. admin Post author

      First of all, SSAO is a crude approximation so you’re free to do whatever you want. But in real-life, there can be ambient occlusion even on a direct lit surface.

  10. david

    The SSAO is not working for me when I press “O”, also I can’t find this shader in SVN revision 120

  11. Edgar

    Hi, I don’t understand this line?

    for(int i=0; i<SAMPLES;++i)

    whats means i&lt?

    Thanks

  12. xahir

    You do convert normals from [0, 1] range to [-1, 1] range (both fres and norm) but you do not convert occNorm. This causes some strange artifacts. Possible fix:
    occNorm = (occluderFragment.xyz * 2.0) – vec3(1.0);

  13. Rambler.JW

    er,I wanna say many words. However,My English is very poor. So, I just could say thanks for your sharing. I’ll often go here,I need learn from you.

  14. Anthony

    Really great help! Thanks for sharing your code. I have implemented it into a school project I am working on but get quite a bit of Moire patterning as opposed to the noise I have seen in Starcraft 2′s implementation. Any ideas where I should be looking to try to remedy this so that I can get a smoother blur? I have been reading some of the paper’s but no luck figuring it out yet.

    Thanks again!

  15. Felix

    To the poster with the Moire problems that is most likely a precision problem. Make sure you are using at least 16-bit textures when you are rendering the SSAO.

  16. Pingback: Screen Space Ambient Occlusion | AnKi 3D Engine

  17. Gio

    How is the optimised version different to the first one? They look the same to me… harder to read for sure, but it seems to boil down to exactly the same instructions. Unless I’m missing something?

  18. Pingback: Applying SSAO to scenes « Electronic Meteor

  19. sh code

    you’ve got some pretty weak FPS there, matey :-D
    (sorry, couldn’t help it)

    very good tutorial nevertheless, thank you.

  20. mincomp

    First of all, thanks for your code :) but sampling in world space then just use xy components to sample again in image space looks weird to me. Though the 3d vector is in the hemisphere, the xy components might mean nothing, and can be randomly distributed.

  21. Alex

    I really enjoyed the fact that you named a vector “se” and then later get the “xy” values from it.

Comments are closed.