Computer Vision – The Integral Image

September 3, 2010 by Badgerati 19 Comments

The Integral Image or Summed Area Table, was first introduced to us in 1984, but wasn’t properly introduced to the world of Computer Vision till 2001 by Viola and Jones with the Viola-Jones Object Detection Framework.

The Integral Image is used as a quick and effective way of calculating the sum of values (pixel values) in a given image – or a rectangular subset of a grid (the given image).

It can also, or is mainly, used for calculating the average intensity within a given image. If one wants to use the Integral Image, it is normally a wise idea to make sure the image is in greyscale first.

How does it work?

So, how does it work? I hear you cry. Well, in all honesty it really isn’t that hard to get the grasp of! Sure some of the equations may look daunting, but they really aren’t that hard…

So, let us start off with the basics. When creating an Integral Image, we need to create a Summed Area Table. In this table, if we go to any point (x,y) then at this table entry we will come across a value. This value itself is quite interesting, as it is the sum of all the pixel values above, to the left and of course including the original pixel value of (x,y) itself:

What is really good about the Summed Area Table, is that we are actually able to construct it with only one pass over of the given image. There will be more on complexity later, but, in order for this fact to become true, all we have to do is accept that the value in the Summed Area Table at (x,y) is simply calculated by:

That is, we get the original pixel value i(x,y) form the image, and then we add the values directly above this pixel, and directly left to this pixel from the Summed Area Table at s(x-1, y) and s(x, y-1). Finally, we subtract the value directly top-left of i(x,y) from the Summed Area Table – that is, s(x-1, y-1).

Here is a better example, take the following image and its corresponding Summed Area Table:

On the left we have the given image, with its corresponding pixel values. On the right we have the images corresponding Summed Area Table. AT this current moment in time, we only have one value filled in, that is to say, we have filled in s(x-1, y-1) = 5.

Why is this? Well, taking the equation from above. We simply substitute in values:

i(x,y) = 5 – this is the pixel value from the given image. The next values are from the Summed Area Table.
s(x-1, y) = 0 – why? because x-1 is outside the image bounds, so it is automatically a value of 0.
s(x, y-1) = 0 – can you see why this one is? That’s right, same as above, but this time because of y-1.
s(x-1, y-1) = 0 – this one should be obvious by now.

In the case of all s(x’, y’) above, they were all out-of-bounds inside the Summed Area Table. So are all defaulted to a value of 0. Therefore, 5+0+0-0 = 5. So s(x-1, y-1) gets a value of 5.

Assuming here of course that s(x-1, y-1) is s(x,y) for the time being for the equation above.

Now, I am just going to substitute in the values for s(x, y-1) and s(x-1, y). It is up to YOU to check them, and see if they’re are correct – as well as to see if you can use the equation correctly:

If you have actually attempted to calculate the values, and ended up with the correct ones. Well done! You may give yourselves a pat on the back!

This is what those values represent. taking the s(x-1, y) entry, we have a value of 8. This value represents the sum of the pixels to the left, above and including itself. In this case we have the pixel itself with value 3, and using the equation, we use the entry in the Summed Area Table directly above (only none out-of-bounds result), which has a value of 5. So 5+3=8, which IS the sum of the pixels left, above and including itself.

But, now I will show you quickly the calculation for s(x,y) using 4 values. If you struggled slightly with trying to solve the values for s(x-1, y) etc. then this should help you a little bit; otherwise, feel free to drop off a comment with any questions.

Here, is the Summed Area Table completed:

Some of you may have already calculated it, but here’s what you do.

First of all, we sub in the values from the above tables:

i(x,y) = 6 – Remember, this value comes from the actual given image. Which is as marked, 6.
s(x-1, y) = 8 – This time we need the values in the Summed Area Table. this value here is 8.
s(x, y-1) = 7 – Same as above, so this time the value is 7.
s(x-1, y-1) = 5 – This time we can actually use this value, which is a value of 5

This time all values can be used, as there are no out-of-bounds results. So now, sticking these values into the equation we get: s(x,y) = 6 + 8 + 7 – 5 = 16. The question is, is this correct? Well yes, it is!

Remembering that 16 is the sum of all pixels top, left and itself, we add up the 4 pixel values in the actual given image: 5 + 2 + 3 + 6 = 16!! Amazing isn’t it!

What next?

Well, once you have used the equation to calculate and fill up your Summed Area Table, the the task of calculating the sum of pixels in some rectangle which is a subset of the original image can be done in constant time. Yes, that’s right, in O(1) complexity!

In order to do this we only need to use 4 values from the Summed Area Table, that is, 4 array references into the Summed Area Table. With these 4 values, we then add or subtract them for the correct value of the sum of the pixels within that region. To do this, we use this equation:

This is fairly similar to the equation further above.

Let us now have an example. Let us say that we wish to calculate the area contained by the green square:

Remember, the value 16 is the total sum of all the squares. But we want just that green square. As you can already see I have labelled on the A, B, C and D. This is what each of them are:

Firstly, we have s(A), which includes these squares:

So s(A) is just the green square, which has value 5. Next we have s(B):

Now, s(B) is the value 7, because this is the value of the sum of the values up to that point. s(C) looks fairly similar:

As for s(D), this is the sum of all the values up to that point:

So, with all of this, we have the values:

A = 5
B = 7
C = 8
D = 16

With this, we can substitute them all into the equation for,

i(x’, y’) = s(A) + s(D) – s(B) – s(C) = 5 + 16 – 7 – 8 = 6

Therefore, we are left with the value 6. Think it’s wrong? Think again! Go back to the original image pixel values. Now look at the bottom-right pixel value. What’s that, a 6! See told you it worked!

Bigger Example

I am not going to explain the whole process for this, it’s up to you to work it out. But here is an example of a bigger original image (4×4), with its corresponding Summed Area Table. The final 5 images are for calculating that area enclosed in A, B,C and D labels.

Original Image

Summed Area Table

Calculating an Area

This is the area we want.

This is what A, B, C and D correspond to.

Remember to use A+D-B-C. If you do everything correctly you should get the value 16.

Complexity

We have already touched a little bit on this. But as mentioned previously, the complexity for evaluating the Summed Area Table can be done in O(1) [constant] rather than in O(n^2) [quadratic].

Notice: as time goes on this post will probably get further improved.

Main Sources

[1] Wikipedia: Summed Area Table

[2] Viola-Jones Object Detection Framework

[3] Integral Image-based Representations paper by Konstantinos G. Derpanis. in July 14, 2007 (PDF)

[4] An introduction to the theory behind the integral image algorithm (YouTube)

[5] Badgerati Tutorials – The Integral Image

Filed under Computer Science, Computer Vision Tagged with Algorithm, Area, Complexity, Computer, Constant, Image, Image Processing, Image Representation, Integral, Integral Image, Processing, Representation, Science, Summed, Table, Viola-Jones, Vision

About Badgerati
Computer Scientist, Games Developer, and DevOps Engineer. Fantasy and Sci-fi book lover, also founder of Cadaeic Studios.

19 Responses to Computer Vision – The Integral Image

Paulo says:

May 13, 2011 at 12:05 am

Simple explanation. Nice effort in putting this together!
I guess that in the equation after “What’s next”, S(C) should be swapped with S(D).

Reply
- Badgerati says:
  
  May 13, 2011 at 12:16 am
  
  Thank you very much!
  And yes, you’re correct, S(C) and S(D) should be swapped – my bad. I shall fix this ASAP, it’s a wonder how I missed that error in the first place. Thank you for pointing it out to me!
  
  Reply
rot says:

June 19, 2011 at 8:19 pm

Thnx !!!

Reply
Vagelis says:

July 8, 2011 at 12:04 pm

beautifully simple!

Reply
Silver bullet says:

July 17, 2011 at 11:12 am

This is a very clear and useful tutorial, thank you!

A little typo:
“Therefore, 5+0+0-1 = 5. So s(x-1, y-1) gets a value of 5.”
It should be 5+0+0-0 🙂

Reply
- Badgerati says:
  
  November 2, 2011 at 8:16 pm
  
  Thank you very much! 🙂 And thanks for pointing out that typo; I’ve fixed it!
  
  Reply
Rhonda says:

November 2, 2011 at 8:10 pm

Imagine?

Reply
- Badgerati says:
  
  November 2, 2011 at 8:18 pm
  
  That should say “image”. Thanks for pointing it out; I’ve fixed it now.
  
  Reply
Ankush says:

November 18, 2011 at 5:46 pm

Thnx for such a simple explanation………..it helped me a lot.

Reply
royamkhan says:

December 26, 2011 at 8:08 pm

thanks a lot dear , u made it very easy to understand

Reply
Sarhat says:

March 6, 2012 at 8:07 pm

It is great Job, thank you very very much…….

Reply
Anton Averich says:

April 17, 2012 at 4:48 pm

At last I’ve understood this! Thank you for a good explanation! 🙂

Reply
Mani says:

April 26, 2012 at 9:00 pm

Thanks guys for this great tutorial… my head was blowing off… loved your tuts..

Reply
Jason says:

June 7, 2012 at 4:41 pm

Great explanation!

Reply
Onur says:

October 1, 2012 at 12:56 pm

Clear explanation, very helpul. Thanks for your effort.

Reply
Javier C. says:

October 23, 2012 at 10:08 pm

Great explanation! My head almost exploded when I saw the math formulas in some papers, now Im not afraid of them anymore xD.

Reply
metritos says:

October 31, 2012 at 4:39 pm

thank you !!! you really make it simple to understand !!!

Reply
Pingback: iOS – C/C++ – Speed up Integral Image calculation | QueryPost.com
gautam says:

February 19, 2014 at 5:53 pm

thanks sir!!! nice explanation..

Reply