Computer Vision – The Integral Image


The Integral Image or Summed Area Table, was first introduced to us in 1984, but wasn’t properly introduced to the world of Computer Vision till 2001 by Viola and Jones with the Viola-Jones Object Detection Framework.

The Integral Image is used as a quick and effective way of calculating the sum of values (pixel values) in a given image – or a rectangular subset of a grid (the given image).

It can also, or is mainly, used for calculating the average intensity within a given image. If one wants to use the Integral Image, it is normally a wise idea to make sure the image is in greyscale first.

How does it work?

So, how does it work? I hear you cry. Well, in all honesty it really isn’t that hard to get the grasp of! Sure some of the equations may look daunting, but they really aren’t that hard…

So, let us start off with the basics. When creating an Integral Image, we need to create a Summed Area Table. In this table, if we go to any point (x,y) then at this table entry we will come across a value. This value itself is quite interesting, as it is the sum of all the pixel values above, to the left and of course including the original pixel value of (x,y) itself:

IISumEx

What is really good about the Summed Area Table, is that we are actually able to construct it with only one pass over of the given image. There will be more on complexity later, but, in order for this fact to become true, all we have to do is accept that the value in the Summed Area Table at (x,y) is simply calculated by:

sxyEq

That is, we get the original pixel value i(x,y) form the image, and then we add the values directly above this pixel, and directly left to this pixel from the Summed Area Table at s(x-1, y) and s(x, y-1). Finally, we subtract the value directly top-left of i(x,y) from the Summed Area Table – that is, s(x-1, y-1).

Here is a better example, take the following image and its corresponding Summed Area Table:

adding_1

On the left we have the given image, with its corresponding pixel values. On the right we have the images corresponding Summed Area Table. AT this current moment in time, we only have one value filled in, that is to say, we have filled in s(x-1, y-1) = 5.

Why is this? Well, taking the equation from above. We simply substitute in values:

  • i(x,y) = 5          –   this is the pixel value from the given image. The next values are from the Summed Area Table.
  • s(x-1, y) = 0    –  why? because x-1 is outside the image bounds, so it is automatically a value of 0.
  • s(x, y-1) = 0    – can you see why this one is? That’s right, same as above, but this time because of y-1.
  • s(x-1, y-1) = 0 –  this one should be obvious by now.

In the case of all s(x’, y’) above, they were all out-of-bounds inside the Summed Area Table. So are all defaulted to a value of 0. Therefore, 5+0+0-0 = 5. So s(x-1, y-1) gets a value of 5.

Assuming here of course that s(x-1, y-1) is s(x,y) for the time being for the equation above.

Now, I am just going to substitute in the values for s(x, y-1) and s(x-1, y). It is up to YOU to check them, and see if they’re are correct – as well as to see if you can use the equation correctly:

adding_2

If you have actually attempted to calculate the values, and ended up with the correct ones. Well done! You may give yourselves a pat on the back!

This is what those values represent. taking the s(x-1, y) entry, we have a value of 8. This value represents the sum of the pixels to the left, above and including itself. In this case we have the pixel itself with value 3, and using the equation, we use the entry in the Summed Area Table directly above (only none out-of-bounds result), which has a value of 5. So 5+3=8, which IS the sum of the pixels left, above and including itself.

But, now I will show you quickly the calculation for s(x,y) using 4 values. If you struggled slightly with trying to solve the values for s(x-1, y) etc. then this should help you a little bit; otherwise, feel free to drop off a comment with any questions.

Here, is the Summed Area Table completed:

adding_3

Some of you may have already calculated it, but here’s what you do.

First of all, we sub in the values from the above tables:

  • i(x,y) = 6         –  Remember, this value comes from the actual given image. Which is as marked, 6.
  • s(x-1, y) = 8   –  This time we need the values in the Summed Area Table. this value here is 8.
  • s(x, y-1) = 7   –  Same as above, so this time the value is 7.
  • s(x-1, y-1) = 5  –  This time we can actually use this value, which is a value of 5

This time all values can be used, as there are no out-of-bounds results. So now, sticking these values into the equation we get: s(x,y) = 6 + 8 + 7 – 5  =  16. The question is, is this correct? Well yes, it is!

Remembering that 16 is the sum of all pixels top, left and itself, we add up the 4 pixel values in the actual given image: 5 + 2 + 3 + 6 = 16!! Amazing isn’t it!

What next?

Well, once you have used the equation to calculate and fill up your Summed Area Table, the the task of calculating the sum of pixels in some rectangle which is a subset of the original image can be done in constant time. Yes, that’s right, in O(1) complexity!

In order to do this we only need to use 4 values from the Summed Area Table, that is, 4 array references into the Summed Area Table. With these 4 values, we then add or subtract them for the correct value of the sum of the pixels within that region. To do this, we use this equation:

regionadding_2

This is fairly similar to the equation further above.

Let us now have an example. Let us say that we wish  to calculate the area contained by the green square:

IIexeg

Remember, the value 16 is the total sum of all the squares. But we want just that green square. As you can already see I have labelled on the A, B, C and D. This is what each of them are:

Firstly, we have s(A), which includes these squares:

IIex1

So s(A) is just the green square, which has value 5. Next we have s(B):

IIex2

Now, s(B) is the value 7, because this is the value of the sum of the values up to that point. s(C) looks fairly similar:

IIex3

As for s(D), this is the sum of all the values up to that point:

IIex4

So, with all of this, we have the values:

  • A = 5
  • B = 7
  • C = 8
  • D = 16

With this, we can substitute them all into the equation for,

  • i(x’, y’)   =   s(A) + s(D) – s(B) – s(C)   =   5 + 16 – 7 – 8   =  6

Therefore, we are left with the value 6. Think it’s wrong? Think again! Go back to the original image pixel values. Now look at the bottom-right pixel value. What’s that, a 6! See told you it worked!

Bigger Example

I am not going to explain the whole process for this, it’s up to you to work it out. But here is an example of a bigger original image (4×4), with its corresponding Summed Area Table. The final 5 images are for calculating that area enclosed in A, B,C and D labels.

Original Image

bigexampleimage

Summed Area Table

bigegSAT

Calculating an Area

This is the area we want.

areacalcbig1

This is what A, B, C and D correspond to.

areacalcbig2

areacalcbig3

areacalcbig4

areacalcbig5

Remember to use A+D-B-C. If you do everything correctly you should get the value 16.

Complexity

We have already touched a little bit on this. But as mentioned previously, the complexity for evaluating the Summed Area Table can be done in O(1) [constant] rather than in O(n^2) [quadratic].

Notice: as time goes on this post will probably get further improved.

Main Sources

[1] Wikipedia: Summed Area Table

[2] Viola-Jones Object Detection Framework

[3] Integral Image-based Representations paper by Konstantinos G. Derpanis. in July 14, 2007 (PDF)

[4] An introduction to the theory behind the integral image algorithm (YouTube)

[5] Badgerati Tutorials – The Integral Image

About Badgerati
Computer Scientist, Games Developer, and DevOps Engineer. Fantasy and Sci-fi book lover, also founder of Cadaeic Studios.

19 Responses to Computer Vision – The Integral Image

  1. Paulo says:

    Simple explanation. Nice effort in putting this together!
    I guess that in the equation after “What’s next”, S(C) should be swapped with S(D).

    • Badgerati says:

      Thank you very much!
      And yes, you’re correct, S(C) and S(D) should be swapped – my bad. I shall fix this ASAP, it’s a wonder how I missed that error in the first place. Thank you for pointing it out to me!

  2. rot says:

    Thnx !!!

  3. Vagelis says:

    beautifully simple!

  4. Silver bullet says:

    This is a very clear and useful tutorial, thank you!

    A little typo:
    “Therefore, 5+0+0-1 = 5. So s(x-1, y-1) gets a value of 5.”
    It should be 5+0+0-0 🙂

  5. Rhonda says:

    Imagine?

  6. Ankush says:

    Thnx for such a simple explanation………..it helped me a lot.

  7. royamkhan says:

    thanks a lot dear , u made it very easy to understand

  8. Sarhat says:

    It is great Job, thank you very very much…….

  9. At last I’ve understood this! Thank you for a good explanation! 🙂

  10. Mani says:

    Thanks guys for this great tutorial… my head was blowing off… loved your tuts..

  11. Jason says:

    Great explanation!

  12. Onur says:

    Clear explanation, very helpul. Thanks for your effort.

  13. Javier C. says:

    Great explanation! My head almost exploded when I saw the math formulas in some papers, now Im not afraid of them anymore xD.

  14. metritos says:

    thank you !!! you really make it simple to understand !!!

  15. Pingback: iOS – C/C++ – Speed up Integral Image calculation | QueryPost.com

  16. gautam says:

    thanks sir!!! nice explanation..

Leave a reply to Jason Cancel reply