Computer Vision – The Integral Image
September 3, 2010 19 Comments
The Integral Image or Summed Area Table, was first introduced to us in 1984, but wasn’t properly introduced to the world of Computer Vision till 2001 by Viola and Jones with the Viola-Jones Object Detection Framework.
The Integral Image is used as a quick and effective way of calculating the sum of values (pixel values) in a given image – or a rectangular subset of a grid (the given image).
It can also, or is mainly, used for calculating the average intensity within a given image. If one wants to use the Integral Image, it is normally a wise idea to make sure the image is in greyscale first.
How does it work?
So, how does it work? I hear you cry. Well, in all honesty it really isn’t that hard to get the grasp of! Sure some of the equations may look daunting, but they really aren’t that hard…
So, let us start off with the basics. When creating an Integral Image, we need to create a Summed Area Table. In this table, if we go to any point (x,y) then at this table entry we will come across a value. This value itself is quite interesting, as it is the sum of all the pixel values above, to the left and of course including the original pixel value of (x,y) itself:
What is really good about the Summed Area Table, is that we are actually able to construct it with only one pass over of the given image. There will be more on complexity later, but, in order for this fact to become true, all we have to do is accept that the value in the Summed Area Table at (x,y) is simply calculated by:
That is, we get the original pixel value i(x,y) form the image, and then we add the values directly above this pixel, and directly left to this pixel from the Summed Area Table at s(x-1, y) and s(x, y-1). Finally, we subtract the value directly top-left of i(x,y) from the Summed Area Table – that is, s(x-1, y-1).
Here is a better example, take the following image and its corresponding Summed Area Table:
On the left we have the given image, with its corresponding pixel values. On the right we have the images corresponding Summed Area Table. AT this current moment in time, we only have one value filled in, that is to say, we have filled in s(x-1, y-1) = 5.
Why is this? Well, taking the equation from above. We simply substitute in values:
- i(x,y) = 5 – this is the pixel value from the given image. The next values are from the Summed Area Table.
- s(x-1, y) = 0 – why? because x-1 is outside the image bounds, so it is automatically a value of 0.
- s(x, y-1) = 0 – can you see why this one is? That’s right, same as above, but this time because of y-1.
- s(x-1, y-1) = 0 – this one should be obvious by now.
In the case of all s(x’, y’) above, they were all out-of-bounds inside the Summed Area Table. So are all defaulted to a value of 0. Therefore, 5+0+0-0 = 5. So s(x-1, y-1) gets a value of 5.
Assuming here of course that s(x-1, y-1) is s(x,y) for the time being for the equation above.
Now, I am just going to substitute in the values for s(x, y-1) and s(x-1, y). It is up to YOU to check them, and see if they’re are correct – as well as to see if you can use the equation correctly:
If you have actually attempted to calculate the values, and ended up with the correct ones. Well done! You may give yourselves a pat on the back!
This is what those values represent. taking the s(x-1, y) entry, we have a value of 8. This value represents the sum of the pixels to the left, above and including itself. In this case we have the pixel itself with value 3, and using the equation, we use the entry in the Summed Area Table directly above (only none out-of-bounds result), which has a value of 5. So 5+3=8, which IS the sum of the pixels left, above and including itself.
But, now I will show you quickly the calculation for s(x,y) using 4 values. If you struggled slightly with trying to solve the values for s(x-1, y) etc. then this should help you a little bit; otherwise, feel free to drop off a comment with any questions.
Here, is the Summed Area Table completed:
Some of you may have already calculated it, but here’s what you do.
First of all, we sub in the values from the above tables:
- i(x,y) = 6 – Remember, this value comes from the actual given image. Which is as marked, 6.
- s(x-1, y) = 8 – This time we need the values in the Summed Area Table. this value here is 8.
- s(x, y-1) = 7 – Same as above, so this time the value is 7.
- s(x-1, y-1) = 5 – This time we can actually use this value, which is a value of 5
This time all values can be used, as there are no out-of-bounds results. So now, sticking these values into the equation we get: s(x,y) = 6 + 8 + 7 – 5 = 16. The question is, is this correct? Well yes, it is!
Remembering that 16 is the sum of all pixels top, left and itself, we add up the 4 pixel values in the actual given image: 5 + 2 + 3 + 6 = 16!! Amazing isn’t it!
Well, once you have used the equation to calculate and fill up your Summed Area Table, the the task of calculating the sum of pixels in some rectangle which is a subset of the original image can be done in constant time. Yes, that’s right, in O(1) complexity!
In order to do this we only need to use 4 values from the Summed Area Table, that is, 4 array references into the Summed Area Table. With these 4 values, we then add or subtract them for the correct value of the sum of the pixels within that region. To do this, we use this equation:
This is fairly similar to the equation further above.
Let us now have an example. Let us say that we wish to calculate the area contained by the green square:
Remember, the value 16 is the total sum of all the squares. But we want just that green square. As you can already see I have labelled on the A, B, C and D. This is what each of them are:
Firstly, we have s(A), which includes these squares:
So s(A) is just the green square, which has value 5. Next we have s(B):
Now, s(B) is the value 7, because this is the value of the sum of the values up to that point. s(C) looks fairly similar:
As for s(D), this is the sum of all the values up to that point:
So, with all of this, we have the values:
- A = 5
- B = 7
- C = 8
- D = 16
With this, we can substitute them all into the equation for,
- i(x’, y’) = s(A) + s(D) – s(B) – s(C) = 5 + 16 – 7 – 8 = 6
Therefore, we are left with the value 6. Think it’s wrong? Think again! Go back to the original image pixel values. Now look at the bottom-right pixel value. What’s that, a 6! See told you it worked!
I am not going to explain the whole process for this, it’s up to you to work it out. But here is an example of a bigger original image (4×4), with its corresponding Summed Area Table. The final 5 images are for calculating that area enclosed in A, B,C and D labels.
Summed Area Table
Calculating an Area
This is the area we want.
This is what A, B, C and D correspond to.
Remember to use A+D-B-C. If you do everything correctly you should get the value 16.
We have already touched a little bit on this. But as mentioned previously, the complexity for evaluating the Summed Area Table can be done in O(1) [constant] rather than in O(n^2) [quadratic].
Notice: as time goes on this post will probably get further improved.