Distributed Systems – Massive Distribution for Performance
September 10, 2009 Leave a comment
Definition of a Distributed System
A distributed System is one in which independent, self-sufficient – often autonomous or heterogeneous – spatially-separated components must use a common interconnect to exchange information and coordinate actions, and allow the whole to appear to the user as one single coherent system.
Knowledge is POWER!
Think for a moment, what could you do with a computer? Yeah, your right – sod all really, specially if its Windows 😛 . Now, what about 100 computers? That’s right, now were getting somewhere! Now, let’s look at something better. What about an entire data centre comprising thousands of computers! Well, put simply – holy monkeys that’s a lot of computers, its… its like being in heaven 😀 !
Parallel vs. Distributed
Do you remember the parallelization schemes that a distributed system can be placed into? If it helps, its:
Where a distributed system is MIMD-parallel computing when the interconnect is on a network-scale.
A Parallelization Example
Say you were given a list ‘a’, which contained 60 items, along with some integer ‘b’.
Now, you and 2 other people were asked to produce another different list, ‘x’, which is identical to list ‘a’, however, every item has been incremented by 1, and the first occurrence of ‘b+3’ has been removed. How would you do it?
Let’s say that it is possible to partition the work, so all 3 people can work in parallel. This would finish the job much quicker. This idea is commonly known as the divide-and-conquer strategy. And we also can say that a parallelization pattern called master-and-workers applies here.
Master-and-worker is when you split the work load evenly between a number of processes that will work in parallel to get the correct answer. Say for example we use the example above. Here we have 3 people and a list of 60 elements. In this case you would split up the work load so each person has a list containing 20 elements from ‘a’:
- a1 = [0:20]
- a2 = [21:40]
- a3 = [41:60]
Here, we will have 3 new lists: x1, x2 and x3, where each element has been incremented by 1.
Now what we do it get all 3 new ‘x’ lists, and combine them together to get the one list: ‘x’, which is every element in ‘a’, but incremented by 1.
Now all we need to do it remove the first item in the list that is equal to ‘b+3’.
This process is called Split, Spawn and Merge. First we split the list, then we spawn methods to return a new list ‘x’, then we merge all the new lists together to get the one new list ‘x’.
The map-reduce model – would you believe it – comprises of the map and reduce second-order functions.
In its simplest form, a map takes a unary function ‘f’ and a collection [c1, ···, cn], and then returns a collection:
- [f(c1), ….., f(cn)]
as you can see, the map will apply the required function onto each element of a given collection. Here’s an example:
- Say we have a collection ‘A’
>>> A = [1, 2, 3, 4, 5]
- Then we have a function, say: incr(x), which will return the value ‘x+1’
- Now we declare a new collection: ‘B’ = map(incr, A)
>>>B = map(incr, A)
- Note how the first argument is the function incr, and the second is the collection ‘A’.
- Once this is done, the map will apply each element of ‘A’ to the function. And when we print out the new collection ‘B’ we get:
[2, 3, 4, 5, 6]
In its simplest form, reduce takes a binary, associative function, Θ (with an optional, initial value ‘i’) and a collection [c1, …, cn], and then returns a new value:
i Θ c1 Θ … Θ cn
So reduce will apply the function to the elements of the given collection. Lets have an example:
- Lets use the collection ‘B’ from the map above:
>>> B = map(incr, A)
- Then we define a binary associative function, that takes to numbers for arguments:
- Now we apply the function to the collection ‘B’ using reduce:
- When printed, we obtain the value 184,320.
Map-Reduce Programming Model
Users implement the following interfaces, which are 2 functions known as the mapper and the reducer.
- Map (in_key, in_value) ->
(out_key, intermediate_value) list
- Reduce (out_key, intermediate_value list) ->
This infrastructure takes care of splitting, spawning and merging, as well as fault-tolerance and load balancing.