Distributed Systems – Logical Clocks and Processes
September 10, 2009 Leave a comment
A fat client is a client in a client-server architecture network which typically provides rich functionality independently of the central server. Originally known as just ‘client’ or ‘thick client’, the name is contrasted to thin client, which describes a computer heavily dependant on a servers applications.
A thin client is a client computer or client software in a client-server architecture network which depends primarily on the central server for processing activities, and mainly focuses on conveying input and output between the user and the remote server. In contrast, a thick or fat client does as much processing as possible and passes only data for communications and storage to the server.
Let’s say we have a logical clock, LCi for each processor. In this case, when ever an event happens, we shall increment LCi.
If a processor, X, sends a message to processor, Y, then processor X will also send LCX, which is that processors logical clock.
When processor Y receives this message, then we do:
If LCY < (LCX + 1):
LCY = LCX + 1
In order to update processor Y’s logical clock.
Let’s now say that we have a processor ‘A’ and ‘B’. There are a few things we can say:
- If A precedes (happens before) B, then we can write A –> B
- If A and B are concurrent events, then sadly we can’t say anything about their ordering.
If A –> B is true, then it must also be true that LCA < LCB. However, this is not the case. Just because LCA < LCB does not mean that A –> B. Therefore, we can say that we cannot infer a casual ordering of processors just by looking at their timestamps.
An atomic operation refers to a set of operations or events that can be combined so that they appear to the rest of the system to be a single operation with only to possible outcomes: success or failure.
To read up more on this please visit:
In concurrent programming, a critical section is a piece of code that accesses a shared resource that must not be concurrently accessed by more than one thread of execution. [Assuming you read the link above, this can solve the problem between the 2 processes accessing the same memory location].
A critical section will usually terminate in a fixed time, and a thread, task or process will only have to wait a fixed time to enter the shared resource.
By carefully controlling which variables are modified inside and outside the critical section (usually, by accessing important state only from within), concurrent access to that state is prevented. A critical section is typically used when a multithreaded program must update multiple related variables without a separate thread making conflicting changes to that data. In a related situation, a critical section may be used to ensure a shared resource, for example a printer, can only be accessed by one process at a time.
Want more? Try here: http://en.wikipedia.org/wiki/Critical_section
So this leaves on little simple question:
Can we safely and atomically update the state on 2 different machines?
The answer? NO! 🙂
Mutual Exclusion (mutex)
Mutual exclusion algorithms are used in concurrent programming to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections.
For example, suppose a piece of code is altering a piece of data over several program steps, when another thread starts executing. If this second thread reads from the same piece of data which is currently being overwritten. If the second thread tries to overwrite that data, then the ensuing state will probably be unrecoverable. These shared data being accessed by critical sections of code, must therefore be protected, so that other processes which read from or write to the chunk of data are excluded from running.
A mutex is also a common name for a program object that negotiates mutual exclusion among threads, also called a lock.
A Distributed Version of mutex
A process ‘K’ wants to enter a critical section:
- It generates a timestamp TSK
- It then sends a request (K, TSK) to everyone
- Waits for a reply from everyone to see if it is OK to proceed
- Process ‘K’ enters its critical section.
When another process – different from ‘K’ – receives a request:
- If this other process is in its critical section, it defers its reply
- Otherwise, it sends a reply, unless this process wishes to also try and enter the critical section
- If we want to enter the critical section, if the requestor has a smaller timestamp than us, allow them first, otherwise defer the reply.
If process ‘K’ does not get a reply from everyone, then it does not enter its critical section.
A centralized Version of mutex
- Send a request to the server for a lock on the mutex
- When a reply comes back, start the critical section
- When finished, send a message to the server to release the mutex.
The Lock Server:
- If the mutex is available, mark it as being used by that client, and send a reply. Otherwise, queue the request.
- When the mutex is released by the client, if someone else wants it, pass it to them, otherwise mark it as being available.
Two Phase Commit (2PC)
A coordinator requests a transaction, and sends a request to all participants.
For example: C1 sends a request to remove X pounds from an account and C2 sends a request to add X pounds to an account.
All participants respond as to whether they are willing and able to execute the request, and send either VOTE_COMMIT or VOTE_ABORT. Then each participant logs their vote.
The coordinator looks at all the votes, if everyone has voted to commit, then the coordinator send a GLOBAL_COMMIT to everyone, otherwise, it sends a GLOBAL_ABORT.
One receiving the decision from the coordinator, all participants records the decision locally.
If the decision is ABORT, then all participants ROLL BACK to their previous safe states, otherwise, they continue on, completing the transaction.
The Bully Algorithm
In the example of 2PC described above, how do we decide which node is the coordinator?
For this, we use an algorithm called the Bully Algorithm. This relies on some ordering and numbering of the nodes. Once this is done, we can do this:
- Node ‘P’ sends an ELECTION to message to all other nodes with a higher number than itself.
- If no one responds, then node ‘P’ wins the election and becomes the coordinator.
- If one of the higher numbered nodes answers, then this algorithm repeats, but on the higher numbered node.
What if no one responds? Then quite clearly we need a time-out.
view the Bully Algorithm notes on this blog here!: Bully Algorthim
Global Clock no more!
Until the mid 1800’s, all that existed was ‘local time’ based on the suns position. This changed by about 1 minute for every 12 miles. So clearly, a global clock back then was just nonsense.
- ‘P’ requests the time from ‘S’
- ‘S’ then prepares a response to send the time ‘T’, and then sends this response at the last possible moment.
- ‘P’ sets its time to be ‘T + RoundTripTime/2’
Problems with this, is that this algorithm assumes that the RoundTripTime is equal for both sending and receiving, so what if it isn’t?
- Choose a master node
This master node can then:
- Request the time from each process.
- It then observes the RoundTripTime and estimates the time in each process and its own time.
- Averages the clock times, ignoring any outlying values.
- Sends each process a relative update to their clock, which can be positive or negative.
A problem here, is what happens to a node that has the master set its clock backwards…