TOPICS (Click to Navigate)

Pages

Saturday, July 19, 2014

Failure of a Coordinator Site

Handling of the Coordinator Site failure by 2PC / How does 2 Phase Commit Protocol handle the failure of a coordinator site? / Steps involved in handling coordinator failure by 2PC protocol





Handling the Failure of a Coordinator Site

Let us suppose that the Coordinator Site failed during execution of 2 Phase Commit (2PC) protocol for a transaction T. This situation can be handled in two ways;

  • The other sites which are participating in the transaction T may try to decide the fate of the transaction. That is, they may try to decide on Commit or Abort of T using the control messages available in every site.

  • The second way is to wait until the coordinator site recovers.

Method 1
Let us see, how the transaction T’s final status can be decided through the first method in detail.
[a] If an active site has <commit T> message in its log - <commit T> message is decided by the coordinator site. If the coordinator site sends the <commit T> message to all the participating sites, then only they can write the message into their log files. Hence, the decision is to commit the transaction T.
[b] If an active site recorded <abort T> message in its log – This clearly shows that the decision taken by the coordinator site before it fails was to abort the transaction T. Hence, the decision should be abort T.
[c] If some active sites do not hold a <ready T> message in their log files – As stated in 2PC protocol, if one or more of the participating sites do not contain <ready T> message in their log files, then it clearly shows that those sites must not have responded to the coordinator on the <prepare T> message. Hence, the coordinator must have taken a decision to abort the transaction T. So, we abort T.

Method 2

If none of the cases [a], [b], and [c] holds, we can apply only the second way of handling the failure of coordinator site. That is, we need to wait until the transaction coordinator recovers.