Handling of the Coordinator Site failure by 2PC / How does 2 Phase Commit Protocol handle the failure of a coordinator site? / Steps involved in handling coordinator failure by 2PC protocol
Handling the Failure of a Coordinator Site
Let us suppose that the
Coordinator Site failed during execution of 2 Phase Commit (2PC) protocol for a
transaction T. This situation can be handled in two ways;
- The other sites which are participating in the transaction T may try to decide the fate of the transaction. That is, they may try to decide on Commit or Abort of T using the control messages available in every site.
- The second way is to wait until the coordinator site recovers.
Method 1
Let us see, how the transaction
T’s final status can be decided through the first method in detail.
[a] If an active site has
<commit T> message in its log - <commit T> message is decided by
the coordinator site. If the coordinator site sends the <commit T>
message to all the participating sites, then only they can write the message
into their log files. Hence, the decision is to commit the transaction T.
[b] If an active site recorded
<abort T> message in its log – This clearly shows that the decision taken
by the coordinator site before it fails was to abort the transaction T. Hence,
the decision should be abort T.
[c] If some active sites do not
hold a <ready T> message in their log files – As stated in 2PC protocol, if
one or more of the participating sites do not contain <ready T> message
in their log files, then it clearly shows that those sites must not have
responded to the coordinator on the <prepare T> message. Hence, the
coordinator must have taken a decision to abort the transaction T. So, we abort
T.
Method 2
If none of the cases [a], [b], and [c] holds, we can apply only the second way of handling the failure of coordinator site. That is, we need to wait until the transaction coordinator recovers.