Realistic

Paxos

Assumption (weak and realistic):

The system is partially synchronous (in reality, it can even be asynchronous).
Communication between processes can be unstable: messages can be lost, duplicated, or reordered.
Corrupted messages can be detected (and subsequently, ignored).
All the operations are deterministic: once the execution starts, is known its output.
Processes can exhibit crash failures, but not arbitrary failures.
Processes cannot conspire.

Deductions of Paxos algorithm

A client-server configuration is assumed, initially with a primary server.
To make the server more robust, we start by adding a backup server.
To ensure that all commands are executed in the same order in both servers, the primary server gives unique sequence numbers to each and every command. In Paxos, the primary is called the leader.
Assuming that all the current commands can always be restored (either by the clients or the servers) -> only consider control messages.

Dealing with lost messages

Some of Paxos terminologies:

The leader sends an accepting message ACCEPT(o, t) to the backups when it sets the timestamp t to the command o.
The backup responds by sending a learn message: LEARN(o, t).
When the leader notices that the operation o hasn't been learned yet, it retransmits the ACCEPT(o, t) with the original timestamp.

Two servers and a crash

Problem

Primary crashes after executing an operation, but the backup never received the accept message.

Solution

Never execute an operation before it has been learned.

Fundamental rule

In Paxos, a server S cannot execute an operation o without having received a LEARN(o) from all operational servers.

Failure detection

Reliable failure detection is practically impossible. The solution passes by a set of timeouts, but it is necessary to have in mind that failure detection can sometimes trigger a false positive.

Number of necessary servers

At least 3 servers.
In Paxos with 3 servers, the server S cannot execute an operation o without having received at least one LEARN message, and thus knowing that the majority of servers will execute o.
If a backup server crashes Paxos will continue running correctly: operations in operational servers occur in the same order.

Leader crashes after executing o1

S3 completely ignores the activity in S1:

S2 receives the ACCEPT(o, 1), detects the cash, and becomes the leader.
S3 didn't even receive the ACCEPT(o, 1).
S2 sends the ACCEPT(o2, 2) -> S3 sees an unexpected timestamp and informs S2 that it has lost o1.
S2 retransmits the ACCEPT(o1, 1) which allows S3 to recover.

S2 didn't received the ACCEPT(o1, 1):

S2 detects the crash and becomes the leader.
S2 sends the ACCEPT(o1, 1) -> S3 retransmits the LEARN(o1).
S2 sends the ACCEPT(o2, 2) -> S3 informs S2 that apparently it has lost the ACCEPT(o1,1) from S1 which allows for S2 to recover.

Leader crashes after sending the ACCEPT(o1,1)

S3 completely ignores the activity of S1:

As soon as S2 announces that o2 is to be accepted, S3 will notice that it has lost an operation and that it can ask S2 to help recover it.

S2 does not receive the ACCEPT(o1,1):

As soon as S2 proposes an operation, it will use an obsolete timestamp, which allows S3 to inform S2 that it has lost operation o1.

Paxos (with 3 servers) behaves correctly when a server crashes, independently of when the server crashed.

False crash detection

S3 receives an ACCEPT(o1, 1), but much after of the ACCEPT(o2,1). If it knew who the current leader is, it could reject the delayed message.

Leaders should include their IDs in their messages.

Normal operation

Paxos defines 3 roles:

Proposers;
Acceptors;
Learners.

P1a - Prepare -> Anchor the timestamp.
P1b - Promise -> Close the timestamp.
P2a - Accept -> Send its value.
P2b - Learn -> Executes the operation.

PreviousConsensus NextConsensus in arbitrary failures

Last updated 2 years ago