February 22, 2008 (Lecture 17)

February 22, 2008 (Lecture 17)

Reading

Chow and Johnson, et al: 6.3
Coulouris, et al: 12.5

Exam #1

...will be oth this afternoon. It is due at 11:59PM one week from today. Please submit as you do a lab -- please use a common file format and simple pagination. We care about the quality of your thoughts -- not your graphic design capability :-)

Timestamp Ordering

Another approach to preventing inconsistency in light of concurrency is known as inconsistency avoidance. Instead of preventing inconsistency in the transaction manager, before the transaction is executed, inconsistency avoidance allows the transaction to execute, but causes it to abort if it cannot complete execution without violating the isolation property.
The most common approaches in this category are based on logical timestamps. As transactions enter the system, they are given a timestamp. This timestamp is used to determine which operations can be completed and which operations should cause a transaction to be aborted.
To implement this approach, we need a few things:

A facility for applying logical timestamps to each transaction as it enters the transaction manager
Some information about each object:

The last time a committed transaction read the object (RD)
The last time a committed transaction changed the object (WR)
A sorted list of pending (uncommitted) operations (PENDING)
A pointer to the oldest pending write in PENDING (WR-MIN)

Given this information, the scheduler can speculate and allow operations to execute tentatively. These operations will only become visible if they are eventually commited. Furthermore, it is possible that they may be aborted as the result interference from other operations.
Reads:
Since reads cannot conflict with each other, we don't need to worry about read-read conflicts. Instead, when a new read operation is received, we only need to worry about a conflict with a write. If the read operation arrives too late, after a write with a later timestamp has committed, the reading transaction must be aborted -- it is impossible to undo the committed write to deliver the proper value.
If the read operation arrived in a timely way, we add it to the queue. If there is an earlier pending write, we wait for this write to commit or abort before returning a value. Otherwise, we hope for the best and return the value to the transaction (this allows the transaction to speculatively continue). If an earlier write shows up and commits before the transaction using the speculative read value commits, the reading transaction must abort -- that's why we enqueued the read for record keeping.

Writes:
Just as with reads, a write request must be rejected if it arrives late. In other words, if the writing transaction's timestamp is earlier than the object's WR, the transaction must be aborted. The write must also be aborted if a prior read has already been committed. This is because we can't undo the read and give the reading transaction a new value.
If a write request arrives in a timely way, we place it into the queue according to its time.

Commit:
If a transaction that contains only a reads commits, three things need to happen:

The RD times need to be updated.
If speculative writes showed up with earlier timestamps (ahead in queue), they must be aborted.
The read operation needs to be removed from the object's queue -- a write can no longer interfere with it. The write would be aborted because of the RD time.

It is important to note that a read cannot commit if it was queued behind a pending writes at the time that it was issued. This is because it must wait for those writes to commit or abort -- the transaction is blocked until this happens and the write can return. The only pending writes that might need to be aborted are those that arrived late-ish -- after the speculative read returned a value. Please remember that they were not aborted, because the read had not committed -- there was still hope (until now).
When speculative writes are committed, transactions associated with reads and writes that occured before it must be aborted and removed from the queue. This is because their effect will be forever lost once the write modifies the object. The write operation, itself, must be removed from the object's queue, and any reads that were blocked waiting for the write to commit or abort can be unblocked and permitted to return a value to their transaction. Of course the WR time should be updated.

Abort:
If a transaction containing a read is aborted, the read is removed from the object's queue -- no other action needs to be taken, since read cannot affect other transactions. If a transaction containing a write is aborted, it should be removed from the object's queue and any writes which were blocked awaiting the write should be allowed to continue.

Optimistic Transaction Scheduling
During the last couple of classes, we discussed 2PL as an example of inconsistency prevention and timestamp ordering as an example of inconsistency avoidance. Today we are going to discuss another approach, known as validation
Instead of preventing inconsistency by structuring safe transactions using locks, or avoiding deadlock by aborting transactions that might go awry as soon as a problematic operation appears, we are going to optimistically speculate and complete each transaction. Once the transaction is speculatively done, the system will validate it to ensure that the result, as a whole, is safe.
This approach will consider a transaction to have a three phase life cycle: execution, validation, and update.

During the execution phase, the transaction manager will initialize the transaction's workspace and make shadow copies of the objects into this workspace. These objects will be labeled with version numbers to help with validation.
Once the execution phase has completed, the speculative results of the transaction are know -- it is time for validation. The validation phase ensures that the results of the transaction are consistent and won't cause problems for other uncommitted transactions.
The update phase of the transaction is respnsible for committing the validated transactions to the objects. Once validated, a transaction must be committed to atomic storage. Pending updates may occur in any order.

In order to validate a transaction, certain timestamps are needed:

The object must keep track of the last committed read (RD) and write (WR)
Each transaction must be timestamped at the beginning of execution (TS) and at the end of execution/beginning of validation (TV).

The validation protocol itself is a two-phase protocol initiated by the trasnaction manager. It sends a message containing TS, TV, and the set of read objects (RO) and write objects (RO) to participating transaction managers (transaction managers on systems executing subtransactions). The transaction is validated only after the coordinating transaction manager has received the OK from all participants.
Each participant will use the following rules to determine if the transaction can be validated:

If there are no pending transactions, the validation request can be immediately granted -- there is no possible conflict

Since this protocol serializes transactions with respect to TV, the transaction must be rejected if any pending transaction on the participant has a lower TV.

If there exist one or more pending transactions in the validation phase, read-write conflicts are a concern. This is because the pending transaction may be validated and begin to write objects. If this happens, The speculatively read value may become inconsistent with the actual value at the time of validation. Read-write conflicts can be detected by looking for intersections between RO of the transaction requesting validation and the WOs of the transactions already in validation. If a conflict exists, the requesting transaction must be aborted.
If the validation of the transaction begins while another transaction is being updated, write-write conflicts are also a concern. This is because updates enqueued updates can happen in any order. There is no telling which of the transactions updates will occur last and "stick". As a result, different operations may stick from different transactions. To avoid this, the WO of the transaction requesting validation must be checked against the WOs of the transactions in the update phase. If there is any conflict the validating transaction must be aborted.