Transactions and isolation
VDSF and transactions
Why is VDSF using transactions? The answer is pretty simple: to simplify the programs written for VDSF and to avoid inconsistent data.
The simplicity aspect: once in a while things go wrong for a number of possible reasons. When this happens you want to be able to go back to a pristine state (and start over or log the error or...). Transactions will do that for you, a simple function call allow you to undo all your changes since the previous commit. Without transactions, programmers would be required to handle this manually. Which is ok until your data is correlated (which brings us to the next point).
Let's start with an example, a simple program reading a request from an input queue and storing the result of its processing to two output queues. The three operations should be part of the same unit of work - they either are all executed or none are executed. Otherwise you'll get inconsistent results (like a client receiving a notification that its request is completed but the record was never added to the database or the request falling in the cracks and never processed). With transactions, the three operations are done as if they were a single operation. To go back to our example, your notification program will not be able to see the data in the notification queue until the request has been committed by the other program.
At least, that's the theory... in the real world other sessions will be partially aware of the changes, which is why we need an isolation policy (in other words, an explanation of the rules governing the interactions between sessions accessing the same data).
VDSF isolation policy
The architecture of the product imposes some constraints on vdsf isolation policy as explained below.
vdsf transactions should never fail. To clarify: vdsCommit and vdsRollback can indeed return errors but only in the preprocessing phase (invalid session handle for example) - once they start the real work, they should never fail. Why? Mainly a matter of simplicity... simplicity at the API level and simplicity in the engine itself (note: introducing transactions in the engine added a certain level of complexity to the code - adding, on top of it the ability to recover from errors was never an option).
To accomplish this goal, new data and new objects are inserted in the shared memory at their proper place upon insertion and deleted data and objects are not removed (till committed and all accesses are terminated). One of the consequences of inserting objects at their proper place is that a complete serialization of transactions is impossible.
An example to make this clear: if transactions were completely serialized, two different sessions inserting the same data (with the same key) in a hash map (with unique keys) would succeed. The first session to commit would succeed, the second session attempt to commit would fail since the key is already there. As we saw previously, a failure to commit is not part of the architecture.
The basic rules are:
- The creation and destruction of data containers are included in the transactions.
- A data record will only support one transaction at a time (for example, you will not be able to insert records and removed or modified them in the same transaction) The same rule apply for the creation and destruction of data containers.
- Deleted data (and objects) will remain visible (as much as possible) until the deletion is committed.
- Inserted data (or created containers) will just be visible enough to tell other transactions why there might be a conflict when they try to insert/create the same data.
- Each element of a transaction is processed in the order it was executed on commit and in the inverse order on rollback.
There is one issue with our isolation policy: since transactions cannot fail, you cannot take all the locks in advance before starting to process the transaction (otherwise you would either get some transaction failures or worse, deadlocks if you use locks without timeouts). The main consequence: the transaction cannot be seen as one single operation.
In most cases, this is not a problem. In some cases it might be a problem. For example, if you were to store an ip address and ip port as two different items in a hash map - one of your programs using this info might pick the old address and the new port.
There are usually solutions to these problems. In the previous example, you could have an additional data record to inform the program if the config data needs to be re-read or not (changing this item last insures that the program won't re-read the data until its all changed). Or you can implement a "soft lock" (which requires two commits, one to "lock", one to do the changes and then "unlock"). Both these workarounds use the rule that elements of a transaction are processed in the order they were executed.
A problem ticket has been added on this issue - to see if a solution can be found which does not require special tricks as described in the previous paragraph.
vdsf isolation policy for data containers (objects)
Starting at version 0.2, newly created objects (queues, maps, etc.) will be visible from other sessions but it will not be possible to open them until they are committed. A "object in use" error code will be returned (in version 0.1, the error code was "does not exist"). As before, the session who created the object will be able to open it and to populate it (if needed).
Uncommitted deleted objects will be visible to other sessions and sessions will be able to open them and to add/remove/view the data as they wish. It will be as if the object is not deleted with one exception, no session can delete this object (obviously). For the session who deleted the object, the object becomes inaccessible (error: "object is deleted"). This is a major change to the current situation (where uncommitted deleted objects cannot be open and if already open, become read only).
Committed deleted objects become invisible to all sessions. All sessions having such an object open will receive the error "object is deleted" the next time they try to access it. However, transactions which include data items from the deleted object will be able to proceed as if the object is still present. A new object with the same name can be created even if the original object is still in shared memory (waiting to be removed from memory until all its reference counters go to zero).
The transaction status of the objects will be returned when ever possible (the status field of the struct vdsFolderEntry or the struct vdsObjStatus, for example).
vdsf isolation policy for data records
In version 0.2, just like version 0.1, it will not be possible to access newly created data records until they are committed. The major difference between the two versions: a "data in use" error code will now be returned instead of the version 0.1 error "no such item" (for queues, the next available record will be accessed, as usual). As before, the session who created a record will be able to access it (read only).
Uncommitted deleted data records will be visible to other sessions and sessions will be able to access them. Of course, no other transaction can be done on the data record, it becomes read only. For the session who deleted the data record, it becomes inaccessible (error: "data in use" for hash maps, for queues the next available record will be accessed, as usual). This is a major change to the current situation (where uncommitted deleted records cannot be viewed by other sessions.
For uncommitted updated data records (hash maps only), the rule is very simple. The session who updated the record see the new value, other sessions see the old value.
Committed deleted data records become invisible to all sessions. A new record with the same key (for key-oriented records) can be created even if the original record is still in shared memory (waiting to be removed from memory until its reference counter go to zero).
Additional rule for queues: if the queue contains valid data (anything other than committed deleted data records) that cannot be accessed by a session based on the previous rules, the error "data in use" will be returned instead of the current one, "is empty".
Last updated on March 21, 2008.