System.Data.SQLite
View Ticket
Not logged in
Ticket UUID: 1f7bfff46779802fe2d4e37be290d363ff276360
Title: SQLiteTransaction: critical rollback bug
Status: Closed Type: Code_Defect
Severity: Severe Priority: Blocker
Subsystem: Transaction Resolution: Not_Backwards_Compatible
Last Modified: 2016-11-17 20:32:39
Version Found In: 1.0.102.0
User Comments:
anonymous added on 2016-10-27 08:18:26:
SQLiteTransaction.Rollback() has a critical bug which can lead to data loss and / or corruption.

The problem appears when a nested SQLiteTransaction is rolled back. (I'm not sure why SQLiteTransaction is even allowing nested transactions in the first place, since the SQLite library itself doesn't support this. However, the SQLiteTransaction implementation clearly allows nested transactions by intention, as can be seen in the source code.)

I think nested System.Data.SQLite transactions should be disallowed at all, since SQLite itself also doesn't actually support this.

The test code to reproduce the problem is given below and on Pastebin: [http://pastebin.com/0ZF98DhH]



[TestMethod]
public void TestTransactions() {
	SQLiteConnection connection = ...;

	// Create a test table.
	using (SQLiteCommand command = new SQLiteCommand("CREATE TABLE test(column TEXT);", connection)) {
		command.ExecuteNonQuery();
	}

	// Insert a single row.
	using (SQLiteCommand command = new SQLiteCommand("INSERT INTO test VALUES ('no transaction');", connection)) {
		command.ExecuteNonQuery();
	}

	// Assert that the table contains a single row. Test PASSES.
	using (SQLiteCommand command = new SQLiteCommand("SELECT COUNT(*) FROM test;", connection)) {
		Assert.AreEqual((long)1, command.ExecuteScalar());
	}

	// Begin a transaction.
	using (SQLiteTransaction transaction = connection.BeginTransaction()) {
				
		// Insert another row into the table.
		using (SQLiteCommand command = new SQLiteCommand("INSERT INTO test VALUES ('in transaction');", connection)) {
			command.ExecuteNonQuery();
		}

		// Assert that the table contains two rows. Test PASSES.
		using (SQLiteCommand command = new SQLiteCommand("SELECT COUNT(*) FROM test;", connection)) {
			Assert.AreEqual((long)2, command.ExecuteScalar());
		}

		// Begin another transaction. System.Data.SQLite apparently allows me to do this, as it doesn't throw any exception.
		// Also, the implementation of SQLiteTransaction suggests that transactions can actually be nested.
		using (SQLiteTransaction nestedTransaction = connection.BeginTransaction()) {

			// Insert another row into the table.
			using (SQLiteCommand command = new SQLiteCommand("INSERT INTO test VALUES ('in nested transaction');", connection)) {
				command.ExecuteNonQuery();
			}

			// Assert that the table contains three rows. Test PASSES.
			using (SQLiteCommand command = new SQLiteCommand("SELECT COUNT(*) FROM test;", connection)) {
				Assert.AreEqual((long)3, command.ExecuteScalar());
			}

			// Note: the nested transaction goes out of scope here without a call to Commit(), so an implicit Rollback() is done.
		}

		// The nested transaction has been rolled back again. Assert that the table now contains two rows again. Test FAILS,
		// since actual row count == 1.
		using (SQLiteCommand command = new SQLiteCommand("SELECT COUNT(*) FROM test;", connection)) {
			Assert.AreEqual((long)2, command.ExecuteScalar());
		}

		// Note: transaction goes out of scope here without a call to Commit(), so an implicit Rollback() is done.
	}

	// All transactions have been rolled back again. Assert that the table now contains a single row again. Test PASSES.
	using (SQLiteCommand command = new SQLiteCommand("SELECT COUNT(*) FROM test;", connection)) {
		Assert.AreEqual((long)1, command.ExecuteScalar());
	}
}

mistachkin added on 2016-10-27 20:23:17:
I've reviewed the code in the SQLiteTransaction class and it does not appear to
make sense when applies to nested transactions.

One possible fix may be to prevent "nested transactions" from issuing any
implicit rollbacks.

mistachkin added on 2016-10-27 20:43:30:
Draft fix on branch, check-in [0b11868e929386dd].  Still needs tests.

anonymous added on 2016-10-28 10:50:39:
mistachkin, thank you very much for your fast reaction on this issue.

Regarding your draft fix: Just simply doing nothing on Rollback() for a nested transaction doesn't solve the problem either way. It's arguable that a nested Commit() does nothing, since we're still contained within an outer transaction, which must be separately committed before any changes can be effective.

However, it's wrong to do nothing on a nested Rollback(), because the whole point of a transaction (nested or not) is that it *can* actually be rolled back to its initial state. If this isn't possible, then the fundamental ACID database principle is violated, which is absolutely intolerable.

There are two ways to fix this:

1) Suggested easy fix: throw an InvalidOperationException() when trying to start a nested transaction. SQLite doesn't support nested transactions, so why should SQLiteTransaction?

2) Use SAVEPOINT's to implement the logic of nested transactions: [https://www.sqlite.org/lang_savepoint.html] However, be sure to test this carefully and extensively, since a database core feature like transactions should really just work rock-solid.

mistachkin added on 2016-10-28 20:10:16:
Maybe something like [712e256f4f0433ab]?

anonymous added on 2016-10-29 10:59:43:
Looks good, basically.

One remaining issue though. The current implementation will cause SQLiteTransaction objects to be ambiguous, which is still a problem. Consider this code snippet:


SQLiteConnection connection = ...;
using (SQLiteTransaction outerTransaction = connection.CreateTransaction()) {
	using (SQLiteTransaction innerTransaction = connection.CreateTransaction()) {
		// Some SQL statements here...
		
		// Decide to commit the OUTER transaction.
		outerTransaction.Commit();
	}
	
	// Fatal loss of data here: the call to outerTransaction.Commit() actually committed
	// the INNER transaction instead, and the follow-up implicit innerTransaction.Rollback()
	// rolled back the OUTER transaction afterwards.
}


To fix this, the commit / rollback information must be tracked per SQLiteTransaction object instance instead of per transaction nesting level. As a suggestion, I'd do the following (pseudo code & error handling omitted):

class SQLiteTransaction {
	private SQLiteConnection _connection;
	private string _commitCommand;
	private string _rollbackCommand;
	
	void BeginTransaction() {
		if (!_connection.hasAnyOpenTransaction()) {
			_connection.ExecuteCommand("BEGIN;");
			_commitCommand = "COMMIT;";
			_rollbackCommand = "ROLLBACK;";
		} else {
			// Returns some per-connection-unique savepoint name
			string savepointName = _connection.CreateSavepointName();
			
			_connection.ExecuteCommand("SAVEPOINT " + savepointName + ";");
			_commitCommand = "RELEASE " + savepointName + ";";
			_rollbackCommand = "ROLLBACK TO " + savepointName + ";";
		}
	}
	
	void Commit() {
		_connection.ExecuteCommand(_commitCommand);
	}
	
	void Rollback() {
		_connection.ExecuteCommand(_rollbackCommand);
	}
}


I'd like to point out an important consequence of this implementation, however: For transaction nesting level N, there are (2! * 2^N) possible different SQL command sequences being sent to the SQLite engine (see below). If only one of them doesn't work as expected, then there may still be some fatal ACID violation and therefore also some unexpected fatal loss of data possible.

Ideally, tests should cover (at least some of the) combinations up to level 3 (i.e. using two savepoints plus the outer transaction). If this seems to be too much work, please consider implementing the easy fix of just disallowing nested transactions at all instead. This will be much better than a potential fatal ACID bug in System.Data.SQLite (as it is now).


// Possible command sequences for nesting level N = 1
// 1! * 2^1 = 2 combinations
BEGIN + COMMIT
BEGIN + ROLLBACK


// Possible command sequences for nesting level N = 2
// 2! * 2^2 = 8 combinations
BEGIN + SAVEPOINT s1 + RELEASE s1 + COMMIT
BEGIN + SAVEPOINT s1 + ROLLBACK s1 + COMMIT
BEGIN + SAVEPOINT s1 + RELEASE s1 + ROLLBACK
BEGIN + SAVEPOINT s1 + ROLLBACK s1 + ROLLBACK
BEGIN + SAVEPOINT s1 + COMMIT + RELEASE s1
BEGIN + SAVEPOINT s1 + COMMIT + ROLLBACK s1
BEGIN + SAVEPOINT s1 + ROLLBACK + RELEASE s1
BEGIN + SAVEPOINT s1 + ROLLBACK + ROLLBACK s1


// Possible command sequences for level N = 3
// 3! * 2^3 = 48 combinations
// Not listed here - too many...

mistachkin added on 2016-10-29 18:00:58:
To summarize: the savepoint name needs to be unique every time so that exiting
and re-entering a transaction level does not have unintended consequences?

anonymous added on 2016-10-29 19:22:35:
Yes, it must be unique. Otherwise it could happen that we reuse a savepoint name which is still in use by another SQLiteTransaction object.

Maybe just use a simple incrementing integer to generate unique savepoint names. Synchronize this if necessary, as multi-threading issues may apply here.

Note that I haven't tested whether SQLite allows a connection to see a savepoint which has been created by a different connection. If this should be the case, then savepoint names must actually be unique per DATABASE, not just per CONNECTION.

mistachkin added on 2016-10-29 22:22:14:
I think check-in [09db2a0a1b37532d] should address the predictability issue.

Also, I'm relatively sure that SAVEPOINT names are scoped to the connection (i.e.
two different connections could have the same SAVEPOINT name and they would not
be referring to the same SAVEPOINT).

anonymous added on 2016-10-30 10:16:11:
Well, unique savepoint names alone are unfortunatley only part of the story. We still have this major issue here:


SQLiteConnection connection = ...;
using (SQLiteTransaction outerTransaction = connection.CreateTransaction()) {
	using (SQLiteTransaction innerTransaction = connection.CreateTransaction()) {
		// Some SQL statements here...
		outerTransaction.Commit();
	}
	
	// Fatal loss of data here: outer transaction is ROLLED BACK instead of COMMITTED !!!
}


The commit / rollback information *MUST* be stored per SQLiteTransaction object instance. Counting transaction nesting levels is (and has always been) *NOT* sufficient. Remove that obsolete SQLiteConnection._transactionLevel member altogether, this is just messed up incorrect legacy code.

You may consider the SQLiteTransaction pseudo code given above. It will correctly handle transactions and savepoints.

Btw, I checked that savepoints are actually scoped to connections, so we're OK to have per-connection unique savepoint names. (However, please consider having just a simple incrementing SQLiteConnection._savePointCounter for this, instead of using non-deterministic random numbers.)

mistachkin added on 2016-10-30 18:54:37:
The outer transaction always overrides the inner ones.  You'll need to add a 
commit in the outer using block.

mistachkin added on 2016-10-30 18:55:14:
Wait, I misread your example.  That should work right now (on the branch).

anonymous added on 2016-10-30 19:47:33:
I don't think so. Please try the above example and check yourself. (Note that the Commit() call is on the OUTER transaction, but in the INNER scope.)

mistachkin added on 2016-10-30 20:34:52:
You're right, the implementation on the branch did not take out-of-order COMMIT
into account (among other things).  I've added an appropriate test, very similar
to yours.  With the latest code on the branch, everything appears to be correct.

That being said, I'm starting to get nervous about getting nested transaction
semantics entirely correct.

anonymous added on 2016-10-31 09:23:08:
I'm sorry to say that it's still not correct. Out-of-order commits and rollbacks of nested transactions can still fail. One example (out of many possible ones):


SQLiteTransaction t1 = connection.CreateTransaction();
SQLiteTransaction t2 = connection.CreateTransaction();
SQLiteTransaction t3 = connection.CreateTransaction();
t2.Rollback();   // Fatal error: will rollback t3 instead of t2


I'll repeat it once again here:

Counting transaction nesting levels is (and has always been) *NOT* sufficient. Remove that obsolete SQLiteConnection._transactionLevel member altogether, this is just messed up incorrect legacy code.


I'll attach a fixed version of SQLiteTransaction.cs to this bug report. I urge you to take a look at it and to consider it to eventually get this critical bug fixed. Many thanks.

anonymous added on 2016-10-31 09:33:46:
Fixed version of SQLiteTransaction attached.

Additional changes required in SQLiteConnection:

--> Remove SQLiteConnection._transactionLevel member.
--> Add new boolean SQLiteConnection._inTransaction member.

anonymous added on 2016-10-31 09:52:51:
Minor correction in attached file:

Lines 235, 240 and 241 should use local variable "cnn" instead of class member "_cnn".

mistachkin added on 2016-10-31 13:27:53:
It seems like the primary change necessary here is just to stop using the level
for constructing SAVEPOINT names, which makes sense.

However, removing _transactionLevel from the connection and replacing it with a
boolean probably isn't strictly necessary.

anonymous added on 2016-10-31 14:56:40:
You're right, it isn't strictly necessary. However, if it's not used anymore, then it also should be removed, in order to keep the code clean.

mistachkin added on 2016-10-31 18:56:01:
Retaining the _transactionLevel may be useful for the following reasons:

1. When debugging, it's easier to see how deeply nested things are.
2. Just in case people are using the private field (i.e. unsupported).

I wouldn't mind breaking people relying on #2; however, the #1 use case seems
fairly compelling, especially given how much debugging I've personally done on
this issue.

anonymous added on 2016-10-31 19:21:20:
OK, the current code from the branch looks good now. I think it should work. (I didn't do any real-life tests with it, however.)


Note that it's still possible for SQLiteConnection._transactionLevel to contain incorrect values. An example:


SQLiteTransaction t1 = connection.CreateTransaction();
SQLiteTransaction t2 = connection.CreateTransaction();
SQLiteTransaction t3 = connection.CreateTransaction();
t2.Commit();
// Actual transaction level == 1 (due to implicit t3.Commit() done by SQLite)
// SQLiteConnection._transactionLevel == 2


In fact, it's not easily possible to correctly track the current transaction nesting level of SQLite. Luckily, we don't have to. All we need to know is whether we're on the top level or not - hence the check for SQLiteConnection._transactionLevel == 0 in the SQLiteTransaction constructor.

Since SQLiteConnection._transactionLevel may contain incorrect values, I strongly suggest to remove it, and to replace it by a boolean SQLiteConnection._hasOpenTransaction member or so. This will just make the code a lot cleaner.

From an algorithmic point of view, the current implementation should be good, however.

mistachkin added on 2016-11-04 02:06:31:
I've decided to minimize the potential for breaking applications of the existing
transaction semantics by making nested transactions opt-in via a new connection
flag, "AllowNestedTransactions".

There are still some changes that are visible to applications, most notably the
fact that SQLiteTransaction is now an unsealed class; however, it does keep the
underlying database semantics identical, with the exception of now throwing an
exception if a nested transaction is attempted without the new connection flag
enabled.

mistachkin added on 2016-11-17 19:31:48:
After discussing this issue with the rest of the team, the decision has been made
to retain the existing (broken) behavior for nested transactions.  This behavior
is specifically tested for by the legacy test suite (i.e. "test.exe").  This seems
to strongly indicate that changing this behavior at this point would be a severe
break with backward compatibility.

To opt-in to the new (fixed) nested transaction behavior, please use the new
connection flag, "AllowNestedTransactions".

mistachkin added on 2016-11-17 20:31:30:
The changes described are now on trunk, via check-in [f73c069fb9089a6e].