System.Data.SQLite
View Ticket
Not logged in
Ticket UUID: 7e1dd697dc41f0eb92d7a83ccbb672459cfbbe18
Title: Consecutive transaction scope support
Status: Closed Type: Feature_Request
Severity: Minor Priority: Blocker
Subsystem: Connection Resolution: Fixed
Last Modified: 2018-01-30 21:17:54
Version Found In: 1.0.105.1
User Comments:
anonymous added on 2017-05-16 14:18:00:
SQLiteConnection does not currently support enlisting with a "null" transaction. It considers it an error and throw an ArgumentNullException. This prevents re-using a connection having participating in a previous transaction after it has completed.

Many other connection does support it. I have tested: SqlConnection, OleDbConnection, OdbcConnection, oracle managed connection, Firebird v2 connection, postgresql, mysql. 

The remarks section of `Db​Connection.​Enlist​Transaction` implies that it should at least be able of enlisting in a new transaction if the previous one is completed:

> Once a connection is explicitly enlisted in a transaction, it cannot be unenlisted or enlisted in another transaction until the first transaction finishes.

https://docs.microsoft.com/en-us/dotnet/api/system.data.common.dbconnection.enlisttransaction

But current code seems to have logic for throwing if a previous enlistment is there, without checking if that previous enlistment transaction is still active.

This issue was discovered while investigating solutions for NHibernate current issues with transaction scope handling. It might force us to use some special handling for the SQLite case. (See https://nhibernate.jira.com/browse/NH-4011 for more on those NHibernate issues, see https://github.com/nhibernate/nhibernate-core/pull/410 for the case having revealed this SQLite limitation (along with SqlCe, to be complete).)

mistachkin added on 2017-05-16 16:16:07:
Do you have an example in C# that demonstrates the issue?

anonymous (claiming to be fredericDelaporte) added on 2018-01-27 08:28:05:
As somewhat written in another ticket:
https://system.data.sqlite.org/index.html/tktview/5cee5409f84da5f6217221a09a5020bb49ac1ec7

This report is not much adequate. Enlisting with null is a workaround working for some other DB provider but which should not be actually needed.

The real trouble is when willing to chain distributed scopes usage on the same connection: it may fail with "Already enlisted in a transaction" message.

Note that this is a low occurrences trouble. My test usually do not trigger it within one thousand iterations, unless I put a "Thread.Sleep(500)" in the enlistment Commit right before the lock (added in the patch from linked ticket above). In such case the trouble triggers at the second iteration.

Here is the test I use:

[Test]
public void Loop2()
{
	var cnx = Sfi.ConnectionProvider.GetConnection();
	try
	{
		for (var i = 0; i < 1000; i++)
		{
			try
			{
				using (var tx = new TransactionScope())
				{
					ForceEscalationToDistributedTx.Escalate();

					Assert.AreNotEqual(
						Guid.Empty,
						System.Transactions.Transaction.Current.TransactionInformation.DistributedIdentifier,
						"Transaction lacks a distributed identifier");

					cnx.EnlistTransaction(System.Transactions.Transaction.Current);
					using (var cmd = cnx.CreateCommand())
					{
						cmd.CommandText = "insert into Person (Id, NotNullData) values (@id, '')";
						var param = cmd.CreateParameter();
						param.DbType = DbType.Int32;
						param.ParameterName = "@id";
						param.Value = i;
						cmd.Parameters.Add(param);
						cmd.ExecuteNonQuery();
					}

					tx.Complete();
				}
			}
			catch
			{
				Console.WriteLine("Failed at iteration " + i);
				throw;
			}
		}
	}
	finally
	{
		Sfi.ConnectionProvider.CloseConnection(cnx);
	}
}


(Where Sfi is a NHibernate thing that I use to get the SQLiteConnection on a test database with some tables in it, purged at each test run. And source code of ForceEscalationToDistributedTx is at: https://github.com/nhibernate/nhibernate-core/blob/5.0.3/src/NHibernate.Test/SystemTransactions/DistributedSystemTransactionFixture.cs#L737 )

> For the consecutive transaction scopes (the other ticket), I do not fully understand the problem.  The _enlistment field is cleared out when the transaction is committed -OR- rolled back via the enlistment.  Also, it is cleared out when the connection is closed.
> 
> When else would it need to be cleared out?

No where else unfortunately, this is a race condition. It is cleared where it needs to be cleared. But when the previous scope is distributed, its actual completion may be delayed past the scope disposal, causing the Commit or Rollback to actually occurs after the scope disposal. So when that occurs, the connection is unable to enlist in a new scope right after the previous one, because its previous enlistment is not yet cleared.

Supporting that would require a mechanism causing the enlistment in second scope to wait for the end of first enlistment, instead of throwing "Already enlisted in a transaction".
Of course such a wait should be done only if the first enlistment is indeed in the process of completing itself. It may be checked through a flag set on it in Prepare phase, since the prepare phase is guaranteed to occur before the scope disposal finishes, when there is a prepare phase. ( https://github.com/npgsql/npgsql/issues/1571#issuecomment-308651461 ) There may be no prepare phase when roll-backing unfortunately. The other way to detect whether an enlistment seems to be processing its completion is to have kept a reference on the system transaction it is bound to, and check whether it is already disposed or no more active.

mistachkin added on 2018-01-27 19:58:43:
I've been giving this problem a lot of thought.  I think I have a clean (and
simple) way to help.  In the following check-in, I've added an experimental
WaitForEnlistmentReset method to the SQLiteConnection class:

[/info/2d7e7593419da23c]

Opinion?

mistachkin added on 2018-01-27 22:36:13:
I'm marking this as "fixed", because:

  1. The changes for the other ticket fix the fundamental race condition(s).

  2. Using the new method, the inability to use consecutive transaction scopes
     can now be fully avoided.

The associated tests for ticket [5cee5409f8] should confirm both these
assertions.

anonymous (claiming to be fredericDelaporte) added on 2018-01-29 12:30:44:
> I think I have a clean (and simple) way to help.  In the following check-in, I've added an experimental WaitForEnlistmentReset method to the SQLiteConnection class

I personally avoid `Sleep` solutions excepted for tests. But well, this solution should work. Now for avoiding waiting on enlistment which are not in their completion process, I would add some code to check `_enlistment._scope` state, doing `return false;` immediately if it is still ongoing. (`_enlistment._scope.TransactionInformation.Status == TransactionStatus.Active`, but guarded in a `try catch (ObjectDisposedException)` as accessing `TransactionInformation` property on `_scope` would throw if the transaction is already disposed of.

// After acquiring the lock and having check for `_enlistment` not being `null`.
try
{
  if (_enlistment._scope.TransactionInformation.Status == TransactionStatus.Active)
    return false;
}
catch (ObjectDisposedException)
{
  // The transaction is already disposed of, meaning the enlistment can only be in the process of completing.
}
// Do the wait for completion.

By the way, I would rather use a `StopWatch` for checking if we have exhausted the wait timeout, rather than computing on `UtcNow`.

Now this method being public, I guess you were intending to instruct users to call it if they are exposed to the race condition of this ticket. This is clearly a "no-go" for a data-provider agnostic solution like NHibernate, it needs to stick to the `DbConnection` interface. So instead this method would have to be called from inside the `EnlistTransaction` to fix the trouble in a generic way, without requiring some ad-hoc code from users. The timeout would then need some configuration parameter.

> I'm marking this as "fixed", because:
>
>  1. The changes for the other ticket fix the fundamental race condition(s).

I do not really think so. It fixes the race condition on connection close case. It does not fix the race condition on transaction scope enlistment after having disposed a previous scope in which the connection was enlisted. They are different race conditions in my mind.

The test in the comment above showcases this race easily only if, just for the test purpose, a Sleep is introduced at the start of the enlistment commit. (At least it does this with the last zip you had provided on the other ticket.) Otherwise it may succeed most of the times, while the race condition could still be there, occurring at rare occasions, which is a bad thing for overall reliability.

>  2. Using the new method, the inability to use consecutive transaction scopes can now be fully avoided.

For user code that can accept using `SQLiteConnection` specific methods, yes. But for code which needs to stick to `DbConnection` interface, that is not really viable.

mistachkin added on 2018-01-29 19:52:53:
Yes, using Sleep is not an ideal solution; however, given that there is no other
obvious way to synchronize with the .NET Framework (or MSDTC?) asynchronous
transaction mechanism, I'm not sure what else to do.

I'm going to fine-tune the new method a bit, including adding the scope check.

I'm a bit weary of using StopWatch as I do not have personal experience with its
reliability.

Limiting your interaction with System.Data.SQLite to only those methods exposed
via the ADO.NET interface is a valid choice; however, changing the semantics of
those methods at this point would probably break backward compatibility.  Also,
it would be much riskier than adding a new method.

mistachkin added on 2018-01-29 19:56:39:
Wait a minute.  I was just thinking.  If I check the scope and return true,
that could still cause a problem because the point of this method was to
attempt to avoid the exception being thrown by EnlistTransaction, which
requires the _enlistment field to be null.  In other words, we would also
have to null out the _enlistment field within WaitForEnlistmentReset, which
just seems wrong.

anonymous (claiming to be fredericDelaporte) added on 2018-01-30 12:31:52:
> Wait a minute.  I was just thinking.  If I check the scope and return true, that could still cause a problem because the point of this method was to attempt to avoid the exception being thrown by EnlistTransaction, which requires the _enlistment field to be null.  In other words, we would also have to null out the _enlistment field within WaitForEnlistmentReset, which just seems wrong.

Yes it would be wrong to nullify _enlistment, that is not the point of this method in my mind. It just here to give the opportunity to wait for a previous disposed scope to actually end its work. But current code for the method can only return true if the enlistment ends up null (tanks to the wait or because it was already null), so I do not see the point here. Maybe its about another potential race condition, if the user use scopes on the connection from another thread: then yes it would not be enough for protecting the enlistment. But this would be an user bug, and this ticket should not try to fix user triggered threading issues in my mind, only those from MSDTC/.Net integration of MSDTC.

Anyway in my mind, the whole WaitForEnlistmentReset should be inlined into EnlistTransaction, allowing to do the check, acquiring the lock in it, wait a not yet finished but in the process of completing scope if needed, throw on timeout or on scope fully active, and furthermore do what EnlistTransaction needs to do inside the lock too, if conditions are met.

> changing the semantics of those methods at this point would probably break backward compatibility

I do not see a semantic change here. The EnlistTransaction is meant to allow enlisting in a scope, included after a previous one has ended. Fro the user viewpoint, scope should be ended after its disposal. That is only a MSDTC glitch if it is not truly the case. (For me it is even a bug in the way MSDTC is integrated in .Net Framework: why not having its asynchronism, but the scope disposal should handle that and not expose user code following it to its asynchronism. Unfortunately that is a decades old subject and I do not hope Microsoft will fix that.) So in the race condition case, respecting the semantic should be "dodge the not yet completed previous disposed scope by waiting a bit for it" rather than "throw due to that disposed previous scope not yet actually finished". Of course this is somewhat doing Microsoft job by fixing troubles caused by a flaw in their MSDTC/.Net integration, but currently only code having access to the enlistment can fix this. So it is then a matter of choosing between letting users be exposed helplessly to the trouble or handle it for them.

Of course I ackownledge that is still some added complexity in the enlistment, so still an added opportunity at introducing bugs.

> I'm a bit weary of using StopWatch as I do not have personal experience with its reliability.

I have never seen trouble with it and I trust its reliability, although I have only used it for measuring performances of some features (included in production), rather than actually implementing the feature thanks to it.

mistachkin added on 2018-01-30 18:28:56:
Changing EnlistTransaction the way you describe would certainly be a break with
backward compatibility.

It's too late for this release cycle; however, I suppose I could add a new
connection flag to enable this type of behavior in EnlistTransaction, so it
would not be enabled by default.  This would mitigate any backward compatbility
issues.

mistachkin added on 2018-01-30 21:17:54:
New connection flag and connection string property on trunk.