2015-03-04
| ||
19:27 | • Closed ticket [e122d26e70]: Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? plus 2 other changes artifact: 2d19ff718c user: mistachkin | |
2014-11-18
| ||
22:12 | • Fixed ticket [e122d26e70]. artifact: 22319572f7 user: mistachkin | |
22:08 | • Ticket [e122d26e70]: 4 changes artifact: e567a35f34 user: mistachkin | |
2014-10-08
| ||
20:56 | • Ticket [e122d26e70]: 3 changes artifact: 2eb926d64a user: mistachkin | |
18:49 | Various minor performance enhancements to the SQLiteDataReader class. Pursuant to [e122d26e70]. check-in: e6cc5d000b user: mistachkin tags: trunk | |
2014-10-07
| ||
18:06 | • Pending ticket [e122d26e70]: Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? plus 4 other changes artifact: d020fdfed8 user: mistachkin | |
17:55 | • Ticket [e122d26e70]: 3 changes artifact: 668a4c8e8e user: mistachkin | |
17:52 | More performance adjustments to the data reader class. Pursuant to [e122d26e70]. Closed-Leaf check-in: 53e8310fda user: mistachkin tags: experimental | |
2014-10-06
| ||
18:21 | • Ticket [e122d26e70] Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? status still Open with 3 other changes artifact: 80887650b4 user: mistachkin | |
2014-10-02
| ||
10:23 | • Ticket [e122d26e70]: 4 changes artifact: 8948e652c8 user: anonymous | |
2014-09-30
| ||
14:05 | • Ticket [e122d26e70]: 3 changes artifact: 9954618b3d user: mistachkin | |
14:03 | More minor performance enhancements. Pursuant to [e122d26e70]. check-in: 684f381cd4 user: mistachkin tags: experimental | |
2014-09-29
| ||
16:31 | • Ticket [e122d26e70] Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? status still Open with 4 other changes artifact: abb633082d user: anonymous | |
2014-09-26
| ||
04:02 | • Ticket [e122d26e70]: 3 changes artifact: 7c27eb00e6 user: mistachkin | |
03:39 | • Ticket [e122d26e70]: 3 changes artifact: 5dc7e34bb1 user: mistachkin | |
2014-09-25
| ||
17:37 | • Ticket [e122d26e70]: 4 changes artifact: 4d8cf6f615 user: anonymous | |
2014-09-24
| ||
05:57 | • Ticket [e122d26e70]: 3 changes artifact: 7c430af946 user: mistachkin | |
05:56 | • Ticket [e122d26e70]: 3 changes artifact: 8c06191c41 user: mistachkin | |
04:15 | Experimental changes to remove superfluous calls from the SQLiteDataReader class. These changes are not yet completely correct. Pursuant to [e122d26e70]. check-in: ab425e79d7 user: mistachkin tags: experimental | |
2014-09-23
| ||
14:22 | • Ticket [e122d26e70] Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? status still Open with 4 other changes artifact: bc4a2d084a user: anonymous | |
2014-09-22
| ||
19:42 | • Ticket [e122d26e70]: 6 changes artifact: 6359a77cd4 user: mistachkin | |
17:09 | • New ticket [e122d26e70]. artifact: 65fdaa03a7 user: anonymous | |
Ticket Hash: | e122d26e70f7d9d4d05f06d2f72ac71d4b03711c | |||
Title: | Is the call to ColumnAffinity in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) needed? | |||
Status: | Closed | Type: | Performance | |
Severity: | Important | Priority: | Blocker | |
Subsystem: | Data_Reader | Resolution: | Fixed | |
Last Modified: | 2015-03-04 19:27:30 | |||
Version Found In: | trunk | |||
User Comments: | ||||
anonymous added on 2014-09-22 17:09:05:
The virtual function call _activeStatement._sql.ColumnAffinity(_activeStatement, i) in SQLiteDataReader.GetSQLiteType(SQLiteConnectionFlags flags, int i) showed up in our perf profiles. Now, I'm wondering why the call is needed here. Shouldn't the type affinity already be set if the type is being retrieved from the _fieldTypeArray cache? IOW, why is typ.Affinity reset if it is != TypeAffinity.Uninitialized? If the call is not needed an alternative implementation could look like this (totally untested code, may erase all your data)? private SQLiteType GetSQLiteType(SQLiteConnectionFlags flags, int i) { // Initialize the field types array if not already initialized if (_fieldTypeArray == null) _fieldTypeArray = new SQLiteType[VisibleFieldCount]; if (_fieldTypeArray[i] != null) return _fieldTypeArray[i]; var type = new SQLiteType(); _fieldTypeArray[i] = type; var connection = GetConnection(this); var columnType = _activeStatement._sql.ColumnType(_activeStatement, i, out type.Affinity); type.Type = SQLiteConvert.TypeNameToDbType(connection, columnType, flags); return type; } mistachkin added on 2014-09-22 19:42:05: The call is present as far back as the check-in [67f18e8aaa] on 2007-10-01. anonymous (claiming to be Alexander Sieb) added on 2014-09-23 14:22:38: Some background. Our program is reading ~1.5M rows using the SQLiteDataReader which is 100% cpu bound. So we are looking for ways to decrease the time spent in the data reader. The costs are roughly like this: GetDouble(100%) get_VisibleFieldCount(~25% of caller) GetDouble(~25%) VerifyType(~50%) CheckClosed(~35%) GetSQLiteType(~50%) ColumnAffinity(~80%) So removing ColumnAffinity() should result in better perf. Alternatively, VerifyType() could cache its results per column index. Btw, the call to CheckClosed() in VerifyType can be removed since all callers of VerifyType() seem to call it via VisibleFieldCount prior to calling VerifyType(). Also, CheckDisposed() is called twice in GetDouble(), GetInt64(), etc: First line and through VisibleFieldCount. Thanks for your help! mistachkin added on 2014-09-24 05:56:33: Experimentation reveals that the call to ColumnAffinity in the SQLiteDataReader.GetSQLiteType method is absolutely required as its value can change on a per-row basis. mistachkin added on 2014-09-24 05:57:27: Several other minor optimizations have been checked in on the 'experimental' branch. Can you rebuild System.Data.SQLite from that branch and re-run your benchmarks against it? anonymous (claiming to be Alexander Sieb) added on 2014-09-25 17:37:25: With System.Data.SQLite-8aa9c8c9e4 the perf profile is as follows (x86, i7 3770K): GetDouble(100%) GetDouble(~27%) CheckClosed(~21%) ?JIT_GetField64@@..(~52%) - Access to _version which is of type long. VerifyType(~42%) GetSQLiteType(~91%) ColumnAffinity(~82%) Overall perf increase seems to be about 3% compared to 1.0.94.0, but other changes in sqlite itself could be a factor too. I'll try to compare 8aa9c8c9e4 to trunk without your improvements get a clearer picture. Is there a way to set _throwOnDisposed to false? It's internal and set to true in the ctor. CheckClosed() would return early without the 64 bit access to _version if we could set it to false. FYI, inside SQLite the top five functions wrt. cpu load are: sqliteVdebExec memcpy sqlite3BtreeMovetoUnpacked _RtlLeaveCriticalSection _RtleEnterCriticalSection mistachkin added on 2014-09-26 03:39:43: Trust me when I say that the code is CheckClosed has been carefully tested to make sure that native resources are not accessed after being freed and that the previously fixed issues with resource cleanup stay fixed. That being said, I *think* the _version field can be changed to simply "int", which should speed things up a bit (thanks for pointing out it was a "long", as that had not occurred to me before this point). Given the highly dynamic nature of SQLite, the GetSQLiteType method basically must call ColumnAffinity each time to make sure the correct type affinity handling is used for a given column (and row). mistachkin added on 2014-09-26 04:02:48: I've updated the 'experimental' branch with the change of '_version' from long to int. Can you try this change and see if it improves the performance even further? anonymous (claiming to be Alexander Sieb) added on 2014-09-29 16:31:13: So I made the following additional changes. 1. I changed the last if in CheckClosed() to: SQLiteConnection cn = _command.Connection; if (cn._version != _version || cn.State != ConnectionState.Open) throw new InvalidOperationException("Connection was closed, statement was terminated"); a) I assigned _command.Connection to a variable to remove the double access. The compiler/JIT won't (aren't allowed to) do that and it was visible in the profiler. b) I changed the condition to test _version first. It looks more fluent to me since _version is used in the previous if statement. Also, _version should still be *hot* because of that. With those changes the runtime of CheckClosed() is reduced from 202 Inclusive Samples (trunk) to 60 Inclusive Samples (experimental + above changes). So the change from long to int and the other changes certainly help. 2. I believe that double is used more often than float, so I made the test for DbType.Double the first test in VerifyType 'case TypeAffinity.Double:' to help branch prediction and to execute less code on average. The difference is small, but visible in my tests on an Intel 3770K. case TypeAffinity.Int64: if (typ == DbType.Int16) return affinity; if (typ == DbType.Int32) return affinity; if (typ == DbType.Int64) return affinity; if (typ == DbType.DateTime) return affinity; if (typ == DbType.Double) return affinity; if (typ == DbType.Single) return affinity; if (typ == DbType.Decimal) return affinity; if (typ == DbType.Boolean) return affinity; if (typ == DbType.SByte) return affinity; if (typ == DbType.Byte) return affinity; break; case TypeAffinity.Double: --> if (typ == DbType.Double) return affinity; if (typ == DbType.Single) return affinity; ... Compared to trunk I get about 100000 rows/s better read throughput (1000000 instead of 900000 rows/s). mistachkin added on 2014-09-30 14:05:31: I've made the changes on the experimental branch. But now I'm wondering if there are more opportunities for optimization of the VerifyType method, e.g. for the Int64 affinity. anonymous (claiming to be Alexander Sieb) added on 2014-10-02 10:23:47: I believe DbType.Int16 is less likely to occur in real world scenarios, so I would move it after DbType.Int64. case TypeAffinity.Int64: if (typ == DbType.Int32) return affinity; if (typ == DbType.Int64) return affinity; --> if (typ == DbType.Int16) return affinity; if (typ == DbType.DateTime) return affinity; if (typ == DbType.Double) return affinity; if (typ == DbType.Single) return affinity; if (typ == DbType.Decimal) return affinity; if (typ == DbType.Boolean) return affinity; if (typ == DbType.SByte) return affinity; if (typ == DbType.Byte) return affinity; Looking at GetSQLiteType again. The lazy init of _fieldTypeArray: // Initialize the field types array if not already initialized if (_fieldTypeArray == null) _fieldTypeArray = new SQLiteType[PrivateVisibleFieldCount]; could be moved to where _fieldCount is initialized in NextResult() (near line 1395), so that it isn't executed for every single field access. This should be safe since _fieldCount has to be initialized before GetSQLiteType is called. Also _fieldTypeArray is set to null in NextResult(), so instead we could initialize the array there: // Ahh, we found a row-returning resultset eligible to be returned! _activeStatement = stmt; _fieldCount = fieldCount; _fieldIndexes = null; --> _fieldTypeArray = new SQLiteType[fieldCount]; The other thing I was thinking about is if the following two if statements mean same thing: // Initialize this column's field type instance if (_fieldTypeArray[i] == null) _fieldTypeArray[i] = new SQLiteType(); typ = _fieldTypeArray[i]; // If not initialized, then fetch the declared column datatype and attempt to convert it // to a known DbType. if (typ.Affinity == TypeAffinity.Uninitialized) { If typ is constructed as result of the first if statement typ.Affinity is TypeAffinity.Uninitialized. TypeAffinity.Uninitialized is 0 and SQLiteBase.ColumnAffinity should never return 0. If that's true if (typ.Affinity == TypeAffinity.Uninitialized) can be eliminated and its body joined with the first if statement. Plus, VerifyType would throw if typ.Affinity is left uninitialized. My simplified version would look like this (untested): private SQLiteType GetSQLiteType(SQLiteConnectionFlags flags, int i) { SQLiteType typ; // _fieldTypeArray is initialized at the of NextResult() // If not initialized, then fetch the declared column datatype and attempt to convert it // to a known DbType. if (_fieldTypeArray[i] == null) { typ = new SQLiteType(); _fieldTypeArray[i] = typ; typ.Type = SQLiteConvert.TypeNameToDbType( GetConnection(this), _activeStatement._sql.ColumnType( _activeStatement, i, out typ.Affinity), flags); } else { typ = _fieldTypeArray[i]; typ.Affinity = _activeStatement._sql.ColumnAffinity( _activeStatement, i); } return typ; } A more defensive version retaining the TypeAffinity.Uninitialized check would be: private SQLiteType GetSQLiteType(SQLiteConnectionFlags flags, int i) { SQLiteType typ; // Initialize this column's field type instance if (_fieldTypeArray[i] == null) { typ = new SQLiteType(); // If not initialized, then fetch the declared column datatype and attempt to convert it // to a known DbType. typ.Type = SQLiteConvert.TypeNameToDbType( GetConnection(this), _activeStatement._sql.ColumnType( _activeStatement, i, out typ.Affinity), flags); --> if (typ.Affinity != TypeAffinity.Uninitialized) --> _fieldTypeArray[i] = typ; } else { typ = _fieldTypeArray[i]; typ.Affinity = _activeStatement._sql.ColumnAffinity( _activeStatement, i); } return typ; } And IMHO I would invert the if to have the common case in the if branch: private SQLiteType GetSQLiteType(SQLiteConnectionFlags flags, int i) { SQLiteType typ; if (_fieldTypeArray[i] != null) { typ = _fieldTypeArray[i]; typ.Affinity = _activeStatement._sql.ColumnAffinity(_activeStatement, i); } else { typ = new SQLiteType(); // If not initialized, then fetch the declared column datatype and attempt to convert it // to a known DbType. typ.Type = SQLiteConvert.TypeNameToDbType( GetConnection(this), _activeStatement._sql. ColumnType(_activeStatement, i, out typ.Affinity), flags); if (typ.Affinity != TypeAffinity.Uninitialized) _fieldTypeArray[i] = typ; } return typ; } I'll try to test the changes in the next days with our setup. mistachkin added on 2014-10-06 18:21:54: I'll try to evaluate these changes this week. mistachkin added on 2014-10-07 17:55:06: I've adjusted the code on the 'experimental' branch (see check-in [53e8310fda]) in order to simplify things. I think this is the most optimization that can be done to the GetSQLiteType method while still retaining the existing semantics. mistachkin added on 2014-10-08 20:56:55: The changes have now been merged to trunk. |